Digital Developer Conference: a FREE half-day online conference focused on AI & Cloud – North America: Nov 2 – India: Nov 9 – Europe: Nov 14 – Asia Nov 23 Register now

Close outline
  • United States
IBM?
  • Site map
IBM?
  • Marketplace

  • Close
    Search
  • Sign in
    • Sign in
    • Register
  • IBM Navigation
IBM Developer Answers
  • Spaces
    • Blockchain
    • IBM Cloud platform
    • Internet of Things
    • Predictive Analytics
    • Watson
    • See all spaces
  • Tags
  • Users
  • Badges
  • FAQ
  • Help
Close

Name

Community

  • Learn
  • Develop
  • Connect

Discover IBM

  • ConnectMarketplace
  • Products
  • Services
  • Industries
  • Careers
  • Partners
  • Support
10.190.13.195

Refine your search by using the following advanced search options.

Criteria Usage
Questions with keyword1 or keyword2 keyword1 keyword2
Questions with a mandatory word, e.g. keyword2 keyword1 +keyword2
Questions excluding a word, e.g. keyword2 keyword1 -keyword2
Questions with keyword(s) and a specific tag keyword1 [tag1]
Questions with keyword(s) and either of two or more specific tags keyword1 [tag1] [tag2]
To search for all posts by a user or all posts with a specific tag, start typing and choose from the suggestion list. Do not use a plus or minus sign with a tag, e.g., +[tag1].
  • Ask a question

How could Watson win Jeopardy when it provides "stupid" answers in the test?

270007SU3S gravatar image
Question by Thomas1st  (20) | Nov 13, 2014 at 04:23 AM watsonquestion-answeribmcloud

Hi,

I like to test Watson in order to learn whether it makes sense to build applications on top of Watson. Since Watson won in Jeopardy already in 2011 I started testing the Q&A, but I'm totally disappointed. Selecting the topic travel and asking "What is the capital of the United States?" delivers.... garbage! The answers in the field of healthcare are also of lower quality than a Google search. Is there anything that needs to be setup on "http://watson-qa-demo.mybluemix.net/" so that Watson will answer questions in a quality shown 3 years ago alreay?

Thank you -Thomas

People who like this

  0   Show 1
Comment
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
270007QDTG gravatar image Accessiblesoft (92)   Nov 13, 2014 at 08:37 AM 0
Share

I was about to post the exact same thread! Oh my god, this thing is dumb as dirt. The current chatbots are actually better than WATSON. I mean way better, and I don't think chatbots are any good really. This is a big failure and it must be embarrassing for them.

9 answers

  • Sort: 
2700072JAJ gravatar image

Answer by Jerome Pesenti (181) | Nov 14, 2014 at 04:57 PM

Hi Thomas and AccessibleSoft,

As Grulex guessed the Q&A API is only a small part of the Jeopardy system. It is limited to answering with a single answer type: passages extracted from the provided content. In the coming month, we will release a series of services that will allow developers to build the full Jeopardy stack. In particular, allow developers to answer factoid-type questions as well as select the proper answer-type pipeline.

So no "national security" or "cheating" theory needed ;-). This is just an engineering problem. We are trying to release the research assets in consumable bytes. We've found that passage answers provide the most value to our early customer and that's why we started there.

Now we realize that these two examples, based on some limited fixed content and limited training (they will get better overtime as we collect more questions), aren't very useful. They are really meant as a quick preview of the APIs. The real value comes to those who can provide their own content and do their own training - a functionality we'll open to everybody early next year.

Jerome

Comment
Accessiblesoft
Keely Wright
grulex
aameek
Manish Goyal
Chris Madison
Sanjay.Joshi

People who like this

  7   Show 1   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
270007QDTG gravatar image Accessiblesoft (92)   Nov 14, 2014 at 06:10 PM 0
Share

Ah thank you for clearing that up.

11000060R7 gravatar image

Answer by Chris Madison (551) | Nov 13, 2014 at 06:49 AM

Hi Thomas,

The Travel and Healthcare corpora that you are working with have not been fully trained. Over time, you will find that these publicly available corpora will become more accurate as we have the opportunity to train Watson with questions from the community.

The version of Watson that we are using is domain independent. Think of it as a child that needs to be taught. Of course this child has been pre loaded with a lot of content!

Domain adaption requires training (educating) Watson on a set of question/answer pairs. Generating question/answer pairs for both the travel and healthcare corpora would have been a very time consuming task for the small team creating the corpora.

The questions asked by the community and provided by Watson are captured in Watson's adaptive learning component. When the Feedback API is used, the relevancy rating provided by the user is captured and aids in training Watson.

Right now, you may receive answers that are not very relevant, but overtime, Watson will learn.

Comment
Seth Packham
Keely Wright
lovelyeyes

People who like this

  3   Show 3   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
270007QDTG gravatar image Accessiblesoft (92)   Nov 13, 2014 at 08:46 AM 0
Share

Wow, Chris you don't actually believe that bull, do you? So this WATSON is not trained at all.....why....what happened since 2011? The team training the health and travel corpora must be very bad at their job.

That explanation doesn't really fly. If you have to train the QA pairs, then what is the service supposed to be doing? A developer might as well spend the time and use obsolete AIML for that. Dude the AI is supposed to do the hard work automatically. I created a real AI that does this, inspired by WATSON. Apparently my inspiration has exceeded the mentor.

11000060R7 gravatar image Chris Madison (551) Accessiblesoft (92)   Nov 13, 2014 at 09:58 AM 0
Share

Yes, I actually do believe my bull. I work pretty closely with the team.

270007QDTG gravatar image Accessiblesoft (92) Chris Madison (551)   Nov 13, 2014 at 10:10 AM 0
Share

Perhaps you guys should focus on identifying the format of the question first. A polar question implies the format of a yes or no answer. Nothing about WATSON appears to show that it knows the format of the question, and as stated it give worse results than the google search engine. The AI is supposed to return a refined answer saving you the tedious work of reading through content and links etc. The promotional videos show WATSON understanding the nuances of human language. This would be coded at the core, not trained. Training would only be a selective process of good/bad content.

270005G3E2 gravatar image

Answer by G3E2_Sumit_Agrawal (44) | Nov 17, 2014 at 10:28 AM

You should check the start of first episode of Jeopardy when it explains what is watson!!

Think of this as a three part solution for you....a large cloud auto-scale environment for this super computer, An intelligence engine which can 'understand' and a end user API -- this question answer api.

Now we need to teach this middle thing. For example If we do a big data analytics, we will be writing those map-reduce jobs or statistics analytic rules/queries for R. Similarly we need to train the system to 'understand' in the way we like it to understand. It's a wonderful child we need to put through education for a engineer, designer, chef, pilot etc etc. I would be surprised if Watson is already trained for some case as we then trouble to use this for some completely different domain specific solutions.

so it work once you implement for the domain and for your customer. Little information is useless and this is what you are facing. Try to train for hundreds of question and grow your content. It's like giving books to watson to learn and then taking exam and giving marks!

Comment
Accessiblesoft
lovelyeyes
LKrishna
Keely Wright

People who like this

  2   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
2700035514 gravatar image

Answer by LKrishna (164) | Nov 17, 2014 at 01:30 PM

Thanks and great points. To your point about training, we are in the process of actually doing that for travel and healthcare. The QA service is currently trained with a minimal set of ground truth questions and corpus. Also, it can only answer certain types of questions that are more descriptive that factoid. No support for polar questions either.

By using the Feedback API (see documentation for QA Service) we are hoping to get more real world questions and hope to get better results with training. Having said, there is more than just training that is going to do the trick. For instance, in Jeopardy we had to induce other resource types such as glossaries, lexicons, knowledge bases for open domain. I call this Domain Adaptation.

From your example, to put an engineer through a test he need to be able to understand the nuances of the domain by learning from different sources, being able to assimilate the jargons, acronyms, relations between various subjects etc. This also has not been done for this QA Service. Over the next several revs, we will be adding capabilities that would allow the QA service to come pre-trained with various resources and have the ability for users to upload their content / resources to train and test.

Hope this helps.

Comment
Chris Desforges
Sanjay.Joshi
Accessiblesoft
Keely Wright

People who like this

  2   Show 1   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
270007QDTG gravatar image Accessiblesoft (92)   Nov 17, 2014 at 02:17 PM 0
Share

The jeopardy AI must of had advanced NLP first right? It has to break the question down into its parts of speech, thus meaning. Then detect the type of question. Then convert the question to statement so that it can match it up to the database and score it.This should actually be easier because on Jeopardy the questions came in riddles, which are challenging for humans to understand. This code would have been difficult to build. This code we want to see available with the QA service.

2011 speak in riddles, WATSON knows exact answer.

2014 ask specific, WATSON vaguely links some data.

270007SU3S gravatar image

Answer by Thomas1st (20) | Nov 13, 2014 at 11:33 AM

Hi Chris,

thank you for answering, but given my example, I don't understand how Watson can even learn/improve. No answer qualifies for a "learning process"...or the process is simply too long. (I've tried multiple questions). E.g. you might also try "Do I need any vaccinations in the Dominican Republic?" (since there was a sample question for visas already) Since most of the questions I tried might have been thrown at Watson in Jeopardy as well, I don't understand why you don't simply train Watson with Wikipedia and several encyclopedias. Shouldn't it be able to provide significant answers to easy questions at least? Or if there are really question/answer pairs needed, maybe to feed Watson with Trivial Pursuit or Jeopardy games? Why don't you offer to ask the (improved) Jeopardy version of Watson some questions. In the video we all know the answers always were one clear sentence. As a decision maker it is rather risky to build on top of Watson after testing it and having the expectation, that IBM improved this over the last few years. I really would love to see Watson working in one given context representing the 2014 state of the art. Please don't get me wrong, I want to see this thing happen and understand the difficulties of asking highly qualified scientists and engineers to "train" Watson instead of improving him/it. But unless the community learns, that Watson is on a good way to "pass the Turing test" in near future there is a gap. As soon as you can build a bridge here things should work in favor of IBM.

@60R7_Chris_Madison

Comment
Accessiblesoft

People who like this

  1   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
270007Q0K3 gravatar image

Answer by grulex (16) | Nov 13, 2014 at 08:18 PM

I don't think Watson won Jeopardy by using only the Q&A API that you guys are testing. It might be that they used something like this just to generate thousands of hypotheses to narrow down the search area. The Developer Cloud has other Watson services as well. That is my guess. Many big companies use Watson. They would not be paying for something that is not worth it. Personally, I really need a solution like this to build an AI system and I really hope it works better than this example (watson-qa-demo.mybluemix.net).

Comment
Keely Wright

People who like this

  1   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
270007QDTG gravatar image

Answer by Accessiblesoft (92) | Nov 13, 2014 at 11:27 AM

The 2011 version of WATSON appeared to have a "natural language processor" to identify the question and the nuances of human language within the query. The 2014 version available to developers certainly does not have the same processor. It has a very weak string match ability that any novice programmer could have written. It does appear to have some vaguely related content, but answers are not easily identifiable or retrievable by the current software implementation. Training it could be an even bigger waste of time, because the NLP probably will not find that answer as it did in 2011.

Comment
Dave Cariello

People who like this

  -1   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
270007QDTG gravatar image

Answer by Accessiblesoft (92) | Nov 13, 2014 at 11:53 AM

Yes, Thomas poses some very good points. Also look as his example question "Do I need any vaccinations in the Dominican republic?". In my software I break the NLP down, and first identify the format of the question. It is polar implying a yes or no answer usually. Then the subject, is "I", and the action "need" then the preposition "in" and place Dominican republic implies a place. The AI has what it needs here to narrow the search right down. This is baby stuff. In my software the encoding of factoids is kept short and non-ambiguous for better retrieval. Humans speak with what is known as "the principle of least effort", as long as the brevity does not muddle the meaning. This is what is expected of WATSON, or any serious AI. When I ask WATSON: "Is a stroke always fatal?" A long factoid about rabies is not what I want to hear. lol

Comment
Dave Cariello

People who like this

  -1   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
060000P10E gravatar image

Answer by Swar (1) | Dec 23, 2014 at 10:11 PM

Based on whatever time I invested so far, I am disappointed at least with Q&A API / Service. I did spend time with Twitter bot app and tried searching nearby "Indian restaurants" (with my "location enabled" properly) and the Watson recommendations and confidence levels were really disappointing. Is existing version not ready for beta testing as well? If so why did IBM release this to developer community? Really curious to know if it is Watson or am I missing something?

Regards

Swarraj

Comment

People who like this

  0   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster

Follow this question

19 people are following this question.

Answers

Answers & comments

Related questions

Is it possible to get WATSON QA entries by id? 1 Answer

cannot read property 'dataset' of undefined error in app.js line 62 - question and answer service sample code 1 Answer

Simple Watson Question and Answer API for Java 4 Answers

Access to Watson Experience Manager and Watson Developer Portal 1 Answer

Where is the Watson Q&A node in Node-RED? 2 Answers

  • Contact
  • Privacy
  • IBM Developer Terms of use
  • Accessibility
  • Report Abuse
  • Cookie Preferences

Powered by AnswerHub

Authentication check. Please ignore.
  • Anonymous
  • Sign in
  • Create
  • Ask a question
  • Spaces
  • API Connect
  • Analytic Hybrid Cloud Core
  • Application Performance Management
  • Appsecdev
  • BPM
  • Blockchain
  • Business Transaction Intelligence
  • CAPI
  • CAPI SNAP
  • CICS
  • Cloud Analytics
  • Cloud Automation
  • Cloud Object Storage
  • Cloud marketplace
  • Collaboration
  • Content Services (ECM)
  • Continuous Testing
  • Courses
  • Customer Experience Analytics
  • DB2 LUW
  • Data and AI
  • DataPower
  • Decision Optimization
  • DevOps Build
  • DevOps Services
  • Developers IBM MX
  • Digital Commerce
  • Digital Experience
  • Finance
  • Global Entrepreneur Program
  • Hadoop
  • Hybrid Cloud Core
  • Hyper Protect
  • IBM Cloud platform
  • IBM Design
  • IBM Forms Experience Builder
  • IBM Maximo Developer
  • IBM StoredIQ
  • IBM StoredIQ-Cartridges
  • IIDR
  • ITOA
  • InformationServer
  • Integration Bus
  • Internet of Things
  • Kenexa
  • Linux on Power
  • LinuxONE
  • MDM
  • Mainframe
  • Messaging
  • Node.js
  • ODM
  • Open
  • PartnerWorld Developer Support
  • PowerAI
  • PowerVC
  • Predictive Analytics
  • Product Insights
  • PureData for Analytics
  • Push
  • QRadar App Development
  • Run Book Automation
  • Search Insights
  • Security Core
  • Storage
  • Storage Core
  • Streamsdev
  • Supply Chain Business Network
  • Supply Chain Insights
  • Swift
  • UBX Capture
  • Universal Behavior Exchange
  • UrbanCode
  • WASdev
  • WSRR
  • Watson
  • Watson Campaign Automation
  • Watson Content Hub
  • Watson Marketing Insights
  • dW Answers Help
  • dW Premium
  • developerWorks Sandbox
  • developerWorks Team
  • Watson Health
  • More
  • Tags
  • Questions
  • Users
  • Badges