Hi, I am building an appointment booking system in hospital. I need Natural Language Classifier to return a low confidence level for the questions that are not listed in the training set. I had a look at this link where its advised to setup a "notRelevant" classify - with a set of sample questions that do not match any of the normal set.
I have tried this and NLC returns "notRelevant" based on the sample sets again (obviously thats the behaviour of NLC - answers based on sample set).
But covering the negative scenario for any domain is not possible. i.e, my domain is appointment booking in hospital. But, if I ask NLC "Tell me world wonders", it returns with a classifier in the training set with higher confidence level.
Is there any solution handling negative / non-domain scenarios?
Answer by @chughts (11657) | Nov 12, 2015 at 02:44 AM
The hurdle that you are hitting is that if you train the Natural Language Classifier (NLC) to only recognize hospital related statements, then it will see hospital related statements in everything it sees.
If you take a look at the response that you get, you will notice that it gives a confidence level against every classification in the corpus you created, and not only the classification that it ranks with the highest confidence level. You get higher confidence levels the closer NLC can match the statement with a classification statement that it has been trained with.
In your case you might want to nest your classifiers, ie. start off with a simple "relevant", "notrelevant", split then pass it to your hospital classifier, knowing that the statement has a higher confidence level of being relevant.
Can I use a classifer of ibmwatson-nlc-groundtruth application in my own code? 1 Answer
Natural Language Classifier Conflict with this classifier error 0 Answers
Phrase Length in Watson Natural Language Classifier 4 Answers
How to get questions and classifications from a classifier 0 Answers
NLC and Node-RED, choosing the NLC classifier_id via msg (or other runtime variable) 1 Answer