IBM and Red Hat — the next chapter of open innovation. Learn more ›
Will Chaparro | Published August 15, 2017
Artificial intelligenceKnowledge discovery
Recently, IBM Watson Discovery Service introduced a new capability called Relevancy Training. Relevancy Training lets you teach Watson what results should be surfaced higher than others so that your users can get the right answer to their question faster. You can train your private search collections by using either a tooling-only approach, or by using the Discovery APIs. In this tutorial, I explain how to use the tooling to train your private search collection.
Relevancy training is a process that lets you take a query, look at the search results returned from that query, and tell Watson what the ordering should be. This way, you are training Watson by using example queries that are representative of the queries that your users enter, and with explicit ratings of the search results.
After Watson has enough information from you, it starts to learn about the patterns and structure of the search collection and the queries your users enter. Watson uses machine learning techniques to find specific signals in queries that can be applied against the corpus. It identifies what similarities exist between those learnings and new queries that are entered by the user. It can differentiate between “good” and “bad” documents by using these signals and patterns. Watson then reorders the search results based on the training it received.
Of course, Watson is only as good as its teacher, and so it is important to ensure that any training it receives is performed by someone who knows the data. The training questions should also be representative of what your users will enter. I recommend that you select the queries randomly from your records of actual user queries. Do not handpick examples that look like “good” queries to you. In doing so, you are likely to introduce a bias into your training data toward handling the queries that you would like users to ask and not the queries that users actually do ask.
So, how do you approach getting these queries?
If you are replacing an existing search system, then there is a good chance that you have logs of queries that actual users are asking the system. Source your queries from these logs to seed the relevancy training for your Watson Discovery-based solution. Enter the queries from your logs, view the results from Watson, and then tell Watson which results are good and which are bad.
If you are starting with a new implementation, I advise you to initially deploy the system without Relevancy Training, but ensure that you are logging queries. Then, use the queries that you have logged to train Watson using Relevancy Training.
As an example, you might be looking for information within content-rich, detailed publications like these from the Public Library of Science.
To search these documents, I’ll upload them to a Watson Discovery Service instance. After I upload the documents, I’ll search against them. Then, I’ll start the Relevancy Training process by uploading queries. For each query, I’ll review the results that are returned and rate them as being Relevant or Not Relevant. After I’ve satisfied the learning requirements for Watson, I’ll let Watson learn from the information that I provided. Finally, I’ll try some of those searches again with a newly trained Watson.
This tutorial assumes that you have some familiarity with IBM Cloud and Watson Discovery Service. You’ll need an IBM Cloud account to begin. If you don’t have an IBM Cloud account, you can request a free trial here. If you already have an IBM Cloud account and Discovery instance, you can skip to Step 5. If you already have a Collection, you can skip to step 9.
This tutorial demonstrated how you can use Relevancy Training in the Watson Discovery Service to teach Watson to make better judgments when ordering search results. This in turn gives your users the answers to their questions faster. Now that you know how to use Relevancy Training you can begin applying this technique to other business applications.
Use Relevancy Training in any application that is providing documents as search results. You can use this capability in product support cases to help agents find answers to customer questions quickly, research scenarios to scan the latest publications, training applications to help knowledge workers get up to speed, enterprise applications to surface the most relevant answers to FAQs, and many other potential use cases.
See the documentation for details on how to try out Relevancy Training in IBM Watson Discovery Service.
For more information, you can watch a webinar on Relevancy Training or see what other webinars are offered in the Building with Watson webinar series. There is also a good blog on How to get the most out of Relevancy Training.
May 16, 2019
API ManagementArtificial intelligence+
May 28, 2019
April 8, 2019
Back to top