Archived content

Archived date: 2019-05-21

This content is no longer being updated or maintained. The content is provided “as is.” Given the rapid evolution of technology, some content, steps, or illustrations may have changed.

Watson Assistant, formerly known as Watson Conversation, provides an excellent technical foundation for building applications that can understand and respond using natural language, such as messaging channels, web applications, or virtual assistants. Thanks to the popularity of this type of technology, chatbots are found virtually everywhere.

However, not all chatbots are created equal, so users’ expectations might vary widely. Some chatbots might feel like the famous Watson Jeopardy AI, whereas others might feel more like an interactive FAQ, offering only generic and lengthy responses to specific questions. Users might be delighted or underwhelmed based on their very first interaction, and that experience influences whether they return. Unfortunately, extensive training is not always feasible for a nascent chatbot. When chatbots are insufficiently trained, they might tell the user what they are trained on, which doesn’t necessarily help a user with a specific question. Or they might degrade to perform a search that falls short of what a proper search interface can provide.

For many people, search remains an effective means for finding answers, and users might be faced with a perplexing choice between exploring a new chatbot or staying with search. These are two different solutions that might overlap in functionality, especially if the chatbot is forced to degrade to search frequently. Should the chatbot not meet their initial expectations, they might revert to search and avoid using the chatbot in the future. When a choice is offered, first impressions are critical to long-term adoption. If there are other alternatives to your abandoned chatbot, even if vast improvements are made later, you might not get those users back.

Another consideration is the role of natural language. Chatbots require a rich vocabulary that includes synonyms, idioms, and technical terms in the appropriate knowledge base. Users might be forced to guess and then memorize a specific vocabulary, and misunderstandings are possible. Natural language support is a valuable tool but recognizing its limitations might provide for a better overall user experience. For example, don’t let natural language support get in the way of using keywords. In the right hands, a keyword search that uses advanced features such as quotation marks, booleans, exclusions, and inclusions can be amazingly effective. The ideal solution allows both natural language and keyword search terms to be used as appropriate.

When we were looking at integrating this technology into an existing application, we explored a new approach that would overcome many of these shortcomings while taking full advantage of this exciting new technology.

Hybrid search: A different approach to the chatbot you grew up with

Our approach sidesteps these problems by integrating both search and chatbot technologies into a single user interface based on a typical search paradigm, eliminating the need to choose between competing systems. The existing search system has a robust keyword-based search engine that uses Apache Solr and has been fine-tuned to stand on its own as a solution. On this foundation, we surface Watson Assistant along with the results from Solr. This “hybrid search” approach aligns user expectations to what they can expect with a traditional search engine. However, when trained on a specific intent, Watson Assistant, being fully capable of understanding natural language as well, is capable of responding to the user with something they did not initially expect, an answer to a question that might go beyond what a traditional search engine could deliver or a self-service form.

There are many advantages to this approach. Using a traditional search paradigm as the foundation, search results are always returned, but Watson offers a solution only when it has been specifically trained for it. This is extremely helpful for scaling up the Watson component over time with more training. While traditional chatbots degrade to search, the hybrid search approach evolves to become smarter over time as it is trained on more solutions.

Another point of divergence is limiting the use of natural language to identification of the correct Watson intent, and instead providing a restricted or semi-guided approach to the conversation dialog presentation. For example, if the user entered the phrase, “I want to reset my mobile device” into a traditional chatbot, the dialog might present a list and ask the user which one to reset. At this point, the user could enter a valid response or something totally different that could cause problems with the flow of the conversation. At a minimum, it could reset the conversation back to the root and go down a different path or cause Watson to ask the question again.

Our approach was to restrict the user to the set of selections for which Watson has received training. For the same prompt about resetting a device, our hybrid search responds back with a set of predefined alternatives, one for each device that is registered along with one for “not listed.” This reduces the risk of misunderstanding that would have been present had natural language been leveraged and takes away the guesswork.

A hybrid approach

Expanding on the idea of providing only valid choices, we also added intelligence to the front end to detect the different types of responses that can be returned from Watson. For example, if we determine the response is a “support option” (for example, a phone number, create a ticket, or start a chat with a human), then the information would be displayed in a card type format that the user can interact with. Or if a list of documents is returned, the hybrid UI retrieves information about the documents so the user could better choose without having to open the documents. Watson can still notify the hybrid UI to enter either formatted (for example, a serial number) or free form text. These types of controls are the basis for interactive forms that can provide for a richer support experience.

Type-ahead (rapid results) when searching with Watson

Our application implements a type-ahead results pattern where, as the user is entering information in the search bar, we show them a quick list of the top results. When we started integrating Watson into the search we came across a few limitations. As a user is typing in a search string, the app checks with each character that is typed to see whether there is a result that it can take them to directly. This call needs to be very fast. The application cannot wait for a response to come back from Watson sitting in the IBM Cloud to create a session, perform intent matching, and then compose a response. Another issue is the response that comes back from Watson might be a form, document, support definition, or request for more information. This doesn’t fit into the rapid search or type-ahead paradigm. Finally, there is the cost factor. Watson is a metered service where you pay for usage. It is roughly a 10-20x ratio of type-ahead searches as opposed to a full search.

Type ahead suggestions

To allow us to include Watson-based intents into the rapid results, we developed a custom service that extracts information from the Watson Assistant corpora that the application uses. In this new service, we used the existing Watson APIs to retrieve all of the intents and the dialog information for each corpus. The dialog information is used to extract the top-level intents (that is, the entry points into a conversation flow for a user). This set would then be further restricted by removing non-critical intents like “conversation_start” and “anything_else.” After we filter down to the core intents, we extract the examples from Watson for the intents. Logically, we consider an intent and its examples to be a “document.” Those resulting documents are stored in Solr along with the other content that was already indexed. This process was run nightly.

Processing showing documents

Now when the user begins typing in a search query, we show results from original content in Solr along with the key intents that could start a conversation in Watson. For our purposes, we boosted and added branding to the Watson intents above other content to showcase the new technology. From a user standpoint, in the type-ahead results area they see the first example intent as a possible result. If they selected it from the pull-down menu, we take them to the full hybrid search results with the example intent already passed into Watson as if the user had keyed it in. This allows for the creation of a Watson session and guaranteed positive results.

As users become more familiar with the Watson integration in our application, we expect they will begin conducting more natural language processing (NLP) searches. Previously, a user would enter a keyword-based search like “connect to vpn” to find help on how to connect to the corporate VPN. When they realize that Watson is also looking at what they enter, savvy users might start entering in full phrases like “I can’t connect to VPN from home.”

The use of NLP phrases raises an issue with potential degradation of the quality of search results. From tests that we’ve done, Solr by itself does not do a good job with NLP sentences. To help level the results, we add functions from the Apache OpenNLP component. This component can look at a sentence and extract out the different parts of speech. We use this capability to extract the noun and verb phrases, effectively building a keyword search from the entered sentence. Feeding the previous NLP example into the processor, the results would look like:

I_PRP can't_MD connect_VB to_TO VPN_NNP from_IN home_NN

Note in the previous output, the post fix is an abbreviation of the type of speech the word is.

Extracting just the noun and verb phrases, we would get a set of keywords “connect IBM home.” This logic would be applied only if the input string goes over a specific threshold of words (for example, 5 or more). This new set of keywords would be used with the original string to build a combined result that should include both exact matches and the keywords. Note that only the original string would be sent to Watson, which would perform its own NLP intent matching. Effectively, we get the best of both worlds from a user standpoint.

Technical solution

Now for some background on the application. It is a micro-service architecture that is deployed using Docker images. Each service is hosted in a separate Docker image and is deployed in a high availability environment. To facilitate the support of the functions that are discussed above, we made the following changes to the application’s architecture. Note that what is shown represents a simplified view of the architecture and has a number of non-impacted services that are not shown for clarity purposes.

flowchart showing the solution

The following is a quick description of the components that are listed above:

  • Support site UI: This is the web front-end application that users employ to interact with the system.
  • Content service: This micro-service is used to perform searches and retrieve information from the content store on a number of different types of documents. It uses a combination of CouchDB and Solr to facilitate this.
  • Apache Solr: A fast, open source search indexing engine.
  • Aggregation service: This is a new service created that allows for a single composite interface for the UI to call to retrieve and combine information from both Watson and existing application sources. We put this in place as an expansion point for when we integrate more Watson services and external data from other systems.
  • Orchestration service: Because we will be using the same workspaces for other presentation layers (for example, chatbot or voice response units), we created this layer to be a wrapper and integration service for Watson Assistant. It lets us augment and enhance responses from a dialog flow with information that is stored outside of the conversation.
  • Apache OpenNLP: An open source natural language processing component that we use to identify parts of speech.

When a user enters in a search term the following flows are used. The first one represents the type-ahead flow. Note that the search happens all inside the application, with the indexed intents representing the potential entry points into Watson Assistant.

flowchart showing type-ahead solution
  1. The UI sends the current search query entered along with other pertinent information about the user (personalization, authentication level, and so on) into the type-ahead service entry point.
  2. The aggregation service determines the size of the input phrase. If it is above a specified size in the number of words, OpenNLP is called to extract the key noun-verb phase.
  3. The aggregation service calls Solr to see whether the current phrase matches to an indexed intent’s examples. Because this is from a user standpoint, the response needs to be rapid, and the Solr query is only for the top 2 or 3 intent matches.
  4. In parallel, the aggregator service also calls the existing content service’s entry point to retrieve the type-ahead results. It either passes in the original search string or the noun-verb phase depending on what happened in Step 2 above. This result set is also limited to the top 2 or 3 results.
  5. The result to this flow is that the user will see a set of potential top hits from both the existing content along with the top intents (we display the first example from the intent to the user) as an entry point into a conversation. If the user selects the Watson intent, it fires off the hybrid search process that is described below.

In the hybrid search flow, Watson is invoked as part of the process. If Watson returns with a high enough degree of confidence that it can help with the user’s query, then the user is guided through the interaction along with the existing results from a legacy search. The flow is:

flowchart showing hybrid search
  1. The UI sends the search query along with other pertinent information about the user (for example, personalization or authentication level) into the type-ahead service entry point.
  2. The aggregation service calls the orchestration service to start a session with Watson.
  3. The orchestration service calls Watson and creates a session.
  4. The orchestration service determines whether Watson has a high enough level of confidence with the response, then augments the information with any needed data. If the confidence level is not high enough, then the orchestration service tells the aggregation service that Watson has nothing for the user.
  5. The aggregation service determines the size of the input phrase. If it is above a specified size in the number of words, OpenNLP is called to extract the key noun-verb phase.
  6. The aggregator service also calls the existing content service’s entry point to retrieve the keyword-based results. It either passes in the original search string or the noun-verb phase depending on what happened in Step 2 above.
  7. The result to this flow is that the user sees a set of potential top hits from both the existing content along with the top intents (we display the first example from the intent to the user) as an entry point into a conversation. If the user selects the Watson intent, it fires off the hybrid search process.


Hybrid search is a new way to use Watson Assistant without some of the limitations of a traditional chatbot. This approach lets you have best of both worlds: a robust search engine, which is coupled with AI technology that is capable of evolving as Watson receives more training over time. Advanced users retain the power of keywords and advanced search techniques, while others can simply ask a question in natural language. While the integration with search and rapid results have presented some unique architectural challenges, we’ve outlined some technical solutions that can help other adopters get their own hybrid search application off the ground.