When it comes to conversation, how you say something can be as important as what you say. Conversations among humans are rarely dry and dull. They are filled with emotions such as joy, anger, excitement and sadness. These emotions are conveyed in tones such as open, agreeable, analytical, and confident or tentative.

Why does a conversation between a human and bot have to be dry and dull? The answer is: it doesn’t. At IBM Watson, we are working on making conversations between bots and humans more compassionate, natural and personalized.

This blog post gives you step-by-step details on how you can use the IBM Watson Tone Analyzer service to detect various tones and emotions conveyed by users during conversation with a bot. After emotions and tones are detected, developers can build compassionate responses into the dialog flows. At run time, the bots respond more naturally to users in the conversations.

This post uses a simple food coaching application as an example. The bot is created using Watson Conversation and asks you whether you ate your meal of the day. Depending on your answer, the application takes a particular conversation path. If you answer yes, the bot asks if you ate anything unhealthy. Finally, the bot asks you how you feel about the meal.

If you did not eat, the bot asks you about how you feel about not eating. The bot then infers the tone of your message to create a compassionate response and a pointer to healthy eating. The screen capture below shows a few of the conversation flows.


The code to create this bot is available here.


Creating the bot

Creating a conversational agent that is tone aware involves three steps:

  1. Get the user’s tone for each conversation turn from Tone Analyzer.
  2. Add the tone to the request payload and send it to Conversation.
  3. Use tone to define rules that tell Conversation how to decide between multiple paths in the dialog tree.

Step 1: Get the user’s tone for each conversation turn from Tone Analyzer

The first step is to determine the client‚Äôs emotion using the Tone Analyzer API.¬† It takes a chunk of written text as input and returns a JSON object that contains the tones it has detected with a confidence score for each.¬†The following snippet shows the emotion tone for the user message, “I don’t want to talk to you no more, you empty-headed animal food trough wiper! I fart in your general direction! Your mother was a hamster and your father smelt of elderberries!‚ÄĚ (Look up that quotation if you don‚Äôt recognize it already!)

¬† ¬† ¬† ¬† “tones”: [
¬† ¬† ¬† ¬† ¬† {“score”: 0.604187,”tone_id”: “anger”,”tone_name”: “Anger”},
¬† ¬† ¬† ¬† ¬† {“score”: 0.013697,”tone_id”: “disgust”,”tone_name”: “Disgust”},
¬† ¬† ¬† ¬† ¬† {“score”: 0.002734,”tone_id”: “fear”,”tone_name”: “Fear”},
¬† ¬† ¬† ¬† ¬† {“score”: 0.235987,”tone_id”: “joy”,¬†“tone_name”: “Joy”},
¬† ¬† ¬† ¬† ¬† {“score”: 0.204725,”tone_id”: “sadness”,”tone_name”: “Sadness”}
¬† ¬† ¬† ¬† “category_id”: “emotion_tone”,
¬† ¬† ¬† ¬† “category_name”: “Emotion Tone”

Note that the tone values returned by Tone Analyzer are in a raw format, so you’ll have to figure out how you want to interpret and use them in your specific application.  The team behind Tone Analyzer provides some guidelines on how to interpret the scores to help you get started.  For emotion tones, they suggest a threshold score of 0.5.

For the food coach bot, you must determine the dominant emotion for each of the user‚Äôs conversation turns.¬†The dominant emotion is the emotion with the maximum score, with the caveat that it be at least 0.5.¬† If there isn’t an emotion with a score greater than or equal to 0.5, the dominant emotion is set to ‚Äúneutral.‚ÄĚ ¬†Circling back to the sample text input above, the dominant emotion expressed by the user would be anger.

For help in creating an instance of the Tone Analyzer, and getting the credentials needed to send requests to the service, please refer to this tutorial.

Step 2: Add emotional tone to the context variable of the Conversation API payload and sending it to the Service

The second step is to add the tones retrieved in Step 1 to a JSON request payload, and send it to the Watson Conversation Service.¬† The Conversation payload includes two objects: an input object and a context object.¬† You can put any application state you want to pass along to the Conversation Service into the context object. The state is then available for use in defining the rules Conversation uses to dynamically determine the path to take in your application’s dialog tree.¬† This example uses the dominant emotion in the encoded rules that tell Conversation how to determine the path to take in a dialog tree.¬† The details of how this is done are covered in Step 3.

A simple example of a Conversation payload that contains an emotion tone in the context:

¬† “input”: {
¬†¬†¬† “text”: “Hungry ! Greens are overrated !!!!!‚ÄĚ
¬† “context”: {
¬† ¬† “user”:¬†{
¬† ¬†¬† ¬†¬†¬†¬†¬†“emotion”: {
¬† ¬† ¬† ¬† ¬† “current”: “anger”

If you‚Äôd like to use the social and language¬†tones returned by¬†Tone Analyzer, the food coach includes these tones as well as the emotion tones.¬† See the initUser function in tone_detection.js for a recommended way of encoding this data. You‚Äôll notice that the dominant emotion has been labeled as the ‚Äúcurrent‚ÄĚ emotion to distinguish it from an emotion history object.¬†In some applications, it’s useful to keep track of and use the emotion history throughout a conversation. ¬†You‚Äôll notice in updateUserTone that one of the parameters is¬†maintainHistory. ¬†This is a Boolean variable that can be toggled to include tone history or not.”


Putting Steps 1 and 2 together

The invokeToneConversation function found in app.js implements Steps 1 and 2 through calls to two different functions from tone_detection.js. The invokeToneConversation function is provided below with the calls to the functions in tone_detection.js that implement Steps 1 and 2.

function invokeToneConversation(payload, res) {
  toneDetection.invokeToneAsync(payload, toneAnalyzer)                  # Step 1
  .then( (tone) => {
    toneDetection.updateUserTone(payload, tone, maintainToneHistory);   # Step 2
    conversation.message(payload, function(err, data) {                 
¬† ¬† ¬†…

Step 3: Define rules in dialog nodes using tone

The third step is to add rules to the dialog nodes, and where appropriate, to provide different responses based on tone.¬† The example below is pulled from the food coach workspace and has a¬†node with four child nodes.¬† The first child is a help node (#help) and doesn‚Äôt use tone. The next three child nodes have entry conditions and rules that are based on tone.¬† The rules are specified at the top of the node (black background),¬†and the output to be¬†returned if the node’s rules are met is at the bottom (white background).¬†

In this example, if at the parent node the¬†intent of the user is determined to be #yes, the person’s input includes a @food entity, and the #yes intent has a confidence > 0.7. Conversation returns ‚ÄúThat seems like a healthy meal [‚Ķ] How do you feel about what you ate?‚ÄĚ ¬†Based on the tone in the user‚Äôs subsequent response to the question about how they feel, one of the three child nodes is selected.¬† For example, if the tone is one of joy ($user.tone.current.emotion ==¬†‚Äújoy‚ÄĚ), the second child node is selected and Conversation responds with¬†‚ÄúGood, good.¬† Keep at it!” ¬† ¬†

Download the sample application

Integrating Tone Analyzer into Watson Conversations isn’t difficult. If you would like to start with this food coach application, the code and Conversation template, along with installation instructions are available here. Packaged code for the tone-conversation integration pattern is also available in the Watson Developer Cloud SDKs. It’s available in a number of programming languages:

Please tell us how you make your bots more compassionate by sharing your stories in comments below.

3 comments on"Creating a tone aware conversational agent using Watson Tone Analyzer and Watson Conversation services"

  1. You can further augment this functionality with speech by adding expressive and transformable speech from our text to speech services. See the demo at https://text-to-speech-demo.mybluemix.net/.

    • Rama Akkiraju October 19, 2016

      We are doing that as we speak Michael. This demo is also be available on Pepper Robot with Speech incorporated into it.

  2. Vibha Sinha October 19, 2016

    Thanks for the pointer, Michael. You are absolutely correct, that adding expressive speech, adds another dimension of tone expression. We did integrated it with expressive speech (not part of public demo though). There is one instance in the conversation flow, where conversational-agent shows apology, and another place uncertainty.

Join The Discussion

Your email address will not be published. Required fields are marked *