Overview

Skill Level: Intermediate

A basic knowledge of Unity3D and C# is assumed

Using the tools described, you can use the Unity3D front-end to speak and listen to your questions and be responded to in the exact same manner as if you were using a text-based chat bot, except this is way cooler!

Ingredients

IBM Cloud account - sign up for one, if you don't already have one.

Unity3D development environment - Downloads (as per the EULA guidelines, determine the correct version for you)

Sign up for a unity developer account (so you can access the unity asset store)

(free) UMA2 plug-in - available from unity asset store

(not-free) SALSA plug-in (RandomEyes and LipSync)* - available from unity asset store

(free) IBM Watson SDK - available from unity assets store

IBM Watson APIs - Watson Assistant / Speech to Text / Text to Speech - available from IBM Cloud

 

*other products are available, I just happened to choose this one due to it's simplicity, ease-of-use and very helpful website

 

https://developer.ibm.com/code/open/projects/watson-developer-cloud-unity-sdk/

watson_unity_sdk

 

 

I apologise up-front that this article is very screenshot heavy, but I do feel that they add value as a walk-through guideline - and you can always see what setting values I have in my environment (that works) and compare it to yours that might not.

I'm afraid that due to having to purchase the license/software for SALSA I will not be supplying a github repo. for this project - but, as you will see below, all of the components are available for you to install, configure and setup.  The only code is the .cs file and I've screenshot the contents of that file enough for you to recreate yourself.

UPDATE: Okay, for ease of reading, I've uploaded the .cs file to a github repo. as-is

 

Here's a sneak peek of the end result: 

 

 

This ARTICLE does a really good job (from SOULMACHINES) of explaining why using a human interface is going to change the way we interact with computers....

 

UPDATE: Okay, following on from a few comments being posted about errors with the latest Watson SDK (2.4.0) that was breaking the code, I tested this myself to replicate the issue.  I downloaded the latest version of Watson SDK 2.4.0 .zip file, extracted it and overwrote the Assets/Watson folder with the contents.

When starting up Unity and opening the Project I see the following message (and this matches the error raied in the comments below):

WatsonSDK_240_error

(oddly, even though I downloaded v2.4.0 the changelog file still only showed 2.3.0 - but I assure you it was the 2.4.0 release I tested with)

 

Modify the following 2 lines of code in the WatsonTTS.cs file and everything will then work fine for you - I tested it after the code change and all works as expected.

WatsonTTS_mods_for_WatsonSDK_240

(Thanks to Crazy Minnow for the info. in the comments)

Step-by-step

  1. Create and setup your IBM Watson API services

    As defined in the ingredients section, it is assumed that you have an IBM Cloud account. Sign in to your account and select [Catalog] [Watson]. The services that we are interested in are highlighted below:

    catalog_watson_apis_01

    (As you can see, there are many more Watson API services that you can investigate and integrate for version 2.0)

    Select the [Speech to Text] service and select [Create]:

    catalog_watson_apis_STT_01

    Once created, we need to take a copy of the [Service Credentials] as we’ll need them within the Unity3D app:

    catalog_watson_apis_STT_02

     

    Now, repeat the same thing for the [Text to Speech] service:

    catalog_watson_apis_TTS_01

    catalog_watson_apis_TTS_02

    Finally, we need to create a [Watson Assistant] service:

    catalog_watson_apis_WA_01

    catalog_watson_apis_WA_02

    Now, we have all the [Service Credentials] that we shall need to include from the Unity3D Watson SDK.

     

    For the sake of this article, we shall create some quick and simple conversation Intent/Entity and Dialog flows within the [Watson Assistant] service.

    To perform this, we need to click on [Manage] and then click on [Open tool]:

    catalog_watson_apis_newWA_01

    Then select the [Workspaces] tab from the Intoduction screen.

    This will show the tiles of Workspaces.  Create a new Workspace (or re-use an existing one).  I am using a pre-existing Workspace:

    catalog_watson_apis_WA_04

    We need to click on [View Details] in order to get the Workspace Id (that we need in order to connect to this Workspace Id):

    catalog_watson_apis_WA_05

    Once we have that value, we can click on the Workspace to see the Intents/Entities/Dialogs:

    catalog_watson_apis_WA_06

    As you can see, I have some pre-setup Intents (some copied from the [Content Catalog]), but for this recipe, I just setup the [#General_Jokes] Intent

    catalog_watson_apis_WA_07

    I setup 17 examples, which is a reasonable amount of examples for an Intent.

    I also setup some Entities, I’ll include them here just to show the [@startConversation], as you’ll see that in the [Dialog] tab shortly:

    catalog_watson_apis_WA_08

    Switch to the [Dialog] tab and by default you should have a [Welcome] node.  This is the initial node that is executed when you first call the [Watson Assistant] API.  As you can see below, this is the first node that gets executed on an initial connection [conversation_start] and will respond with the text “HAL 9000 initiated and awaiting your command” and shall wait at this point for the users input:

    catalog_watson_apis_WA_09

    We shall create a new top-level node and link it to the [#General_Jokes] intent, therefore, if the #General_Jokes intent is identified and triggered it shall follow this node and into it’s child nodes, but first, it shall return back the response to the user “Seriously, you want me to tell you a joke?” and wait for a response from the user:

    catalog_watson_apis_WA_10

    If the user responds with “Yes” (or something positive) then we shall respond with a “random” response, that happens to be a joke (I didn’t say they were quality jokes….but you can change that).  (Take note the <waitX> tags within the responses, we’ll come back to that later on)

    catalog_watson_apis_WA_11

    We create a child node response for a non-Yes response, here we’re just taking any non-Yes response and catering for it rather than an exact “no” response (but you can modify that if you like).  As you can see, if you respond with a non-Tes response, we just respond with “Okay, not a problem, no funny time for you then”

    catalog_watson_apis_WA_12

    That’s enough and pretty much all we really need to setup within the [Watson Assistant] service for now – you can extend it much more as you see fit.

     

  2. Setup Unity3D

    After you’ve downloaded and installed Unity, when you start you’ll be given the option to create [New] or [Open] an existing project.  Select [New].

    unity_01

    Select [Add Asset Package], you need to add [IBM Watson Unity SDK], [SALSA With RandomEyes (assuming you have purchased it via the unity asset store)] and [UMA 2 – Unity Multipurpose Avatar]:

    unity_02

    You will then be presented with an “empty” Unity3D project, like so:

    unity_03

    Follow the instructions as defined on the SALSA website here

     

    You need to go to the SALSA website and download the UMA_DCS asset, select [Assets], [Import Package], [Custom Package] and select the .unitypackage file that you downloaded:

    unity_08

     

    This will then give you access to the Example/Scenes [SalsaUmaSync 1-Click Setup] – this has the GameObjects pre-selected and setup for us to use out-of-the-box:

    unity_09

    Double-click this scene to open it in the Unity IDE.

    unity_10

    Click on the [Main Camera] and make sure that the [Audio Listener] is selected (this is the Microphone input):

    unity_05

    Just make sure that this has a tick so that it is active.

    All that we shall add extra is a Canvas/Text GameObject, like so:

    unity_06

    This is purely so that we can output to the screen what has been detected by the [Audio Listener] and then converted via the [Speech to Text] service.

    Ensure that you have all of the Components added to your GameObject like so:

    unity_07

    As we’ll be enhancing this, we will add a new folder called [Scripts] and we shall add a new file called [WatsonTTS.cs]

     

    As you can see, in the Inspector view we can add the [Service Credentials] from the IBM Watson API services that we captured earlier.

    unity_13

    We see a [Preview] of the file is you single-click the file, if you double-click then it will then open in the editor you have defined.  I have defined to use the Mono-Develop IDE, as we shall see in the next step.

    The one modification that I have made is to add extra Wardrobe items to the UMA character, to do this you do the following:

    unity_11

    I changed the [Active Race] to Female and added the [Wardrobe Recipes]:

    unity_12

    One last modification is to change where the UMA avatar is viewed from the Camera perspective, so that we can zoom into just the head of the avatar:

    unity_14

    By default, the UMA character will have the “Locomotion” animation assigned to it, which makes it look about randomly, which is a little distracting – if I had more time, I would customise this to be a smaller range, we’ll do that for version 2.0.  For now, we’ll just remove the animation:

    unity_15

    We have not covered the content of the [WatsonTTS.cs] file yet, but once you’ve created the content and you press the [>] run button you will see your 3d Digital Human, like so:

    unity_run_01

    Due to using the SALSA plug-in, the Avatar will automatically move it’s eyes, blink and when it speaks it will perform Lip-syncing to the words that are spoken.

    Not bad for spending less than 30 minutes getting this far!  Imagine what you could do in “hours” or “days”, refining and getting to know more about the UMA modelling etc… as I say, I’ve used the out-the-box most basic example here, so I could focus on the integration to the IBM Watson API services, but I will return to refining and enhancing the UMA and SALSA setup and configuration.

     

    YES! it was commented that the above female avatar looked a bit scary! so I switched her for the male avatar – very simple to do.  I repeated the same exercise of adding Pants / T-Shirt / Haircut and eyebrows and in minutes we now have this avatar:

    unity_male_3d_uma_02

    Okay, still not 100% perfect, but pretty good for something this quick and easy – we can iron out the finer details once we get it all working together.

  3. Explanation of the WatsonTTS.cs C# file used to control everything

     

    The code that was used as a baseline is already included within the Watson/Examples/ServiceExamples/Scripts folder .cs scripts.

    watsonTTS_01

    As mentioned in the previous step, we shall create a new C# script file with the following contents.

    To start with, we need to include all the libraries that we shall be using.  Then you’ll notice that we have field declarations for the Watson APIs that we recorded earlier, we’ll set them like this so we don’t have to hard-code them into the .cs file.

    You’ll also notice that we have private variables declared that we’ll use within the script.

    watsonTTS_02 

    As we do not hardcode the Watson API values in the .cs script, you have to insert the values within the Unity IDE itself, like so:

    unity_07

    Now,bBack to the C# coding. The structure of a .cs file for Unity is to have a Start() method that is executed as an initialiser and an Update() method that is executed every frame (if you’ve ever coded for an Arduino, then it’s a very similar setup).

    watsonTTS_03

    The Start() method uses the credentials defined in the IDE and the Watson SDK to prepare the objects for later usage.

    In the second part, we execute the code to make an initial connection to the Watson Assistant service, just passing the text “first hello” and the results will be returned to the OnCONVMessage callback method.

    As you can see the object “response” is passed to this method and this will contain the JSON response from the Watson Assistant service.

    watsonTTS_04

    In the response, we are passed the “context” variable, we shall copy this to the local _context variable so that we can pass this as an input each time we make a call to the Watson Assistant service to keep track of the “context” values of the conversation.

    You can also see above, that we extract the output:text JSON value as this contains the text that is returned by the Watson Assistant Dialog node.

    Just as an example, I have left in some custom action tags that are contained within the Dialog node response.  As you can see above, we can detect these action tags within the conversation text itself and replace these with the values that the Text to Speech API service requires.  The reason for these break pauses will become clearer later on.  We store the text to be converted into the global variable TTS_content.

    As you can then see, we set the play variable to true.  This will then get picked up on the next cycle of the Update() method.

     

    watsonTTS_05

    As you can see the first check we make in the Update() method is to check the value of the play variable.  Why do we do this?  Well….if we are going to call the Text to Speech service and play the speech to the user, we need to stop the Microphone from listening otherwise we’ll get into a self-talking avatar this is speaking and listening to itself.  Not what we want.  We want to play the message and when finished, we want to start listening for the users input via the microphone.

    There’s probably a better way to do it from within Unity, but I found that the above code worked for me.  We perform a check (we set the variable value in another method as you’ll see shortly) and we countdown the length of time of the clip that is being played.  This way, we can then determine when the Avatar has finished speaking / playing the clip and then start listening via the microphone again.

    Going back to the check on the play variable – if we look previously, at the end of the onCONVMessage() callback method we set play to true, so this will call the GetTTS() method.

    watsonTTS_06

    The GetTTS() method calls the Watson Text to Speech API, the only thing we’re setting here is the voice to use and we pass the TTS_Content variable that contains the text to convert.  The callback will go to the HandleToSpeechCalback() method.

    As you can see the clip object is returned and we assign this to the Audio Source and Play() the clip.  Here, we set the wait variable to the length of the clip and set the check variable to true – again we use these values within the Update() method.

     

    Going back up the file, we have the OOTB content from the sample files for the Speech to Text.  As you can see 

    watsonTTS_07

    As you can see above, when the method StartRecording() is executed is will call the RecordHandling() method as shown below:

    watsonTTS_08

    This starts the microhpone listening and takes the captured speech and streams the content to the Speech to Text service.

    watsonTTS_09

    As you are speaking, the Speech to Text service will attempt to convert the text “live” and show the output to the Canvas text variable on the screen.

    Once the speech has finished (the result is determined to be .final rather than .interim), we take that text and call the Watson Assistant API via the Watson SDK, passing the Input text and the Context variable (as this is the 2nd+ conversation call, we need to keep passing the growing Context variable value)

     

    That does seem like quite a lot, but it is actually pretty simple and does exactly what it is required to do.  Next we’ll see what it actually does.

  4. Preview and Debug within Unity

    This is what your Unity IDE screen should now look like if you are viewing the “Scene” tab and have the “SALSA_UMA2_DCS” GameObject selected:

    unity_preview_01 

     As you can see, I have the Active Race now set to [HumanMaleDCS] and I have added some Wardrobe Recipes from the examples folder.

     

    When you press the [>] Run button, the Avatar will be displayed in the “scene” window within the IDE and you will see the Debug.Log() output values displayed underneath.  This is where you can keep track of what is going on within the .cs code:

    unity_preview_02

    As you can see I have output when the “play” variable is set to true, this will trigger the action in th Update() method.  This is actually where the Speech for the welcome/greeting message is happening.  The output with “Server state is listening” is where the Speech has finished and the Microphone is now active and listening.  The “[DEBUG] tell me a joke” output is showing me what the Text-to-Speech service recognised and will then be passing to the Watson Assistant service.  As I say, this is a good way to see the output of each step and to analyse the information in more detail.  If you select a line in the DEBUG output, you will see there is a small window at the bottom of the panel that shows you more indepth information – this is really useful for reviewing the contents of the JSON messages passed back and forth.

     

    If you wish to “see” your avatar outside of the Unity IDE environment, then from the File menu, select Build Settings:

    unity_build_01

    Here you will need to press the [Add Open Scenes] if your scene is not in the list initially.  You then select [PC, Mac & Linux Standalone] and select the Target Platform you wish to output for.  You can then press [Build] and it will output, in this case for Mac, to a .app application that you can run by double-clicking on it and it will start up the Unity player and your avatar will initiate and you can start talking and communicating as much or as little as you like!

    If you select [Player Settings…] you will see in the IDE Inspector on the right, there are more specific details that you can set about the output of the .app itself, you can change the Splash Image, the amount of RAM allocated to the app, your specific details etc…etc…

     

  5. Running the app from a Mac

    I made a few minor settings changes that I want to raise here – as I’m sure if you are following through this, you would have got this far and thought, “But, when I view my UMA human avatar, I don’t have it zoomed in on the head? how do I do that?”

    First of all, select the “Main Camera” GameObject and look in the Inspector to see where I’ve set the main camera to be [X, Y, Z] values:

    unity_settings_01

    Now, click on the “SALSA_UMA2_DCS” GameObject – this is the actual human avatar:

    unity_settings_02

    You can see that I have modified the “Position” values.  You might ask, “how did I know to set it to these values?”.  Well, good question!

    If you press the [>]Run button in the Unity IDE and then you see the UMA human on the screen, you can directly modify the values in the Inspector and the changes happen in real-time.  This way, you can play around with values of the “Main Camera” and th “SALSA_UMA2_DCS” GameObjects and get the view that you want.  Be aware though! Write down the values you changed to, once you press the [>]Run button to stop, those values you changed will revert back to the previous values.  You will then have to modify them manually again.

    One last change I made was to replace a default animation value that is set – you may not want to do this, but I found it a bit distracting and I will attempt to write my own animation model in the future.  If you do not change this value, then when you see your UMA human avatar it’ll be moving about, rocking it’s head and body, swinging around a bit like it’s been in the bar for a few hours.  I didn’t want this so I set the animation to ‘none’, that is why my UMA human avatar is fixed and focused looking forward and just it’s eyes and mouth move:

    unity_settings_03

    As you can see, there are some default UMA animations that you can use.

     

    This is all great, but the ultimate goal is to see it actually running and working!

    For that I’ve captured a couple of video’s that you can view below:

    (if you’re really interested, yes that is my custom car: https://www.lilmerc.co.uk/ )

     

     

     As you see hear/see it did not always behave as I expected.  I need to work on adding more content to my Watson Assistant Intents / Entities and change my Dialog flow to include a reference to the Intents[0].confidence % level, so that when I get mis-heard saying “no” and it thinks I said “note”, it handles it more gracefully.  Now I have the baseline working though, I can spend more time refining these areas.

    I’m going to give this tutorial a little look too, as I think I might be neededing to do this: https://developer.ibm.com/recipes/tutorials/improve-your-speech-to-text-accuracy-by-using-ibm-watson-speech-to-text-acoustic-model-customization-service/

     

    As you can see above, I’ve spent more time writing this up than it actually took me to make.  My goal now will be to enhance things further (when I get some time), such as looking more into what the SALSA components can do for me; making the LipSync more realistic; perhaps adding more visual feature movements to the UMA human avatar; having key trigger words that perform certain reactions, such as having the head tilt to one side when listening or having the UMA digital avatar squint and wrinkle it’s forehead slightly when responding to questions…

    ….and then there is the other-side, I can look into tapping into the IBM Watson Tone Analyzer service to detect the tone of the user and change the UMA digital avatar responses…. oh, and then there is the ability to Build&Deploy to WebGL….and to iOS and Android phones…..oooooo and then there is the Virtual Reality output from Unity too……

    Anyway, there is always scope for doing more, this is genuinely just the start… I hope you find it a useful starting point for you own projects.  Good Luck!

     

    https://developer.ibm.com/code/open/projects/watson-developer-cloud-unity-sdk/

    https://developer.ibm.com/code/open/projects/watson-developer-cloud-unity-sdk/

    https://developer.ibm.com/code/open/projects/watson-developer-cloud-unity-sdk/

     

29 comments on"Create a 3D Digital Human with IBM Watson Assistant and Unity3D"

  1. Hello Tony,
    Really appreciated your work, awesome tutorial ..just stuck at one point I am getting no intents and contexts in the OnCONVMessage callback hence throwing NullPointerException there..what I am doing wrong?
    Thank you in advance.

  2. Tony_Pigram May 18, 2018

    I just noticed in my line 279 for onConvMessage() the red-box highlighter is actually covering up the first ‘(‘ – apologies for that.

    So, the response should come back in the response object. Can you Log.Debug() that object to see what you are receiving? or even put a Debug stop on line 283 and use the Unity#d MonoDevelop to “see” what that object contains.

    One last point – you have setup the Watson Assistant service with a couple of Intents, Entities and some Dialog content and have tested that you have correct connectivity values – ie. it is connecting to the API okay?

  3. I haven’t created the new workspace of my own instead I added existing car dashboard workspace into mine. Connectivity seems to be fine I am just facing an issue in STT. Assistant and TTS are working good.
    I am able here the welcome message from UMA but my commands are not working with STT.

  4. Tony_Pigram May 18, 2018

    The STT code comes directly from the Watson SDK example code. Double-check your code against the code I used here: https://github.com/YnotZer0/unity3d_watsonSdk/blob/master/WatsonTTS.cs

    If you are hearing the “welcome message” then it means that the code has executed the code at #102 and has triggered the onConvMessage() at #279. If STT is not working, then I’d double-check that you’ve configured the STT config details within the IDE config values, make sure that you have added the Audio Listener and activated it:

    “Click on the [Main Camera] and make sure that the [Audio Listener] is selected (this is the Microphone input)”

    Then when you are running, you should see the DEBUG output like the 2nd screenshot in the section above: https://developer.ibm.com/recipes/tutorials/create-a-3d-digital-human-with-ibm-watson-assistant-and-unity3d/#r_step4

    You should see that the SpeechToTextOnListeningMessage DEBUG output says that the ‘Server state is listening’ – if that is the case it is initialised and ready to listen and stream the audio to the STT service. Do you see this initialising for you?

    p.s. apparently there was an issue with the STT service returning 500-errors yesterday, maybe that was related?

  5. Surprisingly it’s working on windows and not on the mac, same code untouched. This is so strange I did not find any evidence of this issue. I am aware you also have developed this tutorial on the mac machine. Let’s see😕

  6. I even did microphone testing on mac, they seem to be working fine.

  7. [05/27/2018 18:41:32][Unity][CRITICAL] Unity Exception IndexOutOfRangeException: Array index is out of range. : WatsonTTS.HandleToSpeechCallback (UnityEngine.AudioClip clip, System.Collections.Generic.Dictionary`2 customData) (at Assets/WatsonTTS.cs:382)
    IBM.Watson.DeveloperCloud.Services.TextToSpeech.v1.TextToSpeech.ToSpeechResponse (IBM.Watson.DeveloperCloud.Connection.Request req, IBM.Watson.DeveloperCloud.Connection.Response resp) (at Assets/Watson/Scripts/Services/TextToSpeech/v1/TextToSpeech.cs:470)
    IBM.Watson.DeveloperCloud.Connection.RESTConnector+c__Iterator0.MoveNext () (at Assets/Watson/Scripts/Connection/RESTConnector.cs:548)
    IBM.Watson.DeveloperCloud.Utilities.Runnable+Routine.MoveNext () (at Assets/Watson/Scripts/Utilities/Runnable.cs:131)
    UnityEngine.SetupCoroutine.InvokeMoveNext (IEnumerator enumerator, IntPtr returnValueAddress) (at C:/buildslave/unity/build/Runtime/Export/Coroutines.cs:17)

    UnityEngine.Debug:LogError(Object)
    IBM.Watson.DeveloperCloud.Debug.DebugReactor:ProcessLog(LogRecord) (at Assets/Watson/Scripts/Debug/DebugReactor.cs:60)
    IBM.Watson.DeveloperCloud.Logging.LogSystem:ProcessLog(LogRecord) (at Assets/Watson/Scripts/Logging/Logger.cs:206)
    IBM.Watson.DeveloperCloud.Logging.Log:Critical(String, String, Object[]) (at Assets/Watson/Scripts/Logging/Logger.cs:294)
    IBM.Watson.DeveloperCloud.Logging.LogSystem:UnityLogCallback(String, String, LogType) (at Assets/Watson/Scripts/Logging/Logger.cs:167)
    UnityEngine.Application:CallLogCallback(String, String, LogType, Boolean)

    Hey Tony,

    I am facing index out of ranger error. Please help.

    • Tony_Pigram May 28, 2018

      Hi rajax,

      Before the following line:
      _textToSpeech.ToSpeech(HandleToSpeechCallback, OnTTSFail, TTS_content, true);

      are you able to output the content of “TTS_content” so I can see what you are passing to the TTS service? (I’m also assuming you have the correct values in the Inspector for the service variables)

      I note the following: https://answers.unity.com/questions/1473488/why-am-i-getting-a-list-index-out-of-range-error-f.html
      It might be related to the version of the SDK that you are using? Does the default sample work for you?

      • digital acid May 29, 2018

        Hi Tony, thanks for the awesome tutorial. I am also facing the same error as rajax. the only change is I am using iclone characters. If you have a remedy I’d be happy to try it. I’ve tried the same version of unity and different as well as sdk’s. all the watson examples work just not when combined all together in one script with salsa. if you can contact me privately that would be great if. again thanks for the awesome tutorial!

  8. Jean-LucD June 13, 2018

    Hi, have this errors.
    Assets/Scripts/WatsonTTS.cs(129,31): error CS0123: A method or delegate `WatsonTTS.OnSTTRecognize(IBM.Watson.DeveloperCloud.Services.SpeechToText.v1.SpeechRecognitionEvent)’ parameters do not match delegate `IBM.Watson.DeveloperCloud.Services.SpeechToText.v1.SpeechToText.OnRecognize(IBM.Watson.DeveloperCloud.Services.SpeechToText.v1.SpeechRecognitionEvent, System.Collections.Generic.Dictionary)’ parameters

    Assets/Scripts/WatsonTTS.cs(129,31): error CS0123: A method or delegate `WatsonTTS.OnSTTRecognizeSpeaker(IBM.Watson.DeveloperCloud.Services.SpeechToText.v1.SpeakerRecognitionEvent)’ parameters do not match delegate `IBM.Watson.DeveloperCloud.Services.SpeechToText.v1.SpeechToText.OnRecognizeSpeaker(IBM.Watson.DeveloperCloud.Services.SpeechToText.v1.SpeakerRecognitionEvent, System.Collections.Generic.Dictionary)’ parameters

    can you have a solution ?
    Thanks

    • Tony_Pigram June 13, 2018

      Hi Jean-LucD,

      It looks like that line in your post that is throwing the error is the following one:
      //around line 129
      _speechToText.StartListening(OnSTTRecognize, OnSTTRecognizeSpeaker);

      I am aware that the Watson SDK has changed a couple of versions since I posted the article, so I went to check the Watson SDK here: https://github.com/watson-developer-cloud/unity-sdk

      I then took a look at the SDK source-code for the .StartListening() method: https://github.com/watson-developer-cloud/unity-sdk/blob/develop/Scripts/Services/SpeechToText/v1/SpeechToText.cs

      //around line 500
      ///

      /// This starts the service listening and it will invoke the callback for any recognized speech.
      /// OnListen() must be called by the user to queue audio data to send to the service.
      /// StopListening() should be called when you want to stop listening.
      ///

      /// All recognize results are passed to this callback.
      /// Speaker label goes through this callback if it arrives separately from recognize result.
      /// Returns true on success, false on failure.
      public bool StartListening(OnRecognize callback, OnRecognizeSpeaker speakerLabelCallback = null, Dictionary customData = null)
      {

      Then I took a look at the example code for the SDK (that I notice was changed 5 days ago): https://github.com/watson-developer-cloud/unity-sdk/blob/develop/Examples/ServiceExamples/Scripts/ExampleStreaming.cs

      //around line 122
      _service.StartListening(OnRecognize, OnRecognizeSpeaker);

      That matches my original code. The SDK code implies that if it is not passed 3 params it will just set the 3rd param to null.

      The error you are receiving “parameters do not match delegate” implies that the call is not being passed the parameters it is expecting, which is interesting. I wonder – have you tried the Example code and proven it is working with your setup/Unity Watson SDK version?

      • Hi Tony_Pigram,

        With the latest Watson API, the two SpeechToText.StartListening call back methods [OnSTTRecognize] and [OnSTTRecognizeSpeaker] now require a two parameter signature. If you add [Dictionary customData] as a second parameter to each method in your github code, the code will continue to work with the latest API updates.

        Also, in the image [catalog_watson_apis_WA_12.png], the [#General_Jokes] child node [true] should be [false].

        Thanks!

      • The comment section ate the Dictionary example, it should be as below but replace the brackets with angle brackets:
        Dictionary[string, object] customData

  9. Jean-LucD June 14, 2018

    Hi Tony and CrazyM, problem solved and ready for new challenges, Thanks so much for your time and comments. As i say: Each line of code makes a better world.

  10. LDMarkley July 03, 2018

    Thank you so much for the wonderful tutorial. I followed along with it in the hopes of learning how to integrate the SDK in my project. I’ve followed your instructions completely and I’m running into four errors when I attempt to run my project. I’ve checked that I have the latest version of the SDK and the latest version of Unity. If you have any suggestions I’d appreciate the feedback as I’m not sure what is causing these issues. The chatbot DOES say the opening welcome line, so at least that part is working. I’ve also verified that all of my credentials have been entered correctly and my microphone does work with the TTS Example scene.

    First error – [Unity][CRITICAL] Unity Exception ArgumentNullException: Argument cannot be null.
    Parameter name: text : IBM.Watson.DeveloperCloud.Services.TextToSpeech.v1.TextToSpeech.ToSpeech (IBM.Watson.DeveloperCloud.Services.TextToSpeech.v1.SuccessCallback`1 successCallback, IBM.Watson.DeveloperCloud.Services.TextToSpeech.v1.FailCallback failCallback, System.String text, Boolean usePost, System.Collections.Generic.Dictionary`2 customData) (at Assets/Watson/Scripts/Services/TextToSpeech/v1/TextToSpeech.cs:392)

    Second error – ArgumentNullException: Argument cannot be null.
    Parameter name: text
    IBM.Watson.DeveloperCloud.Services.TextToSpeech.v1.TextToSpeech.ToSpeech (IBM.Watson.DeveloperCloud.Services.TextToSpeech.v1.SuccessCallback`1 successCallback, IBM.Watson.DeveloperCloud.Services.TextToSpeech.v1.FailCallback failCallback, System.String text, Boolean usePost, System.Collections.Generic.Dictionary`2 customData) (at Assets/Watson/Scripts/Services/TextToSpeech/v1/TextToSpeech.cs:392)

    Third error – [Unity][CRITICAL] Unity Exception IndexOutOfRangeException: Array index is out of range. : WatsonTTS.HandleToSpeechCallback (UnityEngine.AudioClip clip, System.Collections.Generic.Dictionary`2 customData) (at Assets/Scripts/WatsonTTS.cs:379)

    Fourth error – IndexOutOfRangeException: Array index is out of range.
    WatsonTTS.HandleToSpeechCallback (UnityEngine.AudioClip clip, System.Collections.Generic.Dictionary`2 customData) (at Assets/Scripts/WatsonTTS.cs:379)
    IBM.Watson.DeveloperCloud.Services.TextToSpeech.v1.TextToSpeech.ToSpeechResponse (IBM.Watson.DeveloperCloud.Connection.Request req, IBM.Watson.DeveloperCloud.Connection.Response resp) (at Assets/Watson/Scripts/Services/TextToSpeech/v1/TextToSpeech.cs:470)

    • Tony_Pigram July 09, 2018

      Hi LDMarkley,

      Hmmm….. that looks like a repeat of the issue rajax was having (I wonder if he resolved it?).

      That code originated from the sample source here:
      https://github.com/watson-developer-cloud/unity-sdk/blob/develop/Examples/ServiceExamples/Scripts/ExampleTextToSpeech.cs

      Lines 117,188, 256 and 261->270. As you can see they are like-for-like. The ONLY difference is that I reference the usage of audioSrc for playing of the clip that I have assigned to the SALSA 3D (script) component- you should see if shown in the ‘Audio Source’ property.

      It is most odd because that code is re-used, so if you’ve heard the Welcome message that has received the response back from the WA service and has called the TTS service passing the contentText and has received a result that can be played. I was going to ask if you could debug output the TTS_content field inside GetTTS() to see that it is actually passing content to the TTS service?

      It’s annoying that I cannot replicate the issue or make it fail myself.
      I manually set TTS_content= ” “;
      To see if I could emulate passing an empty space value, but that returns a very different error message:
      [Unity][CRITICAL] Unity Exception ArgumentException: Length of created clip must be larger than 0 : UnityEngine.AudioClip.Create (System.String name, Int32 lengthSamples, Int32 channels, Int32 frequency, Boolean stream, UnityEngine.PCMReaderCallback pcmreadercallback, UnityEngine.PCMSetPositionCallback pcmsetpositioncallback)

      I manually set TTS_content = “”;
      That has no value at all and again I get a very different error:
      [Unity][CRITICAL] Unity Exception ArgumentNullException: Argument cannot be null. Parameter name: text : IBM.Watson.DeveloperCloud.Services.TextToSpeech.v1.TextToSpeech.ToSpeech

      I’m at a loss as I cannot replicate the issue by trying to break the code, it just won’t fail for me…. Are you able to provide any further indepth debugging analysis?

      • LDMarkley July 13, 2018

        Sorry for the delay in getting back to you, it’s been a hectic week. So keep in mind I’m new to all this when you read my response so if I sound like I have no idea what I’m talking about I probably don’t. 🙂
        I was fiddling with different things trying to debug the errors I mentioned and I discovered completely by accident that unchecking the “Play” checkmark under where I put in my IBM credentials cleared errors 1 and 2. I’m genuinely not sure why. I did do the debug print like you said and there was indeed data being passed to the TTS service. I have no idea why unchecking the play checkmark worked but it did.
        For errors three and four I found out, again, by basically just poking at the code to see what I could do that if I comment out the Coroutine calls at line 379 and 380, errors 3 and 4 clear and the whole thing runs perfectly. Again, I have no idea why it didn’t like the coroutine since it was only taking care of the eye motion but that seems to have been where the problem lay.

        • Tony_Pigram July 16, 2018

          Ah! okay, I’m not sure why switching the play variable from true to false solved anything, but glad it did. Line 72 should have set it to false by default.

          Then line 342 will set it to true, so that the text from the Conversation/Assistant service can now be played/sent to the TTS service:

          //trigger the Update to PLAY the TTS message
          play = true;

          There is a continuous check being made by the Update() function (further down the code):

          // Update is called once per frame
          void Update () {

          if (play)
          {
          Debug.Log (“play=true”);
          play = false;
          Active = false;
          GetTTS();
          }

          So, that will pickup the fact that play is now true and then set itself to false and then call GetTTS():

          //called by Update() when play=true;
          private void GetTTS()
          {
          // Synthesize
          // Log.Debug(“WatsonTTS”, “Attempting synthesize.”);
          _textToSpeech.Voice = VoiceType.en_US_Michael; // .en_US_Allison; //.en_GB_Kate;
          _textToSpeech.ToSpeech(HandleToSpeechCallback, OnTTSFail, TTS_content, true);
          }

          Which is then passed the Text returned from the Conversation/Assistant service:

          void HandleToSpeechCallback(AudioClip clip, Dictionary customData = null)
          {
          if (Application.isPlaying && clip != null && audioSrc != null)
          {
          audioSrc.spatialBlend = 0.0f;
          audioSrc.clip = clip;
          audioSrc.Play();

          //set flag values that can be picked up in the Update() loop
          wait = clip.length;
          check = true;
          }
          }

          And it was in the center of that code that we had the StartCoroutine() code that was throwing your error. Good debugging!

          Ah, the StartCoroutine() code for lookleft/lookright was an experiment that I was fiddling around with that didn’t work out – I had those 2 extra elements in my project that’s why it worked for me and not for you – thank you for helping debug that, I’ve removed reference to them as it didn’t do what I was hoping for anyway.

          Thanks for figuring that out and apologies for leaving in those 2 lines of code – am glad you got to grips with what is happening in the code now though!

  11. Tony, Thanks for the recipe. I have the example working well in Unity, but I’m also a bit confused why I can’t a general conversation with the virtual character. If I recall correctly, I was able to use both Watson and Google Assistant (in the command line) and have a simple conversation with with program–much as you would with an Alexa or other home device. Is there a way to get Watson to have a similarly conversational quality in this example? Thanks again…

    • Tony_Pigram July 27, 2018

      Well, the virtual character is just the user interface to the intelligence (or lack of) behind the scenes. You might be referencing what we refer to as chit-chat or small-talk: https://dialogflow.com/docs/small-talk
      My Watson Assistant workspace had about 3 specific Intents created and 1 dialog flow for the structured conversation response – purely as a tester to prove it works.
      There is nothing to stop you extending your WA workspace to have more Intents (I believe WA even provides a load of sample utterances/Intents for you now via “Content Catalog”: https://www.ibm.com/blogs/watson/2018/03/the-future-of-watson-conversation-watson-assistant/ , you can add those and build out your dialog flow responses as needed)

      I once connected up my 3D-printed robotic head to use IBM Visual Recognition, STT and TTS service and as a tester passed the conversation text to https://www.cleverbot.com/ – this has an API that allows you to pass text to it and you receive responses back. It is nonsense chit-chat conversation and you do not have to build out a Dialog flow, like you would do in Watson Assistant.

      It’s usage is very limited, but it is a fun way to have some bizarre and random conversations. You should be able to pretty easily swap out the calls to Watson Assistant service with calling Cleverbot in your Unity project code.

  12. tiancaipipi110 August 10, 2018

    did you add another “yes” intent to create the child node of “General_Jokes”?

  13. tiancaipipi110 August 10, 2018

    also where did the “General_positive_response” come from? did you make that at front too? so why not add it to the tutorial? Also, in the video you said “no”, but in the response it was recognizing “true”, what’s going on there?

    • Tony_Pigram August 12, 2018

      “General_positive_response” comes from the [Content Catalog] tab of the Workspace – click on that tab and you can see there are some pre-made Intents with utterances to save you having to come up with them yourself. they are a good starter point. It wasn’t relevant for the tutorial, hence not mentioning it, just showing it in the images.
      If you scroll up the comment, CrazyM points out that the “true” node should be a “false” node, as it will always go to the true node no matter what was uttered. This tutorial wasn’t meant to be about Conversation design, that you can create yourself, I just created a very simple Intent to show that interactions can happen.

  14. tiancaipipi110 August 10, 2018

    how did the “TextStreaming” get in there? How did you “add” the [Wardrobe Recipes]? where did you add it to? Isn’t it always there? What to put in the “CONV_version date”? I’ve tried with workspace created/modified date, neither works.

    • Tony_Pigram August 12, 2018

      If you scroll up and find the text “All that we shall add extra is a Canvas/Text GameObject, like so:” you will see where the TextStreaming is added.
      [Wardrobe Recipes] is a folder that is present when you add the UMA, if you search above for “I changed the [Active Race] to Female and added the [Wardrobe Recipes]:” you can see that at the bottom of the screen, the files are present, if you drag them into the section in the top (both are shown in red boxes for you), you can then “add” wardrobe items to your character.

      Okay, CONV_version_date is the date used by the Watson Assistant APIs: see here https://www.ibm.com/watson/developercloud/assistant/api/v1/curl.html?curl#versioning
      If you search the article for “Ensure that you have all of the Components added to your GameObject like so” you can click on the image and see the value I was using.

  15. BlueNucleus August 27, 2018

    Do you think this code can run in mobile environment if I export it as an Android App?

  16. After playing with this for a while I’m back with a follow-up question, which I suppose is mostly hypothetical. Is there anything you’d suggest to improve response time? Running my own example it felt like the delay between saying something and the bot responding was just long enough to be a bit awkward. I know there’s a lot of factors at play here, I was just debating any ways that the conversation could feel a bit more “natural” so to speak.

Join The Discussion