Overview

Skill Level: Intermediate

Intermediate

Get a feel of Watson Speech To Text, Translation, Text To Speech services as they integrate & leverage the data on Watson IoT Platform, to translate the voice that is continuously streamed. The recipe showcases integration of Platform & Cognitive services

Ingredients

Hardware Requirements:

  • Raspberry Pi with at least 8 GB SD Card
  • Head-set with Mic - USB only

Software Requirements:

  • Maven
  • Git
  • IBM Bluemix Account
  • Jazzhub Account

Service fee may apply. Estimated Monthly Costs: < $50

Information

More than 256MB RAM might be required to deploy this application. If the memory usage exceeds 256MB in the free trial accounts (Bluemix Trial Account and Standard Account), the application might not work as expected.

We suggest that you upgrade to Pay-as-You-Go or Subscription account to enjoy the full-range Bluemix services.

Step-by-step

  1. Introduction

    Words has the ability to Make or Break. If the conversation is between two individuals with languages of different origin, then, a Translator is the need of the hour. The Translator has the responsibility to ensure that the communication between these two individuals continues uninterrupted, in the right context and in a meaningful way. Be it International Help Desk’s, Call Center, Business process outsourcing (BPO), International Travel & Tourism, etc, to name a few, they all have the primary requirement of involving a Translator, to ensure quality customer service and pleasant experience for their customers / visitors.

    IBM Watson has a set of niche services, made available on Bluemix, to ensure the above mentioned scenario is carried out in a much smoother way. Watson Speech To Text, Language Translation, Text To Speech and Watson IoT Platform are the Bluemix services, whose capabilities are leveraged in this recipe, to demonstrate how the voice spoken in one language gets translated to another language, as the audio is streamed continuously.

    More so, having the data on the Watson IoT Platform, brings in the potential for lot more possibilities, like Management & Configuration of IoT Devices/ Gateways, various Drones & Droids and lot more.

    The Echo Translation Code Samples for this recipe is written using the Java Client Library for IBM Watson IoT Platform. The Sample is intended to execute on both Linux and Windows environment, as long as the requirements are met. The Recipe also makes use of Java SDK for IBM Watson services, which is available for Cloning / Download from Github repository.

    Refer to the WIoTP Documentation to further understand the IBM Watson IoT Platform capabilities, to come up with your own Application, Device & Gateway samples in the world of Internet of Things.

  2. What is demonstrated in this recipe?

    This section details out the architecture of the Echo Translation Sample, the components involved as part of it’s implementation and the role played by each of the Watson services, that are currently part of this architecture.

    The execution of the Echo Translation Sample can be broadly classified into Device side execution and execution on Bluemix platform.

    • Execution on Device: –

      Echo Translation Sample makes use of the Mic connected to the Raspberry Pi Device as the source of audio, in the preferred language of the user. The Raspberry Pi Device hosts the Echo Translation sample code, which comprises code snippets to execute & perform both Speech To Text (STT) and Text To Speech (TTS) on the Device itself. The output audio stream from the Text To Speech service can be written directly to a file on Device file system in one of the supported audio formats.
    • Execution on Bluemix Platform:-

      The Bluemix platform hosts the set of Watson services that are part of the Echo Translation Sample. Speech To Text (STT) service, Watson Language Translation (LT) service, Text To Speech (TTS) service and the Watson IoT Platform (WIoTP) are all hosted on Bluemix. The sample code on the Device makes use of the Authentication Credentials of STT & TTS to execute respective code snippets. The Watson Language Translation service plays the role of the vital cog, as it translates the incoming data from choice of source language to the preferred target language. The WIoTP plays a significant role, as it proves to be the communication bridge between the set of Watson services that are being used in the Echo Translation model.

     

    Language_Translation_v12

    Image 1: High Level Architecture of Translating Echo

     

    The execution of the Echo Translation Sample, as performed by the three main Watson services, are described as follows:

    Speech to Text

    A Mic is connected to the Raspberry Pi device. As the User speaks or communicates through the mic in the preferred choice of language, the audio is streamed to the Speech To Text Watson service, which then converts the Audio Stream into a Text file. This Text file is then uploaded to the Watson IoT Platform, which opens up an array of possibilities that can be achieved through WIoTP.

    Language Translation

    The Text file available in WIoTP, is now fed as input to the Watson Language Translation service. Based on the choice of target langauge chosen, the Watson Translation service generates the output Text file, translating the original data in the preferred language. The output Text file, now available in the target language, is loaded into the WIoTP.

    Text to Speech

    The Text file containing the translated data in the preferred language, being made available on WIoTP, is then passed on to the Text To Speech Watson service. This service now reads through the text content of the Text file and accordingly generates the audio stream, which can be either streamed directly on the attached audio source or be written to the file system as an audio file ( WAV, FLAC, OGG etc )

    Note 1: The Raspberry Pi Device do not have the capability to record voice through the Audio Jack Port, i.e 3.5mm Port. Hence, it is highly recommended to use a Head-set with Mic that connects to the Raspberry Pi through the USB Port.

    Note 2: The Echo Translation Sample is currently available in JAVA language and hence is platform independent. This facilitates the usage of the Sample on Windows platform as well (Any Laptop or Desktop that has either built-in Mic or has USB connectivity). If using Eclipse, then Import the Sample as an ‘Existing Maven Project’, build it and you should now have the environment ready for execution.

    Note 3: We have not explicitly tested this Echo Translation Sample on the Mac platform, but considering the platform independence of Java, we assume that the sample should work fine on it.

     

    In this section, you were taken through the implementation architecture of Voice Transmission – Echo Translation Sample, that showcased the live streaming of voice and translation of the same from one source language to another target language.

  3. Create & Deploy Node-RED application and Watson services in Bluemix

    The section helps you with the steps to Deploy the Bluemix services that are relevant to the demonstration of Voice Transmission – Echo Translation sample.

    As discussed in the architecture diagram, the recipe makes use of Watson IoT Platform, Speech To Text, Language Translation and Text To Speech services of Bluemix to successfully execute the sample. It also mentioned about the Node-RED Starter service that shall take care of the sample execution on the Bluemix platform.

    The following set of steps, shall walk you through the deployment process:

    1. The Voice Transmission РEcho Translation sample is made availabe on the Github repository. Click on the Create Toolchain button, provided below, to initiate the deployment process of the Echo Translation sample

      Toolchain-9

      Image 2: Create Toolchain action button, to initiate sample deployment process

      Note: If you are a User using the United Kingdom Region in your Bluemix environment, then please make use of the following Deploy to Bluemix button, to deploy the setup under your United Kingdom Region. Users of US South Region can ignore this step.

      deploy

    2. Provide a Custom Name to the application in the Create Toolchain page (or in the Deploy to Bluemix) option and click on Create to go ahead and deploy the application with relevant Bluemix services.
    3. The Deployment process takes couple of minutes to complete. Behind the scenes, the following set of processes are executed without any intervention:

      ¬† ¬† ¬†–¬†¬†Clones the application code into your Jazzhub account
      ¬† ¬† ¬†–¬† Create the application in Bluemix
           Р Creates the necessary services for this application
           Р Deploys the application in Bluemix
      – Binds the services to the application

    4. With the Deployment process concluding successfully, you can now access the Voice Transmission – Echo Translation Application, by clicking on the View App link. Alternatively, you can always access the Application from the application service available on Bluemix Dashboard.

      You would need to append the key word “red” to access the Node-RED Editor from your Application URL

      Ex: Your Application URL would be something similar to 

      http://Unique-Application-Name>.mybluemix.net

      You would append the URL with /red/, to get into the Node-RED Editor

      http://<Unique-Application-Name>.mybluemix.net/red/


    5. Minimize the browser tab where you have the Application URL open, for now.
    6. Open the Bluemix Dashboard and click on the Application service that you have currently deployed. The application should now list all the Watson services that are binded to it.

      New_Bluemix_Listing

      Image 3: Listing of Watson Services that are binded to your Application

    7. Click on each of the Watson service individualy, to obtain the Authentication credentials for STT, LT & TTS services. Choose the Service Credentials tab and click on New Credential, provide a Name for your reference and click on Add, to generate a new set of credentials, for the binded service. Click on View Credentials, to view the new set of credentials that was generated for you.

      Obtain_New_Credentials

      Image 4: Authentication Credentials of Watson Services that are binded to your Application.

      Make note of the credentials. Even though, if you don’t make a note of them, that is still fine. Since, you can always obtain the credentials as mentioned in the above step.

    In this section, you were briefed on the steps that helped you Deploy the Voice Transmission – Echo Translation sample, Open the Node-RED Application and how to obtain the Authentication credentials for each of the Watson service, that play a critical role in the processing of Echo Translation sample on the Bluemix platform.

  4. Register your Device In Watson IoT Platform

    The section helps you register the device to Watson IoT Platform and obtain the credentials, in order to publish the device events on to the Watson IoT Platform dashboard

    In order to access the full capabilities of the IBM Watson IoT Platform you must create an organization and register one or more devices in it.

    Open the Bluemix Dashboard and click on the Application service that you have currently deployed, if you have already moved away from it. You should see the Watson IoT Platform listed under the Binded Services. Click on the WIoTP service and choose the click on Launch button to launch the WIoTP Dashboard.

    Carry out the steps present in this recipe to register your device(s) in IBM Watson Internet of Things Platform. Once successfully registered, you should obtain the device credentials as well generate API Key. Note them down to work with the Voice Transmission – Echo Translation sample, during the scope of this recipe.

    This section explained how you can register devices using the dashboard and obtain security credentials.

  5. Seting up environment on the RPi Device

    In this section, you will be assisted with steps to prepare the environment on the Raspberry Pi Device, on top of which, you will be running the Voice Transmission sample.

    Set up the Voice Transmission – Echo Translation sample on the Raspberry Pi, that forms the Device side execution of the sample. The steps are detailed as follows:

    1. Obtain the IP Address of the Raspberry Pi device and connect to it using SSH
    2. Setup the Raspbian environment with Maven and Git if not installed already, by installing them as follows:
      sudo apt-get update
      sudo apt-get install maven
    3. Git is present in the latest Raspbian OS and hence the users need not install it again. However, if the users have an older version of Raspbian and do not see Git being part of it, then, make use of the following command to install the Git automatically using:

      sudo apt-get install git-all 
    4. Obtain the Voice Transmission sample by cloning the iot-cognitive-samples project using git clone as follows:

      git clone https://github.com/ibm-watson-iot/iot-cognitive-samples.git
    5. We are demonstrating the Voice Transmission – Echo Translation sample in this recipe. Hence, navigate to the source directory structure of ‘echo-translation‘ within ‘voice-transmission‘ directory, under iot-cognitive-samples directory structure, as shown below:
      cd iot-cognitive-samples/voice-transmission/echo-translation
    6. Build the Echo Translation Sample using the Maven command
      mvn clean package -Dmaven.test.skip=true

       

    Build & Compile on Eclipse:

    In a Eclipse environment (either on Linux or Windows platform), the Cloned project of iot-cognitive-samples can be imported as Existing Maven Projects. To compile the same, right-click on the project in the Package Explorer area, choose Run-As and select Maven build:

    VoiceTransmission --> Right Click --> Run As --> Maven build

    You will be prompted with the Edit Configuration and Launch pop up window. Fill up the content ‘clean package -Dmaven.test.skip=true‘ against the Goals, choose Apply and click on Run to Build & Compile the project.

    Goals: clean package -Dmaven.test.skip=true

     

    Monitor the messages on the command prompt, as the build process progresses. Ensure that the build process is concluded successfully without any errors or issues. A successful build would have now generated a new directory, by name target/classes, that holds all the compiled files.

    By end of this section, you should now have a valid setup of Maven & Git and a successfully compiled sample of Echo Translation.

  6. Edit the properties file

    In this section, you should familliarize yourself with the Device configuration parameters, whose values shall alter the course of this recipe, towards successful execution.

    Now, on Raspberry Pi device, open up the Device properties file device.properties, located under target/classes. Edit the properties file, to update the Device Registration details, the Authentication credentials of Speech To Text service & Text To Speech service. Post editing the ‘device.properties’ file, the entries should look similar to the one shown below:

    ## Device Registration detail
    Organization-ID = xxxxxx
    Device-Type = iotsample-deviceType
    Device-ID = Device02
    Authentication-Method = token
    Authentication-Token = xxxxxxxx

    ## Optional fields
    Clean-Session = true

    ## Speech To Text Credentials
    stt-username = xxxxxxxx-c7df-zzzz-a6fd-xyzxyzxyzxyz
    stt-password = xyzxyzxyzxyz

    ## Text To Speech Credentials
    tts-username = xxxxxxxx-c7df-zzzz-a6fd-xyzxyzxyzxyz
    tts-password = xyzxyzxyzxyz

    Listing 1: Configuration of parameters as listed under the device.properties file

    This section helped you understand the configuration parameters and the importance of their values, to successfully execute the Echo Translation sample.

  7. Creating the Node-RED flow

    This section briefs you on the steps that are needed to configure the Node-RED flow, that efficiently performs the Language Translation process.

    In the prior section ‘Seting up environment on the RPi Device’, you had performed steps to set up the sample for Device side execution. In this section, you will be introduced to the steps that shall help configure the Node-RED flow.

    1. Now, coming back to the browser, maximize the Application window that you had minimized earlier. If you have already closed the tab / window, then, click on the Application URL on the Bluemix Application. This step should help open the Node-RED Interface.
       http://<unique-application-name>.mybluemix.net/red/
    2. The Node-RED Flow that makes up the Bluemix side execution, is made available on the Github repository. Copy the contents of the file and Import the Flow on to the Clipboard in the Node-RED application.
      Import --> Clipboard --> <Paste the Contents> --> <Click Import>

      The Imported Node-RED Flow, should look similar to the one shown in the following Image

      Image 5: Node-RED Flow depicting the configuration for Bluemix Side execution

    3. Device credentials and Application API-Keys were generated by you, while you carried out the steps mentioned in the section ‘Register your Device In Watson IoT Platform’. Similarly, you had obtained the credentials for Watson Language Translation service, as you worked through the section ‘Create Node-RED application and Watson services in Bluemix‘. Now, update the credentials obtained against the nodes that are configured as part of the Node-RED flow.

    IBM IoT In Node: Double click on the IBM IoT In node at the beginning of the flow. Choose the Authentication to be API Key from the drop down menu, update the API Key entry with the API-Key and API Token credentials, choose Device Event for Input Type. Leave the rest to default, i.e having the All check box checked, provide a custom name to the node and click on Done to complete the node updates.

    Watson Translate Node: Double click on the Watson Translate node ( the node in the Green color with the name English – Spanish). Enter the authentication credentials, to further open up the next set of options. Choose the Mode as Translate, Domains as Conversational, Source language as English and Target language as Spanish. Click on Done to complete the node updates.

    IBM IoT Out Node: Double click on the IBM IoT Out node, located at the end of the flow. Entries for Authentication and API Key should be now auto populated, based on the choices made in the IoT In node. Choose Device Command for Output Type, Update Device Type and Device ID with the credentials you had obtained earlier, mention cmd against Command Type, json against Format and update Data as {“hi”:”hello”}. Provide a custom name to the node and click on Done to complete updating the node.

    Leave the other set of Node-RED nodes intact, without disturbing them. Deploy the Flow and observe Debug console to ensure there are no errors or warnings from the Flow. With this, we have successfully enabled the Node-RED Flow, that picks up the output of Speech to Text service from Watson IoT Platfrom, Translates it and sends the output to Text to Speech service through the IoT Platform.

    This section helped you set up the Bluemix side execution part of the Voice Transmission – Echo Translation sample, by configuring the Node-RED Flow to perform relevant langauge translation operation on the Bluemix platform.

  8. Initiating the Watson Voice translation

    In this section, you shall be walked through the steps that shall help initiate the Voice Transmission – Echo Translation sample.

    The following set of steps will kick start the execution of the Voice Transmission – Echo Translation sample:

    1. Come back to the Raspberry Pi Device. Navigate to the source directory, where you had performed the Maven Build command, i.e
      cd iot-cognitive-samples/voice-transmission/echo-translation
    2. Execute the following maven execution command, to execute the Echo Translation sample:
      mvn exec:java -Dexec.mainClass="com.ibm.watsoniot.FriendlyWatsonLanguageTranslator"
    3. As the execution begins, you should see a series of messages, that mentions successful connection to the Watson IoT Platform, and waits for Audio Input from the user
      Jul 22, 2016 1:02:13 PM com.ibm.iotf.client.AbstractClient createClient
      INFO: main: Org ID = ofz8jz
      Client ID = d:ofz8jz:iotsample-deviceType:Device05
      Jul 22, 2016 1:02:13 PM com.ibm.iotf.client.AbstractClient connect
      INFO: main: Connecting client d:ofz8jz:iotsample-deviceType:Device05 to ssl://ofz8jz.messaging.internetofthings.ibmcloud.com:8883 (attempt #1)...
      Jul 22, 2016 1:02:17 PM com.ibm.iotf.client.AbstractClient connect
      INFO: main: Successfully connected to the IBM Watson IoT Platform

      Listing 2: Message output showing the connectivity to Watson IoT Platform during the exeuction of the Echo Translation sample

    4. Now, hold the Mic at an appropriate distance from the mouth and speak into it, to initate the conversation. This Audio Streaming shall be open for the next 300 seconds or 5 minutes, as per the current sample, which can be always tweaked to increase or decrease the timeline. As the conversation is in progress, you should see the following excerpts from the execution:
      publishing tanscript: hello : true
      Playback started.
      34758
      Playback completed.
      publishing tanscript: good morning : true
      Playback started.
      59500
      Playback completed.
      publishing tanscript: how are you doing today : true
      Playback started.
      64500
      Playback completed.

      Listing 3: Message output showing the Audio Streaming excerpts during the exeuction of the Echo Translation sample

      The Publishing Transcript displayes the text output, as shared by the processing of Speech To Text. The message Playback Started indicates that the streaming of the Translated Audio Output from the Text To Speech processing has started and is in progress at destination Audio Output. Once the Audio Streaming is completed, the message Playback Completed is written to the prompt.

      Execution on Eclipse:

      To execute the Voice Transmission – Echo Translation Sample on Eclipse, you need to run the program FriendlyWatsonLanguageTranslator.java. To perform the execution, you need to do as follows:

      FriendlyWatsonLanguageTranslator.java --> Right Click --> Run As --> Java Application

       

      That concludes the execution of the Echo Translation sample.

    In this section, you were introduced to the set of steps that helped to perform and execute the Voice Transmission – Echo Translation sample, successfully.

  9. Conclusion

    The Watson IoT Recipe Translated Echo Leverage Watson IoT Platform to translate voice showcased the possibilities of live streaming the audio and translating it to another language, on the fly, without having to write it on the disk, thus saving time and enhancing the performance. This scenario is best suited for environments were continuous streaming of audio is a primary requirement. The hassles of audio storage and triggering their injection, either spontaneously or at defined time intervals, have been nullified here.

    Watch out this space, as the next set of Watson IoT Recipes are scheduled to utilize and enhance the current implementation.

Join The Discussion