As part of the Watson development platform’s continued expansion, IBM is today introducing the latest set of cognitive services to move into General Availability (GA) that will drive new Watson powered applications. They include the GA release of IBM Watson Language Translation (a merger of Language Identification and Machine Translation), IBM Speech to Text, and IBM Text to Speech. These cognitive speech and language services are open to anyone, enabling application developers and IBM’s growing ecosystem to develop and commercialize new cognitive computing solutions that can do the following:
  • Translate news, patents, or conversational documents across several languages (Language Translation)
  • Produce transcripts from speech in multi-media files or conversational streams, capturing vast information for a myriad of business uses. This Watson cognitive service also benefits from a recent IBM conversational speech transcription breakthrough to advance the accuracy of speech recognition (Speech to Text)
  • Make their web, mobile, and Internet of Things applications speak with a consistent voice across all Representational State Transfer (REST) – compatible platforms (Text to Speech)
  • There are already organizations building applications with these services, since IBM opened them up in beta mode over the past year on the Watson Developer Cloud on IBM Bluemix. Developers have used these APIs to quickly build prototype applications in only two days at IBM hack-a-thons, demonstrating the versatility and ease of use of the services.

Supported Capabilities

We have made several updates since the beta releases which was inspired by feedback from our user community. Language Translation now supports:
  • Language Identification – identifies the textual input of the language if it is one of the 62 supported languages
  • The News domain – targeted at news articles and transcripts, it translates English to and from French, Spanish, Portuguese or Arabic
  • The Conversational domain – targeted at conversational colloquialisms, it translates English to and from French, Spanish, Portuguese, or Arabic
  • The Patent domain – targeted at technical and legal terminology, it translates Spanish, Portuguese, Chinese, or Korean to English
Speech to Text now supports:
  • New wideband and narrowband telephony language support – U.S. English and Spanish
  • Broader vocabulary coverage, and improved accuracy for U.S. English
Text to Speech now supports:
  • U.S. English, UK English, Spanish, French, Italian, and German
  • A subset of SSML (Speech Synthesis Markup Language) for U.S. English, U.K. English, French, and German (see the documentation for more details)
  • Improved programming support for applications stored outside of Bluemix

Pricing and Freemium Tiers

Trial Bluemix accounts remain free. Please visit www.bluemix.net to register, and get free instant access to a 30-day trial without a credit card. Use of the Speech to Text, Text to Speech, and Language Translation services are free during this trial period. After the trial period, pricing for Language Translation will be:
  • $0.02 per thousand characters. The first million characters per month are free.
  • An add-on charge of $3.00 per thousand characters for usage of the Patent model in Language Translation.
After the trial period, pricing for Speech to Text will be:
  • $0.02 per minute. The first thousand minutes per month are free.
  • An add-on charge of $0.02 per minute for usage of narrowband (telephony) models. The first thousand minutes per month are free.
After the trial period, pricing for Text to Speech will be:
  • $0.02 per thousand characters. The first million characters per month are free.

Transition Plan

We look forward to continuing our partnership with the many clients, business partners, and creative developers that have built innovative applications using the beta version of the four services: Speech to Text, Text to Speech, Machine Translation and Language Identification. If you have used these beta services, please migrate your applications to use the GA services by August 10, 2015. After this date the beta plans for these services will no longer be available. For details about upgrading, see: We’re eager to see the next round of cognitive applications based on the Speech and Translation Services. For questions, join the discussion in our Forum, or send an email to HiWatson@us.ibm.com with “Speech” or “Translation” in your inquiry. IBM is placing the power of Watson in the hands of developers and an ecosystem of partners, entrepreneurs, tech enthusiasts and students with a growing platform of Watson services (APIs) to create an entirely new class of apps and businesses that make cognitive computing systems the new computing standard.

8 comments on"IBM Watson Language Translation and Speech Services – General Availability"

  1. good information.

  2. Interesting… this suggest that the Keyboards end is close.

  3. Name * Rick Seger July 28, 2015

    The future looks bright. However, I do need to take issue with one key point/use case you described. Everyone in Education circles knows that “when you write… you learn”. Let’s not do damage to the learning process. The real value comes when you can utilize the voice and handwritten notes together. Now that’s when the future looks really interesting… and not just for students.

    • Rick, I completely agree. Taking notes is critical for the absorption process. But there are many times where detailed note taking is difficult, so having transcription as alternative or augmentation would be a real plus.

Join The Discussion

Your email address will not be published. Required fields are marked *