Since its launch, you’ve made the IBM Watson Developer Cloud one of IBM’s most vibrant and innovative communities on Bluemix. Today more than 5,000 partners, developers, data hobbyists, entrepreneurs, students and others have contributed to building 6,000+ apps infused with Watson’s cognitive computing capabilities.

A few months ago we released eight beta Watson services so that this community can test drive them, think of new ways to apply and tap into Watson’s capabilities, and harden each service as we prepare them for general availability. The services – which range from Language Identification and Machine Translation to Visualization Rendering and User Modeling – are being embedded into a new class of cognitive apps.

One example is Red Ant’s Sell Smart mobile app, a retail sales trainer that lets employees easily identify unique customer buying preferences by analyzing demographics, purchase history, wish lists, pricing and other product information. Another is eyeQ’s eyeQ Insights, which helps retailers understand how consumers make purchasing decisions while standing in the store.

Today, we are excited to announce the arrival of five additional new beta services to the Watson Developer Cloud. Available now, you can access the following free beta services on Bluemix:

We’ve included an overview of each service below. Our team will continue to add more services in the Watson Developer Cloud as they become available. Stay tuned.

New services

Speech to Text

Speech to Text is a cloud-based, real-time service that uses low latency speech recognition capabilities to convert speech into text for voice-controlled mobile applications, transcription services, and more. Transcriptions are continuously sent back to the client, and retroactively corrected as more speech is heard, helping the system learn.

The service is based on more than 50 years of speech research at IBM. It uses state-of-the-art algorithms based on convolutional neural networks or “deep learning”. Using these algorithms, the Watson team has published the best accuracy results (10.4% word error rate vs. 12.5% for the second best as of today) on the popular Switchboard Hub5-2000 benchmark, and provided technology that has been deployed on more than 500 million smartphones. This is the first time in 10 years that the IBM team is delivering speech technology broadly to developers. While the base algorithms are solid, the service will keep getting better as it gets more usage and training data.

Use Cases:

  • Enable voice control over apps, embedded devices or accessories
  • Provide transcription of meetings and conference calls in real-time
  • Critical building block for “Speech-to-Speech” translation

Documentation | Demo

Text to Speech

Text to Speech converts textual input into speech, and provides the option of three voices in English or Spanish, including the American English voice used by Watson in the 2011 Jeopardy match.  Text to Speech generates synthesized audio output complete with appropriate cadence and intonation.

The user can input any English or Spanish text to generate speech output, a service that has potential applications for the vision-impaired, as reading-based education tools and for multiple mobile apps.

Use Cases:

  • Assistance for the vision-impaired, reading and language education
  • Enable the audio reading of texts and emails to drivers
  • Critical building block for “Speech-to-Speech” translation

Documentation | Demo

Visual Recognition

Visual Recognition analyzes the visual appearance of images or video frames to understand what is happening in a scene.

The Visual Recognition service includes an unmatched number of preset classifier and trained labels (2,000+), a taxonomy that recognizes 150+ different sports, and can ingest 1,000+ batch images with the ability to recognize multiple labels in a picture. Like the Speech to Text service, Visual Recognition relies on deep learning. Convolutional neural networks are used as semantic classifiers that recognize many visual entities such as settings, objects, and events. Input JPEG images into the service and you will receive a set of labels and probability scores such as such as “soccer, 0.7” or “baseball, 0.3”.

Use Cases:

  • Organize and ingest large collections of digital images
  • Build semantic association between images from multiple users
  • Understand consumer shopping preferences based on image queries

Documentation | Demo

Concept Insights

Concept Insights handles text in a conceptual way, delivering a search capability that discovers new insights on text compared to traditional keyword searches.

Concept Insights links user-provided documents with a pre-existing graph of concepts based on Wikipedia (e.g. ‘The New York Times’, ‘Machine learning’, etc.). Two types of links are identified: explicit links when a document directly mentions a concept, and implicit links which connect the user’s documents to relevant concepts that are not directly mentioned. Users of this service can also search for documents that are relevant to a concept or collection of concepts by exploring the explicit and implicit links.

Use Cases:

  • Improve search queries with results that are more conceptually related
  • Locate sources of expertise across large or complex organizations
  • Deepen customer engagement on externally facing websites
To learn more about Concept Insights, check out this great video and blog post by Luis A. Lastras, a research scientist in Watson Group.  Luis and his team joined the Watson Group from the famous IBM Research Labs located in Yorktown Heights, New York, and brought you the Concepts Insights service within one year.

Documentation | Demo

Tradeoff Analytics

Tradeoff Analytics enables dynamic real-time ‘tradeoff’ decisions across static or changing parameters, all delivered in an interactive visual display. Tradeoff Analytics enables better decision-making by dynamically weighing multiple, often conflicting, goals. This service uses Pareto filtering techniques to identify the optimal alternatives across multiple criteria. It then uses various analytical and visual approaches to help the decision maker explore tradeoffs and alternatives.

Tradeoff Analytics can be used to help make complex decisions like what mortgage to take, which treatment option to follow, what car to purchase.

Use Cases:

  • Enable retailers and manufacturers to determine product mix
  • Allow consumers to compare and contrast competitive products or services
  • Help physicians select optimal treatment options based on multiple criteria

Documentation | Demo

14 comments on"Five new services expand IBM Watson capabilities to images, speech, and more"

  1. MichaelDHolmes February 05, 2015

    Congratulations to the team on a job well done exposing this exciting new functionality to the development community. It’s a great testament to the confidence IBM places in the creativity of developers to come up with ways of implementing Watson’s capabilities in new and exciting ways that we at IBM would likely never have thought of independently. The development community is a fantastic source of innovation and we can’t wait to see the new ways developers find to put Watson to work!

  2. The future is now!

  3. Great work guys.

    We have been waiting to leverage Watson’s capabilities to solve our NLP problems for a while now.

    It is great to see that all these services are available as an API now.

    Again, great job!!

  4. Great, waiting for the French version for Text Speech

  5. […] eight services available now becomes thirteen, thanks to the addition of five new services including Concept Insights, Speech-to-Text Translation, Text-to-Speech Translation, Tradeoff […]

  6. This is very promising. To have a text to speech application that works well with a good voice is wonderful. Many thanks to the people at IBM that made this happen.

  7. […] it always drove me nuts that it only worked in Chrome.  Last month the IBM Watson team released 5 new services, and guess what… Speech Recognition and Speech Synthesis are […]

  8. This is insightful. Am learning a lot everyday from this talented and selfless team. Hoping to do my final year project using Bluemix.

Join The Discussion

Your email address will not be published. Required fields are marked *