Transcribe audio in real time or from an audio file

Summary

Using Node.js and React components, create a web app that takes audio from your microphone or from a file and transcribes the speech into text. The app uses IBM® Watson™ Speech to Text to provide a selection of models, with support for multiple languages. Watson Speech to Text is available on IBM Cloud and with the Watson API Kit on IBM Cloud Pak® for Data.

Description

Built with React components and a Node.js server, the speech-to-text web app takes audio input from your microphone or from a file. The audio is streamed through a WebSocket to allow real-time transcription. You can watch the text appear, and update it while you speak.

The Node.js server is used to authenticate with the service using your credentials. The web app requests a temporary token from the server so that your credentials are not sent to the browser.

This app is intended to get you started. A speech-to-text app is a fun example, but the real results happen when you use this code to make your own application accept speech input.

Watson Speech to Text is available on IBM Cloud and with the Watson API Kit on IBM Cloud Pak for Data. With IBM Cloud Pak for Data, you can provision Watson Speech to Text on your own private cloud or wherever Red Hat OpenShift runs.

When you have completed this code pattern, you understand how to:

  • Stream audio to the Watson Speech to Text service using a WebSocket
  • Integrate Watson Speech to Text in a web app
  • Use React components and a Node.js server

Flow

Transcribe audio flow diagram

  1. The user supplies an audio input to the application (running locally, in IBM Cloud, or in IBM Cloud Pak for Data).
  2. The application sends the audio data to the Watson Speech to Text service through a WebSocket connection.
  3. As the data is processed, the Watson Speech to Text service returns information about extracted text and other metadata to the application to display.

Instructions

Find the detailed steps for this pattern in the readme file. The steps show you how to:

  1. Provision the Watson Speech to Text service.
  2. Deploy the server.
  3. Use the web app.