Get IBM Watson credentials
Login on https://console.bluemix.net/catalog/services/speech-to-text tand generate credentials for the Speech to Text service.
Go to Service Credentials and copy the username & password values, weâ€™re going to use them soon.
Create Box MetadataÂ template
To be able to attach the recognized speech to audio files we need to create a Metadata template on Box:
- Access the Metadata section on your Box admin panel
- Create a new template called â€śAudio dataâ€ť
- Create a metadata field named â€śtranscriptâ€ť of type â€śtextâ€ť
Step 1: On New FileÂ uploaded
Now letâ€™s openÂ Stamplayâ€™s visual flow builder to put together this automation. The workflow will be started every time a new file will be uploaded in a given folder.
- Initialize the project and create a flow
- Select the Box connector and then Trigger â€śNew File uploadedâ€ť
- Connect your Box account by granting access to Stamplay
- You can either search for or copy the Id of the folder where audio files will be uploaded (you can find the folder Id from the URL)
- Youâ€™ll be asked to test the trigger, so upload a file in the target folder and wait until Stamplay confirms that data has been received successfully. Then click Save.
Step 2: Get the uploaded file
The next step is to grab the file that has been uploaded on Box so that we can pass it to IBM Watson.
- Hover theÂ +Â icon on the first step of your flow and click onÂ Action
- Select the Box component and then pick theÂ Download FileÂ action
- This action requires a valid Box file Id that you can easily grab from the previous step by using the data mapper. Click on the button next to the input field to open it.
- Select the data pillÂ idÂ (body.source.id), it sits right after the data pillÂ type
Step 3: Recognize the Audio
Now weâ€™re ready to pass this file to IBM Watson.
- Add one more action to your workflow
- SelectÂ IBM Watson Speech to TextÂ and then pickÂ Recognize Speech From Audio
- Paste the credentials that you previously copied from Bluemix, click Connect and then Continue.
- Select the audio file format that you expect to be uploaded in theÂ Content TypeÂ field and pass the result
- In the second field namedÂ FileÂ pass the data pill URL that you can grab from the result of theÂ Download FileÂ step.
Step 4: Retrieve the result of the Speech Recognition
The speech recognition is a complex process so is not a service that returns a result right after it is called. For this reason we need to retrieve the result of that with a separate action step.
- Add one more action and select theÂ IBM Watson Speech to TextÂ connector
- To fillÂ Job IdÂ field pass the Id of the result of the previousÂ Recognize Speech From AudioÂ action
- After that, letâ€™s turn on the flow so we can see if the recognition works fine
Upload an audio file in the target Box folder. If everything has been configured correctly the flow will be started and after a while weâ€™ll get the result of the transcript.
To see if the Flow ran successfully, enter the History section (if may result pending for minutes, depending on the size of the audio file).
Step 5: Put the transcript together
IBM Watson Speech Recognition returns the transcript under the form of a list of sentences. So we need to append them to each other in order to have a single piece of text that will be applied as metadata.
For this weâ€™re going to useÂ variables. Access the settings of the flow and create a variable namedÂ fulltranscript.
Step 6: Iterate over the list of Watsonâ€™s results
We need to append the single sentences extracted by IBM Watson to each other. To do this:
- Add a Loop step to the flow
- In theÂ ListÂ filed pass theÂ resultsÂ data pill that you can find inside theÂ Retrieve Text From Speech RecognitionÂ action
Step 7: Iterate over the list of IBM Watsonâ€™s results
Now we need to concatenate every single sentence so weâ€™ll use the variable previously created,Â fulltranscript, to store and incrementally update it so that it will eventually contain the full text.
The logic behind this is the following: for each resultÂ fulltranscriptÂ will be updated by appending the new piece of text to the variable current value.
Consider a list of sentences â€śHiâ€ť, â€śmy nameâ€ť, â€śis Giulianoâ€ť, before the loop startsÂ fulltranscriptÂ is empty â€śâ€ť. The execution will go like this:
- â€śâ€ť + â€śHiâ€ť
- â€śHiâ€ť + â€śmy nameâ€ť
- â€śHi my nameâ€ť + â€śis Giulianoâ€ť
Final result â€śHi my name is Giulianoâ€ť.
Letâ€™s do this in our flow:
- Over with your cursor theÂ +Â icon of theÂ LoopÂ step and selectÂ In,Â thenÂ ActionÂ (actions inside a Loop are be executed as many time as the number of the items in the list)
- Select theÂ VariableÂ component and then pick theÂ Set/Update flow variable
- Select theÂ fulltranscriptÂ variable from the dropdown
- The new variable value will be set to the current value + the item processed by theÂ Loop.
Step 8: Apply the Metadata
Finally we can grab the full text stored in the variable and apply it to the Metadata of the file on Box.
- Add one last action outside of theÂ Loop
- Select theÂ BoxÂ component and then theÂ Create Metadata on FileÂ action
- The File Id is the same we used for theÂ Download FileÂ action, and we can grab it again from the results of the very first step of our workflow
- Pick the Metadata template previously created (Audio data)
- Stamplay will load the fields of the Metadata template and we simply need to pass there the content of the variableÂ fulltranscript.
Upload a new file and after few minutes youâ€™ll be able to see itâ€™s metadata enriched by this powerful combo!
At Stamplay we make it easy for people to automate processes and create high value integrations by tying together different apps.
Signup for aÂ free trialÂ and start automating your processes on Stamplay with Jira, Intercom and hundreds of other apps now.
If you need help to connect your apps or have an API that you want to make easy to connect with tweet us atÂ @stamplayÂ and/or drop us a mail at firstname.lastname@example.org