Can we enable diarization and smart format in speech to text service at the same time? We actually need both for our requirement –identify between agent/customer, we also need customer addresses/phone numbers in smart format.
Looking at the example at https://speech-to-text-demo.mybluemix.net/ only one or the other works – primarily because smart formatted output is not available in the timestamped words and we need the timestamped words in order to diarize the text. In the below example the timestamped text holds “twenty thousand dollars” whereas the transcript reads as $20000. Is there anyway we can achieve both speaker diarization and smart formatting? Please suggest on the possible option. Thanks! [ "twenty", 27.19, 27.51 ], [ "thousand", 27.51, 27.83 ], [ "dollars", 27.83, 28.35 ] ], "confidence": 0.935, "transcript": "thank you for calling this is Dave speaking how can I help you hi Dave I filled out an application last night and the last page it says to call and give more information I'd be more than happy to assist you with that I'll have to ask you some additional questions okay okay vehicle that you're looking to purchase are you purchasing from an individual or from a dealer I'm an individual okay all right %HESITATION any special occasion for the car purchase no I just want a new car okay and looks like you applied for $20000 " } ], "final": true } ], "result_index": 0 }
Answer by @chughts (8573) | Apr 17, 2017 at 01:41 PM
What you are seeing is maybe just a feature of the sample application as the API documentation (https://www.ibm.com/watson/developercloud/speech-to-text/api/v1/#recognize_sessionless_nonmp12), doesn't mention any exclusivity between the speaker_labels and smart_formatting options. So I guess you can try specifying both.
I have tried specifying both. But the timestamped words aren't smart formatted
Just to clarify what I think you are saying. Are you using the speaker labels option or the smart formatting option?
Answer by Srividhya_Narayanan (1) | Apr 19, 2017 at 05:56 AM
I am using both speaker_labels true and smart_formatting true. Any responses please? I badly need speaker labeling as we are transcribing call center text. Smart formatting is also needed as the user mentions phone number and we need it to be properly formatted..
How do I send an audio file for transcription using IBM Watson Android sdk? 4 Answers
speech to text is not converting properly 0 Answers
Integration Speech to Text with PHP project 1 Answer
How to minimize the time of the final response from Watson STT? 2 Answers
Streaming Speech to Text websocket fails with 500 error (python3) why? 1 Answer