Can we enable diarization and smart format in speech to text service at the same time? We actually need both for our requirement –identify between agent/customer, we also need customer addresses/phone numbers in smart format.
Looking at the example at https://speech-to-text-demo.mybluemix.net/ only one or the other works – primarily because smart formatted output is not available in the timestamped words and we need the timestamped words in order to diarize the text. In the below example the timestamped text holds “twenty thousand dollars” whereas the transcript reads as $20000. Is there anyway we can achieve both speaker diarization and smart formatting? Please suggest on the possible option. Thanks! [ "twenty", 27.19, 27.51 ], [ "thousand", 27.51, 27.83 ], [ "dollars", 27.83, 28.35 ] ], "confidence": 0.935, "transcript": "thank you for calling this is Dave speaking how can I help you hi Dave I filled out an application last night and the last page it says to call and give more information I'd be more than happy to assist you with that I'll have to ask you some additional questions okay okay vehicle that you're looking to purchase are you purchasing from an individual or from a dealer I'm an individual okay all right %HESITATION any special occasion for the car purchase no I just want a new car okay and looks like you applied for $20000 " } ], "final": true } ], "result_index": 0 }
What you are seeing is maybe just a feature of the sample application as the API documentation (https://www.ibm.com/watson/developercloud/speech-to-text/api/v1/#recognize_sessionless_nonmp12), doesn't mention any exclusivity between the speaker_labels and smart_formatting options. So I guess you can try specifying both.
I am using both speaker_labels true and smart_formatting true. Any responses please? I badly need speaker labeling as we are transcribing call center text. Smart formatting is also needed as the user mentions phone number and we need it to be properly formatted..
Speech To Text Performance 1 Answer