WAI
Status Update
Comments
ar...@google.com <ar...@google.com> #2
Thanks for your issue report. Can you confirm if this is taking consistently the same amount of time? Could you advise on which alternative APIs you've used as a benchmark?
If API operations are taking a long time, one potential workaround is to use the speech.asyncrecognize method to process jobs in the background [1].
Note that the Speech API currently makes no claims about performance, and as a Beta release the Speech API is not subject to any SLA and is not intended for real time usage in critical applications (see footnote at [2]).
[1]https://cloud.google.com/speech/reference/rest/v1beta1/speech/asyncrecognize
[2]https://cloud.google.com/speech/
If API operations are taking a long time, one potential workaround is to use the speech.asyncrecognize method to process jobs in the background [1].
Note that the Speech API currently makes no claims about performance, and as a Beta release the Speech API is not subject to any SLA and is not intended for real time usage in critical applications (see footnote at [2]).
[1]
[2]
we...@gmail.com <we...@gmail.com> #3
Hi. The amount of time it takes to transcribe audio files (only English tested) that is around 20 seconds in length fall consistently between 15 seconds and 20 seconds, using the curl command listed in the original post.
The benchmark was:
1. MS Bing Speech API [1], which takes roughly 6-7 seconds to transcribe audios of similar length
2. IBM Watson Speech to Text [2], which also gets the job done for less than 10 seconds
The speech.asyncrecognize method is relatively OK for real-time operations, which I am gladly using now. However, in cases where we have to transcribe bulk audio files, speech.asyncrecognize method takes even longer time to complete than the syncregocnize method.
Surprisingly, one of my friend is a Chrome developer and the Chrome Speech API (should be powered by the same engine) is amazingly fast. I passed my audio files to him for a testing and the transcription can be done well below 4 seconds. I wonder what is going on in the backend.
[1]https://www.microsoft.com/cognitive-services/en-us/speech-api
[2]https://www.ibm.com/watson/developercloud/speech-to-text.html
The benchmark was:
1. MS Bing Speech API [1], which takes roughly 6-7 seconds to transcribe audios of similar length
2. IBM Watson Speech to Text [2], which also gets the job done for less than 10 seconds
The speech.asyncrecognize method is relatively OK for real-time operations, which I am gladly using now. However, in cases where we have to transcribe bulk audio files, speech.asyncrecognize method takes even longer time to complete than the syncregocnize method.
Surprisingly, one of my friend is a Chrome developer and the Chrome Speech API (should be powered by the same engine) is amazingly fast. I passed my audio files to him for a testing and the transcription can be done well below 4 seconds. I wonder what is going on in the backend.
[1]
[2]
ar...@google.com <ar...@google.com> #4
The Chrome Web Speech API is actually a different implementation which has been around since Chrome 25, released on February 21st 2013 [1]. The Web Speech API is web standard which is supported across multiple browsers [2] and uses traditional speech processing algorithms.
The Cloud Speech API is a new technology which uses Google's advanced deep learning neural network algorithms. This is similar to how the Bing Speech API and Watson Speech to Text API work. As the Speech API is still a beta product, we don't make any claims that it will be more performant than established solutions, however speed will improve over time as our algorithms become more optimized and more resources are committed to the service.
I will close this issue out as performance benchmarks aren't currently considered a defect with regard to expected behavior, however rest assured this is something we're constantly working on improving. For now if accuracy and language support is less important and speed is more of a concern, the Chrome Web Speech API is an good alternative.
For more general discussion on performance, I'd recommend posting a topic to the 'cloud-speech-discuss' forum [3].
[1]googlechromereleases.blogspot.com/2013/02/stable-channel-update_21.html
[2]https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API
[3]https://groups.google.com/forum/#!topic/cloud-speech-discuss/
The Cloud Speech API is a new technology which uses Google's advanced deep learning neural network algorithms. This is similar to how the Bing Speech API and Watson Speech to Text API work. As the Speech API is still a beta product, we don't make any claims that it will be more performant than established solutions, however speed will improve over time as our algorithms become more optimized and more resources are committed to the service.
I will close this issue out as performance benchmarks aren't currently considered a defect with regard to expected behavior, however rest assured this is something we're constantly working on improving. For now if accuracy and language support is less important and speed is more of a concern, the Chrome Web Speech API is an good alternative.
For more general discussion on performance, I'd recommend posting a topic to the 'cloud-speech-discuss' forum [3].
[1]
[2]
[3]
Description
I tested an audio file of about 18s in length using the HTTP interface. The speech engine takes almost 16s to return a result. Am I using the wrong parameters or does the Speech API needs some additional work after the beta? By the way, using other speech recognition APIs takes about 3s for this file.
I am directly testing the REST interface using curl, synchronize speech recognition v1beta1. My OS is Mac OS X El Capitan.
time curl "