Fixed
Status Update
Comments
jm...@google.com <jm...@google.com> #2
The Speech API engineering team is aware of this issue and are investigating solutions. There is no ETA at this time for a fix, but all further updates should occur here.
jm...@google.com <jm...@google.com> #3
The Speech API engineering team has been able to successfully reproduce the issue in different scenarios, and has identified a possible fix which they have now started to implement. Note that there is still no ETA for a full resolution, but you can continue to expect updates here.
jm...@google.com <jm...@google.com> #4
The Speech API engineering team has identified a fix and they are currently reviewing it to ensure that it would be ready for production release. There is still no ETA for when the fix will roll out into production, but all further updates will continue here.
jm...@google.com <jm...@google.com> #5
The fix is currently planned to be released in the next product update. The production rollout has a rough ETA projection for mid October.
jm...@google.com <jm...@google.com> #6
The engineering team has confirmed that the fix has been approved for release, and has an updated rough ETA of 2-3 weeks to be rolled out into production.
jm...@google.com <jm...@google.com> #7
The engineering team has confirmed that the fix should now be fully released in production.
bo...@gmail.com <bo...@gmail.com> #8
Comment has been deleted.
an...@gmail.com <an...@gmail.com> #9
Comment has been deleted.
Description
When streaming audio to the Speech API, if 60 seconds of audio is sent, use_enhanced is true, the model is set to "video", or "phone_call", and no onFinal event is encountered in that 60 seconds, the stream then blocks for around 45 seconds before continuing to return results. Tested sample rates were 44100hz and 48000hz.
What you expected to happen:
The stream continues to transcribe after 60 seconds of audio.
Steps to reproduce:
Stream an audio file with no natural pauses in speech for 60 seconds using enhanced models enabled and the model set to "video" or "phone_call".
Other information (workarounds you have tried, documentation consulted, etc):