Customer Issue P2
Status Update
Comments
fa...@google.com <fa...@google.com> #2
I have informed our engineering team of this feature request. There is currently no ETA for its implementation.
A current workaround would be to check the returned "boundingPoly" [1] "vertices" for the returned "textAnnotations". If the calculated rectangle's heights > widths, than your image is sideways.
[1]https://cloud.google.com/vision/reference/rest/v1/images/annotate#boundingpoly
A current workaround would be to check the returned "boundingPoly" [1] "vertices" for the returned "textAnnotations". If the calculated rectangle's heights > widths, than your image is sideways.
[1]
ma...@googlemail.com <ma...@googlemail.com> #3
I also need this problem solved :)
Description
When using Speech to Text API to transcribe telephony UK calls, which is working correctly for short utterances when using the language code “en-US”, but not when using “en-GB” language code.
When converting a "8,000 Hz G.711 A-law" audio file to LINEAR16 it works correctly for “en-US”, but when converting to linear lookup table it doesn't work correctly for the “en-GB” language code.
What you expected to happen:
The transcription results for "en-GB" should be the same as the results for the "en-US"
Steps to reproduce:
1. Reproduced the scenario with following commands:
gcloud ml speech recognize 'gs://BUCKET_NAME/audio_file' --language-code=en-US --sample-rate=8000 --encoding=linear16
gcloud ml speech recognize 'gs://BUCKET_NAME/audio_file' --language-code=en-GB --sample-rate=8000 --encoding=linear16
Other information (workarounds you have tried, documentation consulted, etc):