Assigned
Status Update
Comments
pu...@google.com <pu...@google.com>
pu...@google.com <pu...@google.com> #2
I have informed our engineering team of this feature request. There is currently no ETA for its implementation.
A current workaround would be to check the returned "boundingPoly" [1] "vertices" for the returned "textAnnotations". If the calculated rectangle's heights > widths, than your image is sideways.
[1]https://cloud.google.com/vision/reference/rest/v1/images/annotate#boundingpoly
A current workaround would be to check the returned "boundingPoly" [1] "vertices" for the returned "textAnnotations". If the calculated rectangle's heights > widths, than your image is sideways.
[1]
dy...@gmail.com <dy...@gmail.com> #3
I also need this problem solved :)
ka...@babelforce.com <ka...@babelforce.com> #4
same :D
pu...@google.com <pu...@google.com> #5
+1
dy...@gmail.com <dy...@gmail.com> #6
+1
dy...@gmail.com <dy...@gmail.com> #7
This needs more attention. It's not just a display issue as described in the report. The co-ordinates returned in 'boundingPoly' are incorrect if the image was taken on a phone. All the x points should be y and vice versa.
The workaround does not make sense as "boundingPoly" [1] "vertices" for "textAnnotations" does not indicate the image dimensions - it indicates the dimensions of the relevant text block inside the image.
The workaround does not make sense as "boundingPoly" [1] "vertices" for "textAnnotations" does not indicate the image dimensions - it indicates the dimensions of the relevant text block inside the image.
pu...@google.com <pu...@google.com> #8
+1
pu...@google.com <pu...@google.com>
ju...@govpros.us <ju...@govpros.us> #9
Would be great if this could be implemented.
Description
Having audio like this: "one one one one one one one one" returns multiple repeated characters in transcript, for example: "1111111111111" - I would expect only 8 x "1"
Please note, that this issue is not only a problem when using speech-to-text API with for example Golang SDK, I see this issue on ALL google products, Android voice input, or
What you expected to happen:
Correct recognition.
Steps to reproduce:
- go to
- say: "one one one one one one one one"
- see final result transcript to be "11111111111...." (more than 8 times than expected)
Locally I am using model "latest_short"
Other information (workarounds you have tried, documentation consulted, etc):
I tried other models which kind of have similar problems.