Assigned
Status Update
Comments
nr...@google.com <nr...@google.com>
nr...@google.com <nr...@google.com> #2
I have forwarded this request to the engineering team. We will update this issue with any progress updates and a resolution.
Best Regards,
Josh Moyer
Google Cloud Platform Support
Best Regards,
Josh Moyer
Google Cloud Platform Support
Description
Please provide as much information as possible. At least, this should include a description of your issue and steps to reproduce the problem. If possible please provide a summary of what steps or workarounds you have already tried, and any docs or articles you found (un)helpful.
Problem you have encountered:
TEXT_DETECTION is not able to correctly parse images with English and Japanese. The second language (Japanese) is ignored. Example image "ocr-test-image.jpg" attached.
What you expected to happen:
The multi-language text is correctly detected by TEXT_DETECTION.
Steps to reproduce:
Submit the example image to the TEXT_DETECTION API.
Other information (workarounds you have tried, documentation consulted, etc):
- The python cloud vision library (
- Providing language hints (en, ja) does not provide improvement in the output.
- Using DOCUMENT_TEXT_DETECTION instead of TEXT_DETECTION does not provide improvement.
- If the English text is removed, the Japanese text is detected correctly, so it is not an issue with the quality/resolution of the Japanese text. For example here is the example image with the English text removed "ocr-test-image-no-en.jpg". This image is detected correctly.
- The issue with multi-language detection started about a week ago (during the week of May 16 to May 22, 2022) in my experience.