Status Update
Comments
si...@google.com <si...@google.com>
si...@google.com <si...@google.com> #2
Hello,
To assist us in conducting thorough investigation, we kindly request your cooperation in providing the following information regarding the reported issue:
- Has this scenario ever worked as expected in the past?
- Do you see this issue constantly or intermittently ?
- If this issue is seen intermittently, then how often do you observe this issue ? Is there any specific scenario or time at which this issue is observed ?
- To help us understand the issue better, please provide detailed steps to reliably reproduce the problem.
- It would be greatly helpful if you could attach screenshots of the output related to this issue.
Your cooperation in providing these details will enable us to dive deeper into the matter and work towards a prompt resolution. We appreciate your assistance and look forward to resolving this issue for you.
Thank you for your understanding and cooperation.
lj...@gmail.com <lj...@gmail.com> #3 Restricted
lj...@gmail.com <lj...@gmail.com> #4
lj...@gmail.com <lj...@gmail.com> #5
si...@google.com <si...@google.com> #6
Hello,
The Engineering team has parsed the second Screenshot shared by the you through DocumentAI Ocr parser, it has captured the vertical text.
However in the other two samples it wasn't captured.
Could you please share the following details.
- Name of the User
- More sample docs which are causing this issue.
lj...@gmail.com <lj...@gmail.com> #7 Restricted
si...@google.com <si...@google.com> #8
Hello,
Could you please provide the name of the User who is facing this issue ?
lj...@gmail.com <lj...@gmail.com> #9
lj...@gmail.com <lj...@gmail.com> #10
It's weott blim.
I have also submitted feedback in the Document OCR interface, which includes screenshots. You can take a look.
lj...@gmail.com <lj...@gmail.com> #11
However, if you use the images where I have already marked the errors, there might not be any issues.
For example, the above original image and the image with marked errors.
lj...@gmail.com <lj...@gmail.com> #12
lj...@gmail.com <lj...@gmail.com> #13
Did you receive my latest error sample?
si...@google.com <si...@google.com> #14
Hello,
This update has been provided to the Engineering team to further investigate the issue. Future updates regarding this issue will be provided here.
lj...@gmail.com <lj...@gmail.com> #15
lj...@gmail.com <lj...@gmail.com> #16
Can you access them?
si...@google.com <si...@google.com> #17
Yes, All the images are accessible.
lj...@gmail.com <lj...@gmail.com> #18
Please ask the engineering team to use my original images to reproduce the errors. I have performed OCR on the same image multiple times, and these errors always occur. Then, please improve the OCR algorithm.
lj...@gmail.com <lj...@gmail.com> #19
Here is new sample.
lj...@gmail.com <lj...@gmail.com> #20
Is Google Drive's OCR also part of Document AI?
The issues I have reported above also exist in Google Drive's OCR. Google Drive's OCR is also very useful, and I hope its OCR algorithm can be optimized as well.
lj...@gmail.com <lj...@gmail.com>
si...@google.com <si...@google.com>
lj...@gmail.com <lj...@gmail.com> #21
lj...@gmail.com <lj...@gmail.com> #22 Restricted
lj...@gmail.com <lj...@gmail.com> #23
lj...@gmail.com <lj...@gmail.com> #24
Problem you have encountered:
Most of the errors are from Russian.
1. As shown in the images, an OCR recognition error has been found; the symbol "-" appears in the wrong position.
2. Sometimes, some words also appear in the wrong position, similar to the situation with the symbol "-". This situation also occurs from time to time, even if the text in the OCR sample image is completely vertical, parallel, and neat. as shown in the error sample 20230411172209.
3. Some Russian characters exhibit obvious recognition errors, as shown in the error sample 20230411170324.
4. Wrong recognition of upper and lower sample letters, even if the original image is clear.
5. Some uppercase initials cannot be correctly recognized by OCR.
Please note while examining the error sample that the symbol "-" not only appears in the wrong position but also frequently goes missing.
What you expected to happen:
I found that Google Drive's OCR, Document OCR, and Cloud Vision OCR all have this problem. Please improve those three OCRs to fix this problem.
Steps to reproduce:
All error sample images are included in the zip file.
All images include two files, error samples and original images, and the locations of all error samples have been labeled.
Please ask the engineering team to use my original images to reproduce the errors. I have performed OCR on the same image multiple times, and these errors always occur.
CPU version ID is: <pretrained-ocr-v1.2-2022-11-10>
lj...@gmail.com <lj...@gmail.com> #25
It has been four days and I have not received a response or assignment for
my bug report.
My new bug report.
Russian OCR recognition problems report 2023-5 [280659979] - Visible to
Public - Issue Tracker (
<
>
lj...@gmail.com <lj...@gmail.com> #26
I want to know, Did <si...@google.com> and <gc...@google.com> receive my
Japanese OCR error update?
I just need an answer.
>
lj...@gmail.com <lj...@gmail.com> #27
I want to know, Did <si...@google.com> and <gc...@google.com> receive my Japanese OCR error update?
I just need an answer.
Description
Please provide as much information as possible. At least, this should include a description of your issue and steps to reproduce the problem. If possible please provide a summary of what steps or workarounds you have already tried, and any docs or articles you found (un)helpful.
Problem you have encountered:
Japanese vertical text often omits the first character after a line break. I found that Google Drive's OCR, Document OCR, and Cloud Vision OCR all have this issue.
see my case report image.
What you expected to happen:
Please improve the aforementioned three OCRs to address this problem.
I'm not sure if this is related to specific characters, but there are many other cases that I haven't taken screenshots of. I hope you can check more thoroughly.
Steps to reproduce:
use my image, you will see it.
Other information (workarounds you have tried, documentation consulted, etc):
OCR version: pretrained-ocr-v1.2-2022-11-10