Assigned
Status Update
Comments
gs...@google.com <gs...@google.com> #2
There is work in progress, to provide feedback on type of text, whether typed or handwritten. Still, there is no precise release date for this feature.
Confidence levels are provided by page. There is no expected marked differences for blocks of text, which sometimes coincide with pages. Why do you need confidence levels for blocks of text? A feature request needs the support of a well-motivated use-case to prove its point and convince. A detailed description, preferably in step-by-step format, would be appreciated.
Confidence levels are provided by page. There is no expected marked differences for blocks of text, which sometimes coincide with pages. Why do you need confidence levels for blocks of text? A feature request needs the support of a well-motivated use-case to prove its point and convince. A detailed description, preferably in step-by-step format, would be appreciated.
jo...@gmail.com <jo...@gmail.com> #3
Why is confidence score for text blocks more useful than confidence at page
level for consumers of Cloud Vision API?
In use cases where the scanned image contains fields that are critical for
downstream processes it is important to know what is the confidence for the
extracted field value.
Examples of fields in scanned images are 'Invoice Amount' or 'Payment date'.
Not all text blocks on a scanned page have the same importance in a
business process. In production applications where is no ground truth for
the extracted fields consumers typically have to use confidence score to
decide if the extracted values are good enough to send for updates directly
to the next process or send to exception queue for manual review.
To implement this consumers need confidence score for each value extracted.
Confidence score at page level is not sufficient in this case.
How is the confidence score for text blocks used?
During the development phase test scanned documents with ground truth are
used to determine a heuristic threshold value for the confidence score for
each extracted value based on observations across the test documents.
This threshold is then used to decide if the extracted value for a field is
closer to the correct value in the document.
In summary we need confidence for text blocks because text blocks and not
page are the granularity of consumption in applications that use the API.
Hope this helps. I will be happy to provide any more details.
Regards
John
On Thu, 14 May 2020 at 11:57 PM, <buganizer-system@google.com> wrote:
level for consumers of Cloud Vision API?
In use cases where the scanned image contains fields that are critical for
downstream processes it is important to know what is the confidence for the
extracted field value.
Examples of fields in scanned images are 'Invoice Amount' or 'Payment date'.
Not all text blocks on a scanned page have the same importance in a
business process. In production applications where is no ground truth for
the extracted fields consumers typically have to use confidence score to
decide if the extracted values are good enough to send for updates directly
to the next process or send to exception queue for manual review.
To implement this consumers need confidence score for each value extracted.
Confidence score at page level is not sufficient in this case.
How is the confidence score for text blocks used?
During the development phase test scanned documents with ground truth are
used to determine a heuristic threshold value for the confidence score for
each extracted value based on observations across the test documents.
This threshold is then used to decide if the extracted value for a field is
closer to the correct value in the document.
In summary we need confidence for text blocks because text blocks and not
page are the granularity of consumption in applications that use the API.
Hope this helps. I will be happy to provide any more details.
Regards
John
On Thu, 14 May 2020 at 11:57 PM, <buganizer-system@google.com> wrote:
--
John Kuriakose
John Kuriakose
gs...@google.com <gs...@google.com> #4
Engineering has been made aware of your feature request, and will address it in due course. No estimated time to implementation has been set. Meanwhile, you may follow developments in this thread.
mp...@google.com <mp...@google.com>
mi...@gmail.com <mi...@gmail.com> #7
Any updates?
dc...@gmail.com <dc...@gmail.com> #8
I need this feature too, is it in the roadmap now?
kr...@softagent.se <kr...@softagent.se> #9
I'm in need of this too. My use case is for library cards that are mostly best handled with TEXT_DETECTION, which gives stellar OCR quality for typewriter written cards. I am mainly interested in the full text only. But some of the cards are handwritten, and thus need DOCUMENT_TEXT_DETECTION. The problem is the setting has to be done beforehand for thousands of card images and I have no clue how to make that setting dynamic for each card. What I *really* need is an AUTO_TEXT_DETECTION that will auto select the best approach depending on Vision internal detection. But that is for another feature request...
At least if I can detect that the contents are handwritten in the response as per this feature request, then I can request another OCR pass with DOCUMENT_TEXT_DETECTION set. That will make double work though, but get the best possible results.
At least if I can detect that the contents are handwritten in the response as per this feature request, then I can request another OCR pass with DOCUMENT_TEXT_DETECTION set. That will make double work though, but get the best possible results.
mb...@gmail.com <mb...@gmail.com> #10
I need this too, Use case is signed insurance cards where I am NOT interested in the OCR of the signature but only in the printed text.
Is this on the roadmap still?
Is this on the roadmap still?
fl...@yokoy.ai <fl...@yokoy.ai> #11
Would love to see this too. Use case is feature engineering for a classifier that predicts the country of origin of a document (receipt).
sa...@gmail.com <sa...@gmail.com> #12
Any updates? I really need to see which parts of the text are handwritten and which parts are typed/printed.
as...@gmail.com <as...@gmail.com> #13
Looking for this feature. This will be extremely useful for us building a use case for students.
na...@gmail.com <na...@gmail.com> #14
Having this feature will be of great help. By when it can be resolved?
jl...@lexmachina.com <jl...@lexmachina.com> #15
I also need this feature for Google Vision, the lack of it is pushing us towards paying for Amazon Textract instead, which does distinguish between handwriting and printed texts.
ri...@gmail.com <ri...@gmail.com> #16
Dear google team, any update?
nr...@google.com <nr...@google.com>
ar...@deloitte.com <ar...@deloitte.com> #17
Hi Google Team,
We are also in a need of this feature. It is impacting critical business requirement.
Do we have any ETA for this feature?
We are also in a need of this feature. It is impacting critical business requirement.
Do we have any ETA for this feature?
nr...@google.com <nr...@google.com>
nr...@google.com <nr...@google.com>
nr...@google.com <nr...@google.com>
u....@gmail.com <u....@gmail.com> #18
I need this feature. Is there a timeline for development?
pe...@gmail.com <pe...@gmail.com> #19
In dire need of this feature. When can we expect this?
el...@kadow.club <el...@kadow.club> #20
Hello,
Any news regarding this feature?
Any news regarding this feature?
sa...@gmail.com <sa...@gmail.com> #21
Hi @google
Can we get an update here?
Can we get an update here?
ca...@notablehealth.com <ca...@notablehealth.com> #22
+1 any update here?
an...@gmail.com <an...@gmail.com> #23
any update here? I need this functionality to read only printed texts.
Description
Problem you have encountered:
The results from the DOCUMENT_TEXT_DETECTION API do not provide information about the nature of text detected and recognized. As a user I cannot distinguish which parts on an image are typed text versus those that are handwritten. I also have no information about the confidence of the text detection and recognition from the GCP Vision API.
What you expected to happen:
The results from the DOCUMENT_TEXT_DETECTION API should add additional information about the nature of text detected and recognized. It should add an attribute for each Bounding Box that provides info whether the text in the bounding box was typed text or handwritten text .
It should also provide a confidence score for each text bounding box.
Steps to reproduce:
Call the API with a scanned image for a filled-in form that has both typed text and handwritten text.
Other information (workarounds you have tried, documentation consulted, etc):