Inconsistent text in the full ocr text annotation compared to individual text annotations [235127617]

Assigned

Bug

Status Update

No update yet.

Description

mj...@gmail.com

created issue #1

Jun 7, 2022 01:39PM

(I'm reposting this on the public tracker since we had no reaction to the post on the confidential one)

Problem you have encountered: Ever since the OCR model was updated, we noticed a strange regression which caused us some problems with our application's text processing.

Consider the attached image (it's a sample specimen of the Dutch passport). When running a image annotation request with TEXT_DETECTION feature, using the old model (builtin/legacy) we get the appropriate response (old_response.json). The first entity annotation of the response contains the full text of the image. You can find a fragment in the full text which says: \nP NLD Nederlandse-\n. When looking though the individual text annotations I can find the matching annotation which contains the description "Nederlandse-" (notice the dash character):

{
		"Mid": "",
		"Locale": "",
		"Description": "Nederlandse-",
		"Score": 0,
		"Confidence": 0,
		"Topicality": 0,
		"BoundingPoly": {
			"Vertices": [
				{
					"X": 248,
					"Y": 110
				},
				{
					"X": 315,
					"Y": 111
				},
				{
					"X": 315,
					"Y": 120
				},
				{
					"X": 248,
					"Y": 119
				}
			],
			"NormalizedVertices": []
		},
		"Locations": [],
		"Properties": []
	},

After the model update, when using the latest model (new_response.json), the same request has a response where the individual annotations do not add up to the full text annotation: in the full text annotation I can still see the \nP NLD Nederlandse-\n (with the dash character) fragment, but when looking through the individual text annotation the matching annotation is missing the dash character, and the dash is not found in any other annotations so it's completely gone.

	{
		"Mid": "",
		"Locale": "",
		"Description": "Nederlandse",
		"Score": 0,
		"Confidence": 0,
		"Topicality": 0,
		"BoundingPoly": {
			"Vertices": [
				{
					"X": 247,
					"Y": 110
				},
				{
					"X": 312,
					"Y": 111
				},
				{
					"X": 312,
					"Y": 121
				},
				{
					"X": 247,
					"Y": 120
				}
			],
			"NormalizedVertices": []
		},
		"Locations": [],
		"Properties": []
	},

What you expected to happen:

I expect that the image annotation responses with TEXT_DETECTION or DOCUMENT_TEXT_DETECTION feature s should have the first annotation element contain the full text of the image, and each following annotation elements should completely match the full text when added up.

Steps to reproduce:

Run the provided image.png file through Google Cloud Vision with TEXT_DETECTION feature and check the response for the inconsistency

Alternatively

Upload the image to the GCV test application https://cloud.google.com/vision/docs/drag-and-drop and you can see the inconsistency there also.

I'd like to add that this issue is currently happening to quite a lot of files (dash before new lines is present in the full text, but stripped in the individual element annotations), but this is the only non sensitive file I have found so far.

new_response.json

18 KB

Download

old_response.json

19 KB

Download

image.png

98 KB

View

Download

Comments

ds...@google.com <ds...@google.com> Nov 24, 2023 07:41AM

Assigned to ds...@google.com.

ds...@google.com <ds...@google.com> #2Nov 27, 2023 04:19AM

Reassigned to gc...@google.com.

I have forwarded this request to the engineering team. We will update this issue with any progress updates and a resolution.

Best Regards,
Josh Moyer
Google Cloud Platform Support

Issue 235127617

Description

Issue summary

Comments

ds...@google.com <ds...@google.com> Nov 24, 2023 07:41AM

ds...@google.com <ds...@google.com> #2Nov 27, 2023 04:19AM

Add comment

Issue metadata