Some Chinese characters become vertical line incorrectly [35903570]

Feature Request

Status Update

No update yet.

Description

fa...@gmail.com

created issue #1

Feb 9, 2017 03:40AM

What steps will reproduce the problem?
1. do not set language hint or set it as "zh-TW"
2. send request "TEXT_DETECTION" with 1.jpg and 2.jpg
3. get the results 1.json and 2.json

What is the expected output? What do you see instead?
The characters are in correct order, line by line.
But the bad results have some becomes vertical line.

What version of the product are you using? On what operating system?
compile 'com.google.apis:google-api-services-vision:v1-rev30-1.22.0'
compile 'com.google.apis:google-api-services-vision:v1-rev41-1.22.0'

Please provide any additional information below.
Most card images are OK. But the two attached images are bad.

Comments

fa...@gmail.com <fa...@gmail.com> #2Feb 9, 2017 05:51AM

Attach test files and results

Chinese_bad_result.jpg

478 KB

View

Download

test1.jpg

246 KB

View

Download

test2.jpg

120 KB

View

Download

1.json

35 KB

Download

2.json

34 KB

Download

od...@google.com <od...@google.com> #3Feb 9, 2017 10:56PM

Hi there,

I was unable to reproduce your issue and it is possible that your images’ resolution might not be sufficient as indicated in the Best Practices for Vision API’s TEXT_DETECTION feature [1]. Additionally, looking at your provided JSON results, I was neither able to identify the Chinese characters that were converted to vertical lines.

In order to help you further, can you tell me which exact characters detected in the two images are converted into vertical lines for you?

[1]

https://cloud.google.com/vision/docs/best-practices#image_sizing

fa...@gmail.com <fa...@gmail.com> #4Feb 11, 2017 04:37PM

Please see the attached 2-result.png
In 2.json, you can see there is a boundingPoly which contains "006".
That's what I called vertical line.
Its left boundingPoly is also vertical, "997C".

{
"boundingPoly": {
"vertices": [
{
"x": 770,
"y": 269
},
{
"x": 794,
"y": 268
},
{
"x": 798,
"y": 370
},
{
"x": 774,
"y": 371
}
]
},
"description": "006"
},
…
{
"boundingPoly": {
"vertices": [
{
"x": 744,
"y": 270
},
{
"x": 765,
"y": 268
},
{
"x": 778,
"y": 414
},
{
"x": 757,
"y": 415
}
]
},
"description": "997C"
},

2-result.png

1 MB

View

Download

fa...@gmail.com <fa...@gmail.com> #5Feb 11, 2017 04:48PM

I found another image which has the same issue.
I tested two resolutions,
a. 2688x1512
b. 1600x900

The result show 2688x1512 is OK. But 1600x900 has the issue.
I expect 1600x900 is big enough to get good results.

card_297356213.jpg

334 KB

View

Download

49471.jpg

1.3 MB

View

Download

od...@google.com <od...@google.com> #6Feb 13, 2017 10:59PM

Thanks for the additional information. I am still investigating this issue and will update it with further information tomorrow February 14th.

od...@google.com <od...@google.com> #7Feb 14, 2017 10:17PM

Status: Won't Fix (Intended Behavior)

Upon further investigation, I was able to retrieve more accurate results after scaling your first provided image (card_297356213.jpg) to a higher resolution (2000x1125 pixels). Note that the two images you have provided have been taken from slightly different perspectives. Therefore, the observed inaccuracy of the TEXT_DETECTION feature in the first one (card_297356213.jpg) could have been caused by it being slightly skewed. Lastly, when encountering difficulties with the Vision API’s character detection feature, make sure to attempt increasing the resolution of your image.

By all indications, the Platform seems to work as intended and therefore, I will be marking this issue as invalid for categorization purposes.

fa...@gmail.com <fa...@gmail.com> #8Feb 15, 2017 12:09AM

I'm surprised that the issue becomes Invalid.
It's obvious that the engine is not good enough.
Unless the app return a response to tell user to provide higher resolution,
user has no way to know he could get better result after scaling up.

Please consider to make the engine more robust. Another way is to add a new
hint parameter in the app, so that user can ask to recognize horizontal or
vertical line only.

p.s. The last two images are scaled/cropped from the the image.

2017年2月15日上午6:17， <google-cloud-platform@googlecode.com>寫道：

od...@google.com <od...@google.com> #9Feb 15, 2017 10:14PM

Status: New (reopened)

I see how this would be a nice feature to be able to specify the text orientation using a parameter in order to retrieve more accurately ordered text detection results. Still, I will need you to provide the following additional information in case you would want to see this feature being implemented:
1. Please describe the feature or enhancement you are requesting.
2. What business case or problem would this feature help you to solve?
3. What workarounds or alternatives have you considered? In what way were these unsuitable for your needs?
4. What version of the product are you using? On what operating system?
5. Please provide any additional information below.

ar...@google.com <ar...@google.com> #10Feb 19, 2017 06:54PM

Assigned to ar...@google.com.

I'd just like to clarify why this report is considered invalid as a *defect*, but is still a perfectly valid feature request. A defect means the service doesn't work as intended, but the best practices documentation does say:

"Generally, the Vision API requires images to be a sufficient size so that important features within the request can be easily distinguished. Sizes smaller or larger than these recommended sizes may work. However, smaller sizes may result in lower accuracy." [1]

The actual ideal resolution depends on the size of the features being detected, and there are other factors as well such as contrast and sharpness of the image. Also, I'd like to point out that the original images where the issue is demonstrated are photos of cards that were printed out at a much lower resolution than that of the image. You can quite clearly see the pixels when the image is zoomed in.

I've gone ahead and filed a feature request for the text orientation hint as we already have enough information.

[1]

https://cloud.google.com/vision/docs/best-practices

fa...@gmail.com <fa...@gmail.com> #11Feb 20, 2017 01:58AM

Hi, Thanks for your help.
Glad that this is accepted to add a feature.

I did another experiment.
I crop card_297356213.jpg to make card_297356213_a.jpg and card_297356213_b.jpg
The result responses are card_297356213_a.json and card_297356213_b.json.

* card_297356213_a
As you can see, almost all characters are correctly recognized (except 1). But they become lines in vertical.
* card_297356213_b
The three phone numbers are put correctly as three lines.

That's why I think the problem is not caused by insufficient image resolution or image feature.
The part to be improved is the way to combine characters to be a line.
Probably this issue only happens to some Chinese fonts.

p.s. I set language hint
String [] languages = { "zh-TW" };
imageContext.setLanguageHints(Arrays.asList(languages));

card_297356213_a.jpg

41 KB

View

Download

card_297356213_a.json

5.6 KB

Download

card_297356213_b.jpg

27 KB

View

Download

card_297356213_b.json

5.6 KB

Download

is...@google.com <is...@google.com> May 7, 2018 04:46PM

Status: New

Issue 35903570

Description

Issue summary

Comments

fa...@gmail.com <fa...@gmail.com> #2Feb 9, 2017 05:51AM

od...@google.com <od...@google.com> #3Feb 9, 2017 10:56PM

fa...@gmail.com <fa...@gmail.com> #4Feb 11, 2017 04:37PM

fa...@gmail.com <fa...@gmail.com> #5Feb 11, 2017 04:48PM

od...@google.com <od...@google.com> #6Feb 13, 2017 10:59PM

od...@google.com <od...@google.com> #7Feb 14, 2017 10:17PM

fa...@gmail.com <fa...@gmail.com> #8Feb 15, 2017 12:09AM

od...@google.com <od...@google.com> #9Feb 15, 2017 10:14PM

ar...@google.com <ar...@google.com> #10Feb 19, 2017 06:54PM

fa...@gmail.com <fa...@gmail.com> #11Feb 20, 2017 01:58AM

is...@google.com <is...@google.com> May 7, 2018 04:46PM

Add comment

Issue metadata