Feature Request P2
Status Update
Comments
fa...@gmail.com <fa...@gmail.com> #2
Attach test files and results
od...@google.com <od...@google.com> #3
Hi there,
I was unable to reproduce your issue and it is possible that your images’ resolution might not be sufficient as indicated in the Best Practices for Vision API’s TEXT_DETECTION feature [1]. Additionally, looking at your provided JSON results, I was neither able to identify the Chinese characters that were converted to vertical lines.
In order to help you further, can you tell me which exact characters detected in the two images are converted into vertical lines for you?
[1]https://cloud.google.com/vision/docs/best-practices#image_sizing
I was unable to reproduce your issue and it is possible that your images’ resolution might not be sufficient as indicated in the Best Practices for Vision API’s TEXT_DETECTION feature [1]. Additionally, looking at your provided JSON results, I was neither able to identify the Chinese characters that were converted to vertical lines.
In order to help you further, can you tell me which exact characters detected in the two images are converted into vertical lines for you?
[1]
fa...@gmail.com <fa...@gmail.com> #4
Please see the attached 2-result.png
In 2.json, you can see there is a boundingPoly which contains "006".
That's what I called vertical line.
Its left boundingPoly is also vertical, "997C".
{
"boundingPoly": {
"vertices": [
{
"x": 770,
"y": 269
},
{
"x": 794,
"y": 268
},
{
"x": 798,
"y": 370
},
{
"x": 774,
"y": 371
}
]
},
"description": "006"
},
…
{
"boundingPoly": {
"vertices": [
{
"x": 744,
"y": 270
},
{
"x": 765,
"y": 268
},
{
"x": 778,
"y": 414
},
{
"x": 757,
"y": 415
}
]
},
"description": "997C"
},
In 2.json, you can see there is a boundingPoly which contains "006".
That's what I called vertical line.
Its left boundingPoly is also vertical, "997C".
{
"boundingPoly": {
"vertices": [
{
"x": 770,
"y": 269
},
{
"x": 794,
"y": 268
},
{
"x": 798,
"y": 370
},
{
"x": 774,
"y": 371
}
]
},
"description": "006"
},
…
{
"boundingPoly": {
"vertices": [
{
"x": 744,
"y": 270
},
{
"x": 765,
"y": 268
},
{
"x": 778,
"y": 414
},
{
"x": 757,
"y": 415
}
]
},
"description": "997C"
},
fa...@gmail.com <fa...@gmail.com> #5
I found another image which has the same issue.
I tested two resolutions,
a. 2688x1512
b. 1600x900
The result show 2688x1512 is OK. But 1600x900 has the issue.
I expect 1600x900 is big enough to get good results.
I tested two resolutions,
a. 2688x1512
b. 1600x900
The result show 2688x1512 is OK. But 1600x900 has the issue.
I expect 1600x900 is big enough to get good results.
od...@google.com <od...@google.com> #6
Thanks for the additional information. I am still investigating this issue and will update it with further information tomorrow February 14th.
od...@google.com <od...@google.com> #7
Upon further investigation, I was able to retrieve more accurate results after scaling your first provided image (card_297356213.jpg) to a higher resolution (2000x1125 pixels). Note that the two images you have provided have been taken from slightly different perspectives. Therefore, the observed inaccuracy of the TEXT_DETECTION feature in the first one (card_297356213.jpg) could have been caused by it being slightly skewed. Lastly, when encountering difficulties with the Vision API’s character detection feature, make sure to attempt increasing the resolution of your image.
By all indications, the Platform seems to work as intended and therefore, I will be marking this issue as invalid for categorization purposes.
By all indications, the Platform seems to work as intended and therefore, I will be marking this issue as invalid for categorization purposes.
fa...@gmail.com <fa...@gmail.com> #8
I'm surprised that the issue becomes Invalid.
It's obvious that the engine is not good enough.
Unless the app return a response to tell user to provide higher resolution,
user has no way to know he could get better result after scaling up.
Please consider to make the engine more robust. Another way is to add a new
hint parameter in the app, so that user can ask to recognize horizontal or
vertical line only.
p.s. The last two images are scaled/cropped from the the image.
2017年2月15日 上午6:17, <google-cloud-platform@googlecode.com>寫道:
It's obvious that the engine is not good enough.
Unless the app return a response to tell user to provide higher resolution,
user has no way to know he could get better result after scaling up.
Please consider to make the engine more robust. Another way is to add a new
hint parameter in the app, so that user can ask to recognize horizontal or
vertical line only.
p.s. The last two images are scaled/cropped from the the image.
2017年2月15日 上午6:17, <google-cloud-platform@googlecode.com>寫道:
od...@google.com <od...@google.com> #9
I see how this would be a nice feature to be able to specify the text orientation using a parameter in order to retrieve more accurately ordered text detection results. Still, I will need you to provide the following additional information in case you would want to see this feature being implemented:
1. Please describe the feature or enhancement you are requesting.
2. What business case or problem would this feature help you to solve?
3. What workarounds or alternatives have you considered? In what way were these unsuitable for your needs?
4. What version of the product are you using? On what operating system?
5. Please provide any additional information below.
1. Please describe the feature or enhancement you are requesting.
2. What business case or problem would this feature help you to solve?
3. What workarounds or alternatives have you considered? In what way were these unsuitable for your needs?
4. What version of the product are you using? On what operating system?
5. Please provide any additional information below.
ar...@google.com <ar...@google.com> #10
I'd just like to clarify why this report is considered invalid as a *defect*, but is still a perfectly valid feature request. A defect means the service doesn't work as intended, but the best practices documentation does say:
"Generally, the Vision API requires images to be a sufficient size so that important features within the request can be easily distinguished. Sizes smaller or larger than these recommended sizes may work. However, smaller sizes may result in lower accuracy." [1]
The actual ideal resolution depends on the size of the features being detected, and there are other factors as well such as contrast and sharpness of the image. Also, I'd like to point out that the original images where the issue is demonstrated are photos of cards that were printed out at a much lower resolution than that of the image. You can quite clearly see the pixels when the image is zoomed in.
I've gone ahead and filed a feature request for the text orientation hint as we already have enough information.
[1]https://cloud.google.com/vision/docs/best-practices
"Generally, the Vision API requires images to be a sufficient size so that important features within the request can be easily distinguished. Sizes smaller or larger than these recommended sizes may work. However, smaller sizes may result in lower accuracy." [1]
The actual ideal resolution depends on the size of the features being detected, and there are other factors as well such as contrast and sharpness of the image. Also, I'd like to point out that the original images where the issue is demonstrated are photos of cards that were printed out at a much lower resolution than that of the image. You can quite clearly see the pixels when the image is zoomed in.
I've gone ahead and filed a feature request for the text orientation hint as we already have enough information.
[1]
fa...@gmail.com <fa...@gmail.com> #11
Hi, Thanks for your help.
Glad that this is accepted to add a feature.
I did another experiment.
I crop card_297356213.jpg to make card_297356213_a.jpg and card_297356213_b.jpg
The result responses are card_297356213_a.json and card_297356213_b.json.
* card_297356213_a
As you can see, almost all characters are correctly recognized (except 1). But they become lines in vertical.
* card_297356213_b
The three phone numbers are put correctly as three lines.
That's why I think the problem is not caused by insufficient image resolution or image feature.
The part to be improved is the way to combine characters to be a line.
Probably this issue only happens to some Chinese fonts.
p.s. I set language hint
String [] languages = { "zh-TW" };
imageContext.setLanguageHints(Arrays.asList(languages));
Glad that this is accepted to add a feature.
I did another experiment.
I crop card_297356213.jpg to make card_297356213_a.jpg and card_297356213_b.jpg
The result responses are card_297356213_a.json and card_297356213_b.json.
* card_297356213_a
As you can see, almost all characters are correctly recognized (except 1). But they become lines in vertical.
* card_297356213_b
The three phone numbers are put correctly as three lines.
That's why I think the problem is not caused by insufficient image resolution or image feature.
The part to be improved is the way to combine characters to be a line.
Probably this issue only happens to some Chinese fonts.
p.s. I set language hint
String [] languages = { "zh-TW" };
imageContext.setLanguageHints(Arrays.asList(languages));
Description
1. do not set language hint or set it as "zh-TW"
2. send request "TEXT_DETECTION" with 1.jpg and 2.jpg
3. get the results 1.json and 2.json
What is the expected output? What do you see instead?
The characters are in correct order, line by line.
But the bad results have some becomes vertical line.
What version of the product are you using? On what operating system?
compile 'com.google.apis:google-api-services-vision:v1-rev30-1.22.0'
compile 'com.google.apis:google-api-services-vision:v1-rev41-1.22.0'
Please provide any additional information below.
Most card images are OK. But the two attached images are bad.