Assigned
Status Update
Comments
ar...@google.com <ar...@google.com> #2
I have informed our engineering team of this feature request. There is currently no ETA for its implementation.
A current workaround would be to check the returned "boundingPoly" [1] "vertices" for the returned "textAnnotations". If the calculated rectangle's heights > widths, than your image is sideways.
[1]https://cloud.google.com/vision/reference/rest/v1/images/annotate#boundingpoly
A current workaround would be to check the returned "boundingPoly" [1] "vertices" for the returned "textAnnotations". If the calculated rectangle's heights > widths, than your image is sideways.
[1]
Description
This will create a feature request which anybody can view and comment on.
Please describe your requested enhancement. Good feature requests will solve common problems or enable new use cases.
What you would like to accomplish:
Currently, only .PDF, .TIF or .TIFF formats are allowed as valid input for the Entity Extraction method. . These formats are more complex than a simple txt file, and OCR is applied to them for entity extraction.
models.predict
How this might work:
.txt files in Cloud Storage can be used as the input document for the predict method. Making it simpler. Look at the code below for an example, it would work similarly to the methods used to accept PDF currently available or would redirect to Text Snippet, but without having to send all the text in the request.
If applicable, reasons why alternative solutions are not sufficient:
The other workaround is to use Text Snippet , however this would require reading from a txt file, but this is cumbersome when multiple models are involved and the text must be split, which might be important for the model's accuracy.
Other information (workarounds you have tried, documentation consulted, etc):
Follow these steps to reproduce
.txt
file as test input for your model. You will receive an error since it's not supported.App.java
that generates errorINVALID_ARGUMENT: List of found errors: 1.Field: payload.document.input_config.gcs_source.input_uris[0]; Message: The file extension 'txt' is not supported.
Note that you need to modify project Id, Model Id and source filename variables.