Change theme
Help
Press space for more information.
Show links for this issue (Shortcut: i, l)
Copy issue ID
Previous Issue (Shortcut: k)
Next Issue (Shortcut: j)
Sign in to use full features.
Vote: I am impacted
Notification menu
Refresh (Shortcut: Shift+r)
Go home (Shortcut: u)
Pending code changes (auto-populated)
View issue level access limits(Press Alt + Right arrow for more information)
Request for new functionality
View staffing
Description
-Fix issues or improve OCR quality for Document AI for the Hindi language.
Issues Encountered:
-Some characters are replaced.
-Some characters are inserted between new lines (e.g. hyphen, period, single/double quotes or random characters)
-The vertical lines, which is the same as a period (full stop) in English is recognized as the number 1.
-Double quotes are detected as different characters (different bytes).
-Extra spaces are created.
-All dashes can be detected as the same dash lines in English
-Some characters are not detected correctly.
How this might work:
-Hindi characters should be recognized correctly.
If applicable, reasons why alternative solutions are not sufficient:
-Current alternative is only for short-term resolution.
Other information (workarounds you have tried, documentation consulted, etc):
-Workaround applied is to use the model builtin/weekly as it shows lesser issues/inconsistencies on the result.