Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The relevant term is "bounding box", as you probably need the confidence level of a character or word, not just the image. I built such an interface. I think the effort is only worth it if you really have multi-millions of pages.

Niels lately posted a lot about other OCR engines: https://www.linkedin.com/posts/niels-rogge-a3b7a3127_lots-of...



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: