OCR

OCR (Optical Character Recognition) is very usuful technique of pattern recognition in images. Widely used to improve UX, especially in mobile world where user input methods are not very comfortable.

Project that I was involved into was mobile application with portfolio of client products, his offer, some social features, …,  and a lottery. For this lottery user had to enter some kind of Serial Number from product package. It was really inconvenient so client made a decision to add OCR functionallity to the application.

Together with dev team we decided to use Google’s (in fact Google is not early author but today project is under its banner) open source OCR engine – Tesseract (https://github.com/tesseract-ocr/tesseract). In few words, this is very powerfull tool and its big power resides not in wide range of supported formats or fonts but in possibility of customization with support of machine learning. You can train the tool with your own data to improve the speed and efficiency of prediction. This occured to be extremaly useful in our project because client had used very difficult font with lot of gaps in each sign (sth like example below). We also used some combination of image filters like blur and erosion (see below).

ocr

Techniques: JNI, Tesseract, Android Camera API

 

Your browser is out of date. It has security vulnerabilities and may not display all features on this site and other sites.

Please update your browser using one of modern browsers (Google Chrome, Opera, Firefox, IE 10).

X