Kiosk CamApp Project | OCR | Python

Gurram Harshavardhan Netha
3 min readMay 30, 2021


My father runs a Customer Service Point of SBI, which are commonly known as zero banks in our rural village.

For transactions to happen, we need to manually enter a 11 digit number, called as CIF in the home page of the website. This step is considered the HERO thing, as we need to do it for sure, for every customer who visits our CSP.

Pain Point:

Any process, to do manually, takes a hell lot of time. And when it is something related to numbers on a card, which are really small: that’s hectic.

  • Digits are small and undistinguishable
  • Error in single digit leads to ~45 secs delay for customer

Approach of Idea:

An OCR app that detects CIF number from input image and enters the same in respective field then submits the same.

Lets code!

Step 1: Configuration of camera.

We must take input from an external camera. Maybe from any mobile phone.

A quick search on Google suggested me this app. CamDroid

Step 2: Capture Image

To capture image from the connected mobile, we use OpenCV

Step 3: OCR

OCR which abbreviates to Optical Character Recognition, detects text from an image.

pytesseract is a module that can make OCR possible with just 10 lines of code.

Easy peasy!

Step 4: Paste it in SBI window and submit

Here comes the tricky part! This webapp where SBI works happens has a great set of security.

  • It only works in IE (Internet Explorer).
  • Right click disabled
  • Can’t view the page source.
  • Can’t scrape.
  • Ctrl + V doesn’t work…etc!

Well, every problem has a solution.

“PyAutoGUI to the rescue.”

It is a module to automate cursor and keyboard actions. Simple!

Now, find the coordinates of the input box on the screen and boom! 💥

1 minute task, in just 5 seconds!

Improvements from base version:

  • pygetwindow — To activate the Internet Explorer. (We can now recognise even if the Internet Explorer is minimised.)
  • The above module also helped me clear the clutter. i.e Minimise all unnecessary windows. Which gives a cool UX for elderly people who are not much technically active.
  • Playsound — To get a beep sound that confirms a CIF number is detected in the image.


OCR Teserract:

Webcam Input OpenCV:

PyAutoGUI Documentation:

PyGetWindow Documentation: