Question

IQBot and handwritten numbers. I am trying to extract handwritten numbers from a document with IQ Bot using the OCR engine Tesseract4. The OCR does not recognize certain digits correctly. It interprets for example "4" as "&", see the screenshot.

  • 22 March 2022
  • 5 replies
  • 49 views

I selected number as data type. Is it possible to force IQ bot to output a number as extraction result to reduce the amount of possible wrong interpretaions ?


5 replies

Userlevel 5
Badge +10

Hi @Dominik Täffner​ ,

 

Is possible to share the document here?

 

You can extract using IQ BOT, However, the extraction results will be based on the document quality.

 

IQ Bot Pre-processor package - Automation Anywhere ...

 

 

Userlevel 5
Badge +12

Hi @Dominik Täffner​ ,

 

Can you try with Google Vision API OCR?

Dear Tamil,

 

I attached an example pdf of the numbers to be extracted.

Dear Chandu,

 

I have already tried GoogleVision API OCR. See below for the extraction result for the first number in the example. IQ Bot extracts the vertical lines of the extraction boxes, unfortunately not all of them, otherwise it would have been possible to clean the output systematically via python logic.

I also tried to extract the number digit by digit to avoid extracting the vertical lines. This does not work either, as GoogleVision API OCR recognizes the nulber as one SIR (system identified region). Only Tesseract4 allows to extract the number digit by digit.

 

Original Number

1 2 3 4 5 / 5 6 7 8 9 1

 

GoogleVision API OCR extraction result

1 /2/341516171819 니 11

Userlevel 5
Badge +10

Hi @Dominik Täffner​ ,

 

Thanks for sharing the document here.

 

I have tried to extract the value using Microsoft Azure OCR, but I didn't get the correct value. I believe that Computer Vision will help us to extract the value.

 

I will give it a try and update you later.

image

Reply