Question

IQBot and handwritten numbers. I am trying to extract handwritten numbers from a document with IQ Bot using the OCR engine Tesseract4. The OCR does not recognize certain digits correctly. It interprets for example "4" as "&", see the screenshot.

Forum|Forum|4 years ago
March 22, 2022
5 replies
115 views

dominiktaffner
Cadet | Tier 2

I selected number as data type. Is it possible to force IQ bot to output a number as extraction result to reduce the amount of possible wrong interpretaions ?

+7

Tamil Arasu10
Most Valuable Pathfinder
Forum|Forum|4 years ago
March 22, 2022

Hi @Dominik Täffner ,

Is possible to share the document here?

You can extract using IQ BOT, However, the extraction results will be based on the document quality.

IQ Bot Pre-processor package - Automation Anywhere ...

_Tamil_

Like

C

+11

ChanduMohammad
Most Valuable Pathfinder
Forum|Forum|4 years ago
March 23, 2022

Hi @Dominik Täffner ,

Can you try with Google Vision API OCR?

Chand

Like

D

dominiktaffner
Author
Cadet | Tier 2
Forum|Forum|4 years ago
March 23, 2022

Dear Tamil,

I attached an example pdf of the numbers to be extracted.

Like

D

dominiktaffner
Author
Cadet | Tier 2
Forum|Forum|4 years ago
March 23, 2022

Dear Chandu,

I have already tried GoogleVision API OCR. See below for the extraction result for the first number in the example. IQ Bot extracts the vertical lines of the extraction boxes, unfortunately not all of them, otherwise it would have been possible to clean the output systematically via python logic.

I also tried to extract the number digit by digit to avoid extracting the vertical lines. This does not work either, as GoogleVision API OCR recognizes the nulber as one SIR (system identified region). Only Tesseract4 allows to extract the number digit by digit.

Original Number

1 2 3 4 5 / 5 6 7 8 9 1

GoogleVision API OCR extraction result

1 /2/341516171819 니 11

Like

+7

Tamil Arasu10
Most Valuable Pathfinder
Forum|Forum|4 years ago
March 23, 2022

Hi @Dominik Täffner ,

Thanks for sharing the document here.

I have tried to extract the value using Microsoft Azure OCR, but I didn't get the correct value. I believe that Computer Vision will help us to extract the value.

I will give it a try and update you later.

_Tamil_

Like

Sign up

Login to the Pathfinder Community