Skip to main content
Solved

Document Automation Failed to capture correct Invoice Number

  • October 24, 2024
  • 1 reply
  • 52 views

Forum|alt.badge.img

Bot got confused while extracting  ‘I’, and ‘1’ and extracted ‘1NQR’ instead ‘INQR’.

If Invoice number contains PSINV8888, BOT extracted PS1NV which is wrong.

If invoice number is I0782 BOT extracted 10782

 

How can we resolve the issue.

Best answer by Dineshkumar Muthu

Hi @Adarsha G 

This problem often arises from poor document quality. To address this, we need to enhance the quality of the documents before processing them.

Here are some steps we can take to improve document quality:

  1. Denoise: Reduce noise in the document to make characters clearer.
  2. Adjust Grayscale: Convert the document to grayscale to enhance contrast.
  3. Adjust Contrast: Increase the contrast to make the text stand out more against the background.
  4. Adjust Brightness: Modify the brightness to ensure the text is easily readable.
  5. Thresholding: Convert the document to a binary format (black and white) to improve text recognition.
  6. Remove RGB: Convert the document to grayscale to eliminate color information that might interfere with OCR.

These steps should help in resolving issues with character recognition, such as confusing ‘I’ and ‘1’. For more detailed guidance, you can refer to the image enhancement options provided in the Automation Anywhere documentation here.

By applying these preprocessing techniques, we can significantly improve the accuracy of the bot’s data extraction.

 

 

1 reply

Dineshkumar Muthu
Flight Specialist | Tier 4
Forum|alt.badge.img+9
  • Flight Specialist | Tier 4
  • 88 replies
  • Answer
  • October 24, 2024

Hi @Adarsha G 

This problem often arises from poor document quality. To address this, we need to enhance the quality of the documents before processing them.

Here are some steps we can take to improve document quality:

  1. Denoise: Reduce noise in the document to make characters clearer.
  2. Adjust Grayscale: Convert the document to grayscale to enhance contrast.
  3. Adjust Contrast: Increase the contrast to make the text stand out more against the background.
  4. Adjust Brightness: Modify the brightness to ensure the text is easily readable.
  5. Thresholding: Convert the document to a binary format (black and white) to improve text recognition.
  6. Remove RGB: Convert the document to grayscale to eliminate color information that might interfere with OCR.

These steps should help in resolving issues with character recognition, such as confusing ‘I’ and ‘1’. For more detailed guidance, you can refer to the image enhancement options provided in the Automation Anywhere documentation here.

By applying these preprocessing techniques, we can significantly improve the accuracy of the bot’s data extraction.