Question

Extract PDF table from multiple PDFs

  • 20 December 2022
  • 3 replies
  • 526 views

Badge +4

How to extract table from multiple PDFs and the PDFs are  unstructured


3 replies

Userlevel 6
Badge +16

Hi @Harika 1170 ,

I’d recommend checking out IQBOT or AA new offering called Document Automation for extracting the unstructured table data from digital documents.

Checkout below link for more info,

https://docs.automationanywhere.com/bundle/enterprise-v2019/page/enterprise-cloud/topics/iq-bot/native/iq-bot-workflow.html

https://docs.automationanywhere.com/bundle/enterprise-v2019/page/enterprise-cloud/topics/iq-bot/train/iqb-training-map-table.html

Userlevel 3
Badge +6

Dear @Harika 1170,

 

To extract tables from multiple unstructured PDFs using Automation Anywhere, you can follow these steps:

  1.  You can use IQ bots package to extract data from PDF Documents.

  2. First, you will need to install a PDF library or tool that can be used to parse and extract data from PDF files. Some popular options include iText, Apache PDFBox, and PDFTron.

  3. Next, you will need to use the "Read PDF" action to read the contents of each PDF file into a variable. This action allows you to specify the path of the PDF file and the page range you want to extract.

  4. Once you have read the contents of the PDF into a variable, you can use a PDF library or tool to parse the data and extract the table you are interested in. This will typically involve using functions or methods provided by the library or tool to locate and extract the table data from the PDF contents.

  5. Finally, you can use the "Write CSV/Excel" action to write the extracted table data to a CSV or Excel file. This action allows you to specify the data you want to write and the location where you want to save the file.

 

Regards,

Userlevel 7
Badge +13

Hi @Harika 1170 ,

 

In addition to what @ChanduMohammad has suggested, from A360 Enterprise V.25 and above onwards, AA is providing an all-new feature called Document Automation

 

It is using two pre-trained models from AA and Google (separate license would be required) respectively through which you can perform all kinds of document extraction.

 

This will be a perfect option for you especially, if you don’t have an IQ Bot license.

Reply