Skip to main content
Question

How can I extract specific text from PDF?


Forum|alt.badge.img+1

Hi,

I’m using the Community Edition of AA and trying to build a bot to extract one small set of digits. When I select the PDF extract text the bot tries to extract all the text on the page. How can I extract only the specific digits I want?

Thanks, 

5 replies

Tamil Arasu10
Most Valuable Pathfinder
Forum|alt.badge.img+16
  • Most Valuable Pathfinder
  • 3276 replies
  • July 20, 2023

Hi @john whittaker ,

After the extracting the all the text values, you can use the string package to retrieve required digits only.

Do you want to extract the digits from the PDF (directly). Can you please share screen of the PDF or example PDF file.

 


Forum|alt.badge.img+1

Hi, thanks for your response! I attached a screenshot of the upper portion of the PDF which contains an episode number for the TV series Survivor which I copyright. In this case, I’m trying to extract the episode number (4206) and then use that digit to rename the PDF file. I have a lot of certificates, each with a different number so I want to replicate this process over and over.


Tamil Arasu10
Most Valuable Pathfinder
Forum|alt.badge.img+16
  • Most Valuable Pathfinder
  • 3276 replies
  • July 25, 2023

Did you converted the PDF file to Text file ?  Are you able to see the results with the required value ?


Forum|alt.badge.img+4
  • Navigator | Tier 3
  • 20 replies
  • July 26, 2023

1) extract text from pdf with structure. You will be getting text.

2) use before after string operation.

 

 

Thanks 

Aravindh s

Aravindhkumar002@gmail.com 


Forum|alt.badge.img+1

I was out of town for several days and just got back. I’ll try converting pdf to text file. Thanks.


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings