Skip to main content
Solved

String package: How to extract only alphabet from string


Forum|alt.badge.img+2

Hi, help me! Can I extract only alphabet value from string and exclude those number value in string.

Example string: My name is Liyana 6547

I just want to display only “My name is Liyana”. The string value is random and not fix.

Best answer by Zaid Chougle

Hey @liyananadia ,

you can use String→ Replace and pass RegEx as \d+, and replace with keep as blank. You will get the output

View original
Did this topic help answer your question?

12 replies

Zaid Chougle
Navigator | Tier 3
Forum|alt.badge.img+15
  • Navigator | Tier 3
  • 217 replies
  • Answer
  • February 16, 2024

Hey @liyananadia ,

you can use String→ Replace and pass RegEx as \d+, and replace with keep as blank. You will get the output


Forum|alt.badge.img+2
  • Author
  • Cadet | Tier 2
  • 3 replies
  • February 20, 2024

Hi @Zaid Chougle. Thank you so much for your answers. It’s work now! 

 


Forum|alt.badge.img+5
  • Navigator | Tier 3
  • 20 replies
  • February 3, 2025

hi ​@liyananadia ​@Zaid Chougle ,

I have an OCR output string containing various data, and I need to extract two specific pieces of information:

  1. The PAN ID, which follows the pattern of five letters, four digits, and one letter (e.g., ABCDE1234F).
  2. The last four digits of an Aadhar card number. The format appears as xxxx1234 among all the data in same ocr string  but I only need the numeric portion (1234).

     

Forum|alt.badge.img+5
  • Navigator | Tier 3
  • 20 replies
  • February 3, 2025

Hi Team,

 

I have an OCR output string containing various data, and I need to extract two specific pieces of information:

  1. The PAN ID, which follows the pattern of five letters, four digits, and one letter (e.g., ABCDE1234F).
  2. The last four digits of an Aadhar card number. The format appears as xxxx1234 among all the data in same ocr string  but I only need the numeric portion (1234).

    below is Approached methods to extract the desired string using regex under string 

    1. for PAN card ^[A-Z]{2}[A-Z]{3}[0-9]{4}[A-Z]{1}$$ this as  regx 
    2. for Aadhar last 4 digits are [^xxxx][0-9]{4}

    but unable to generate the specified string instead it generates whole string value

Shreya.Kumar
Pathfinder Community Team
Forum|alt.badge.img+9
  • Pathfinder Community Team
  • 71 replies
  • February 3, 2025

@Vaandu Could you share an example/dummy of what your complete output looks like? We’ll be able to help you better based on that.

Depending on any recurring patterns in the output format, you could use a combination of String Extract and Substring functions to get the data you need. For example, to extract the Aadhar info, you could look for a keyword and use String Extract to isolate the complete Aadhar number in a string variable. Then you can use Substring action and specify:

  • the starting position of the substring (length of string - 3)
  • the length of the substring (4).

 


Forum|alt.badge.img+5
  • Navigator | Tier 3
  • 20 replies
  • February 3, 2025

Hi ​@Shreya.Kumar ,

Thanks for your reply,

the String value is  “ Pudukkottal.Tamil Nadu,ca,622202 215 KEELA STREET, xxxxx2606 THURAIYUR PANCHAYAT ARIMALAM PANCHAYAT KEEZHAA NILAI POST.Thuralyur. Pudukkottal Tama Nadu,India.622202  NA ABCDE1234F VIVEKANANTHAN M MURUGESAN 26-04-1992 ABCDE1234F VIVEKANANTHAN MURUGESAN NA NA VALID ABCDE1234F 80.00% 0.00% 100% 100% 89.00% 8900% 100% “

i need to retrive “ABCDE1234F (PAN Id) ” and xxxxx2606 (Aadhar ID last four digits ).

What is the feasible solution for retrieve above data


Shreya.Kumar
Pathfinder Community Team
Forum|alt.badge.img+9
  • Pathfinder Community Team
  • 71 replies
  • February 3, 2025

@Vaandu Seeing as the output data was unstructured, I tried an approach using Regex Tools package, in addition to my earlier suggestion

I also read in the topic you started (now moved to this thread) that you tried using Regex pattern strings, I tried using those but I didn’t get an answer, so instead I simplified the pattern strings:
 

  • For PAN: [A-Z]{5}[0-9]{4}[A-Z]
  • For Aadhaar: (\D{5})(\d{4})

That worked for me using the Extract a Regex Match action. I stored the extracted strings in different variables. To get the last 4 digits of the Aadhaar number, I used the Substring action that I mentioned in my earlier answer.

 

Hope this helps!


Forum|alt.badge.img+5
  • Navigator | Tier 3
  • 20 replies
  • February 3, 2025

Hi ​@Shreya.Kumar ,

I'm glad the solution worked for  PAN ID! 🎉thanks for the innovative solution.

For Aadhar ID, the

ex isn't working because the last number may contain "xxxx2606" (4 x's) or "xxxxx2606" (5 x's). 
for the current solution it retrieves the “VKYC    2448” for aadhar , 

so can i change the regex as “[x]{4,5}(\d{4})” for aadhar

 


Shreya.Kumar
Pathfinder Community Team
Forum|alt.badge.img+9
  • Pathfinder Community Team
  • 71 replies
  • February 3, 2025

@Vaandu could you check using message boxes, exactly which step is not working? Message Boxes work similar to print statements, so you should be able to check the value of extracted string after each action


Forum|alt.badge.img+5
  • Navigator | Tier 3
  • 20 replies
  • February 3, 2025

 

hi ​@Shreya.Kumar 

I tried with another set of input and tested it in the message box action. While it worked correctly with the PAN ID regex configuration, it didn't work as expected with the Aadhar ID. According to your regex (\D{5})(\d{4}), it retrieves 5 non-numeric characters followed by 4 numeric digits. For example, the result is shown as "VKYCM3345" in the message box, but it should have retrieved "xxxx2606" instead. As per the regex, it generated right value only,

so i was tried with (x{4,5})\d{4} , so “x” could be constant and 4 numeric value , but it was not worked out 

 

 


Shreya.Kumar
Pathfinder Community Team
Forum|alt.badge.img+9
  • Pathfinder Community Team
  • 71 replies
  • February 4, 2025

@Vaandu I found a workaround to this - I tried your regex expression with “Extract All Matches” instead of “Extract a Regex Match” and it worked.

This approach gives you all matching strings in a list and you could loop through them to check the correct one for your application. Hope this helps!

 


Forum|alt.badge.img+5
  • Navigator | Tier 3
  • 20 replies
  • February 6, 2025

@Shreya.Kumar ,

Thank you for the insightful idea on extracting patterns!
 

I had been achieving a 100% success rate using another method, which involved applying the same regex through a VBS script. This approach allowed me to obtain highly accurate results without the need for any additional action packages. The extracted string data was then stored in an Excel file.

Your idea provided a feasible solution that significantly enhanced my process. Below is a breakdown of my VBS script and the string output:

additional thanks to ​@Tamil Arasu10 

 


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings