Quick Tip: Headless Web Scraping

4 years ago
August 26, 2020
0 replies
572 views

+12

Automation Anywhere Team
Automation Anywhere Team

In this session, we’ll look at 2 approaches for extracting text from a web application - including a unique application of the REST Web Services package to perform browser-less web scraping.

Video Recap:

Recorder
1. Use the recorder in conjunction with the Automation Anywhere Chrome Extension to highlight object on a webpage and extract them.
2. The recorder action does require an established browser session to interact with.
3. Recorder can be a great option for use cases where some navigation through multiple pages is needed or objects on the page are being dynamically loaded at runtime.
REST Get
1. Use the REST Get method to read the full page's HTML without the need for a browser - returned as a dictionary - where the Body is the dictionary key to the full HTML text.
2. The String Package can be subsequently used for extracting specific text from the REST response.

Bonus Tip

When using the REST Get method to return the full HTML of a page or when using the Recorder Capture action to return the innerHTML of an object - consider pairing those with the String Split action. String split will allow you to turn what could be repeating HTML elements into a list, which could be iterated through to extract out the contents from repeating divs made to look like a table or bootstrap style cards which repeat across a page.

Be sure to stay tuned for more Quick Tips!

Did this topic help answer your question?

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos

Video Recap:

Bonus Tip

Reply

Related topics

Scraping URLs from Google Search Resultsicon

Bot Store Spotlight - JSON Object Manager Package

Quick Tip: Using the Credential Vault

Quick Tip: Using JavaScript within a Bot

Quick Tip: Advanced String Manipulation

Popular tags

Sign up

Login to the Pathfinder Community

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings