In this session, we’ll look at 2 approaches for extracting text from a web application - including a unique application of the REST Web Services package to perform browser-less web scraping.
Video Recap:
- Recorder
- Use the recorder in conjunction with the Automation Anywhere Chrome Extension to highlight object on a webpage and extract them.
- The recorder action does require an established browser session to interact with.
- Recorder can be a great option for use cases where some navigation through multiple pages is needed or objects on the page are being dynamically loaded at runtime.
- REST Get
- Use the REST Get method to read the full page's HTML without the need for a browser - returned as a dictionary - where the Body is the dictionary key to the full HTML text.
- The String Package can be subsequently used for extracting specific text from the REST response.
Bonus Tip
When using the REST Get method to return the full HTML of a page or when using the Recorder Capture action to return the innerHTML of an object - consider pairing those with the String Split action. String split will allow you to turn what could be repeating HTML elements into a list, which could be iterated through to extract out the contents from repeating divs made to look like a table or bootstrap style cards which repeat across a page.
Be sure to stay tuned for more Quick Tips!