Skip to main content
Question

Scraping URLs from Google Search Results

  • August 14, 2024
  • 4 replies
  • 347 views

I’m trying to build a bot in A360 that searches a string in Google and clicks on each of the links that are returned. The issue is that Google randomises the path each time so it’s almost impossible to loop through each of the links using a counter in domx/path. I also looked at scraping URLs from the source code but it’s all embedded in js. Any ideas?

 

4 replies

Forum|alt.badge.img
  • Flight Specialist | Tier 4
  • August 15, 2024

Domx path is working consistently for me on google searches. Try capturing the entire box at the top of a search result.

That has a DomX of: //div[@id='rso']/div[3]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]

The first Div increments with each result so I’ve inserted: //div[@id='rso']/div[$nSearchResultsRow.Number:toString$]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]

Make sure the only things you are using for the object properties are the HTML Tag, the DOMX Path, and maybe the HTML HasFrame.

I’ve placed the recorder action in a loop that goes 5 times. I start with the nSearchResultsRow

equal to 3 since that seems to be the first row and then increment from there in the loop.

To get the URL I’m grabbing the “HTML InnerText” property which for example looks something like this:

Best Vegan Chocolates: Ideal for Plant-Based TreatsDallmann Confectionshttps://dallmannconfections.com › collections › vegan-c..

So you would need to use the string tools to isolate the URL out of there!

Since the number of results will vary, you will need to error trap if you run out of rows and need to click the “More Options” button to expand the results, and then the Next button. Note that clicking next probably resets the rows so you’ll need to set variable back to 3 to start scraping again.


  • Author
  • Cadet | Tier 2
  • August 16, 2024

This is the first thing I tried but the order of results can be random i.e. if you enter a completely different search string, sometimes the domx might be -1 and not necessarily follow an incremental order. I ended up using the headless browser method of REST GET using the URL in URI and string manipulation to capture all the URLs. 

Thanks for your suggestion.


CaptainPathfinder

That’s an interesting question, Niico_MoS !

I think the following community members may be able to help 

@ChanduMohammad 

@Zaid Chougle 

@Tamil Arasu10 

@Paul Hawkins 

@Padmakumar 


  • Cadet | Tier 2
  • June 9, 2026

The main issue is that Google search results are not stable for automation because the page structure keeps changing and most links are loaded dynamically using JavaScript, which makes DOM or XPath-based looping unreliable.

A better approach is to avoid scraping the Google page altogether and instead use a search API like Google Custom Search or SerpAPI, where you get clean results in JSON format and can easily loop through the URLs in A360.

Another option is to use web scraping APIs that handle page rendering and data extraction for you, such as Geekflare Web Scraping API which helps avoid issues with dynamic DOM structures.

If you still want to use the browser, a more workable method is to collect all visible links on the results page, filter out unwanted ones like Google internal links or ads, and then loop through only the valid website URLs.