- Joined
- Jul 28, 2022
- Messages
- 287
- Reaction score
- 158
- Points
- 179
Google Sheets is a powerful tool for scraping various types of data. One way to use it is by running a script that checks whether a given URL has been indexed on Google by searching for 'site:domain.com'. The script relies on Google's own scraping capabilities, so there's no need to use proxies or other tools. This code can be adapted to scrape data from other sources, too. It works well for websites like Reddit and many others. To get started, open the Script Editor in Google Sheets.

Step #2: Paste the code into the script editor and save the file by using Ctrl+S. The code that needs to be copied and pasted is provided below.

Step #3: Execute the Function in Google Sheets

Step #4: Use the Output
Bonus: Implement Conditional Formatting to Highlight Cells based on Indexing Status

The resulting output may resemble the following.

Step #2: Paste the code into the script editor and save the file by using Ctrl+S. The code that needs to be copied and pasted is provided below.
function checkIfPageIsIndexed(url)
{
url = "https://www.google.com/search?q=site:"+url;
var options = {
'muteHttpExceptions': true,
'followRedirects': false
};
var response = UrlFetchApp.fetch(url, options);
var html = response.getContentText();
if ( html.match(/Your search -.*- did not match any documents./) )
return "URL is Not Indexed";
return "URL is Indexed";
}

Step #3: Execute the Function in Google Sheets

Step #4: Use the Output
Bonus: Implement Conditional Formatting to Highlight Cells based on Indexing Status

The resulting output may resemble the following.