Brighton SEO April 2023 – Basic Automation Hacks for SEO
Article Highlights
On the Thursday of BrightonSEO April 2023, I spoke on the main stage on the topic of ‘Basic Automation Hacks for SEO’. This was a massive personal goal for me, and something I’ve wanted to do since first coming to the conference over 5 years ago. I really enjoyed both creating and delivering this talk, and hope that I offered value to those attending. It helps that a lot of what I spoke about is stuff that I use on a frequent basis as part of my role at Blue Array.
There were a number of different hacks that I covered & mentioned in the talk, which are detailed a bit further below:
- Screaming Frog Crawl Scheduling
- Google Sheets
- Formulae
- Macros
- Apps Script
- Python
- Automator
- AI
I’ve provided a number of links to further resources from within this post, as well as full scripts that can be used as part of your own automations.
There’s a link to the full slides here, if you’d like to see them, but this post should cover all your needs!
Screaming Frog Crawl Scheduling
Screaming Frog, everyone’s favourite crawling tool, has a mode that allows you to schedule crawls to run at a pre-set time. We use this at Blue Array for a number of cases – mostly to allow us to run crawls overnight, when less people are using the site, and our computers are not running calls and other stuff!
There’s official documentation from the folks at Screaming Frog on Crawl Scheduling, that covers a lot of the details.
There is a huge amount of customisation available within the Scheduling option for the spider, that allows you to configure your crawl exactly how you want. You can also choose to have the scheduled crawl run as a one-off, or on a monthly, weekly or daily basis.
It’s worth mentioning the functionality of being able to upload your crawls straight into Google Sheets directly from Screaming Frog, which allows for automatic report creation. More on this can be seen in the official documentation on automatic reporting on the Screaming Frog website.
Google Sheets
I previously delivered a talk on Google Sheets for SEO as part of London SEO XL in 2022. Within this talk, I covered a wide range of different formulae that can help speed up a wide range of different SEO processes. I touched on identifying duplicates, using Index Match, Regex, Translate, and more.
Formulae
I mentioned a single formula hack in the Automation talk, that can save (and has saved) countless hours in data manipulation within sheets; extracting text from a URL based on position. To do this, I used the SPLIT & INDEX formulae.
Potential applications for this are pretty wide-ranging, including extracting UIDs, product names, categories, subcategories…Let’s look at an example URL:
https://www.argos.co.uk/browse/home-and-furniture/wardrobes/c:797726/
This URL follows a format that is common on the Argos website:
- We first have the domain name – argos.co.uk
- Then /browse/ – this is the main section of the site where we can browse high level category pages & subcategories
- Then, we have the high-level category – /home-and-furniture/
- Then, we have the subcategory – /wardrobes/
- Finally, there is a UID – /c:797726/
Say, for instance, we wanted to extract the ‘home-and-furniture’ section from the URL. To do this, we have to use the =SPLIT formula, to split the full URL up into manageable sections. Splitting by the “/” is going to be the easiest way of doing this, as each slash within this URL signifies its own section.
The formula we should then use in ‘B2’, if our first URL is in cell ‘A2’, is =(SPLIT(A2, “/”). This will split out the URL into a number of different columns, as shown in the below image:
While we could copy the column that we want & paste it into another sheet, we should wrap the =SPLIT in an INDEX formula, to get only the column that we want, and not the other data. To pull out the ‘home-and-furniture’ from the URL, we need to reference the 4th column of data – as that’s where the =SPLIT has resulted in it being.
In cell B2, we should update our formula to be =INDEX(SPLIT(A2, “/”),4). This will then extract only the 4th column of data in our initial SPLIT, as shown below:
We can change the ‘4’ to a ‘5’ within the above formula to get the subcategory that appears within the URL:
Macros
Beyond formulae, there are many cool things that we can do within Google Sheets. One of those things is using Macros. There have been many cases where we’ve had to apply consistent formatting to a number of tabs within a spreadsheet, or where we have had to make consistent edits on a weekly basis (some reporting, for instance).
Macros allow you to perform those same tasks on different sheets, where perhaps we’d like to freeze the top row, change it to bold and then edit the font and colour of text.
The best way to start using Macros is to use the ‘Record Macro’ Function. This can be found in the ‘Extensions > Macros > Record Macro’ menu.
This will allow you to play around with the function, and get to understand it a bit more. As soon as you click on ‘Record Macro’, you’ll see the following within your sheet:
When you’ve carried out all the actions that you’d like to be recorded, click on ‘Save’. You can give it a custom name, so be sure to be descriptive! You’ll then be able to access the Macro from within the same ‘Macros’ menu from where you started recording.
Google Apps Script
Google Apps Script is a great way of getting more into the advanced side of Google Sheets usage, and will allow you to perform far greater data analysis and manipulation that’s simply not possible with usual formula usage.
A good starting point for Apps Script is looking further into any Macros that you may have recorded, and seeing the code that creates them. For example, for a simple macro that freezes the top row, applies the bold formatting & changes the font size to ‘14’ has the following code:
function freezeAndBoldTopRow() { var spreadsheet = SpreadsheetApp.getActiveSpreadsheet(); var sheets = spreadsheet.getSheets(); for (var i = 0; i < sheets.length; i++) { var sheet = sheets[i]; sheet.setFrozenRows(1); sheet.getRange(1, 1, 1, sheet.getLastColumn()).setFontWeight('bold').setFontSize(14); } }
You can also use Google Apps Script within Google Slides, too. For example, if you wanted to set consistent formatting for all slides within a slide deck, you can use the following code:
function applyFontToAllSlides() { var presentation = SlidesApp.getActivePresentation(); var slides = presentation.getSlides(); var fontName = "Arial"; // Replace with the desired font name var fontSize = 12; // Replace with the desired font size // Define the slide numbers to exclude var excludeSlides = [1,2,3]; // replace with the slide numbers that should be excluded from formatting changes // Loop through all slides for (var i = 0; i < slides.length; i++) { var slide = slides[i]; // Skip slides that are in the excludeSlides array if (excludeSlides.indexOf(i + 1) !== -1) { continue; } // Get all text elements on the slide var textElements = slide.getShapes().flatMap(function(shape) { return shape.getText().getTextElements(); }); // Set the font name and size for each text element for (var j = 0; j < textElements.length; j++) { var textElement = textElements[j]; textElement.getTextStyle().setFontFamily(fontName).setFontSize(fontSize); } } }
Note the area that allows you to exclude certain slides – useful if you’d like to not apply the formatting to a number of slides where you have custom formatting already in place!
For further details on all things Google Sheets hacks, I’d recommend signing up for Ben L Collins’ newsletter – he has a weekly update where he shares a wide range of different sheets formulae & advanced usage. He also has a number of great courses, too!
Python
There’s a huge amount of stuff that you can do with Python – within the talk, I focused on a specific use case:
“I have 100,000 keywords & I need their monthly Search Volume”
Using the SEMRush API, we can query for large numbers of keywords, and the Search Volume associated with them. The script that I mentioned, and used, can be seen below:
import csv import requests import concurrent.futures import time # enter API Key api_key = "xxxxxxxxxxxxxxxxxxxxx" # enter database database = "uk" # enter date (YYYMM15) date = 20230415 # setup SEMRush API Endpoint url = "https://api.semrush.com/" # Define parameters for API request params = { "type": "phrase_all", "key": api_key, "database": database, #"export_columns": "Ph,Nq,Hs", #"display_date": date, } # Define a function to make an API request for a single keyword def make_api_request(keyword): params["phrase"] = keyword with requests.Session() as session: response = session.get(url, params=params) return response.text.splitlines() # Open sample file containing list of keywords (csv) with open("//Users/tom/Documents/sample/all_keywords_only.csv", "r") as csv_file: csv_reader = list(csv.DictReader(csv_file)) num_keywords = len(csv_reader) # count the number of keywords # no need to reset the file pointer or skip the header row # open the output csv file and write a header row with open("//Users/tom/Documents/sample/Full_April_UK_Search_Volume.csv", "w", newline="") as output_file: csv_writer = csv.writer(output_file) csv_writer.writerow(["Date", "Database", "Keyword", "Search Volume"]) # loop through keywords in input and make an api request for each one with concurrent.futures.ThreadPoolExecutor() as executor: futures = [executor.submit(make_api_request, row["Keyword"]) for row in csv_reader] start_time = time.monotonic() num_processed = 0 for i, future in enumerate(concurrent.futures.as_completed(futures)): keyword = csv_reader[i]["Keyword"] lines = future.result() data = csv.DictReader(lines, delimiter=";") for row in data: try: search_volume = row["Search Volume"] date1 = row["Date"] csv_writer.writerow([date1, database, keyword, search_volume]) except KeyError: print(f"Skipping row '{keyword}' with missing 'Search Volume' column") num_processed += 1 if num_processed % 100 == 0: # print progress every 100 keywords elapsed_time = time.monotonic() - start_time print(f"Processed {num_processed}/{num_keywords} keywords in {elapsed_time:.1f} seconds") elapsed_time = time.monotonic() - start_time
This script takes a provided list of keywords from a .csv file, queries SEMRush for the Monthly Search Volume for a specific region (in this case the UK), and then adds them to a final .csv file. Progress is also printed on the screen to keep you updated of progress. This script will be able to query about 100 keywords every 3 seconds, depending on your system. Further detail on the SEMRush API can be found within their official provided documentation.
After collating all the data from the API, we’re now faced with another problem – the amount of data that we have is not going to be easy to wrangle within Sheets or Excel. The end goal is to be able to show the data within a custom report within Google Data Looker Studio.
This is where Google BigQuery comes into play. We can use BigQuery to house enormous amounts of data, and as we’re building a dashboard within Looker Studio, all we have to do is change our data sources from a Google Sheet to the BigQuery database. You can also perform data manipulation within BigQuery, before it comes into Data Studio to allow for faster performance of your reports.
For more information about BigQuery, there’s a great introduction and course on it directly from Google, in this CloudSkillsBoost course.
Automator
If you have an Apple Mac, you will have access to a fantastic in-built automation platform – Automator. This is a great way of automating a lot of basic processes, for example:
- Rename a large number of files
- Resize images
- Combine PDFs
- Schedule tasks to run at certain times, or open applications at certain times
While this is an application that is rarely talked about, it’s something that is incredibly useful and valuable to have in your toolset.
‘AI’
Within the talk, I finally touched on AI, and the wide-ranging impact that it can have on automation. For example, you can use ChatGPT to help you write Apps Script code, or provide ideation for a number of different automations. It’s not going to be able to replace the role of humans in SEO (yet) but it should be treated like an incredibly enthusiastic junior developer. You just need to provide the right prompts, and be able to guide it where it will go wrong.
There’s a huge amount of automation potential out there, increased exponentially by the recent advances in AI technology. The whole goal of automation that was covered in this talk was not to take people’s jobs, but instead to automate the boring stuff, allow humans to spend more time on the things that they actually enjoy doing. At Blue Array, we use automation to carry out a number of different tasks, allowing us to provide the best possible customer service to our clients.