BrightonSEO – Command Line (Terminal) Hacks for SEO Cheat Sheet
Posted by Tom Pool on May 4, 2018
On Friday 27th April 2018, I spoke at one of the UK’s largest SEO events – BrightonSEO, kicking off the day as part of an awesome Technical SEO Session with Emily Mace from Oban Digital (who talked about utilising Hreflang) and Tom Anthony from Distilled (who covered HTTP/2) also speaking.
This was the first time that I have done any public speaking (apart from getting up in front of people at school), and Brighton SEO was something that I had wanted to do since first coming to this event over two years ago. Being able to go up in front of an audience of peers and people that I look up to was a huge experience for me.
Thanks to Kelvin Newman and the Brighton SEO team for allowing me to speak, and for putting on such a fantastic event twice every year. From the conference itself to the pre and post event parties, the whole Brighton SEO experience is one of the best there is if you work in this industry.
Since my talk I’ve received a number of requests for more information about the command line hacks that I covered. As such, I’ve put together this “Cheat Sheet” of commands that were mentioned within my Brighton SEO Talk (“Command Line Hacks For SEO”).
These commands are for usage within Mac’s ‘Terminal’ or the Linux ‘Shell’.
How to access Terminal
To get to this tool on your Mac, go to your Applications, then open up Utilities, and then click on Terminal:
When you open up Terminal, you *should* see something like this:
You can then type the commands you want to use straight into here.
But what if you are on a Windows PC?
As highlighted by Tom Anthony, if you are on a Windows PC you can install cygwin here – this allows you to use these commands as if you were on a Mac/Linux computer.
For each of these commands I have included a small description of what the command means, with both the Syntax and examples that work that you can copy, edit and paste right into terminal.
curl
curl is a command used to transfer data to or from a server. Below are the featured modifiers that I highlighted the usage of:
Basic usage
- curl {URL that you want to test}
- curl https://www.bluearray.co.uk
Download Screaming Frog (or other application) (-O)
- curl -O {URL of file you want to download}
- curl -O https://download.screamingfrog.co.uk/products/seo-spider/ScreamingFrogSEOSpider-9.2.dmg
Check header response (-I) (Capital i)
- curl -I {URL that you want header for}
- curl -I https://www.bluearray.co.uk
Follow redirects (-L)
- curl -L {URL that you want to test}
- curl -L http://www.bluearray.co.uk
These commands can be used together, for example:
- curl -I -L {URL that you want to test}
- curl -I -L http://www.bluearray.co.uk
(This would follow redirects & only show the header responses)
Bonus hacks
Using a different User Agent (-A):
- curl -A “User-Agent” {URL that you want to test}
- curl -A “Googlebot” https://www.bluearray.co.uk
For a full list of commands, type “man curl” or “info curl” into your terminal to bring up the manual page with an in depth exploration of the command & modifiers.
sort
sort is a command that is used to sort data in a number of different ways, with the addition of modifiers:
Basic usage (Sorting A-Z On screen)
- sort {name of the file that you want to sort}
- sort keywordfile.csv
Basic usage (Sorting A-Z & printing results to file)
- sort {filename} > {new filename}
- sort keyworddata.csv > sortedkeyworddata.csv
Sorting by Z-A & printing results to file (-r)
- sort -r {filename} > {new filename}
- sort -r keywords.csv > reversesortedkeyworddata.csv
Sorting by Column 2 – by number, in reverse(-k -t, -n -r)
- sort -k{column number} -t{columnseperator} -n(sort by number) -r(optional)
- sort -k2 -t, -n -r filename.csv > reversenumbersortedbycolumn2data.csv
Sorting By Number (so 10 comes after 9, not 1)
- sort -n {filename.csv} > {numbersorteddata.csv}
- sort -n keywords.csv > numbersortedkeywords.csv
head/tail
head displays the first 10 lines of a file, and tail displays the last 10 lines of a file:
Basic usage (head)
- head {filename}
- head data.csv
Basic usage (tail)
- tail {filename}
- tail data.csv
You can use the ‘>’ symbol to print the results of the tail/head to a new file:
- head {filename} > {toptenrowsfilename}
- head data.csv > topten.csv
Returning the top ‘x’ rows (-n)
- head -n(number of rows you want) {filename} > {xrowsofdata}
- head -n100 data.csv > top100rows.csv
cat
cat stands for ‘concatenate’, and is a command that can be used to display, edit and combine files together:
Combining .csv files (make sure your .csv files are in one folder)
- cat .{fileextension} > {combineddatafile}
- cat .csv > combinedcsvfiles.csv
Bonus hacks
Adding a line count at the start of a line:
- cat -n {filename} > {linecountfilename}
- cat -n file.csv > linecountfile.csv
Displaying Files On Screen (This is the main intended purpose of the command, and does not require any modifiers)
- cat {filename}
- cat file.csv
sed
sed stands for ‘Stream Editor’, and it allows you to filter and transform text. The different methods of filtering can be changed using the following modifiers.
Adding text to the start of a row:
- sed -e ‘s/^/{Text That You Want To Add}’ > {newfile.csv}
- Make sure to escape any special characters using ‘/’!
- sed -e ‘s/^/http\:\/\/www\.domain\.com/’ urlfile.csv > modifiedurlfile.csv
Find and replace:
- sed -e ‘s/{Text You Are Searching For}/{Text To Replace With}/’ {file} > {newfile}
- sed -e ‘s/http:/https:/’ httpurllist.csv > httpsurllist.csv
Bonus hacks
Multiple find and replace using file of commands (-f):
- sed -f {sedscriptname} < {filetorunsedon} > {manipulatedfilename}
- sed -f sedscript < oldfile.csv > newfile.csv
*sedscript contains directives like so:
s/{texttoreplace1}/{texttoreplacewith1}/g
s/{texttoreplace2}/{texttoreplacewith2}/g
s/{texttoreplace3}/{texttoreplacewith3}/g
awk
This is a programming language that is based within terminal, and you can use it to process text and data.
Extracting Columns 1 and 2:
- awk -F “{fileseperator}” ‘{print ${columns}}’ {keywordfile} > {newfile}
- awk -F ”,” ‘{print $1 “,” $2}’ semrushexport.csv > columns1&2.csv
*you can just add in another “${column number} to print this as well
Print line by pattern/match (404,418,503 etc):
- awk ‘/{text or string to search for}/ {print${full line}}’ {file to search in} > {file to put results in}
- awk ‘/404/ {print $0}’ combinedlogs.log > log404.log
- awk ‘/Googlebot/ {print $0}’ combinedlogs.log > gbothits.log
Bonus hacks
Print lines over a certain length:
- awk ‘length(${line}) > {number}’ {file to check} > {file to print data to}
- awk ‘length($0) > 115’ urldata.csv > longurl.csv
- Removing Duplicates (blog post – check it out!)
wc
wc is the word count utility that is built into terminal, and is one of the easier commands to use.
Count the number of lines (-l) (lowercase L):
- wc -l {filename}
- wc -l googlebot.log
Bonus hacks
Count the number of words:
- wc -w {filename}
- wc -w textfile.txt
Basic (no commands)
- wc {file}
*returns 3 columns – lines, words & characters.
Bonus command – Caffeinate
Have you ever needed to run a crawl on your local machine, only to come back a few hours later and seen your computer has gone to sleep? If you want to stop this from happening, this command provides a dose of super caffeine to your machine, and prevents it from sleeping (ever!)
Just type ‘caffeinate’ into your terminal to run this command. To stop, just press ctrl+c.
Summary
All of the above command line hacks worked at the time of writing, on my machine and should allow you to utilise them via a simple copy and paste. I will also write a number of more in-depth guides into some of the more complex command lines in the near-future, so keep an eye out for those.
If you want further guidance around a particular command, or would like to see a particular process carried out through command line, please do not hesitate to get in contact with me.