Blue Array Menu
×

BrightonSEO – Command Line (Terminal) Hacks for SEO Cheat Sheet

May 4, 2018

BrightonSEO – Command Line (Terminal) Hacks for SEO Cheat Sheet

On Friday 27th April 2018, I spoke at one of the UK’s largest SEO events –  BrightonSEO, kicking off the day as part of an awesome Technical SEO Session with Emily Mace from Oban Digital (who talked about utilising Hreflang) and Tom Anthony from Distilled (who covered HTTP/2) also speaking.

Tom Pool at Brighton SEO

This was the first time that I have done any public speaking (apart from getting up in front of people at school), and Brighton SEO was something that I had wanted to do since first coming to this event over two years ago. Being able to go up in front of an audience of peers and people that I look up to was a huge experience for me.

Tom Pool on stage at Brighton SEO

Thanks to Kelvin Newman and the Brighton SEO team for allowing me to speak, and for putting on such a fantastic event twice every year. From the conference itself to the pre and post event parties, the whole Brighton SEO experience is one of the best there is if you work in this industry.

Since my talk I’ve received a number of requests for more information about the command line hacks that I covered. As such, I’ve put together this “Cheat Sheet” of commands that were mentioned within my Brighton SEO Talk (“Command Line Hacks For SEO”).

These commands are for usage within Mac’s ‘Terminal’ or the Linux ‘Shell’.

How to access Terminal

To get to this tool on your Mac, go to your Applications, then open up Utilities, and then click on Terminal:

When you open up Terminal, you *should* see something like this:

You can then type the commands you want to use straight into here.

But what if you are on a Windows PC?

As highlighted by Tom Anthony, if you are on a Windows PC you can install cygwin here – this allows you to use these commands as if you were on a Mac/Linux computer.

For each of these commands I have included a small description of what the command means, with both the Syntax and examples that work that you can copy, edit and paste right into terminal.

curl

curl is a command used to transfer data to or from a server. Below are the featured modifiers that I highlighted the usage of:

Basic usage

  • curl {URL that you want to test}
  • curl https://www.bluearray.co.uk

Download Screaming Frog (or other application) (-O)

Check header response (-I) (Capital i)

  • curl -I {URL that you want header for}
  • curl -I https://www.bluearray.co.uk

Follow redirects (-L)

  • curl -L {URL that you want to test}
  • curl -L http://www.bluearray.co.uk

These commands can be used together, for example:

  • curl -I -L {URL that you want to test}
  • curl -I -L http://www.bluearray.co.uk
    (This would follow redirects & only show the header responses)

Bonus hacks

Using a different User Agent (-A):

  • curl -A “User-Agent” {URL that you want to test}
  • curl -A “Googlebot” https://www.bluearray.co.uk

For a full list of commands, type “man curl” or “info curl” into your terminal to bring up the manual page with an in depth exploration of the command & modifiers.

sort

sort is a command that is used to sort data in a number of different ways, with the addition of modifiers:

Basic usage (Sorting A-Z On screen)

  • sort {name of the file that you want to sort}
  • sort keywordfile.csv

Basic usage (Sorting A-Z & printing results to file)

  • sort {filename} > {new filename}
  • sort keyworddata.csv > sortedkeyworddata.csv

Sorting by Z-A & printing results to file (-r)

  • sort -r {filename} > {new filename}
  • sort -r keywords.csv > reversesortedkeyworddata.csv

Sorting by Column 2 – by number, in reverse(-k -t, -n -r)

  • sort -k{column number} -t{columnseperator} -n(sort by number) -r(optional)
  • sort -k2 -t, -n -r filename.csv > reversenumbersortedbycolumn2data.csv

Sorting By Number (so 10 comes after 9, not 1)

  • sort -n {filename.csv} > {numbersorteddata.csv}
  • sort -n keywords.csv > numbersortedkeywords.csv

head/tail

head displays the first 10 lines of a file, and tail displays the last 10 lines of a file:

Basic usage (head)

  • head {filename}
  • head data.csv

Basic usage (tail)

  • tail {filename}
  • tail data.csv

You can use the ‘>’ symbol to print the results of the tail/head to a new file:

  • head {filename} > {toptenrowsfilename}
  • head data.csv > topten.csv

Returning the top ‘x’ rows (-n)

  • head -n(number of rows you want) {filename} > {xrowsofdata}
  • head -n100 data.csv > top100rows.csv

cat

cat stands for ‘concatenate’, and is a command that can be used to display, edit and combine files together:

Combining .csv files (make sure your .csv files are in one folder)

  • cat .{fileextension} > {combineddatafile}
  • cat .csv > combinedcsvfiles.csv

Bonus hacks

Adding a line count at the start of a line:

  • cat -n {filename} > {linecountfilename}
  • cat -n file.csv > linecountfile.csv

Displaying Files On Screen (This is the main intended purpose of the command, and does not require any modifiers)

  • cat {filename}
  • cat file.csv

sed

sed stands for ‘Stream Editor’, and it allows you to filter and transform text. The different methods of filtering can be changed using the following modifiers.

Adding text to the start of a row:

  • sed -e ‘s/^/{Text That You Want To Add}’ > {newfile.csv}
  • Make sure to escape any special characters using ‘/’!
  • sed -e ‘s/^/http\:\/\/www\.domain\.com/’ urlfile.csv > modifiedurlfile.csv

Find and replace:

  • sed -e ‘s/{Text You Are Searching For}/{Text To Replace With}/’ {file} > {newfile}
  • sed -e ‘s/http:/https:/’ httpurllist.csv > httpsurllist.csv

Bonus hacks

Multiple find and replace using file of commands (-f):

  • sed -f {sedscriptname} < {filetorunsedon} > {manipulatedfilename}
  • sed -f sedscript < oldfile.csv > newfile.csv
    *sedscript contains directives like so:
    s/{texttoreplace1}/{texttoreplacewith1}/g
    s/{texttoreplace2}/{texttoreplacewith2}/g
    s/{texttoreplace3}/{texttoreplacewith3}/g

awk

This is a programming language that is based within terminal, and you can use it to process text and data.

Extracting Columns 1 and 2:

  • awk -F “{fileseperator}” ‘{print ${columns}}’ {keywordfile} > {newfile}
  • awk -F ”,” ‘{print $1 “,” $2}’ semrushexport.csv > columns1&2.csv
    *you can just add in another “${column number} to print this as well

Print line by pattern/match (404,418,503 etc):

  • awk ‘/{text or string to search for}/ {print${full line}}’ {file to search in} > {file to put results in}
  • awk ‘/404/ {print $0}’ combinedlogs.log > log404.log
  • awk ‘/Googlebot/ {print $0}’ combinedlogs.log > gbothits.log

Bonus hacks

Print lines over a certain length:

  • awk ‘length(${line}) > {number}’ {file to check} > {file to print data to}
  • awk ‘length($0) > 115’ urldata.csv > longurl.csv
  • Removing Duplicates (blog post – check it out!)

wc

wc is the word count utility that is built into terminal, and is one of the easier commands to use.

Count the number of lines (-l) (lowercase L):

  • wc -l {filename}
  • wc -l googlebot.log

Bonus hacks

Count the number of words:

  • wc -w {filename}
  • wc -w textfile.txt

Basic (no commands)

  • wc {file}
    *returns 3 columns – lines, words & characters.

Bonus command – Caffeinate

Have you ever needed to run a crawl on your local machine, only to come back a few hours later and seen your computer has gone to sleep? If you want to stop this from happening, this command provides a dose of super caffeine to your machine, and prevents it from sleeping (ever!)

Just type ‘caffeinate’ into your terminal to run this command. To stop, just press ctrl+c.

Summary

All of the above command line hacks worked at the time of writing, on my machine and should allow you to utilise them via a simple copy and paste. I will also write a number of more in-depth guides into some of the more complex command lines in the near-future, so keep an eye out for those.

If you want further guidance around a particular command, or would like to see a particular process carried out through command line, please do not hesitate to get in contact with me.

 

Tom is Blue Array's resident technical SEO geek. When not educating people about the differences between a 404 and 410, you'll likely find him gaming until the early hours, or tearing up Berkshire's mountain bike trails.

Leave a Reply

avatar
  Subscribe  
Notify of