SEO XPath Cheatsheet

Learn how you can hack your way to the top with proven, data-backed tactics directly to your email every week.

XPath or (XML Path Language) is a language that can help you navigate through an XML document using “path like” syntax. Even though this might sound like gibberish XPath has been one of the most used tool in my arsenal for data extraction, competitor and keyword research.

Not only that but in conjunction with Google Sheets and Screaming Frog XPath has made me faster, more productive and better at what I do.

Here are some examples of XPath attribute selectors I use daily.

Meta Description//meta[@name='description']/@content
Canonical URL//link[@rel='canonical']/@href
Robots Meta Tags//meta[@name='robots']/@content
All links on a page//@href
Any element having the class “example”//*[@class='example']
Any element having the id “example”//*[@id='example']
All hrefs of an a tag within a specific class//*[@class='example']//a/@href

As per above, XPath works similarly with all HTML elements. Whether you want to pull all the list items from within a unordered list (//ul/li) or a more specifically all the list items with a link from within a unordered list that lives within a div with a class or id (//div[@class=’class’]//ul/li/a/@href).

How to use XPath in Google Sheets

If you are wondering how can you use XPath in Google Sheets then =importXML is your friend. Here’s a fun little SEO XPath Cheatsheet that I created for you to use.

SEO XPath Cheatsheet
SEO XPath Cheatsheet

Simple replace the URL at the top left corner with the one that you want to retrieve the above info and everything will happen automagically for you. In the meantime click on the fields and check the relevant XPath I used and maybe use it as well for your own projects.

Feel free to copy the spreadsheet and use it as a guide.

How to use XPath directly in the browser

Using XPath in Google Sheets is extremely useful but I usually start by testing the commands directly in the browser. That helps me save time by not pulling exactly what I want in the spreadsheet.

XPath is versatile like that. You can use it pretty much everywhere.

Here’s a cool little video on how to use XPath in your browser.

In order to enable the use of XPath in your browser you need to include the above commands within the following snippet $x("").

There are literally unlimited usage examples of XPath for SEOs and most of them would be unique to the individual website. Remember XPath is extremely flexible. I like to think that if you play around with it just for a day or two it will become a tool that you will depend upon reliable.

Advanced XPath Commands

There are more than 200 build-in XPath functions. Depending your day to day work requirements you can use from one to hundreds of these functions. Here are some examples of XPath functions for more advanced usage.

Select element by position eg. position 2//ul/li[2]
Similar to above//ul/li[position()=2]
Select last element//ul/li[last()]
Select all except first child//ul/li[position()>1]
Select position of nodeset instead of sibling (?)(//div[@class='top10'])[1]
Select all elements that do not contain the word “XSLT”//table//ul/li[not(contains(text(),'XSLT'))]")
Select all elements that contain the word “XSLT”//table//ul/li[contains(text(),'XSLT'))]")
Select all elements that start with the word “XSLT”//table//ul/li[starts-with(text(),'XSLT'))]")
Select all elements that ends with the word “XSLT” (?)//table//ul/li[ends-with(text(),'XSLT'))]")

Nodeset selection by position

Here’s a good example of selecting using nodeset position instead of sibling.

Let’s say you have want to select a div with a class of “top10” that exist in the footer of a website. That footer might contain four similar divs with the same class. By adding the position [1] you select all of them since all of them are the first child of their parent element. If you try to add position [2] you select none of them. Since they these divs are not siblings but different nodesets by adding the parenthesis around your selection you are now selecting on a nodeset level instead of a sibling level.

Here’s a small explainer video

XPath 2.0

All of the above functions are working properly even on Chrome Dev Tools but since Chrome is only supporting XPath 1.0 the [ends-with()] function is not working.

Leave a Reply

Your email address will not be published. Required fields are marked *

Stay informed

Join hundreds of entrepreneurs, marketers and SEO specialists receiving a weekly data-backed, proven SEO tactic straight to their email.