Technical SEO: Log File Analysis

Learn how you can perform an SEO log file analysis utilizing the free invaluable data that sit around your web server. No technical knowledge needed.

Learn how you can hack your way to the top with proven, data-backed tactics directly to your email every week.

SEO Specialists rely on the data received from Google Search Console or other 3rd party tools to understand what Googlebot “sees” when it crawls their site. Even though these tools offer a pretty vast library of information regarding Keyword Research, Content Optimization, and even a vast amount of errors and issues Googlebot might encounter when it crawls a webpage; they don’t tell the whole story. A log file analysis can easily provide an in-depth view of how engines see your site. While also provide a list of issues that they encounter during the time they spend on your site.

In this article we are going to discuss how you can find and analyze these files. As well as how this is going to support your overall SEO strategy.

Log File Example
Log File Example

One of the most undervalued, underutilized tools in pretty much everyone’s disposal is the server log files. 

A server log file is a file that is automatically generated by your server which contains, and automatically populates various information regarding activities that your server performs.

One example is the web server log files which contain a history of page requests made to the server. Here’s an example of how a single row, representing one web server request, looks like. 

139.167.118.2 - - [27/Apr/2020:15:47:11 +1000] "GET /seo-consultant/ HTTP/1.1" 304 328 "https://fourth-p.com/" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.108 Safari/537.36 UCBrowser/13.0.2.1289"

In most cases, the web server logs do not collect user-specific information, rather than generic information such as timestamp, client IP address, the page requested, and the method of the request, HTTP code, user-agent, response code, and referrer. 

In some cases, the above could be found broken down into multiple log files such as error logs, access logs, referrer logs, agent logs, etc. 

Now, as you probably have guessed a server log file is plain and simple, a hot mess. It’s extremely valuable but you can barely make sense if you just open the file and try to read it. Thankfully there are quite a few tools out there that will help us not only “beautify” the files but filter through them and get actionable data

How to Find your Web Server Log Files

Depending on the platform used, web server log files can live in different places. But no matter what hosting service you use, your server always generates those files. It might be hard to find them, but they are there. 

If you are not able to follow the next steps or you do but you can’t find the files I am referring to, reach out to your hosting provider, the agency that you hired to create your site, a developer that you know and trust and he is willing to spend a few minutes to help you out. 

The most common way to find your log files is to simple open your file manager or access your FTP and find the folder called “logs” within the root directory. In case you are using cPanel you can 

For some of the most common web hosts follow the below instructions to find your log files.

Blue Host

https://my.bluehost.com/hosting/help/315

HostGator

https://www.hostgator.com/help/article/raw-access-logs

Hostinger

https://www.hostinger.com/how-to/where-can-i-find-error-logs-for-my-website

Dreamhost

https://help.dreamhost.com/hc/en-us/articles/216512197-Viewing-your-access-and-error-logs-via-SFTP

Siteground

https://www.siteground.com/kb/where_are_the_server_log_files_for_my_site/

A2 Hosting

https://www.a2hosting.com/kb/cpanel/cpanel-logging-features/raw-access-log

GoDaddy

https://au.godaddy.com/help/working-with-error-logs-in-web-and-classic-hosting-1197

How to make sense of your Log Files

Once you have your log files in your hands, its time to start getting some actionable data out of them. As we previously discussed, log files are not the most readable files and you could spend days looking at them without making any real progress. 

Thankfully some tools can help us make sense of them by simply putting the information in the right columns, enabling filtering and identifying issues.  

For the sake of this, I will use Screaming Frog Log File Analyser that you can download for free from their website but it will only work for up to 1000 rows of 1 project. Since log files are usually massive (depending on your traffic) I would suggest that if you want to start working with log files, buy the pro version and use the full application. For what is worth, I am not an affiliate or I do not have any ties with the tool. I use Sumo Logic for larger clients and Screaming Frog for smaller clients. 

Importing Log Files to Screaming Frog

Creating a project and importing log files to a screaming frog log analyzer is an easy and seamless process. Screaming Frog provides you with a list of preselected User-Agents but also provides a list of user agents that you could choose if you want to track or not. 

screaming frog log analyzer

I would suggest leaving this as-is for your first try and once you have a basic idea of how the tool works filter through the data and set your preferences. 

Making sense of the data

Importing the log files is one thing and making sense of them is another. 

Log file analysis can be a powerful method for SEO and a few of the things that you will be able to find by doing it will support your SEO in the long term. 

Here’s a list of things that SEO log file analysis can help you with. 

Crawl Frequency 

Crawl frequency of specific pages, products, or items that you have on your site. This is an overly underestimated tactic that a lot of SEOs keep on skipping. From personal experience, my opinion is that Google does not waste time crawling the same page numerous times if there is no reason to do so. 

From experiments that I have performed on client sites (and my own), there is a direct correlation between frequency and rankings. 

Identify Errors & Issues

Identify possible errors and issues that engine bots encounter while crawling your site. For example, 4xx or 5xx response codes are a nice start for you to populate a list and go out and fix them.

Errors such as the above will eventually make the URLs that engine bots encountered them, get crawled less and less over time. Make sure to also check for possible redirections that you could resolve. 

Crawl Budget Optimization

Crawl budget optimization is a real thing. By excluding low valued pages by being crawled you’ll improve the chances of allowing pages that you want crawled get crawled. Make sure to read my guide on increasing your crawl budget by reducing indexable pages.

Identify Orphan Pages

You can compare a list of URLs with your log file data to identify pages that are not getting crawled. I like having a clean list on a spreadsheet (maybe from a sitemap) and compare with the log files. If a URL is not getting crawled or never got crawled maybe there is time to start checking your information architecture a bit closer. 

Identify Slow Pages

You can easily identify pages that take too long to load for Googlebot first hand. No need to swap between 10 different apps (unless you need in-depth analysis). Log files can tell you exactly how long it took for a bot to load the page and how much a page weights. 

As I said above, engines don’t like wasting time. If your pages are heavy and they slow down search engine bots then you need to optimize them for speed. 

All of the above data are freely available to you right there within your grasp. No need to pay for SEO tools, no need to hire an SEO agency, and this type of information is invaluable for your site to succeed.

Stay informed

Join hundreds of entrepreneurs, marketers and SEO specialists receiving a weekly data-backed, proven SEO tactic straight to their email.