Find broken links and link redirections

Broken and redirected links are often the first low hanging fruits to pick in SEO. Let's see how Visual SEO Studio can assist you locating them all.

Links inspection at chain factory, 1943 (detail cut from public domain photo) What are broken links?
What causes broken links?
Are broken links bad for SEO?
What are link redirections?
What are redirections?
Are redirections bad for SEO?
What causes links redirections?
How do I locate and fix broken links and redirections?
Method 1: Using the session views
Method 2: Using the "HTTP Issues" panel
Method 3 (recommended): Using Links Inspector

 

Broken links and redirections are normally among the first things to look at and fix when taking care of a web site.

Opening one of the session views (Crawl View, Folder View and Tabular View) in Visual SEO Studio you will notice the "HTTP issues" bottom panel.

HTTP Issues - Visual SEO Studio
HTTP Issues (click to enlarge)

Panel "HTTP Issues" lists all the requests performed by the SEO spider where the server did not respond with a "200 OK" HTTP status code.
The panel can be resized at pleasure to show more content.

Most common issues are 404 (Not Found) and 301/302 (redirections) error codes.
If the spider encountered such error codes, it means there are internal links to be fixed.

So called "broken links" are links pointing to resources that do not exist. While exploring your website following links, the spider encounters a link pointing to a wrong URL, tries to follow it and since the web page is not there the web server returns an error code. Typically it is a "404 Not found" HTTP status code, but could be a variation like "410 Gone".
In Visual SEO Studio such URLs are evidenced in red.

Broken links happen all the time. Sites change over time, the Web changes, and links break.
The most common reasons are:

  • a link is created with a destination URL wrong, untested
  • a link destination page is removed, and the link is left there
  • a link destination page is moved, and the link is not updated (and no redirection was set)

CMS automation generally reduces them for internal links, but not for external links.

Partially yes. For sure they are damaging user experience. In terms of SEO, if the correct page is not linked elsewhere, it cannot be indexed and ranked by search engines, thus damaging your SEO.
Even if there shouldn't be the link itself, a broken link is a waste of PageRank flow: the "juice" flowing from one page to another is split among all the page links, without any page benefiting from that fraction.
So yes, broken links should always be fixed.

Link redirections are HTTP redirections encountered when following a link URL.
In Visual SEO Studio such URLs are evidenced in blue.
So you might ask:

What are redirections?

When a webpage changes its URL address, it is good practice to set a HTTP redirection from the old URL to the new one. By doing this, visitors accessing your site from an old link or bookmark can find the content almost transparently. Search engines can understand a page has moved and transfer the assets the old page had gained in their eyes (PageRank for Google, among other link signals) to the new URL.

This is particularly important in case of a site migration. Suppose you changed from the http:// to the https:// version, or from the www. to the naked version, or you changed your domain name, or the entire URL structure. Then you should set automatic redirections from all the old URLs to the new ones, in order not to lose your hard earned search engine positioning.

The way a redirection works is:
When a user-agent - for example a browser, but could also be a search engine spider - is asked a URL, it performs a "HTTP request" asking the web server to serve the resource. If there is a redirection in place, the web server responds with a redirection HTTP status code, along with the new URL address. The browser will then make a new request to the new URL.
The process implies an additional HTTP request for each redirection (they can be chained), getting the desired resource takes a little longer, but many times times the user is not aware of it.

Redirections can be permanent (HTTP 301 "Moved Permanently" or HTTP 308 "Permanent Redirect") or temporary (HTTP 302 "Found" or HTTP 307 "Temporary Redirect"); there are also other redirection codes, but these are the most common, especially 302s and 301s.

Permanent redirection are "cachable". It means browsers and search engines once they meet a permanently redirected URL can safely use the new URL next time they are asked to. For example if a user clicks on a link pointing to a redirected URL, the next time that link will be clicked the browser will remember the redirection and spare the intermediate passage (unless the browser local cache is deleted).
Google in case of permanent redirection - and in some instances also in case of temporary redirection, if it reputed they are used wrongly - can transfer the cumulated PageRank to the new URL. Other search engines have similar mechanisms.

How to setup a redirection depends on the type of web server (Apache? Nginx? IIS? Other?), the used CMS (WordPress? Joomla? Drupal? Other?), the server-side technology (PHP? ASP.NET? Java? Other?) so we cannot give here a definitive solution, it depends on your web server.
You will find the question "how to setup a redirect" asked many times on google. Many answers point to a file called .htaccess assuming your web server is Apache, but according to w3techs.com on July 2019 only about 43.9% of the sites run on Apache, 34.5% use WordPress, and 79.1% are based on PHP. While Apache, WP and PHP are the most used technologies, the asker reality could differ.

Are redirections bad for SEO?

HTTP redirections per-se are a cure, not an illness. Old URLs must be redirected, possibly for ever. Or at least until you are sure search engines have completed the assets transfer after a site migration, no external links to the old URLs exist any longer, no old URL is present in people browser bookmarks, no old URL is printed in old leaflets, pens, gadgets a user could stumble upon... which is to say for as long as you care for your website business.

So if redirects are a good thing, why are we talking about them?
When you setup a HTTP redirect, you also have to update all internal links pointing to the old URL. Failing to do it will result in a link redirection.

A link redirection is not as bad as a broken link: browsers and search engines can follow redirections and land on the final destination content.
The tolls to pay are:

  • longer times to serve the destination page
  • slightly worse user experience
  • increased probability of abandonment (especially in case of HTTP timeouts)
  • search engines taking longer to index your content
    That's because the search engine spiders don't normally follow a redirection as they meet it: they report it to the search engine which will schedule a visit to the newly found URL.
  • increased consume of crawl budget

Redirect chains - i.e. when more redirection are concatenated, and several HTTP requests are necessary to get the final URL - are particularly malicious.
Google is known to be able to follow up to five redirections, after that your destination URL may not be discovered unless it is linked from somewhere else.

Even worse, redirection loops are an evil, the user-agent will stop following the chain when a loop is detected and will not be able to find your destination page. But in this case there is a technical error in the way the HTTP redirects are setup.

So, link redirections should be fixed.

Note:
There are a limited number of cases where it is licit for a spider to encounter a (temporary!) redirection.
Some sites do redirect visitors to entering at the root address to a localized version of their home page. This is normally done with a policy based on the IP address or - more often - on the user's language as detected by reading the HTTP "accept-language" header.
While we don't normally recommend the technique (because both IP and HTTP header are not fully reliable, and we believe a forced redirection to be a poor user experience), these are cases where a redirection encountered by a spider is not to be considered an error.
This though is normally implemented with a 302 redirect for those users hitting the root address via an external link; internal links are not normally redirected.

It typically happens when a HTTP redirection is created, and the links pointing to the old URL are not updated accordingly.
When both redirects and links are automated by the used CMS (Content Management System), this shouldn't happen, but reality is often different. Besides, like in the case of broken links, CMS automation cannot prevent it to happen with external links.

How do I locate and fix broken links and redirections?

Visual SEO Studio assists you in locating broken links and link redirections in several ways.
The recommended method is using the Links Inspector, available in the Professional Edition only.
If you don't need a thorough inspection of all website links, if your site is small or with few issues, you may find more immediate using the other methods, all very straightforward.

Method 1: Using the session views

Session views (Crawl View, Folder View and Tabular View) in Visual SEO Studio give evidence to requested URLs crawled following broken links (evidenced in red) and redirected links (evidenced in blue).

Visual SEO Studio lets you easily locate the link where the broken URL has been found.
The quickest way is to chose the "Find referrer link to URL" context menu option: the referrer page will be selected, and the crawled link highlighted in DOM views.


Feature "Find referrer link to URL" in action

Note: clicking on the Legend link you can see an explanation of the used colors.

This method is available also in the free Community Edition.

Note: as said, the way to fix the link depends on the used CMS on your website. Suppose you were using WordPress, a common scenario is using the "Browse URL" command over the referrer page, having already logged in as and editor with your WordPress account, use the edit page mode, and locate the link (a quick way is using the "Find" command of your preferred browser searching for the URL or the anchor text), correcting the link URL, and saving the modified page. Rinse and repeat for every broken or redirected URL. Then crawl again, rinse and repeat.
If you have a Professional Edition license - or a time-limited Trial license - there is a more powerful and straightforward way based on the Links Inspector feature; we'll illustrate it soon.

Method 2: Using the "HTTP Issues" panel

As already stated, panel "HTTP Issues" lists all the requests performed by the SEO spider where the server did not respond with a "200 OK" HTTP status code. The panel can be resized at will, and its content exported to Excel or CVS.

HTTP Issues - Visual SEO Studio
HTTP Issues (click to enlarge)

The panel provides an easy way to see all broken links, redirections or any other HTTP issue. They can even be filtered by typology (for example, 4xx codes are considered errors, and 3xx codes are marked as warnings).

Right-clicking on a single issue row a context menu provides you several commands, like in the session views case. You can easily locate the page referrer where the links containing the broken URL was originally met.

The same "Find referrer link to URL" command is available from there. Alternatively, you can select the referrer page node in the main view, and from the "Page Links" panel locate the links pointing to the missing/redirected resource.

Context menu Show in DOM in Page Links panel
Context menu “Show in DOM” in "Page Links" panel

Each row permits also to visualize the related link both in Content view (showing the page HTML) or in DOM view.


"Show in DOM" and "Show in code" features in action

This method is available also in the free Community Edition.

Locating and fixing each single referrer link can do a lot. Often those links are in the website boilerplate part (header, footer, navigational menus, sidebars...) and you fix them once you fix them everywhere.
But there are many cases where it is not enough because the same broken/redirected URL may be linked from several places. You should re-crawl more times to locate them all with the limited Community Edition.
The most powerful solution is using the Links Inspector feature.

Links Inspector links list
Links Inspector links list, several columns are hidden (click to enlarge)

Links Inspector is an extremely powerful and complete tool, it can do much more than simply locating broken or redirected links.
For the task at hand, the easiest way is to launch Links Inspector, select the site naked domain name to list all internal links, and filter the result set choosing the option "Broken or redirected links".

Links list, filtering options and helper buttons
Links list, filtering options and helper buttons (click to enlarge)

You also have at your disposal a handy column with all the status codes in a chain.
The usual context menu commands let you locate the individual links in page HTML or DOM.

You can run the "Browse source page" command and fix the issue as already described previously. This time you are accessing the list of the whole set of links in your site, so there will be no need to re-crawl iteratively. A new crawl at the end of the fixing is always recommended though.