Manual: Tabular View
The feature "Tabular View" of Visual SEO Studio, documented in detail.
Tabular View
The tabular view gives you an overview of the information that you should consider in your first analysis of a web site from a SEO perspective.
The view consists in a main table plus other auxiliary sheets at the right and bottom panes. Auxiliary sheets have dedicated help pages accessible by simply clicking on them.
What follows is a detailed description of all view controls and fields. You can also learn more reading the Tabular View, Crawl data in a grid page.
Head info and tools
View switcher
All views share a view switcher at the top of the window, to quickly pass from a data representation to another.
Clicking on its buttons will open - or select it if already open - the corresponding window.
- Manage Sessions
- Crawl View
- Folder View
- Tabular View
Start URL
The address from where the spider started visiting the website. You will typically insert at the start of a new exploration the website Home Page, usually the "root" address. For explorations of list of URLs the field is not populated.
Session name
You can give your sessions an optional descriptive name. The name can be assigned when choosing the crawl parameters, or at a later time.
Find Pages...
Clicking on the
button will pop up a dialog window to let you search pages starting from a text string.Legend
Clicking on this link will expand a Legend detailing the meaning of icons and colors used.
For a more comprehensive explanation, read Understanding colors used for URLs.
Table shortcut buttons
The columns of the main table can be shown or hidden (note that by default only the most relevant for a first analysis are shown, but many more important columns can be shown by clicking on Choose columns... from the context menu you get by clicking (on Windows) on the most upper left corner of the table, or (on Mac) right clicking any cell of the table.
From the context menu it is also possible activate other functionalities also accessible from the command icons at the right above the table:
- to show or hide columns of the table
- to search for a specific value in the table cells
- to add columns with data from Google
Search Analytics
- to add columns with Bing/Yahoo
Traffic Page
data - to add columns with data from Moz
- to export the content of the shown columns to an Excel document
- to export the content of the shown columns to a CSV file
- to export the content of the shown columns to an XML Sitemap
Context menu
Right clicking on an item will pop up a contextual menu:
Tabular View context menu
Context menu command items are:
-
Go to Referrer URL
Selects in the main view the node related to the "Referrer" URL, i.e. the address where the spider found the link to the resource. -
Copy URL
Copies in the clipboard memory the URL of the selected resource. -
Browse URL
Navigates with the default browser the URL of the selected resource. -
Take Screenshot...
Opens the Take a Screenshot dialog window that will permit you to have a preview, choose the desired resolution, and take a full-height screenshot of the web page. -
Screenshot History
Opens the Screenshot History view to show all screenshots taken for the selected resource over time. -
Find pages linking to the URL
Opens in a new Tabular View all pages linking to the resource. -
Find all links to the URL
Opens the Links Inspector to locate all links pointing to the resource. -
Find referrer link to the URL
Selects the right pane DOM view and there hightlights the HTML node where the spider found the link to the resource.
Column headers
Let's now have a closer look to which columns is it possible to show, with a brief description on how we can use them.
Icon
The icon column gives an indication on the state of the explored resource. You can view an explanation of the used icons and colors thanks to the Legend you can activate with the link above the table on the left.
Used icons and their meaning are:
- when the explored resource did not generate errors
- if we got an error while exploring a resource (e.g. when a resource is not found producing a 404 error)
- the warning icon is not necessarily an error, but means the result of the exploration of the resource needs special attention
- when the web server returns a 418 HTTP error
- when the resource has not been explored
Crawl Depth
The depth of the page in the site link structure, also known as "link depth", i.e. the number of clicks needed to reach it starting from the Home Page.
Knowing a page depth from the main URL is important because search engines give more or less importance to a page relatively to its distance from the main URL: the closer, the more important it is.
Note: this is a simplification; in the case of Google for example usually the Home Page is the page with greater PageRank (a Google measure to assess the importance of a page, other search engines use similar models), the pages connected with a single link to the Home Page are thus the ones receiving more PageRank.
Furthermore, the greater the distance, the less likely is the page to be reached and explored by the search engine spiders, because of the normally limited Crawl Budget (simplifying: the number of pages a search engine can explore within a certain time slot when visiting a website).
Thus, place the pages you want to give more weight closer to the Home Page.
Link Depth is also important from a user perspective: it would be hard for them to find a content starting from the Home Page if it takes many clicks to reach it.
A common usability rule wanted each page reachable with three clicks or less. This is not always possible in case of very large websites, nevertheless you should choose a link structure that minimizes each page link depth.
Crawl Progressive
Indicates the progressive number during the crawler exploration.
Thanks to this progressive number you can get an idea on how a search engine spider would explore your website, a piece of information you should take into account when dealing with Crawl Budget issues, typical of large websites.
For example, you may realize the spider takes exploration paths towards content areas you repute less important compared to the ones you think more strategical; in such case you should intervene on the website link structure.
Note: the crawl progressive number is an approximation:
Visual SEO Studio uses an exploration pattern called Breadth-first, which is demonstrated to be the most efficient in finding important contents in absence of external signals; the actual exploration order can slightly change because of the parallelization used for speed reasons during the crawl process. Using a single crawl thread you could make it strictly repeatable.
Search engines exploration patterns are on their part high asynchronous, and exploration priority is weighted by - in Google case - the resources PageRank which could be inflated by external links.
Authority Name
The combination of protocol, host name and, if different from the default value, port number.
An important piece of information you can see form the Authority Name for example is whether the URL is protected by the secure HTTPS protocol.
It could also be handy having the authority name shown in case of explorations of URL lists or of sites with more sub-domains.
Authority
The combination of host name and, if different from the default value, port number.
Path (encoded)
The path of the required resource, with URL encoding when required.
Due to a limit of the HTTP protocol, a URL when "running on the wire" can only contain ASCII characters (i.e. Western characters with no diacritics). URL encoding replaces special characters (diacritics, spaces, non-Western alphabet letters, ...) with their Escape sequence.
Many URLs are only composed of ASCII character, and since they do not need encoding, the encoded and decoded version of their path look the same, but let's have a look to an example URL written in Cyrillic:
Path: /о-компании
(a typical URL path for a company page, it translates from Russian as /about-company
)
Since HTTP protocol cannot convoy non-ASCII characters, in order to permit these human-readable URL paths the characters are encoded by the browser transparently before sending them on the wire to request the resource to a web server, transforming the example path as:
Path (encoded): /%D0%BE-%D0%BA%D0%BE%D0%BC%D0%BF%D0%B0%D0%BD%D0%B8%D0%B8
The encoding used is called percent-encoding
Visual SEO Studio by default shows URLs and Paths in their decoded, human-readable form, but user might want to see the encoded version to investigate URL issues.
Path (decoded)
The resource path (URL decoded, thus in human-readable form).
Page Type
This column shows an icon representing the type of resource crawled. Possible values are:
- When the crawled resource is a robots.txt file
- when the crawled resource is an XML page (this is the case of a Sitemap)
- when the crawled resource is a HTML page
- when the crawled resource is an image
URL
Uniform Resource Locator, the resource address.
For a better search engine optimization it is preferable having "friendly" URLs (i.e. URLs anticipating the page content) and not too long.
Status
The HTTP response code received from the web server upon requesting the resource.
Response codes can be summarized in five standard classes:
- 1xx Informative response – request was received and its processing is going on (it is very unlikely you will ever see a 1xx response code)
- 2xx Success – request was received successfully, understood, accepted and served (it is the response code you normally want to see).
- 3xx Redirection – the requested resource is no longer at the address used
- 4xx Client Error – request has a syntax error or cannot be honored
- 5xx Server Error – web server were unable to honor an apparently valid request
Some very common answers are for example 200 (OK - the standard response for HTTP requests successfully served), 301 (Moved Permanently - used when a page URL is changed and you don't want to "break" external links to the old URL nor you want to lose the page indexation on search engines and want to preserve its PageRank.
(Redirect) do work as follows: when an old URL is requested, the web server answers the client (a browser, or a search engine spider) with a HTTP code 3xx to report the address has changed, and adding in the HTTP header the new address. The browser will then have to request with a new HTTP call the resource to the new address, and in case of permanent redirect could remember for the future the redirection in order to avoid making a double call when the link to the old address will be clicked again.
Redirects can be implemented on the server side using several methods, depending on the used technology and the platform the web server is running on. For example by configuring the .htaccess file on Apache web servers with generic or specific rules; or with dedicated plugins in a WordPress installation; or in case of web sites in ASP.NET technology with rules expressed in the web.config file, or directives set in the single page, or in the logic of the used CMS engine.
Having redirects is not an error per-se, but if they are detected - as it normally happens - during a normal site crawl navigating internal links, it is sign that such internal links were not updated after the URLs change. It is recommended to update the internal links with the new URLs in order not to slow down user navigation experience and not to waste the crawl budget allotted by the search engine.
Particular attention should be given to the 4xx response codes, which Visual SEO Studio rightly reports as errors.
The 4xx codes you will stumble upon are usually 404 (Resource not found) and the nearly identical (Resource no longer existing). Their presence is symptom of a broken link that should be corrected, because user and search engine can not reach the link destination page.
5xx response codes are errors occurred on the web server when it was trying to build the resource to return to the browser or the spider.
They could be a temporary issue, but they should normally not ignored, better reporting them to the developer and investigate on the server side. 5xx errors are a very bad user experience, make visitors abandon the website, and potentially can cause de-indexation by the search engines if repeated over time.
For a more in-depth description of HTTP response codes you can consult the following page on Wikipedia: HTTP status codes
Status Code
The textual description of the HTTP response code received from the web server upon requesting the resource.
Crawl Status
States whether the resource were crawled and otherwise details with a brief description the reason whey it were not visited.
Title
The HTML page title, as read from the title
HTML tag.
This is one of the page elements with greater SEO relevance for a good positioning in search engines. The title should describe efficiently and briefly the page content. It should not be duplicated (no other pages should have the same title) and should not be excessively long to avoid its truncation in the SERP (use the tool "SERP preview" to verify the title were shown in its entirety).
In the past it was common adding among its first words the main keyword, today also synonyms are correctly interpreted by search engines to categorize a page. Keep in mind that today's search engines are much better than in the past in understanding a page semantic content, so ensure your titles are really aligned to the page contents.
Content Type
The MIME/Type of the resource, along with the (optional) used character encoding type (charset). Shown datum is the one read form the HTTP header.
The MIME/Type instructs the browser on the resource type; this way the browser can know whether it will have to render it as a normal web page, or play it some other way (like for example a sound file or a video). Typically, a website pages for example have MIME/Type text/html
Character Set
The Character-Set of the resource, read from the HTTP header.
Knowing the used character encoding is useful when you need to diagnose web page rendering errors, when you see "strange" characters. The most largely used charset today is UTF-8, that other supporting multilingual diacritics and non-Western alphabets is more compact compared to other Unicode charsets and backward compatible with the old 128 characters ASCII format. Being it more compact it improves page download performances.
The charset can also be specified in the page HTML head
, but reading it from the HTTP header has an advantage: a browser can know the file encoding before attempting to read it, with increased performances.
Canonical URL
Canonical URL, computed combining canonical links read from meta tag and HTTP header.
The Canonical URL is the one search engines do index.
Note that when no canonical URL is specified at page level (by using the link
tag with attributes rel="canonical"
and href="..."
, or by using the equivalent HTTP header) the URL used to explore the page will be for the search engine the URL to index.
The reason why it is important to specify the URL you want a search engine would index is to avoid the same page were indexed more than once when it can be reached via more than one URL.
A typical example is in e-commerce sites "faceted" navigation, where the same product page can be reached via different URLs depending on the filters used to retrieve it. A search engine could index more copies of the same page, diluting its value and getting them in concurrency with each other, or could see them as internal duplicates. Specifying the canonical URL prevents all these potential problems and permits to have a certain degree of control over which URL will be used for indexing.
Another typical use case is with web servers that do not discriminate characters casing in the URLs (like MS IIS), linking internally a URL with the wrong casing would lead the web server to still resolve the resource, but a search engine would index two distinct pages then in reality there is only one.
Also web server sensible to character casing (and thus compliant to the URL standard) could have duplicate URL issues: a page could answer to different URLs, it just takes adding some querystring parameters to the address, and for a search engine they would be distinct resources; specifying a canonical URL permits to avoid the problem.
Canonical Path
The path of the canonical URL to be used for indexing.
Canonical Path (decoded)
The path of the canonical URL to be used for indexing (URL decoded).
Redirected To
The HTTP header Location, used with 30x redirect status codes.
In case of non ASCII characters in the URL, it is shown with URL encoding.
Upon a 30x response code the browser will navigate the URL stated in the Location
read from the HTTP header.
Redirected To (decoded)
The HTTP header Location, used with 30x redirect status codes (URL decoded).
Meta Description
The description snippet suggested to be shown in the SERP.
It is specified within the HTML head
section using the meta
tag with attributes name="description"
and content="..."
.
It should be attractive in order to increase the CTR (Click-Through Rate), the probability for a user to click on the link in SERP to visit the page.
H1 title (first found)
The first H1 title found in the page HTML.
The H1 title should explain to the user the page mission in an concise and effective way.
It is usually recommended to have only one per page, although this is not strictly needed nor it is an error in terms of HTML syntax. The H1 title is also a good occasion to recall the main keyword or an alias, a little like the HTML title
tag, even though it doesn't carry the same relative weight in terms of SEO.
It is also useful knowing which is the first H1 because wherever no tag title
in the head
section of the HTML were specified, search engines use the first H1 found as title for the fragment in SERP.
H1 nr.
Number of H1 titles within the page.
Even if having more H1 title within the same page is not to be considered strictly an error, the most part of SEOers consider having just one a good practice, that's why Visual SEO Studio reports the number of H1 titles found.
Is has to be considered that many WP themes made the unhappy choice t wrap logo and secondary elements in the page boilerplate with H1 tags (or other Hx), so it is good having an automated way to detect them.
Robots meta tag
The directives to the generic search engine bot, read from the robots
meta tag in the HTML head
.
Directives are indications the webmaster wants to give to the search engine bots about how to treat a visited page or its content.
The syntax of the tag in the HTML is:
<meta name="robots" content="..." />
The reported value is the content of the content
attribute, it could be composed by more directive separated by commas.
Common non default values you normally want to care about when taking care of a website SEO aspects are noindex
and nofollow
; the first one prevents a page from being indexed (or, better, to appear in SERP), the second instructs the search engine not to follow the links contained within the page itself (which the search engine might do anyway for discovery purpose, but without official endorsement to the destination URLs).
For a complete read on all directives accepted by Google you can see Robots meta tag, data-nosnippet, and X-Robots-Tag specifications; but keep in mind there's not only Google! ;)
(bot name) meta tag
The directives to a specific search engine bot, read from the dedicated meta tag in the HTML head
.
This piece of information is analogous to the one related to the Robots meta tag, but instead of being addressed to any search bot, it is dedicated to a specific spider, determined by its user-agent name.
For example in case the bot were Google's (user-agent name: "googlebot"), the syntax of the tag in the HTML would be:
<meta name="googlebot" content="..." />
The reported value is the content of the content
attribute, it could be composed by more directive separated by commas.
The bot considered by Visual SEO Studio is the one set in the used crawl options.
X-Robots-Tag
The HTTP header X-Robots-Tag, used for directives to the generic search engine bot.
The directives are the same you can find in the robots
meta tag within the head
section of the HTML. In case of non-HTML resources, for example PDF documents, such directives can only be expressed with the HTTP header.
Notes that in case of conflicting directive in the HTML head and in the HTTP header, the bot will consider the most restrictive one (e.g. if the directive in the HTML head were index
and the directive in the HTTP header were noindex
, the more restrictive noindex
would be considered).
X-Robots-Tag (bot name)
The HTTP header X-Robots-Tag, used for directives to the specific search engine bot.
Index
Index value, computed from all generic and bot-specific directives via robots meta tag and X-Robots-Tag HTTP headers.
Possible values are:
- based on all directives computed, the resource is indexable
- based on all directives computed, the resource is NOT indexable
Follow
Follow value, computed from all generic and bot-specific directives via robots meta tag and X-Robots-Tag HTTP headers.
Notice that if a nofollow
were found in the HTML head or HTTP header, it will make all page links unexplorable, even if they don't have the rel="nofollow"
attribute.
Possible values are:
- based on all directives computed, page links are explorable
- based on all directives computed, page links are NOT explorable
Meta Keywords
Not used by search engines, might reveal SEO strategies to competitors.
Meta Refresh
This meta tag were once used to refresh the page after a certain amount of time, or to redirect to other pages.
Today it is rarely used because in the first case techniques like client-side scripts are preferred. Also, if used to redirect to another page, it should considered an error because generally the PageRank would not flow, you should always prefer a redirection based on HTTP 3xx response codes.
Referrer Path
The path of the URL of the resource where the link to present resource was followed.
Crawl paths taken by a bot during a website exploration permit to understand the website link structure.
The Referrer URL is not necessarily the only URL to the resource, just the one Visual SEO Studio spider followed to discover the URL.
You can locate all links to the resource with the context menu entry Find all links to the URL.
Referrer Path (decoded)
The path of the URL of the resource where the link to present resource was followed (URL decoded).
TTFB (ms)
Time To First Byte: the time elapsed between the HTTP call to the web server and the first received by the spider.
High values for all pages may indicate performance problems of the web server hosting the website.
Download Time (ms)
The time it took to download the resource.
High values for all pages may indicate performance problems of the web server hosting the website. High values for single pages (or better, resources) likely indicate a too heavy content (in case of images consider reducing their size; for an in-depth analysis see the tool "Images Inspector".)
Use this piece of information along with the value of the column "Content Length": a high download time with a high Content Length indicates a resource too heavy, a high download time with a low Content Length indicates performance problems on the server side.
Download Date
The date when the resource was downloaded.
Content Length
The size of the resource in bytes, read from the Content-Length
HTTP header.
Use this piece of information along with the value of the column "Download Time (ms)": a high download time with a high Content Length indicates a resource too heavy, a high download time with a low Content Length indicates performance problems on the server side.
Truncated
States whether size of the resource exceeded download limit.
The maximum amount you can download for a resource can be customized before crawling a website by using the option "Maximum Download Size per URL (KB)". Notice that a limit is necessary to avoid so-called "spider traps".
Possible values are:
- Blank, when the spider has downloaded the resource completely
- when the spider could NOT download the resource completely
Size (bytes)
The size of the resource in bytes (stored).
The maximum size will be the value set before crawling a website with the option "Maximum Download Size per URL (KB)". Bigger resources will be truncated.
Column Size value is generally equal to the one in column Content Length, except in cases of truncation, or when the Content-Length HTTP header is not specified.
File size
The size of the file, expressed with an easy to read format (in bytes, or KB, or MB... depending on the actual byte size).
Google Search Analytics integration
The following columns get added to the table after clicking the button and report the data from Search Analytics from Google for a specified date range.
For more information please read Google Search Console integration.
Google: Clicks
The number of time the page has been clicked after appearing in the Google SERP.
Google: Impressions
The number of time the page appearing in the Google SERP.
Google: CTR
The Click-Through Rate is the rate in percentage between the number a page has been clicked and the number it appeared in SERP.
A low CTR could mean that the meta description does not attract the user in clicking the page, in this case we suggest to use the tool "SERP preview" to verify that the preview in SERP has a correct title and a meaningful description.
Google: Position
The average position in SERP of the page (in the specified time window).
Bing/Yahoo integration
The following columns get added to the table after clicking the button and report the data from Traffic Page from Bing/Yahoo for a specified date range.
For more information please read Bing Webmaster Tools integration.
Bing: Clicks from Search
The number of of time the page has been clicked after appearing in the Bing SERP.
Bing: Appeared in Search
The number of time the page appearing in the Bing SERP.
Bing: Click-Through Rate
The Click-Through Rate is the rate in percentage between the number a page has been clicked and the number it appeared in SERP.
A low CTR could mean that the meta description does not attract the user in clicking the page, in this case we suggest to use the tool "SERP preview" to verify that the preview in SERP has a correct title and a meaningful description.
Bing: Avg Search Click Position
The average position in SERP of the page (in the specified time window) when it had been clicked.
Bing: Avg Search Appearance Position
The average position in SERP of the page (in the specified time window).
Moz integration
The following columns get added to the table after clicking the button and report the URL Metrics from Moz.
For more information please read Mozscape API Integration.
Moz: Domain Authority
A score from 0 to 100 (the higher, the better) provided by Moz estimating the authority of the domain to which the resource belongs to, in other words the likelihood of the domain to rank its resources well in search engine results.
Moz: Page Authority
A score from 0 to 100 (the higher, the better) provided by Moz estimating the authority of the resource, in other words the likelihood of the page to rank well in search engine results.
Moz: Links
The number of external links found by Moz to the resource.
Moz: External Equity Links
The number of external links found by Moz to the resource, able to pass equity.