Readability scores in Visual SEO Studio
Understanding readability scores
Examples of use
Histogram views improved
Easier content auditing: word count on steroids
Plain text viewer
Some changes to the Community Edition
Conclusions, and what’s next
Content Auditing was a previously uncovered area. The new 1.3.0 of Visual SEO Studio, code-named ‘TheReader’, entered in a disruptive way adding “Readability Analysis”.
Note: How search engines consider the ease of reading is a long time debated question; we detailed our opinion and did our best to illustrate unbiased facts in the article Readability and Search Engines
Readability scores in Visual SEO Studio
The new feature is the first of a series tools geared for Content Auditing, a previously uncovered area. It permits computing with ease readability scores, average readability level, number of words, sentences and characters of a site pages.
Readability Analysis in Visual SEO Studio
The optional (and recommended) use of an XPath expression permits restricting the analysis to the content part of the web page, ignoring the boilerplate (navigational menus, header, footer, side panels...).
At present the feature is available for texts in English and Italian.
Readability formulas are language-dependent, and often depend also on the correct computation of the number of syllables.
Visual SEO Studio is currently translated in seven languages, and it is likely that analysis of texts in written in the other languages will be supported as well.
[Update: with Visual SEO Studio 1.3.1 readability analysis had been extended to French and Spanish texts, and rel. 1.4.0 added support for generic "other languages". See page Readability Analysis for further details.]
Readability formulas used for English texts:
- Flesch Reading Ease Score (FRES)
- Flesch-Kincaid grade level
- Gunning's Fog Index
- SMOG index
- Automated Readability Index (ARI)
- Coleman-Liau Index
- and the average of the five aforementioned grading school levels.
Readability formulas used for Italian texts:
Note: "Readability Analysis" is available only in the Professional Edition. You can evaluate it for free for 30 days by registering the Trial version.
Understanding readability scores
Readability scores are well accepted formulas attempting to measure how easy to read a text is.
Some attempt to give a score from 0 to 100 (e.g. Flesch Reading Ease Score), the most part try to assess the grade level the reader should have to be able reading the text (using the US educational levels).
I don’t aim with these few lines to make a crash course on Readability, and prefer to point to to external documentation.
You’ll notice that even without a deep knowledge of the score systems, having an overview of the text readability with Visual SEO Studio is straightforward, thanks to the coloring palette used to evidence the scores: Green(ish) values mean ease to read, red(dish) values mean hard to read texts.
“Readability Analysis” usage examples
Readability Analysis works on a set of already crawled pages, what in Visual SEO Studio is called a saved “crawl session”.
Working on a stored session gives the advantage of being able to try several times and refine your parameters, or dig your data multiple times to inspect different sections. You don’t need to crawl each time, your data is already there.
Let’s give some simple usage example for the new feature.
Example 1: analyze the English version of this site blog
First problem to solve is finding an XPath expression to locate the content part of the blog pages. I’m a lazy person, and this time I’ll spare myself writing it and will use the browser to get it:
With Chrome (all major browsers have a similar functionality) I invoke the option “Inspect” over the content, locate an HTML element containing the text I want to evaluate, and select the option “Copy XPath”.
Copy XPath from the browser
Et voilà! In the clipboard is the value “//*[@id="main"]/div/div[1]/div[1]”
I paste it in the “XPath to content” field. The language of the text is set to “English”, so pages will be evaluated only using readability scores targeting the English language.
Analysis options, simple view
Then I want to restrict the analysis to the blog pages only. They all start with the same path so I expand the options panel and type “/SEO-Blog/” in the “Path filter” field, and I select “Allow” syntax of robots.txt
Note: I could obtain the same result with the RegEx “^/SEO-Blog/”, but as the saying goes “When your solution is a Regular Expression, you have got a problem.” (I admit this was simple enough to be readable, though).
Analysis options, expanded view
Used criteria:
Text language: English
XPath: //*[@id="main"]/div/div[1]/div[1]
Allow: /SEO-Blog/
The result will look very similar to the first screenshot of the article.
Example 2: discariche-in-italia.blogspot.it
Used criteria:
Text language: Italian
XPath: //div[@itemprop="blogPost"]
Regex: ^/(\d{4})/(\d{2})/
skip non-canonical URLs: false
Here I put some effort writing both the XPath and the Regex and locate the blog posts assuming they have URLs beginning in the form /yyyy/mm/
The site is hosted on blogspot, and is thus subject to the stupid rule that sets the canonical to the .com version but redirects to the .it version if you have an Italian IP address. For this reason, I unchecked the option "skip non-canonical URLs".
Example 3: pimpmytrip.it
This is a travel blog. I cannot impose filters based on the URL path because it uses a flat URL structure, but I can – with and added computational cost – let the Xpath expression locate the blog posts for me.
Used criteria:
Text language: Italian
XPath: //div[@id[starts-with(., 'post-')]]
Regex: ^/\w+
The Regex I used is to exclude the Home Page, which lists a summary of the blog posts using the same "id"s used within the posts.
Histogram views improved
One of the most appreciated tools, Histogram views are a powerful way to have a quick glance of the overall state of your site about a given measure.
While the full histogram tells you much more about the full story, sometimes you need a single number to communicate. We listened to our users and added computation of Mean and Median values.
Histogram views now show computation of Mean, Median and number of elements
Easier content auditing: word count on steroids
Something that was missing in Visual SEO Studio, word count is a rough, quick way to locate thin content pages.
Not only now the tool permits it, you can leverage the XPath selection to analyze the actual content excluding the boilerplate. Or, you can decide to analyze only a particular section of the text.
Word count of a page section
Plain text viewer
A new right pane view shows the plain text of the extracted text. It is the text obtained by stripping off all formatting. Like the Contents windows, it permits searching, selecting and copying selections of text.
Plain Text viewer
Some changes to the Community Edition
Features "Custom Filters" and "GA Suggestions" are now only available in the Professional Edition.
This has not been a lighthearted decision: users love the product and love the free Community Edition, so much that far too many don't feel the need to purchase the Professional license for their daily consulting/agency work.
We appreciate the product being popular, and want to keep the free version free even for light professional use, yet we cannot risk to impair our economical survival by being our first competitor.
The two features are very "Pro": for example an Analytics health-check is something you perform only occasionally on your own site. It can be done with the 30-days Trial.
We wish to point out the Community Edition is still highly competitive against competition.
Conclusions, and what’s next
This release comes with Readability Analysis, the first of a series of tools dedicated to Content Auditing.
There's more: for a complete and boring list, see the Release Notes.
Another new feature – Data Extraction – is in advanced development state and will see the light very soon.
All new major features will for the most part be available only for the Professional Edition, while general improvements (and many will come) will be available for both editions.
Now, there’s plenty of texts on-line to double check for readability. Don’t waste time and launch “Readability Analysis”!