The new 0.9.0 version of Visual SEO Studio has a brand new engine which dramatically reduces memory consumption to enable auditing much larger web sites.
A new engine to crawl much more
Thanks to this new architecture, the previous limitation in the maximum number of crawlable pages can be dropped.
While the new engine would permit much more, the limitation is temporarily set to 150K URLs.
Crawl options: The SEO tool can now analyze much larger web sites
As said, the new engine permits much more, this is a screenshot from a test:
a ~400K URLs crawl, obtained during a pre-release test
We prefer to raise the bar progressively to remove other bottlenecks in the data presentation layer, to preserve a pleasant user experience.
The end goal is being able to crawl and inspect datasets of 500K / 1ML URLs without flaws.
Previous version of the software limited a crawl to 75K URLs. Such limit was imposed to minimize the chances of crashes due to exceeding the amount of memory usage granted by the OS. Visual SEO Studio uses "memory check points" during a site crawl to prevent such conditions, but other processes could compete for the same resources, and some crashes slipped through the safety net.
The new 'Sasquatch' engine dramatically reduces instant memory consumption, even when crawling much larger sites, so that you'll likely see an improvement in the overall PC responsivity.
The whole application has been revisited and tested with huge datasets, optimizing also the speed when processing reports, reducing the time it takes to persist crawled pages (so yes, crawling is slightly faster if the web site throughput permits it), the time it takes to display the processed data, and giving much more precise feedback when showing the percentage of ongoing task - e.g. loading from disk and processing data for reports ("HTML/URL/GA" Suggestions and Custom Filters) and views.
Furthermore, the application is much smarter when having to elaborate large datasets inspecting the pages HTML content: to speed up processing it temporarily allocates peaks of memory, but it auto-tunes to never exceed the available memory and de-allocates it gradually as soon as it is not needed.
It's not only about memory footprint and performances, the software is now more stable:
other than intrinsically fixing all the memory related instabilities, we fixed all the major pending issues.
Visual SEO Studio 0.8.34 'Caronte' opens a bright future
Released on July 2015, "Caronte" gives the SEO audit tool the ability to migrate user data to new formats.
Visual SEO Studio 0.8.34, code-named 'Caronte'
can migrate existing crawled data to new formats
(Gustave Doré, 1857 - Public Domain image)
The software of course was already able to update it's data format, but the ability to also migrate users' data permits more dramatic changes.
So far those changes had been kept on hold, in order not to bother tens of thousands of users who might have wanted to continue keeping their historical data. We did regularly introduce some minor changes in the data format to extend it, but those were all changes not needing to migrate users' data.
In order to really raise the bar, we needed a way to bring users' data from any format to a new one.
For the record, a complex data migration is what happened with the subsequent "Sasquatch" release, where all the pages HTML content has been moved from a table (it's not a big secret that Visual SEO Studio crawled data is stored on a relational database) to another.
For such migration, we tested thoroughly on all worst-case scenarios: we used 4-5 GB projects, with x86 virtual machines with limited resources.
Caronte has abundantly proved up to the task, and thanks to it we'll be able to further improve the product.
Visual SEO Studio 0.8.33 'Angkor' brings 64 bits support
Another factor which already improved robustness against "out of memory" crashes has been the support of 64 bit CPU architecture with the June 2015 release code-named "Angkor".
Visual SEO Studio 0.8.33 was code-named 'Angkor' because published with Fred working remotely from Cambodia (Fred's own work)
The previous versions were compiled for x86 (32 bits) CPU architecture, which can run on x64 OS machines as well, but the solution imposed a serious limit: Windows does not assign more than 2 GB quota of memory to 32 bits processes, preventing applications to scale.
We knew that at least 70% of our users were running Visual SEO Studio on a 64 bits OS, and it looked like a serious waste not leveraging all that power, so we studied how to benefit all those users while keeping the installation even simpler.
This has been achieved smoothly with a single setup file - rewritten from scratch with a new user interface - which transparently installs the correct version depending on the machine bitness. The user doesn't have to bother about any detail, as the setup will install the correct version, even in case of update.
The new setup is not significantly bigger, as the two versions differ only for a tiny portion of the installed files.
Since 'Angkor', we didn't record any memory-related crash. Now, with 'Sasquatch', we expect those to be a ghost of the past. Period.
Conclusions, and what's next
It has been hard to freeze all new stuff to prioritize the architectural changes done on Summer 2015. We have a lot of stuff queuing for prime time.
For the near future, other than hone things to further extend the crawl limits as promised, we'll return enriching the product with new SEO features.
Some development branches had been kept on hold for long and are now roaring!
Now, remember that big e-commerce site you had to audit? There's Visual SEO Studio 'Sasquatch' ready to gnaw it!