A new spider engine, 2-3 times faster
A more pleasant color scheme
Conclusions, and what’s next
A major milestone, release 1.8 "Partridges" is mostly dedicated to increasing crawl speed.
Usability and User Experience improvements make it also more pleasant to use.
A brand new multi-threaded spider engine, now 2-3 times faster
Visual SEO Studio's crawler now makes better use of multi-core architecture and parallel processing. It still remains adaptive avoiding to overload the web server, but continually tests the server throughput making a number of concurrent connections and then giving it time to complete them.
New crawl option: maximum number of concurrent connections
How did we make Visual SEO Studio faster?
- We rewrote the crawler engine to use a multiple connections wisely
- We leveraged parallel processing to extract links of crawled pages
- We removed some bottlenecks occurring in case of large crawls
- We dramatically reduced the probability of UI freezing during crawl, by making the crawl tree and the output window much quicker, responsive and smooth when updating
- We adapted the memory check-points system (used by the spider to check availability of memory to complete a task) to the new multi-threaded engine and added other minor optimizations
- We updated the HTML parsing library, obtaining further a speedup
Depending on the scenario the crawler could be now 2-3 times faster (100-200% gain), and there still is space for improvement.
The number of connections set is a maximum number. The program may limit it based on the number of processors/cores and other evaluations; over a certain amount it would only increase the instantaneous memory consumption. Such amount varies from case to case depending con the client, the server and the network.
For non-verified sites the engine continues to respect any existing crawl-delay (max 2 secs) found for each domain/subdomain, nevertheless parallelism still helps in case requests were directed to different (sub)domains. A further improvement comes from the engine being less "adaptive" concerning the courtesy delay (if set): now it performs a new HTTP request right after the strict courtesy delay passed (before it waited anyway for the previous request to complete).
In case of crawl of URL list, a further speedup is due to the fact that now crawl-delay(s) are timed for each distinct (sub)domain.
Performances depend on the entire chain: bandwidth available to the client, intermediate network status, and throughput of the web server.
Over a certain amount increasing the number of threads does not help, it will only increase the instantaneous memory consumption and could even slow down or hang the server. That's why Visual SEO Studio crawler is adaptive and reduces pressure if needed.
Current limitations are:
Community Edition:
- up to 2 (default value) threads
Professional Edition:
- Max 32 threads for verified sites (default is 5)
- Max 5 threads for non-verified (default is 5)
Crawl speed is now comparable to competitor's products, yet we managed to continue to be the most polite player of the crowd: it still is almost impossible to overload and DOS a web server with Visual SEO Studio crawler.
This is something important to us: it's no point to crawl a site fast and crash it mid-way; it's not wise to slow down to a snail pace your client's e-commerce site that is selling 24/7; it's not correct to overload a web server without having permission to do it.
With Partridges we succeeded giving fast crawls to all users – Free and Paid plans – without renouncing to these values.
A more pleasant color scheme
Color scheme changed for some UI elements.
With 1.7 we addressed some usability issues by evidencing the most important "call to action" buttons, sometimes not noted at first sight by users.
We did that by changing their size, and color. For the latter we used an "acid green" which certainly users did notice – so we improved usability – but would largely find it annoying because too bright. In practice, we increased Usability and worsened User Experience.
A new pastel green color scheme
This new color scheme corrects the mistake by using a much more pleasant pastel green. Along with it, we changed accordingly the background of the left side Command Pad panel.
Conclusions, and what's next
With Visual SEO Studio "Partridges" we brought fast crawling to all users, paid and free. As said, there still is space for improvement, by optimizing other parts of the process. Expect the SEO tool to become even faster in the future.
There are other improvements, minor new features and fixes published with 1.8, for a complete list please read the Release Notes.
On the features front, we are working full steam... wait and see.
Now, I guess there's that big crawl you long procrastinated. Unleash "Partridges"!
Comments are open on linked Facebook page.