Manual: Crawl Multiple Sites
The feature "Crawl Multiple Sites" of Visual SEO Studio, documented in detail.
Crawl Multiple Sites
Sometimes SEO professionals need to quickly crawl several sites at the same time, for competitors analysis or to check prospective customers' sites.
This option is a helper to explore up to five sites in parallel using the most basic crawl settings, since it is conceived for quick opportunity scouting: only web pages, a limited number of them (enough to understand if a website has areas to work on first, though) and very few advanced options.
To learn more about the feature, please read the Crawl Multiple Sites page.
For more in-depth analysis, use the Crawl a Site... feature.
Start URL (e.g. https://www.example.com/)
Insert here the address from where you want the spider to start visiting the website.
Most of the times you will use the website Home Page URL, which usually is the "root" address (e.g.
https://www.example.com/), but you could also decide to start from another page.
If you do not specify a protocol (
http:// will be assumed.
Session Name (optional)
You can give your crawl session an optional descriptive name for your own convenience. You will also be able to add it or change it at a later time.
Clicking on thelink will expand the window to let you access further crawl parameters.
Maximum Crawl Depth
The Maximum Crawl Depth is how deep in the website link structure you want the spider to go.
For some websites with many levels of paginated content you may want to increase this value.
The reason why this parameter exists instead of assuming an infinite depth is because of so-called "spider traps":
Some can be intentional, some not. Take the classic example of the "infinite calendar" you can find in many blog sites: each day of the calendar is a link to a virtual page, and there are links to go to the next month... for ever! A web crawler would never stop visiting such a site, without employing some limitations like a maximum crawl depth or a maximum number of visitable pages.
Maximum number of pages/images
The maximum number of pages you want the spider to download. The default is 500 pages, as this option is conceived for quick opportunity scouting, not for in-depth analysis (use the Crawl a Site... feature for that).
Only pages do count; other files like robots.txt, XML Sitemap files or other assets are not taken into account. HTTP redirections do not count as well.
Maximum number of concurrent connections
SEO spiders try to speed up website visits by using multiple concurrent HTTP connections, i.e. requesting more web pages at the same time.
Visual SEO Studio does the same, even if its adaptive crawl engine can decide to push less if it detects that web server would get overloaded.
This control lets you tell the spider how much it could push harder if the web server keeps responding fast.
The Visual SEO Studio edition and whether the website is among the Verified Websites can influence the ability of the spider to crawl faster:
For verified sites you can set up to 32 concurrent connection. For non-verified sites, maximum limit is 5.
The free Community Edition can only use 2 concurrent connections at most.
Warning: increasing the number of thread could slow down or hang the server if it cannot keep up with the requests; do it at your own risk (that's why you can force more on verified sites only).