Manual: Crawl XML Sitemap
The feature "Crawl XML Sitemap" of Visual SEO Studio, documented in detail.
Crawl XML Sitemap
This powerful function permits you to audit XML Sitemaps by crawling all their listed URLs.
Sitemaps can be crawled recursively, and are presented nested within the intuitive user interface.
Not only you can crawl normal or index sitemaps, the programs goes a step further and lets you even crawl all the XML Sitemaps listed within a robots.txt file using the
To learn more about the feature, please read the Crawl XML Sitemaps and robots.txt page.
XML Sitemap or robots.txt URL
Insert here the address of the XML Sitemap you want to audit, or of the robots.txt file.
The URLs listed in the XML Sitemaps will be downloaded and shown nested below the Sitemap node.
If you insert the URL of a Index Sitemap, there will be two levels of nesting, with the Index Sitemap at the top, as all XML Sitemaps listed in the Index Sitemap will be downloaded first, and then for each Sitemap its URLs will be downloaded.
Analogously, if you insert the URL of a robots.txt file which uses the
Sitemap: directive, there will be three nesting levels.
If you do not specify a protocol (
http:// will be assumed.
Session Name (optional)
You can give your crawl session an optional descriptive name for your own convenience. You will also be able to add it or change it at a later time.
Clicking on thelink will expand the window to let you access further crawl parameters.
Use HTTP Authentication
Access to websites under develpment could be restricted via HTTP authentication.
Clicking on the button, a window will pop-up to permit configuring the access credentials to use to audit a XML Sitemap of a website with access restricted via HTTP authentication.
Maximum number of concurrent connections
SEO spiders try to speed up website visits by using multiple concurrent HTTP connections, i.e. requesting more web pages at the same time.
Visual SEO Studio does the same, even if its adaptive crawl engine can decide to push less if it detects that web server would get overloaded.
This control lets you tell the spider how much it could push harder if the web server keeps responding fast.
The Visual SEO Studio edition and whether the website is among the Verified Websites can influence the ability of the spider to crawl faster:
For verified sites you can set up to 32 concurrent connection. For non-verified sites, maximum limit is 5.
The free Community Edition can only use 2 concurrent connections at most.
Warning: increasing the number of thread could slow down or hang the server if it cannot keep up with the requests; do it at your own risk (that's why you can force more on verified sites only).