Visual Real-Time Crawler
See in real-time the way a search engine spider crawls your site, with a simulated time scale... and get how it works under the hood!
Visual SEO Studio spider crawls web sites mimicking the way a search engine spider would[1], and the program shows in real-time the taken crawl paths.
Simulated G-time shows how long a search engine spider would take to visit the site
Main features:
- Crawls the way a search engine spider does
- Can Pause and Resume the crawl session
- Crawled data is always persisted and can be retrieved later
- Reconstructs site link structure in real-time
- Support Unicode paths in URLs
- Support Unicode in domain names (e.g. .рф domains)
- Can launch several crawl sessions in parallel
- Can crawl Ajax-based sites
- and more...
Among other things, the Progress right side panel shows in real-time the "Simulated G-Time" (yet another unique feature of Visual SEO Studio!): the crawling time it would take to googlebot - or any other low frequency agent - to visit your site, helping make crystal clear the importance of an efficient crawl path sculpted in the web pages.
You will often be surprised by the unexpected crawl path a spider would take from your site Home Page... check it out, the sooner the better!
See also:
- Video tutorial: Crawling a website
- Help page Manual: Crawl a Site.
- Help page Manual: Visual Real-Time Crawler.
- Crawl URL Lists: off-site analysis
- Multi Site Crawling
- Crawl Tree View
- Pigafetta bot, Visual SEO Studio spider
- Frequently Asked Questions: Crawl issues
[1] Visual SEO Studio spider crawls websites using a breadth-first search exploration, an efficient way to explore a web site finding the most important content early in the exploration process, in absence of external links indicators.
Every search engine uses its own exploration path which might not be entirely reproducible being it also prioritized by external signals such as page rank, result of a multi-steps process (where fetching URLs, parsing links and queuing URLs to be visited are performed independently by separated components) other then part of their trade-secret. Breadth-first nevertheless resembles in a reproducible way the general crawl path taken by most search engine bots.
More on web crawlers section policies con be read on Wikipedia.
No registration required