Data Extraction: data mine web pages

Extract from crawled pages sections of text or HTML fragments located with XPath expressions, organize them in a table and export them. This is 'Data Extraction'

Data Extraction permits to data mine web pages scraping content located with XPath selectors. Results are collected in a neat table.
It works on pre-crawled data sets, thus you can test, correct aim and refine. It saves tonnes of time not having to repeat the exploration every iteration.

Visual SEO Studio Data Extraction
Visual SEO Studio "Data Extraction"

  • You can add any number of custom columns
  • All columns will be resizable, movable, hidable and sortable.
  • You can extract a section Plain Text, Inner HTML or Outer Html
  • In case an XPath expression returns a collection of node, you can choose whether to keep only the first (default behaviour) or all results.
  • In the latter case, additional results are reported in the same column to ease data analysis and consumption.
  • Each choice is customizable for each single column of extracted data.
  • Export results to Excel or CSV format.
  • Can save and load column sets.

Extracted data cells have "Show in code" and "Show in DOM" context menu options. Aside, "Content", "DOM" and "Session" right panel views are available.

Show in DOM in Data Extraction
"Show in DOM" in Data Extraction

You can browse each item URL page thanks to the "Browse URL" context menu item.

Remember: each grid in Visual SEO Studio can be exported to Excel or CSV formats.

Export to Excel
Export to Excel

Note: "Data Extraction" is available only in the Professional Edition. You can evaluate it for free for 15 days by registering the Trial version.

See also: