All Collections
FAQ
What types of websites/data can Octoparse scrape?
What types of websites/data can Octoparse scrape?
Updated over a week ago

Octoparse supports scraping 98% of all websites, including those with AJAX, JAVA scripts, and other dynamic websites. It is also easy to interact with forms, drop-down lists, infinitive scrolling, and many more in Octoparse.

As a rule of thumb, any data/information that can be copied and pasted from any website can be scraped using Octoparse. More specifically, if the target data is found within the website's HTML source code (even not visible on the webpage), then it can be scraped using Octoparse.


1. Elements visible on the webpage:

  • Text

  • Image URL

  • Links (URLs)

  • Inner/Outer HTML code

  • Attribute Value

For more information, please check out here: Extract attributes of a web element (text, URL, HTML, etc)


2. Any information hidden in the source code, such as:

  • Page URL

  • Page Title

  • Metadata

  • HTML source code

  • Current Time

Check out for more details:


3. What types of websites can't Octoparse scrape?

Currently, Octoparse is not capable of scraping data from:

  • XML Sitemap

  • PDF file


If you find it time-consuming to scrape data from complex websites or just want to concentrate on running your business to its full potential, please feel free to reach out to us for our Data Service.

Did this answer your question?