Octoparse supports scraping 98% of all websites, including those with AJAX, JAVA scripts, and other dynamic websites. It is also easy to interact with forms, drop-down lists, infinitive scrolling, and many more in Octoparse.
As a rule of thumb, any data/information that can be copied and pasted from any website can be scraped using Octoparse. More specifically, if the target data is found within the website's HTML source code (even not visible on the webpage), then it can be scraped using Octoparse.
1. Elements visible on the webpage:
Text
Image URL
Links (URLs)
Inner/Outer HTML code
Attribute Value
For more information, please check out here: Extract attributes of a web element (text, URL, HTML, etc)
2. Any information hidden in the source code, such as:
Page URL
Page Title
Metadata
HTML source code
Current Time
Check out for more details:
3. What types of websites can't Octoparse scrape?
Currently, Octoparse is not capable of scraping data from:
XML Sitemap
PDF file
If you find it time-consuming to scrape data from complex websites or just want to concentrate on running your business to its full potential, please feel free to reach out to us for our Data Service.