When to use Auto-detect
Use Auto-detect when:- The page has repeated items such as products, listings, reviews, or search results
- You want a quick starting workflow
- You are not sure which elements to select manually
- You want Octoparse to suggest fields and pagination logic
- You want to quickly preview what data can be extracted from a target website before committing to a full configuration
- You plan to review and adjust the generated workflow afterward
How Auto-detect works
Auto-detect analyzes the page DOM structure and visual layout, uses similarity calculation and feature combination to locate repeated data regions, and generates extraction rules automatically.Run Auto-detect
Octoparse scans the page, detects repeated data regions, and generates extraction fields. It also attempts to identify pagination or next-page buttons so the task can move through multiple result pages automatically.
Confirm page navigation
Review pagination, scrolling, or next-page actions. Auto-detect may identify a next-page button or infinite scroll pattern — verify that it works correctly before running at scale.
What to review after detection
After Auto-detect generates a workflow, check:| Area | What to verify | How to fix |
|---|---|---|
| Fields | Are the correct values captured? | Re-select the element or adjust the XPath in the Data Preview panel |
| Field names | Are column names clear and meaningful? | Rename fields before export |
| Pagination | Does the task move to the next page correctly? | Manually set the next-page button or scroll action |
| Detail pages | Does the workflow open item details when needed? | Add a click action to enter detail pages |
| Duplicates | Are repeated or unwanted elements included? | Delete unwanted fields or rows in the Data Preview panel |
| Missing values | Are some rows missing important fields? | Add fields manually in the Data Preview panel |
Known limitations
Auto-detect works best on pages with clear, repeated data structures. It may produce incomplete or inaccurate results in certain situations:| Situation | What may happen |
|---|---|
| Non-standard or irregular page layout | Fields may be misidentified or missing |
| JavaScript-rendered dynamic content | Data that loads after the initial page render may not be detected |
| Anti-scraping websites (e.g. Indeed, LinkedIn) | Auto-detect may fail to load or parse the page |
| Complex nested structures | The detected list may include incorrect or duplicate elements |
| Wrong field captured | The system may pick up promotional text instead of the actual value (e.g. a discount label instead of the real price) |
When manual editing is needed
Manual adjustments may be needed when:- The page layout is irregular
- Important fields are outside the detected list
- The website loads content dynamically
- Pagination is not detected correctly
- The page requires login, filters, popups, or user interaction
- You need to extract data from detail pages
Related pages
Templates
Start with a prebuilt workflow for common websites.
No-code builder
Adjust or build workflows manually after Auto-detect.
Refine data
Clean and reformat extracted fields before export.