Workflow overview
Build
Start from a URL, template, or custom task. Select the data fields you want and define actions such as clicking, scrolling, pagination, and opening detail pages.
Test
Run a small sample to confirm that Octoparse captures the right fields, records, and page sequence.
Run
Execute the task locally for testing or in the cloud for scheduled, unattended, and larger-scale extraction.
Build the task
A task defines how Octoparse interacts with a website. You can build a task by:- Using a template
- Letting Auto-detect identify page data automatically
- Selecting elements manually in the no-code builder
- Adding actions such as click, scroll, loop, pagination, and wait
- Refining field values before export
Test the extraction logic
Before running a task at scale, test a small sample. Check whether:| Check | Why it matters |
|---|---|
| Fields are correct | Prevents exporting the wrong values |
| Field names are clear | Makes downstream data easier to use |
| Pagination works | Ensures the task moves across result pages |
| Detail pages open correctly | Confirms nested page workflows are captured |
| Sample output looks clean | Reduces cleanup after export |
Run locally or in the cloud
Octoparse supports different run options depending on the task and your plan.| Run type | Best for |
|---|---|
| Local extraction | Testing, debugging, and tasks that rely on your local environment |
| Cloud extraction | Scheduled, unattended, and higher-volume extraction |
| Boost mode | Cloud tasks that need more speed or concurrency, when supported |
Export the data
After a task runs, Octoparse stores the extracted results as structured records. Common export destinations include:- CSV
- Excel
- JSON
- HTML
- XML
- Google Sheets
- Databases
- Cloud storage
Related pages
Local vs cloud extraction
Compare run environments and choose the right execution mode.
Export formats
Learn which output formats Octoparse supports.