Skip to main content
Octoparse CLI is a command-line tool for creating, running, managing, and exporting Octoparse tasks from a terminal. Unlike the desktop app, Octoparse CLI is designed for scripted workflows, CI/CD pipelines, server environments, and automated data pipelines.

What Octoparse CLI does

Create tasks from a URL

Inspect any URL and generate a local task file automatically, manually, or through an AI agent workflow.

Find and inspect tasks

List cloud tasks, search by keyword, and inspect task details by task ID.

Run tasks locally

Run Octoparse tasks with the embedded local engine and independent Chrome.

Control task runs

Start or stop cloud extraction, and pause, resume, stop, or clean up local runs.

Export data

Export local or cloud task data as XLSX, CSV, HTML, JSON, or XML.

Automate with agents

Use JSON output, JSONL event streams, and exit codes to integrate CLI into scripts and AI agents.

How it works

Octoparse CLI runs the embedded Octoparse engine directly for local extraction. It uses independent Chrome and does not require the Electron desktop client. Cloud extraction is controlled through backend APIs. Local extraction is controlled by the local engine.
Local run status is tracked by this CLI and is not synchronized with the Octoparse desktop client status.

Requirements

Before using Octoparse CLI, make sure you have:
  • Node.js 20 or newer
  • npm 8 or newer
  • An Octoparse account (API key or OAuth)
  • Access to the tasks you want to run or export
See Installation for platform requirements and setup steps, including Linux arm64 limitations. Functional commands require authentication, including local --task-file and .otd runs. Setup and diagnostic commands such as --help, --version, doctor, browser doctor, capabilities, and auth can run without it.

Common workflow

1

Install Octoparse CLI

Install the CLI globally with npm.
2

Authenticate

Log in with OAuth or an Octoparse API key. For CI, use the OCTO_ENGINE_API_KEY environment variable.
Do not commit API keys to Git, scripts, docs, screenshots, shared logs, or CI output.
3

Create or find a task

Use octoparse detect <url> to generate a task file from a URL, or octoparse task list to find an existing task ID.
4

Run or control the task

Run the task locally, start or stop a cloud run, or check run status.
5

Export data

Export collected data from local or cloud results.

Quick command examples

Replace <taskId> with your actual task ID and <url> with the page you want to scrape.
octoparse --help
octoparse doctor
octoparse auth login
octoparse detect <url> --auto --goal "Extract product titles and prices" --output task.json
octoparse task list
octoparse task inspect <taskId>
octoparse run <taskId>
octoparse cloud start <taskId>
octoparse local status <taskId>
octoparse data export <taskId> --source local --format xlsx

Current limitations

Octoparse CLI v1 does not support kernel browser or legacy workflow. Kernel browser refers to the browser mode used by older Octoparse runtime workflows. Legacy workflow refers to tasks created with older Octoparse task definitions that are not supported by the current CLI runtime. If a task is not supported, rebuild or update the task in the current Octoparse desktop app, then run it again with the CLI.

Next steps

Install Octoparse CLI

Set up Node.js, install the CLI with npm, and verify the local runtime.

Create tasks from a URL

Use detect to generate a task file from any URL.

Browse all commands

Review task, detect, local run, cloud run, authentication, and export commands.