Octoparse CLI

Octoparse CLIfor terminal, CI/CD, and AI agents.

The web scraping engine your team can run from a laptop, a CI pipeline, or inside the AI agent you're shipping next quarter — same binary, same contract.

Read the Docs

Free trial · no credit card
Cross-platform
Stable contract

Octoparse CLI: three teams, one binary

Same CLI. Same exit codes. Same JSON contract — whether it's running on a laptop, in CI, or inside an agent loop

Developers

One npm install. Run web scraping locally with a bundled engine — your scraped rows never leave the machine.

DevOps Teams

Drop the CLI into GitHub Actions, Docker, Airflow, or cron. Stable exit codes, env-var auth that never touches disk — passes security review on the first round.

AI Agents

Hand the CLI to Claude, Cursor, or your own agent loop. JSONL streaming lets the agent plan the next step before the run finishes.

One command. Three deployment stories

Same binary on your laptop, your CI pipeline, or inside an agent — predictable enough to put on the on-call rotation.

“I just need this CSV before standup.”

A growth analyst pulls competitor pricing every morning into a Jupyter notebook. One run + one data export — fresh sheet before coffee, no Selenium to babysit.

$ octoparse run lp-pricing
✓ 248 rows → pricing.csv

Setup time~ 90 seconds

Daily runtime< 3 min

“Wire it into our weekly data pull.”

A retail data team runs scheduled extractions in CI every Monday 06:00 UTC. Stable exit codes route success downstream, failures straight to on-call — zero containers to maintain.

# .github/workflows/pull.yml
- run: octoparse run $TASK --json
- run: dbt build

On-call pages7 → 0 / month

Stack savedSelenium fleet

“Hand the CLI to my agent as a tool.”

A vertical-AI startup exposes the CLI inside Claude / Cursor as a structured tool. JSONL streaming gives the agent row-by-row feedback so it can plan the next step before the run finishes.

tool: octoparse.run
stream: jsonl
next_action: enrich rows

Robustnessguaranteed

Time to setup< 2 seconds

Why teams keep coming back to Octoparse

Six reasons our customers pick Octoparse and stay.

Global coverage out of the box

200+ ready-to-run templates — Amazon, LinkedIn, Google Maps, YouTube, Yelp, HN, Reddit, and more. One REST shape, the same canonical fields, no XPath or selector maintenance.

8 years of scraping infrastructure

Browser pool, proxy rotation, anti-bot, pagination, structured export — battle-tested since 2018.

Your data. Your rules.

Your runs, your bytes. We don't resell, redistribute, or train on the data we extract for you. Set a retention window, hit delete, gone. Every run gets a trace_id you can audit or replay.

Structured output, every format

JSON, JSONL, CSV, XLSX, XML — same canonical shape. Stream straight into Snowflake via Airbyte, dbt, Airflow, or your own ETL.

Built for AI from day one

Plays native with Claude, GPT, Cursor, Cline, Dify, LangChain. JSONL streaming means your agent can plan the next step before the run finishes.

Best value in the category

Free trial — no credit card. Transparent metered pricing after. Teams report replacing in-house scraping stacks at 1/18 the cost of headcount.

Quiet enough to never page you

Built on eight years of scraping infrastructure — and on feedback from teams already running it in production.

3M+cloud run hours

99.97%cloud availability · 90d

31OS benchmarks

8yscraping infrastructure

"We went from a Selenium fleet on three EC2 boxes to one CLI invocation in a GitHub Action."

Ravi P.Staff DevOps · D2C retail platform

"Our agent loop calls it as a tool. JSONL streaming means it can plan the next step before the run finishes. Game-changer for product UX."

Elena N.Founding engineer · vertical-AI startup

"Stable exit codes, env-var auth — passed our security review on the first round. That almost never happens with scraping tools."

Thomas K.Security architect · Enterprise SaaS

Powering data & AI teams at

Lumen LabsNorthwindQuanta AIDrift RetailHelio CapitalMosaic.ioPlurabankFieldNoteStride HealthArgon FoodsPivotsoftCobalt & Co.

Frequently asked questions

How is the CLI different from the API or MCP?

Is the CLI free?

Which operating systems are supported?

How do I use it in GitHub Actions / Docker / on a server?

When should I run locally vs in the cloud?

Retire the scraper. Keep the data

Free trial. No credit card. Most teams have it running in CI before the daily standup.

Start free trial

Talk to sales