Step-by-step tutorials for you to get started with web scrapingDownload Octoparse
Is Octoparse able to handle CAPTCHA/reCAPTHCA?Thursday, August 16, 2018
Captcha or reCaptcha is a common anti-scraping technique applied by many websites. They would ask you to solve a Captcha before you log in to your account or access the data.
Although Octoparse cannot deal with Captcha automatically, there are workarounds to this issue.
Manually enter Captcha in local extraction.
1. Click the text box to enter Captcha manually in the built-in browser when building a task.
2. Set up enough wait time before clicking the login button or to the step after solving Captcha.
3. When running the task locally, you can manually enter the Captcha or solve other types of Captcha in the extraction window.
Save cookies to avoid encountering Captcha
Manually entering the Captcha every time to run a task is quite inconvenient and can not be applied in cloud extraction. If the websites support using cookies, we can save the cookies to remain logged in.
Learn more about the details on how to save login cookies from this tutorial: [Click here ]
Catpcha encountered during the scraping process is not solvable currently. We suggest you slow down the extraction by using wait time function. [Click here ]
- Most popular tutorials
- Scrape product image from Amazon
- Scrape post from LinkedIn
- Scrape reviews from Amazon
- Task / Workflow Debugging
- Scrape Image URLs from a Website