Step-by-step tutorials for you to get started with web scrapingDownload Octoparse
Is Octoparse able to handle CAPTCHA/reCAPTHCA?Tuesday, December 3, 2019
The latest version for this tutorial is available here. Go to have a check now!
Captcha or reCaptcha is a common anti-scraping technique applied by many websites. They would ask you to solve a Captcha before you log in to your account or access the data.
Although Octoparse cannot deal with Captcha automatically, there are workarounds to this issue.
Manually enter Captcha in local extraction.
1. Click the text box to enter Captcha manually in the built-in browser when building a task.
2. Set up enough wait time before clicking the login button or to the step after solving Captcha.
3. When running the task locally, you can manually enter the Captcha or solve other types of Captcha in the extraction window.
Save cookies to avoid encountering Captcha
Manually entering the Captcha every time to run a task is quite inconvenient and can not be applied in cloud extraction. If the websites support using cookies, we can save the cookies to remain logged in.
Learn more about the details on how to save login cookies from this tutorial: [Click here ]
Catpcha encountered during the scraping process is not solvable currently. We suggest you slow down the extraction by using wait time function. [Click here ]
If you need more CATPCHS service only, you might be able to find out more solutions on 2Captcha or some similar service providers.
- Most popular tutorials
- Use lists to extract
- Set up proxies
- Scrape data via Google Searching
- Extract data from source code
- How to export extracted data to a database?