How to Bypass the CAPTCHA When Extracting Data From Web Pages?

Have you ever been asked to read blurred letters and type them into a box? That’s a CAPTCHA.

CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a method that websites use to tell the difference between robots and humans accessing their pages. CAPTCHAs are there to actually stop you from automating the login. This is an ongoing struggle between CAPTCHA providers and the ones who want to beat the system by bypassing them. 

There are many websites that use CAPTCHA to prevent robots from visiting their websites. So it’ll be very tricky for you to extract data from these websites. Well, is it possible to bypass the CAPTCHA when extracting data From web pages?

There are ways to get around CAPTCHA. By using some artificial technique, it can bypass the verification code. The most common way is to hook your program up to a service in an offshore center where someone sits before a screen all day filling in those little authentication screens.

So far Octoparse does not handle captchas. But we will catch it up.

