First of all, I am not a lawyer nor an expert. The article I wrote is only based on my experience working at Octoparse. If you' re facing real legality problems, please seek legal assistance accordingly.
Web crawling, also as known as data scraping or data scraping in technical terms, is a computer program technique used to scrape huge amount of data from websites where regular-format data can be extracted and processed into easy-to-read structured formats. The uses for businesses or individuals or other purposes are countless.
Is web crawling legal ? Well it depends since there’s a lot of uncertainty regarding the legality of web crawling. If you’re doing web crawling for your own purposes, then it is legal as it falls under fair use doctrine. The complications start if you want to use scraped data for other, especially commercial purposes. Quoted from Wikipedia.org, eBay v. Bidder's Edge, 100 F.Supp.2d 1058 (N.D. Cal. 2000), was a leading case applying the trespass to chattels doctrine to online activities. In 2000, eBay, an online auction company, successfully used the 'trespass to chattels' theory to obtain a preliminary injunction preventing Bidder's Edge, an auction data aggregator, from using a 'crawler' to gather data from eBay's website. The opinion was a leading case applying 'trespass to chattels' to online activities, although its analysis has been criticized in more recent jurisprudence.
As long as you are not crawling at a disruptive rate and the source is public you should be fine.
I suggest you check the websites you plan to crawl for any Terms of Service clauses related to scraping of their intellectual property. If it says "no scraping or crawling", I think you should respect that.
Here’s my suggestion.
1. Scrape websites discreetly. Don’t scrape websites at a disruptive or violated rate without regarding to the load you're placing on the target servers.
2. Use the data discreetly. It's better for everyone concerned if you find a way to use the info you scraped without being forced to reveal its source. You would have problems using the data scraped if the data is copyrighted. Use the data for legal purposes.
Author: The Octoparse Team
For more information about Octoparse, please click here.
Sign up today.