Blog > Data Collection > Post

Build your blog fast with Web Scraping

Thursday, August 10, 2017

Speaking of building a blog fast, it may refer to a concept--- content curation. “Content curation is the process of gathering information relevant to a particular topic or area of interest” (withdrawn from Wikipedia ). In a simpler term, the process of content curation is the act of sorting through large amounts of content on the web and presenting the best posts in a meaningful and organized way. 

A new-developing blog can grow very fast with the right strategy, one of the best is content curation, because it does not create, it shares, which saves lots of your time and be able to attract audiences to your blog. How to find the right contents for your blog that is an issue, reading through all these contents on Internet would not be a good idea, but there is a way that I want to share with you.

Simply with two steps, you will be able to recognize the best contents for your blog.


Step 1. Find websites relevant to your blog.

Almost every website has a theme. Once you set up your own blog’s theme, you should go look for and recognize what websites are relevant with your blog, and do well in the market or Internet, and mark down these websites on your memo list.


Step 2. Use Web Scraper Octoparse to extract information for you

It’s time to discover the right contents for your blog in this step. For a new-developing blog, the content should be popular in the first place, and then relevant, which means you should consider more about the content’s popularity than its relevance to your blog, only a few keywords connection will be fine. So, when using Octoparse to do the extraction work, the only thing you need to focus is the article’s view, rate, and etc.. There is a set of data that I scraped from with Octoparse, let’s see what we can do with these data. (Find out how to use Octoparse in Tutorials)



(Extraction running on Octoparse)



The data shown above, it's what I exported from Octoparse, it showed articles' total views, today's views, and titles, the first two information may link to the heat of these articles. We can rearrange the data in Excel, choose either total views or today’s views to figure out which article is the hottest, and pick out the top five. Take a glance on the articles that you’ve chosen, and see if the contents were right for your blog, after, you can post the articles selected on your blog, and remember not to forget about referring the articles’ original resources. 

Of course, that’s not the end of the effort you should put in to build a blog, you need to learn far more, and this article just states one of ways to build a blog. Hope it can give you some inspirations.




Author: the Octoparse team


In case you'd like to start scraping for your blog now, I've prepared some typical web scraping tutorials for your reference:

Web Scraping Case Study | Scraping Articles from News24

How to Scrape WordPress Posts

Scrape Articles from CNN Money

Download Octoparse Today

For more information about Octoparse, please click here.

Sign up Octoparse today!


Download Octoparse to start web scraping or contact us for any
question about web scraping!

Contact Us Download