How to Scrape A Website Without Programming Skills?

4/13/2016 12:11:37 AM

In most of the cases, it takes a lot of time and effort to write a crawler that can extract information from websites. But what if you have a web scraping tool and no programming skills required? Octoparse is exactly what you need if you’re new to web scraping

Usually when we talk about scraping a website, it refers to scrape the contents inside. Here we share an easiest way for beginners.

A website usually has a homepage, a list of pages , content pages as well as labels and classifications. The most important one is content pages.

We take for example.


After setting basic information of your task, click “Next”.

Then open the page you need to scrape data from in the browser.
For instance, the Apartment for Rent category in San Francisco.


Scroll down the page to the bottom.

To do the pagination scraping, you need to create a “Loop to keep clicking next page”. When you click on “Next page” button here, You can’t find the “Loop click next page” option as usual.


In this case, you need to configure pagination scraping rule in another way.

To do this, drop an “Loop” item into Workflow designer. Choose a “Loop Mode” under “Advanced Options”. Select “Single Element Option”.

Make sure you locate the right place of the pagination link.

Then click the X path, and paste it in the text box.

Click “Save”.



Next, drop a “Click element” action into the “Loop item”

Choose “Click Loop items” under Advanced Option.

Now you’ve configured pagination crawling.


Then create a list of item as usual.

Click on the first title > Select “Create a list of items” > Add current item to the list > Continue to edit the list.

Click on the second title > Add current item to the list > Finish creating list > Click Loop to process.


Now you’re on the detail page.

Then start scraping data you need. Click on the name. Choose “Extract text.”

You can also extract other information in this way.


To get rid of the dollar sign, select the field you want to reformat. Click the Customize Field button. Choose “Reformat extracted data”. Click “Add step”. Click “Replace strings”.Copy the dollar sign and paste it into the “Replace” box.

Don’t type anything in the “With” box. Click “Calculate”. And the dollar sign will be removed. Then click “OK”. Now the final output data has no dollar sign. Click “done”.


Now there’s no dollar sign in the data you captured.

Once done configuring scraping rule, click “Next”.

Now choose “Local extraction” to run the task on your computer.

If the data you want to scrape is huge, choose Cloud Extraction to run your task in the cloud.


The data extracted will be showed in this pane and we can also see the configured rule of the task. You can also check out the build-in browser to see if the task runs as expected.

Export the results to Excel files, or other database formats, and save the file to the computer.


You’ve now known how to scrape a website without programming skills.





Author: The Octoparse Team




Download Octoparse Today



For more information about Octoparse, please click here.

Sign up today.



Author's Picks


About Octoparse

Octoparse 6.0 is Now Available

What A Price Monitor Can Help you?

Examples of Businesses Who Use Data Scraping

Collect Data from Facebook

Collect Data from Craigslist

Collect Data from LinkedIn




Recent Posts


Leave us a message

Your name*

Your email*




Attach file
Attach file
Please enter details of your issue and we will get back to you ASAP.