How to Extract Information from Yellow Page Websites

Wednesday, April 06, 2016 9:19 AM

 

Step1.  Complete basic information. ➜ Click “Next”.

 

Step 2 (Design Workflow)

1) Enter the target URL in the build-in browser and load the website. ( I’ll search all businesses about cars in LA.)

 

2) Click on the “Next” pagination link. ➜ Then choose “Loop click Next Page”. (Note: If you want to extract some information from every page of search result, you need to add a page navigation action.)

 

3)Then go back to the top. ➜ Select the first one car business result. Now I need to “create a list of item”that can navigate to the detail page. To do this I need to select the first company name that links to the detail page.

 

Then choose “create a list of items”. ➜  “Add current item to the list”. ➜ “Continue to edit the list”.

 

4) And select the second one. ➜ “Add current item to the list” ➜ “Finish creating list” ➜ “Loop” to repeat the selection.

Now it will navigate to the car business detail page every time it does the “Loop Item” action.

 

5) Next I’ll extract the name. ➜ Click on the name. ➜ Choose “Extract text”.

(Note: If there’s no information to be extracted, it will leave blank in the result. )

(Then I’ll extract work time, address, and telephone numbers. You can extract any information you want on this page.)

 

6) All the content will be selected in Data Fields. ➜ Click the "Field Name" to modify.

 

7) I’ve now configured the task to extract information from one page on the search result. But I want extract data from all search result. Next, in the Workflow designer, drag the entire “Loop Item” before “click to paginate”.

 

8) Once done configuring extraction rule, click “Next”.

 

Step 3 (Extraction Options)

You can choose not to load images to speed up the extraction. But sometimes may cause problems on certain websites. Click “Next”.

 

Step 4 (Done)

Now the Task is completed! Choose the “Local extraction” to run the task on your computer.

 

The data extracted will be shown in "Data Extracted" pane. Click button to export the results to Excel file, databases or other formats and save the file to your computer.

 

The result looks pretty good!

 

 

 

Happy Data Hunting!

 

 

 

 

 

 

Author: The Octoparse Team

 

 

 

Download Octoparse Today

 

 

 

 

For more information about Octoparse, please click here.

Sign up today.

 

 

Author's Pick

 

About Octoparse

Collect Data from LinkedIn

Collect Data from Amazon

Collect Data from Yelp

Collect Data from eBay

Collect Data from Gumtree.com

Collect Data from Facebook

Get Started with Octoparse in 2 minutes

A Comparison among Three Editions of Octoparse

 

 

 

 

Contact
us

Leave us a message

Your name*

Your email*

Subject*

Description*

Attachment(s)

Attach file
Attach file
Please enter details of your issue and we will get back to you ASAP.