How to Extract Information from Yellow Page WebsitesWednesday, April 06, 2016 9:19 AM
Step1. Complete basic information. ➜ Click “Next”.
Step 2 (Design Workflow)
1) Enter the target URL in the build-in browser and load the website. ( I’ll search all businesses about cars in LA.)
2) Click on the “Next” pagination link. ➜ Then choose “Loop click Next Page”. (Note: If you want to extract some information from every page of search result, you need to add a page navigation action.)
3)Then go back to the top. ➜ Select the first one car business result. Now I need to “create a list of item”that can navigate to the detail page. To do this I need to select the first company name that links to the detail page.
Then choose “create a list of items”. ➜ “Add current item to the list”. ➜ “Continue to edit the list”.
4) And select the second one. ➜ “Add current item to the list” ➜ “Finish creating list” ➜ “Loop” to repeat the selection.
Now it will navigate to the car business detail page every time it does the “Loop Item” action.
5) Next I’ll extract the name. ➜ Click on the name. ➜ Choose “Extract text”.
(Note: If there’s no information to be extracted, it will leave blank in the result. )
(Then I’ll extract work time, address, and telephone numbers. You can extract any information you want on this page.)
6) All the content will be selected in Data Fields. ➜ Click the "Field Name" to modify.
7) I’ve now configured the task to extract information from one page on the search result. But I want extract data from all search result. Next, in the Workflow designer, drag the entire “Loop Item” before “click to paginate”.
8) Once done configuring extraction rule, click “Next”.
Step 3 (Extraction Options)
You can choose not to load images to speed up the extraction. But sometimes may cause problems on certain websites. Click “Next”.
Step 4 (Done)
Now the Task is completed! Choose the “Local extraction” to run the task on your computer.
The data extracted will be shown in "Data Extracted" pane. Click button to export the results to Excel file, databases or other formats and save the file to your computer.
The result looks pretty good!
Happy Data Hunting!
Author: The Octoparse Team
For more information about Octoparse, please click here.
Sign up today.
If this video tutorial is not available for you, you can click here to see the corresponding graphic tutorial.