Extract URLs of WebpagesWednesday, July 20, 2016 6:05 AM
In this article, I would like to show you how to extract URLs of webpages.
If you want to add an extra field for the URLs, please follow the steps down below.
( Example site: http://www.coleparmer.com/Category/Analytical_Balances/113.)
In this category, I’ll extract the data (Product Name, Price, and URL) of each item, 139 results in total.
Step 1. Type the target website in the browser, and then press the “Enter” to open it.
Step 2. Scroll down the page to the bottom, select “Next” button and then choose “Loop click the element” in the pop-up window.
Step 3. Go back to the first item. Click the title, then in the pop-up window, select “Create a list of items” > “Add current item to the list” > ”Continue to edit the list”.
Step 4. Select the second item. Click the title. Select “Add current item to the list”> “Finish Creating List” > “Loop”.
Step 5. Click the title and then choose “Extract text”. Extract the price in the same way.
Step 6. Extract the URL of the page. To do this, click the “Gear” button. “Define data extracted” > “Extract from browser window”, and then check the “Extract page URL” option. Lastly, click “OK” and “Save”.
Step 7. In the Workflow Designer, drag the “Loop Item” action before "Click to paginate" action.
Step 8. Click “Next” at the top right corner of the interface and then choose “Local Extraction” to run this task.
Now it's done!
Happy Data Hunting!