undefined

Scraping Product Detail Pages from eBay.com

Thursday, December 29, 2016 10:50 PM

For the latest tutorials, visit our new self-service portal. Sharpen your skills and explore new ways to use Octoparse.

 

Web scraping online shops like eBay and Amazon has become a critically important data source, allowing you to do the comparison between hot-sale products from prices, features, and product descriptions, conveniently. 

 

In this tutorial, you will learn how to scrape product data from eBay.

 

You can go to Task Template on the main screen of the Octoparse scraping tool and start with the ready-to-use eBay Templates directly to save your time. With this feature, there is no need to configure scraping tasks. For further details, you may check it out here: Task Templates

 

If you would like to know how to build the task from scratch, you may continue reading the following tutorial. We will scrape data such as the name, condition, price, and more info from the product details page with Octoparse.

 

To follow through, you may want to use this URL in the tutorial:

https://www.ebay.com/sch/Digital-Cameras-/31388/i.html

 

We are going to scrape the product information of "digital camera" on eBay. Check out the main steps in the tutorial below [Download task file]

1. Create a Go To Web Page - to open the target web page

2. Auto-detect web page data - to create the workflow

3. Select the link to scrape data from the detail page

4. Extract data from the product detail page

5. Modify the XPath of the data fields

6. Start extraction - to run the task and get the data

 

1. Create a Go To Web Page - to open the target web page

  • Enter the example URL and click Start

 

2. Auto-detect web page data - to create the workflow

  • Click Auto-detect web page data and wait for the detection to complete
  • Delete unwanted fields or modify field names in the Data Preview section
  • Uncheck Add a page scroll
  • Choose Create workflow on the Tips panel

 

You will now get the workflow below.

octoparse-workflow

 

If all the data you need could be scraped from the listing page, you can stop here and jump to Start extraction - run the task and get the data. If you want to go to each product detail page to get more info, follow the steps below.

 

3. Select the link to scrape data from the detail page

  • Choose Click on link(s) to scrape the linked page(s)
  • Choose Title_URL from the drop-down option 
  • Choose Confirm

scraped-linked-pages 

 

Octoparse would automatically go to the first product detail page.

 

4. Extract data from the product detail page

  • Click the element(s) you want and choose Extract the text of the element
  • Double click on the field name to rename it if needed

 

Tip: Check the following tutorials for what kind of data you can scrape: (1) Extract element text/URL/image/HTML/attribute  (2) Generate data (fixed value, date & time)

5. Modify the XPath of the data fields

You may need to modify the XPath of some data fields that do not show on every product page, or the field position varies from page to page, like MPN or UPC. We can modify the XPath to make the data scraping more precise. No worries! We have prepared some frequently-used XPaths for you. You can just use the element XPath provided below.

  • Click More
  • Click on Customize XPath

customize-xpath

 

  • Replace the XPath with the revised one
    • MPN: //td[contains(text(),'MPN')]/following-sibling::td[1]
    • EAN: //td[contains(text(),'EAN')]/following-sibling::td[1]
    • UPC: //td[contains(text(),'UPC')]/following-sibling::td[1]
    • Item Weight: //td[contains(text(),'Item Weight')]/following-sibling::td[1]
  • Click Apply to save

 

Tip: You can check the XPath tutorials below to write XPaths for other fields if needed: (1) What is XPath (2) Locate and scrape an element via nearby text

 

 

6. Start extraction - to run the task and get the data

  • Click save icon
  • Click run icon on the upper left side
  • Select Run task on your device to run the task on your computer, or select Run task in the Cloud to run it on our Cloud servers (for premium users only)

 

Here is the sample output. 

sample data

 

Was this article helpful? Contact usexternal-link-symbol-1.png any time if you need our help!

 

Author: The Octoparse Team 

Download Octoparse Today

 

For more information about Octoparse, please click here.

Sign up today. 

 

We use cookies to enhance your browsing experience. Read about how we use cookies and how you can control them by clicking cookie settings. If you continue to use this site, you consent to our use of cookies.
Accept decline