All Collections
Case Tutorial
E-Commerce
Scrape seller info from AliExpress
Scrape seller info from AliExpress
Updated over a week ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

AliExpress is an online retail service based in China and owned by the Alibaba Group. In this tutorial, we are going to show you how to scrape seller info from AliExpress.


If you would like to know how to build the task from scratch, you may continue reading the following tutorial or check this video below.


The main steps are shown in the menu on the right, and you can download the sample task file here.


1. Create a Go to Web Page - to open the target website

  • Paste the URL on the home page and click Start

AliExpress requires a login to see the product information, so you need to log in to your AliExpress account first.

  • Switch to Browse mode

  • Fill in your login info and click sign in.

  • After logging into your account, turn off the Browse mode.

  • Click Go to Webpage -> Options

  • Tick Use Cookie and click Use cookie from the current page

  • Tick the Scroll down the page after it is loaded box, and change the Repeat number to 10.

mceclip5.png
  • Click on Apply to save the changes


2. Set up a Pagination Loop - to scrape data from multiple pages

  • Click on Next then select Loop click next page

  • Click on Click to Paginate -> Options, update AJAX timeout to 15s

  • Tick the Scroll down the page after it is loaded box then adjust the number of Repeats to 10

mceclip8.png
  • Click Apply to save the changes


3. Set up a Loop Item - to loop click on each product link and enter the detail page

  • Click on 2 random store names on the page and click Loop click each URL


4. Create Hover On - to show the details of sellers

  • Click on the store name then select Hover on the selected element


5. Extract Data - to get the information needed

  • Extract all the info you need by clicking on it, then select Text

  • Repeat until you have all the info needed. In this case, we will extract the store name/ store no./ item as described/ communication/ shipping speed.

Here we need to change the Xpaths of the store number and the open date of the store.

  • Switch to the Vertical View

  • Change the XPath under Field Settings

Store number: //span[contains(.,'Store No.')]

Open date: //span[.='This store has been open since ']/following-sibling::span

  • Double-click on the field names to rename

sto.gif

6. Run the task - to get your target data

Below is how the final workflow looks like. If everything is in place, you can continue to run the task

mceclip12.png

Here is the sample output -

mceclip11.png
Did this answer your question?