Find Suppliers on AliExpress with Data Scraping and CleansingFriday, December 2, 2022
Do you know where to find suppliers for your e-commerce business? It may be a good idea to check out AliExpress. As a top global online e-commerce platform for both retail and wholesale, many sellers on Amazon and Shopify have stocked from this platform.
This article will introduce how you can use data to find reliable suppliers on AliExpress.
Taking "toy car" as an example, we will guide you step by step to extract data, including suppliers, products, prices, ratings, and the number sold on AliExpress. Then, filter the scraped data to get the average price and rating for each supplier, as well as the total amount sold. When you have this information in hand, you can easily see which of the suppliers are dependable and suitable for your business.
Why scrape data from AliExpress
There are more than 100 million listing products on AliExpress. Everything you need in daily life can be found there, such as cell phones, clothes, toys, surveillance devices, etc. You can collect prices, ratings, reviews, and supplier information from it, as much as you can grab from Amazon or Shopify.
AliExpress differs from other e-commerce platforms in that it allows you to track product sales when the majority of the other platforms do not. This data can be quite valuable for us to identify market trends and potential suppliers, and finally, develop a deeper understanding of the products and the market.
Scrape data by Octoparse
Octoparse is a no-code web crawler which makes data scraping easier and faster without having you to code. If this is your first time using this tool, you can go to Octoparse to download the software and install it on your device, then sign up for a free account to log in.
Step 1: Create a new task
Enter the target URL below into the search bar, and click "Start". Then the page will be loaded in the Octoparse built-in browser after seconds.
Target URL: https://www.aliexpress.com/wholesale?catId=0&initiative_id=SB_20221122192506&SearchText=toy+car&spm=a2g0o.home.1000002.0&dida=y
Step 2: Select the data we need
2.1 Once the page is loaded, click "Auto-detect webpage data" on the Tips panel. Then Octoparse will look through the whole page, and give you a data preview. Data contained in this preview will be highlighted in red on the page.
2.2 Preview all the selected data fields at the bottom. By clicking on the name of the data fields, you can check the exact location of each field on the page. Next, remove unnecessary data and rename the data fields as needed. In this case, we'll keep the product URL, price, sold, rating and supplier for further use.
The auto-detection approach is basically Octoparse predicting the data you might need. If you can't get the desired data fields with auto-detection, you can try selecting the elements manually by following the tips on the Tips Panel.
Step 3: Create and modify the workflow
After selecting all the data you want, click "Create Workflow". Then a workflow will show up on the right-hand side. You can gain an overview of the entire scraping process and check if the steps work properly by clicking through them.
Step 4: Run the task and export the data
After everything is set to go, click "Run" to start the scraping process. You need to pick whether to run this task on your device or in the Cloud, and in Standard Mode or Boost Mode. Then, Octoparse will take care of the rest for you. Once the run is finished, export the data as a CSV file.
Clean and analyze data by QuickTable
Looking at the scraped dataset, it's quite obvious that it's a bit messy now. Before diving into any further analysis, we should do a simple data cleansing first. We'll use a tool called QuickTable, a handy excel alternative to handle large and messy datasets.
Step 1: Upload the data file and create a new recipe
Sign up for a free account on QuickTable, then log in. Create a new project named "Toy Car Analysis on AliExpress", and upload the scraped CSV file into the project as a new dataset. Once the data file is successfully uploaded, you can open it and click the "Save Recipe" button to create a new recipe.
Step 2: Keep and rename columns
Keep the columns "Title_URL", "mgxne1", "mgxne2", "_1knf9", "expam", and "ox0kz". Then, rename them as "product link", "price fragment 1", "price fragment 2", "sold", "rating", and "supplier" respectively.
Step 3: Recover the price information
You may have noticed that product prices were split improperly in the file. This is because there is a dot between the price numbers on the original website. Use the "Merge" menu in QuickTable, you can merge the two columns in a second and get a new column created with the right prices.
Step 4: Extract the number value of the "sold" column
Take another look at the column "sold", and you'll find that the data is in string format and cannot be used for calculations directly. So we need to have it extracted into numerical values before we perform any sort of calculations with it. You can use "Text->Substring->Extract number" to extract the numerical values into a new column. Then, remove the original column and rename the newly-created column to "sold".
We now have a clean dataset and it's time to dive into the numbers. Looking at the data, we can easily see that a supplier might sell several different products on AliExpress. So we'll first group all the data by suppliers. Click the "Group by" button, and select "supplier" in the list Group by box.
To get the average price and rating, as well as the total number sold, perform the calculation steps as shown in the screenshot. Click on Save and you will get the results in three columns.
Now that you have the average product price of each supplier, you can easily see which suppliers sell products at prices that fit your budget. Additionally, using the total number sold and average rating, you can also pinpoint suppliers that are long-standing and have good credit.
When looking for suppliers online, there are many things to take into account. In this context, price, rating, and sales are all pretty basic information. Additional data like the number of orders, location, and shipping fee need to be considered as well. Utilizing data extraction, you can gather any information that's needed for your analysis and eventually find the right suppliers for your e-commerce business.