All Collections
Case Tutorial
E-Commerce
Scrape product info from Flipkart
Scrape product info from Flipkart
Updated over a week ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

Flipkart, India's biggest e-commerce platform, owns a large number of users and occupies a big market in India. Millions of people shop on the website every single day. Flipkart provides almost everything one needs in our daily life.

Through the data from the website, one can easily have a comparison between similar products. In this case, we will scrape data such as the image URL, product title, price, and other info of T-shirts on Flipkart with Octoparse.

INFO.jpg

To follow through with the tutorial, kindly please use the following URL for reference:

The main steps are shown in the menu on the right, and you can download the sample task file here.


1. Create a Go to Web Page - to open the target website

To start our scrape journey, the target website needs to be input first.

  • Enter the Flipkart search URL into the search box at the center of the home screen

  • Click Start to create a new task in Custom Task


2. Start auto-detection - to create a workflow

The auto-detect function in Octoparse can identify the structure of the page and automatically generate a collection process.

  • Click Auto-detect web page data on the Tips box and wait for the detection to complete

  • Delete unwanted fields

  • Untick Add a page scroll and then click Create workflow

The workflow is then generated as below:

WORKFLOW.jpg

3. Modify Xpath for the pagination - to successfully go to the next page

To get the pagination going right, an accurate XPath is essential.

  • Click Pagination

  • Input the modified Xpath in Matching XPath: //span[normalize-space()='Next']

  • Click Apply to save the modification


4. Run the task - to get your target data

  • Click the Save button first to save all the settings you have made

  • Then click Run to run your task either locally or cloudly

  • Select Standard Mode to run the task on your local device

  • Waiting for the task to complete

Below is a sample data run from the local run. Excel, CSV, HTML, and JSON formats are available for export.

data.jpg

TIP: Local runs are great for quick runs and small amounts of data. If you are dealing with more complicated tasks or a mass of data, Run in the Cloud is recommended for higher speed. You are very welcome to try the premium feature by signing up for the 14-day free trial here. Tasks could be scheduled hourly, daily, or weekly and data delivered regularly.

Did this answer your question?