You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

Flipkart, India's biggest e-commerce platform, owns a large number of users and occupies a big market in India. Millions of people shop on the website every single day. Flipkart provides almost everything one needs in our daily life.

Through the data from the website, one can easily have a comparison between similar products. In this case, we will scrape data such as the image URL, product title, price, and other info of T-shirts on Flipkart with Octoparse.

To follow through with the tutorial, kindly please use the following URL for reference:

https://www.flipkart.com/search?q=t+shirts&as=on&as-show=on&otracker=AS_Query_TrendingAutoSuggest_1_0_na_na_na&otracker1=AS_Query_TrendingAutoSuggest_1_0_na_na_na&as-pos=1&as-type=HISTORY&suggestionId=t+shirts&requestId=6b1b2bb2-7abd-458f-b79e-311ce7af47cd

The main steps are shown in the menu on the right, and you can download the sample task file here.

1. Create a Go to Web Page - to open the target website

To start our scrape journey, the target website needs to be input first.

Enter the Flipkart search URL into the search box at the center of the home screen
Click Start to create a new task in Custom Task

2. Start auto-detection - to create a workflow

The auto-detect function in Octoparse can identify the structure of the page and automatically generate a collection process.

Click Auto-detect web page data on the Tips box and wait for the detection to complete

Delete unwanted fields

Untick Add a page scroll and then click Create workflow

The workflow is then generated as below:

3. Modify Xpath for the pagination - to successfully go to the next page

To get the pagination going right, an accurate XPath is essential.

Click Pagination
Input the modified Xpath in Matching XPath: //span[normalize-space()='Next']
Click Apply to save the modification

4. Run the task - to get your target data

Click the Save button first to save all the settings you have made
Then click Run to run your task either locally or cloudly

Select Standard Mode to run the task on your local device

Waiting for the task to complete

Below is a sample data run from the local run. Excel, CSV, HTML, and JSON formats are available for export.

TIP: Local runs are great for quick runs and small amounts of data. If you are dealing with more complicated tasks or a mass of data, Run in the Cloud is recommended for higher speed. You are very welcome to try the premium feature by signing up for the 14-day free trial here. Tasks could be scheduled hourly, daily, or weekly and data delivered regularly.

Scrape company info from Goodfirms.co

Scrape product info from Myntra (Jabong)

Scrape data from Duckduckgo search results

Scrape product info from Bol.com

Scrape product info from Etsy