undefined

How to Bulk Download Images from A Website?

Tuesday, May 10, 2016 3:46 AM

For the latest tutorials, visit our new self-service portal. Sharpen your skills and explore new ways to use Octoparse.

 

Many product webpages use image carousels to display multiple images as slides which you can usually flip through manually.

In this tutorial, I will show you how to extract the URLs of images into your desired format so that you can bulk download them.  

 

You may need this link to follow though:

 https://www.ebay.com/itm/Lenovo-Legion-Y540-15-6-144Hz-i7-9750H-16GB-RAM-256GB-SSD-GTX-1660-Ti-Office/303553933195

 An image carousel

Open the link in Octoparse's built-in browser to create a new task.

1. Scape one image into one column

Simply select one of the images, and select "Extract the URL of the selected image" on the Tips panel.

Repeat the same process to fetch all the other image URLs. 

 

2. Scrape images into different lines

It is also possible to scrape images to different lines of the same column using a loop extract action. 

1) Select the first image

2) Go on to select the second image and choose "Extract image URLs". 

 

3. Scrape all images into one column

There are two ways to achieve scraping all images into one column. 

Option 1. Merge the extracted image URLs into one line

Once you've looped extracted the image URLs into different lines (following steps in Scrape images to different lines), you can then

merge the extracted data to merge the lines into one single line.  

1) Click the "More" icon for the data field, then select "Merge multiple rows of data into one"

 

Option 2. Scrape the HTML code of the carousel and match out the image URLs from the code 

1) Select the entire carousel and select "Extract the outer HTML of the selected element"

2) Click the "More" icon for the field and select "Clean data".

3) Click "Add Step" and choose "Matching with Regular Expression"

4) Inspect the code to find the starting value and ending value of the image URL.

5) Click "Try the ReEx tool"

6) Enter the "Start with" and "End with" values to generate a RegEx and apply the setting. 

7) Tick "Match all" and confirm

 

Tips!

1. The image URLs scraped are thumbnail URLs. If you need to get the full image URLs, you can continue to add steps to reformat the field.

Please check this tutorial:

How to scrape the full image URLs instead of thumbnails?

 

 

If you need any help with task configuration or data collection, submit a ticket to our support team! We'll get back to you within 24 hours.

 

Happy Data Hunting!

Author: The Octoparse Team

Download Octoparse Today

 

For more information about Octoparse, please click here.

Sign up today. 

 

We use cookies to enhance your browsing experience. Read about how we use cookies and how you can control them by clicking cookie settings. If you continue to use this site, you consent to our use of cookies.
Accept decline