How to Bulk Download Images from A Website?
Tuesday, May 10, 2016 3:46 AMFor the latest tutorials, visit our new self-service portal. Sharpen your skills and explore new ways to use Octoparse.
Many product webpages use image carousels to display multiple images as slides which you can usually flip through manually.
In this tutorial, I will show you how to extract the URLs of images into your desired format so that you can bulk download them.
You may need this link to follow though:
Open the link in Octoparse's built-in browser to create a new task.
1. Scape one image into one column
Simply select one of the images, and select "Extract the URL of the selected image" on the Tips panel.
Repeat the same process to fetch all the other image URLs.
2. Scrape images into different lines
It is also possible to scrape images to different lines of the same column using a loop extract action.
1) Select the first image
2) Go on to select the second image and choose "Extract image URLs".
3. Scrape all images into one column
There are two ways to achieve scraping all images into one column.
Option 1. Merge the extracted image URLs into one line
Once you've looped extracted the image URLs into different lines (following steps in Scrape images to different lines), you can then
merge the extracted data to merge the lines into one single line.
1) Click the "More" icon for the data field, then select "Merge multiple rows of data into one"
Option 2. Scrape the HTML code of the carousel and match out the image URLs from the code
1) Select the entire carousel and select "Extract the outer HTML of the selected element"
2) Click the "More" icon for the field and select "Clean data".
3) Click "Add Step" and choose "Matching with Regular Expression"
4) Inspect the code to find the starting value and ending value of the image URL.
5) Click "Try the ReEx tool"
6) Enter the "Start with" and "End with" values to generate a RegEx and apply the setting.
7) Tick "Match all" and confirm
Tips! 1. The image URLs scraped are thumbnail URLs. If you need to get the full image URLs, you can continue to add steps to reformat the field. Please check this tutorial: |
If you need any help with task configuration or data collection, submit a ticket to our support team! We'll get back to you within 24 hours.
Happy Data Hunting!
Author: The Octoparse Team
For more information about Octoparse, please click here.
Sign up today.