undefined
Blog > Web Scraping > Post

How to scrape and bulk download images from any website

Tuesday, February 23, 2021

Table of Contents

How to get images from a website

Get prepared with a web scraping tool

Get all image URLs in 3 steps

Bulk download images in seconds

Takeaway

 

How to get images from a website

As you may notice, in many cases we need to save a list of images from a website and it can be very tiring and tedious work just by clicking and saving images one by one.

In fact, a web scraping tool is a perfect pick to automate this job. Instead of endless clicks across web pages, you just need to configure a task within 5 minutes and the robot will get you all the image URLs. Copy them into a bulk image downloader, you will get things done in just 10 minutes.

 

Get prepared with a web scraping tool

The first step is to download the web scraping tool on your computer. Octoparse is the tool we will use in this task. Don’t worry. This is a foolproof guide and you don’t need any programming experience to get started!

 

download free web scraping tool

 

Note: Octoparse offers a free plan for our users. You don’t need to pay for any features mentioned in this guide.


 

Get all image URLs in 3 steps

#Step 1: Create a task

1)Launch Octoparse. Enter the URL of the webpage we are scraping from. Then click the “Start” button to proceed.

 

start building an image scraper


Sample URL: https://www.aliexpress.com/w/wholesale-pillow-decor-home.html?spm=a2g0o.home.100000001.2.650c2145tkm2Bo

(If this link goes invalid, just use another product listing link in Aliexpress)

 

2)As the page renders in the Octoparse, find the Tips Panel on the upper right and click Auto-detect web page data to proceed auto-detection. Auto-detection is a feature that can help you detect and select valuable data on the page. You don’t even do the point and click, Octoparse offer options for you.


auto detect all images on the web page

 

3)As the detection completes, a data preview box will pop up where you can check what data you are getting from the current set. Click Switch auto-detect results to pick across different options. Click Create workflow to confirm and create a task.

 

create an image scraper

That’s it! Now you have built your Aliexpress scraper!

 

#Step 2: Edit the task

In this step, we will check the pagination and amend the Xpath if necessary. What’s pagination? Well, it is a setting to let the scraper click through pages so whenever it finishes scraping the current page, it will click on the next page and continue the scraping.

 

1)Check the workflow created by Octoparse on the left and click the “Pagination” loop. Scroll down the page on the built-in browser to where the “Next” button lies so that we can see if the scrape has selected the right button to turn pages.


check the pagination

 

2)The picture shows that the scraper has selected the “Previous” button(in the red rectangle) instead of “Next”. Clicking Previous is not a way to turn pages, robot. Let’s correct it.

 

check the pagination on Octoparse

3)Double click the Pagination loop, you will reach in the settings interface. In the bar there is a line of code, that’s auto-created Xpath to locate the “Next” button. How to amend it?
Click the little arrow on it’s right, and click the “Next” button in the browser. Yea, you are telling the robot, hey this is the button I want to click. Fairly simple right?

 

amending xpath by clicking

 

4)The last move in the step is to set automatic scrolls. This can make sure the page gets completely loaded before the actual scraping takes place. Just a few clicks will do!

 

 

set automatic scrolls

 

5)Go back to the workflow and double click “Go to Web Page”, go to “after loading page” and check “scroll down the page after it is loaded”, set a 50 times scroll-down with a half-second between 2 scrolls.

 

Congratulations. Now you have successfully configured a scraper!

 

#Step 3: Run the task

Just click the “Run” button on the above and run the task on your device. You will get thousands of data lines in a few minutes. That’s the speed of Octoparse! Once you get a hang of this, you must regret wasting time on manual jobs getting web data in the early days.


run the image scraper

 

This is the data Octoparse got for me in 5 minutes. In fact, it scrapes not only all the images URLs, but also the according product details. This could be extremely helpful for people who are doing product research and eCommerce analysis.

 

Bulk download images in seconds

With all image URLs in an Excel file, what you need now is a bulk download tool. I would recommend the Chrome extension: Tab Save.

 

bulk download images

 

Copy and paste the image URLs into Tab Save and click to download, all images will be saved in your computer in seconds.




all images downloaded

 

The whole process may take just 10 minutes and you will get thousands of images (even product details accordingly) from Aliexpress. Come on, this is definitely worth a try!



Takeaway

    1. Octoparse is a web scraping tool that offers a free plan for users. It is particularly designed for no-code use. The best tool for starting your web scraping project from scratch!
    2. You can use auto-detection feature in Octoparse to get data from any website. This will free you from confusing scraper building steps that you are very likely facing in other tools.
    3. You don’t need to write code to amend Xpath! Point and click is the most friendly way for you to teach the robot what to do.
    4. “How long will it take to scrape 2000+ image URLs?”
      “5 Minutes.”

 

 

Related resources

 

How to Extract Data from Twitter Without Coding

Top 5 Social Media Scraping Tools for 2020

Scrape video information from YouTube

Scrape public posts from Facebook

 

 

Download Octoparse to start web scraping or contact us for any
question about web scraping!

Contact Us Download
We use cookies to enhance your browsing experience. Read about how we use cookies and how you can control them by clicking cookie settings. If you continue to use this site, you consent to our use of cookies.
Accept decline