Web Scraping Case Study | Scraping Data from Youtube

Monday, April 24, 2017 5:01 AM

 

In this tutorial, I will show you step by step on how to crawl data from Youtube.com with Octoparse. 

List features covered 

  • Build a  list
  • Expand Selection Area
  • Local Extraction

 

Now, let's get started!

 

Step 1. Set up basic information 

  • Click on "New Task" to start on a new task and complete the basic information

 

Step 2. Navigate to the target website

 

Step 3. Create a list of items

To proceed with the extraction, we'll first need to build a list of items to extract from. 

Move your cursor over the youtube listings with similar layout, where you would extract the data needed. 

  • Click any where on the first listing section

Notice that, we had not selected the whole section properly in the first place. Hence we will need to expand the selection to ensure the whole section has been accurately selected. 

  • Click “Expand the selection area” to the point where the outlined box includes all the content you want to scrape.
  • When prompted, Click “Create a list of items” (sections with similar layout)
  • Click “Add current item to the list”

 

Now, the first item has been added to the list, we need to finish adding all items to the list

  • Click “Continue to edit the list”
  • Click a second section with similar layout, similarly, expand to include the whole section
  • Click “Add current item to the list” again

Now we get all the sections added to the list. 

  • Click “Finish Creating List”
  • Click “loop”, this action will tell Octoparse to click on each section on the list to extract the selected data

 

Step 4. Select the data to be extracted

Now, we have arrived on the detail page which we would like to capture data from. Click on the specific data to extract

  • Click the video info data field “How Tall is Giant”
  • Select “Extract text”
  • Follow the same steps to extract the other data.
  • Click "Save"

 

Step 5. Rename Data Fields

Rename the any field names if necessary.

 

Step 6. Starting running your task 

  • After saving your extraction configuration,click “Next”
  • Select “Local Extraction”
  • Click “OK” to run the task on your computer.

Octoparse will automatically extract all the data selected. Check the "Data Extracted" pane  for the extraction progress.

 

Step 7. Check the data and export

  • Check the data extracted
  • Click "Export" button to export the results to Excel file, databases or other formats and save the file to your computer

Good job for completing this case tutorial. You can download and run this Example on your own.

Now check out similar case studies:

Or, learn more about related topics:

 

 

Request Pro Trial

Leave us a message

Your name*

Your email*

Subject*

Description*

Attachment(s)

Attach file
Attach file
Please enter details of your issue and we will get back to you ASAP.
× get my coupon now No Thanks