All Collections
Case Tutorial
Cryptocurrency
Scrape cryptocurrency information from Yahoo Finance
Scrape cryptocurrency information from Yahoo Finance
Updated over a week ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

A cryptocurrency is a digital or virtual currency that is secured by cryptography, which makes it nearly impossible to counterfeit or double-spend. Many cryptocurrencies are decentralized networks based on blockchain technology—a distributed ledger enforced by a disparate network of computers.

Cryptocurrency players need to monitor price fluctuations in currencies as prices change in seconds. Octoparse can schedule the scraping to run instantly to help update the information in time.

In this tutorial, we are going to show you how to scrape cryptocurrency info from Yahoo Finance.

For Yahoo Finance, you could visit our ready-to-use "Task Template" on the main screen of the Octoparse scraping tool. All you need to do is type in several parameters and the task is ready to go. For further details, you may check it out here: Task Templates

To follow through, you may want to use this URL in the tutorial:


We will scrape data such as the Symbol and Name from the cryptocurrency chart with Octoparse.

1.1.png

The main steps are shown in the menu on the right, and you can download the sample task file here.


1. Create a Go to Web Page - to open the targeted web page

  • Enter the page URL on the home screen and click Start to create a new task


2. Auto-detect web page data - to create the workflow

  • Choose Auto-detect web page data and wait for detection to complete

2.png
  • Click Switch auto-detect results on the Tips panel until you see the table information is selected

switch.jpg
  • Uncheck Add a page scroll

  • Click Create workflow

3.png
  • Click on Click to Paginate action

  • Extend the AJAX timeout to 7-10s

  • Click Apply to save

4.png

3. Extract data - to refine the data fields

  • Switch to Vertical View

  • Rename fields by double-clicking each field name

  • Delete the fields by selecting the fields and click on the trash bin icon

    5.png
_1.gif

Note: A field name can only include letters, numbers, and "_". Also, it must start with a letter.

We need to modify the XPath for some fields to make the data scraping more precise.

  • Price: //fin-streamer[@data-field="regularMarketPrice"]

  • Marketcap: //fin-streamer[@data-field="marketCap"]


4. Modify XPath of Pagination - to fix endless scraping

The auto-generated XPath of Pagination needs to be modified; otherwise, the scraping cannot be stopped. Octoparse will keep scraping the last page. Check out details about this issue here.

  • Click on Pagination

  • Input the new XPath //button[not(@disabled)]//span[text()="Next"]

  • Click Apply to confirm

6.png


5. Run the task - to get your target data

  • Click Save

  • Click Run on the upper left side

  • Select Run on your device to run the task on your computer, or select Run in the Cloud to run the task in the Cloud (for premium users only). You can also schedule a task to update the data frequently

You can export the result data in provided formats such as EXCEL, CVS, JSON or in your database.

Here is the sample output.

13.png
Did this answer your question?