How to Scrape Data from Yahoo Finance

In the world of business, the financial health of a company is crucial. This information is provided to the public through various platforms and forums. Yahoo Finance is one of the most popularly used web media to view and extract financial details of companies around the world. In this article, we’ll introduce the best Yahoo Finance scraper to scrape Yahoo Finance easily and quickly.

Is Yahoo Finance Web Scraping Legal

In short, YES. Most of the data available on the Yahoo Finance website is open-source and public information. But you should still pay attention to your local web scraping laws and rules when you’re scraping and using these data.

You can scrape Yahoo Finance data, including the following main parts:

News updates on stock markets
Current Stock prices of companies
The trend of rising or falling of a company’s stock prices.
Mutual Funds and ETFs
Value of currencies and even cryptocurrencies

Scraping the above information can be of paramount usage to businesses. In the business market, analyzing this data can produce information required to improve business strategies.

Does Yahoo Finance have an API?

Yahoo Finance API is a platform that provides financial data and analytics to developers and businesses. It offers a range of APIs that allow developers to access a variety of financial data, including current and historical stock prices, exchange rates, financial statements, news and articles, and more.

However, Yahoo Finance has officially discontinued its API service in 2017. That means you need to find alternative ways if you want to scrape data from Yahoo Finance.

Scrape Data from Yahoo Finance Without Coding

There are multiple ways that you can scrape Yahoo Finance. Some of them require no coding knowledge. Tools and software are available online for web scraping. Here we’ll introduce Octoparse, the best web scraping tool. It is one of the most widely-used, open-source software and is free. It extracts bulk data from multiple websites, and supports up to 10,000 links in one go.

Octoparse can auto-detect data from Yahoo Finance site, it also has preset templates for you to scrape data within only a few clicks. It allows you to export the extracted data in multiple formats, including Excel, CSV, and database. You can use the scheduled scraping for your tasks hourly, daily, or weekly.

3 steps to extract Yahoo Finance data

You can follow the steps below to scrape data from Yahoo Finance after you have downloaded and installed Octoparse. Or, you can watch the video guide here.

Step 1: Enter the URL copied from Yahoo Finance

Open the Yahoo Finance page you need to scrape data, and copy the page link. Enter the URL into Octoparse search bar, and click on the Search button. It will start auto-detecting.

Step 2: Make changes on your workflow

Waiting for the auto-detecting process, and create a workflow after it finished. You can see the extracted data in the preview section, and Octoparse lets you change the data fields you want to extract.

Step 3: Run the workflow to extract Yahoo Finance data

After all the changes have been saved, click on the Run button to scrape data. You can download the results to your local devices or save them to a database.

If you still have any question about the details, you can move to Scrape cryptocurrencies information from Yahoo Finance tutorial to read more. Or you can learn the steps to extract stock info from Yahoo Finance.

How to Pull Yahoo Finance Data with Python

To web scrape Yahoo Finance using python, we can make use of multiple python modules and methods available open-source. One of the simplest and beginner-friendly methods to scrape financial data is to use the BeautifulSoup library. Let us look at this method step by step.

Step 1: Install the dependencies on the device you are using.

pip install bs4
pip install requests
pip install pandas

Step 2: Import the modules

#import modules
import requests
from bs4 import BeautifulSoup
import pandas as pd

Step 3: Get the webpage URL and check for errors.

#get the URL using response variable
my_url = "https://finance.yahoo.com/news"
response = requests.get(my_url)

#Catching Exceptions 
print("response.ok : {} , response.status_code : {}".format(response.ok , response.status_code))
print("Preview of response.text : ", response.text[:500])

Step 4: Create a function to retrieve HTML data of the webpage as a Beautiful Soup object.

#utility function to download a webpage and return a beautiful soup doc
def get_page(url):
 response = requests.get(url)
 if not response.ok:
 print('Status code:', response.status_code)
 raise Exception('Failed to load page {}'.format(url))
 page_content = response.text
 doc = BeautifulSoup(page_content, 'html.parser')
 return doc

#function call 
doc = get_page(my_url)

Step 5: Extract and store the information.

#appropritae tags common to news-headlines to filter out the necessary information.
a_tags = doc.find_all('a', {'class': "js-content-viewer"})
print(len(a_tags))

#print(a_tags[1])
news_list = []

#print top 10 Headlines
for i in range(1,len(a_tags)+1):
 news = a_tags[i-1].text
 news_list.append(news)
 print("Headline "+str(i)+ ":" + news)
news_df = pd.DataFrame(news_list)
news_df.to_csv('Market_News')

Final Thoughts

Scraping financial information has given you a competitive edge over the business world. Now, you have learned the basics of scraping Yahoo Finance, try extracting data from multiple sources and compiling them into a useful set of information. Apply the inferences from this information in different forms and how the extracted data can be put to maximum use. I hope this article inspires you to dive into the world of financial web scraping.