How to Extract and Monitor Stock Prices from Yahoo! FinanceWednesday, June 23, 2021
Hi! Today, Octoparse is gonna show you how to extract and monitor stock prices from Yahoo! Finance. Ok, let’s get started.
Table of Contents
Auto-detect feature in Octoparse
The new Octoparse 8.2 interface is very intuitive. Once you copy and paste a URL into the address bar, it automatically gets started in parsing the webpage and guesses what content you want to extract. In this case, we are going to enter the most active stocks URL from Yahoo! Finance.
After you paste the URL, click the Start button and wait for a little bit while it’s loading.
Before we move on to the next steps, let’s find out the reasons for scraping stock prices.
Why do we want to scrape stock prices?
Well, when you constantly extract a stock price and continuously feed the data into your research and data models, you can then train your algorithm with your machine learning code that later gives you more accurate and profitable advice in the investment market.
As you might have known, one of the applications of Octoparse is price monitoring. Not only does it track and monitor prices on a page, but it also extracts raw data from your competitors and scrapes real-time data within a few mouse clicks.
4 steps to scrape and extract stock prices
With the auto-detection feature, it takes 4 steps to scrape and extract stock prices.
Step 1: Switch Auto-detect results
When you see the link “Switch auto-detect results,” click it through a few times until you see the table of most active stocks is highlighted in the browser.
Step 2: Take care of the pagination and generate a workflow
We need to take care of the pagination option. Click the "Edit link" and select the "Next page" button on the page to let Octoparse know where exactly the next button is.
Click the confirm button once you have selected the next button. Uncheck the "Add a page scroll" and generate a workflow.
Step 3: Set up the pagination option and review the workflow
As you can see, a workflow is generated on the left-hand side. The next thing we need to set up is the pagination option. Hover the mouse cursor to the Pagination bar and click the gear icon. Expand the "Exit Loop" panel and enter 2. This means we want the loop to quit after the next button on the webpage has been clicked twice.
Let’s review our workflow. First, it executes "Go to Web Page!". Next, it goes into a pagination loop where contains the real deal of extraction. You can see the "Extract Data" step that extracts what you have preselected. Then the loop will click the next button for you until the exit condition.
Step 4: Save and run the task
Ok, let’s save the task and click the "Run" button to fire off our scraping task.
There are a few options where and when you want to run the task. For this example’s sake, Select "Run task on your device".
Ok, we can see extracted data started coming in and it will stop after 50 lines of data are extracted.
Export the data and open up the spreadsheet
When it is completed, click the "Export data" button.
Now open up the saved spreadsheet and compare the data with the ones you previewed. Make sure the data looks good. As we scroll the spreadsheet, the last row is 51 which makes sense because the first row is the header. 51 minus 1 is 50 rows.
There you have it. In only a few minutes we have extracted the most active stocks from Yahoo! Finance and saved the data to a spreadsheet.
We have way much more to come regarding Octoparse and web scraping. If you like our channel, please give us a thumb up and subscribe. Thank you!