Scrape Betting Odds for Sports Analytics
Thursday, June 07, 2018
The 2018 FIFA World Cup is around the corner, which will be held in Russia from June 14th to July 15th. Scraping the dynamic betting odds from online betting agencies is an important statistics resource for sports analytics, like winner prediction, team value. Or just to make a sure and low-risk bets.
In this article, I would like to address following three questions:
- Why should we scrape betting odds?
- How could we scrape betting odds, easier and faster?
- How could we automate the updating betting odds into database consistently?
Why Scrape Betting Odds?
Professional betting agencies make their fortune by calculating their betting odds in a way, which maximizes their profits and minimizes their risks. They get the most comprehensive information to do their prediction and then adjust their odds.
On one hand, the changing track of the betting odds reflects where people make their bets, as the more they bet on, the lower the odd is; on the other hand, betting agencies will also autonomously adjust their betting odds to maximize their profit, like lowering down the odds of the winner to avoid losing much money.
Beyond this, find out the degree of the correlation between the betting agencies’ odds and the actual match outcome is also important. We can scrape the betting agencies’ reported odds and the level of favored outcomes within each game, then compared them to the actual match outcomes. Therefore, we can evaluate the value of the betting agencies’ odds.
How Could We Scrape Betting Odds?
In this article, I will show you how to scrape the betting odds from an odds comparison site. You are also able to download the scraping task to run it on your end.
To follow through, you gonna have an Octoparse account and download the free app to your computer.
Step1: Create the Task and Open the Web
1.1. We will create the task with Advanced Mode, then enter the URL of the betting website. And then click "Save URL" at the bottom of the interface.
1.2. Toggle the "Workflow" button, which allows us to check our workflow conveniently.
Step 2: Select the Data and Extract
2.1. In the built-in browser, click on one country name and then click on the expansion button at the bottom of the "Action Tips", so Octoparse will expand the selection "TD" to "TR", which means selecting the entire row.
2.2. "Select all sub-elements" in the "Action Tips", it means Octoparse is selecting all the data in this row.
2.3. "Select all" in the "Action Tips", then Octoparse will select all the data in all the rows in this table. Then select "Extract data".
Now Octoparse will auto-generate all the data from the row into the data field.
Step 3: Filtering the Data Extracted
If the data generated in the data field are what you want, you can skip this step. But for me, in this case, I would re-select each data field, to make sure every column is the data I want.
3.1. Therefore, after deleting all the data fields which are automatically generated, click on the data that we want to collect in the row, then select "Extract text of the selected element" from the "Action Tips". Repeat the same steps to extract other data(odds) in the row one by one.
3.2. Edit the field name and customize the data field if needed. Then click "OK" to save all the settings.
Tips: We could add the current time of extraction by clicking on "Add predefined fields" at the bottom of the "Data field".
Step 4: Run the Scraper and Get the Data
The overall workflow is completed. Just Click on "Save" and "Start Extraction", we will get the betting odds.
When the data extraction is finished, we could export to Excel, CSV, JSON, HTML or database for further analysis.
How Could We Automate the Updating Betting Odds into Database Consistently?
There are two solutions in Octoparse to automate the dynamic betting odds to our database/system.
Solution A: Standard Plan
First, schedule the task in Cloud Extraction at the frequency we want, even at 5 mins interval. Then the task will run at 5 mins interval automatically. This function is critically important when the match is ongoing, as the betting odds will change dramatically at that time.
Second, connect to Octoparse API. In this way, we can have the extracted data delivered automatically to our database in real time, without accessing the Octoparse App.
Solution B: Professional Plan
Connecting to Octoparse Advanced API could control the task (run or stop) and get the data from our system.
Beyond this, we could have more crawlers, up to 250, and 20 concurrent cloud extraction tasks. It means we could import the dynamic data (betting odds or team information) to our database from up to 20 sources/websites.
The value of the scraping tool is to allow us to extract web data at a large amount, on different websites and at any time, easier and faster. With the same method, we could scrape data information from other websites to add to our dataset. So we could build up the metrics and do our own combinative analysis to predict the winner.
Happy Data Hunting!
Most popular posts
- Related articles
- Top 5 Social Media Scraping Tools for 2018
- How to Build a Web Crawler from Scratch – A G...
- How to Build a Hotel Data Scraper When You Ar...
- Web Scraping 101: Tackle Pagination for Web S...
- Data Insight: What Is Web Scraping?