The 2018 FIFA World Cup began on June 7, 2018, and ended on July 7, 2018. Scraping the dynamic betting odds from online betting agencies is an important statistics resource for sports analytics, like winner prediction, team value. Or just to make a low-risk bet.
In this article, I would like to address the following three questions:
- Why should we scrape betting odds?
- How to scrape betting odds, easier and faster?
- How to automate the updating betting odds into the database consistently?
Why should we scrape betting odds?
Professional betting agencies make their fortune by calculating their betting odds to maximize the profits and prevent large payout. They set up the statistic model with a large pool of data. And then calculate the average odds, and make the prediction once the calculate the outliers.
On one hand, the changing track of the betting odds reflects where people make their bets. The more bets there are, the lower the odds will be. On the other hand, betting agencies will hedge their bets to prevent the possibility of large payouts.
Is it possible to come with a method to beat the betting agencies? First and foremost, we need to find out the correlation between the betting agencies’ odds and the actual outcomes. We can scrape odds reported by betting agencies, and the actual game results on each game. As a result, we will be able to compare and generate a prediction model.
How to Scrape Betting Odds?
In this article, I will show you how to scrape the betting odds from an odds comparison site. You are also able to download the scraping task to run it on your end.
To follow through, you need to have an Octoparse account and download the free app to your computer.
Step1: Create the Task and Open the Web
1.1. We will create the task with Advanced Mode. Enter the URL of the betting website. And then click "Save URL" at the bottom of the interface.
1.2. Switch the "Workflow" button. This allows us to check our workflow conveniently.
Step 2: Select the Data and Extract
2.1. In the built-in browser, click on one country name and then click on the expansion button at the bottom of the "Action Tips" panel. Hence, Octoparse will expand the selection from “Table Cell” (TD) to “Table Row” (TR).
2.2. Click "Select all sub-elements" command in the "Action Tips" panel. By doing this, Octoparse is able to select all the data on the same row.
2.3. Click "Select all" command from the "Action Tips" panel. Therefore Octoparse will select all data from all rows in the table. Last but not least, click the "Extract data" command.
Now Octoparse will show the extracted information in the data field.
Step 3: Filtering the Data Extracted
If the extracted information in the data field is what you expected, you can skip this step. However, if it’s not what you want, you can re-select the data, repeat the above step till you get the right one. Otherwise, please make sure the XPath is correct. ( To learn more about XPath, Click Here)
3.2. Edit the field name and customize the data field if needed. Then click "OK" to save all the settings.
Tips: We could add the current time of extraction by clicking on "Add predefined fields" at the bottom of the "Data field".
Step 4: Run the Scraper and Get the Data
The overall workflow is completed. Just Click on "Save" and "Start Extraction", we will get the betting odds.
When the data extraction is finished, we could export to Excel, CSV, JSON, HTML or database for further analysis.
How Could We Automate the Updating Betting Odds into Database Consistently?
Solution A: Standard Plan
First, schedule the task in Cloud Extraction at the frequency we want. For example, set as 5 mins interval. Then the task will run at 5 mins interval automatically. This function is critically important to keep the data gets updated so you won't miss any betting odd.
Second, connect to Octoparse API. In this way, we can have the extracted data delivered automatically to the database without accessing the Octoparse App.
Solution B: Professional Plan
Connecting to Octoparse Advanced API could control the task (run or stop) and get the data from our system.
Beyond this, you could have more crawlers, up to 250, and 20 concurrent cloud extraction tasks. That said, we could import dynamic data (betting odds or team information) to your database from up to 20 sources/websites.
The value of the scraping tool is to allow us to extract web data in large scare on different websites concurrently. With the same method, we could scrape information from other websites and enrich our database so as to expands the metrics and conduct a more comprehensive analysis to predict the winner.
Artículo en español: Scraping Análisis de Cuotas de Apuestas Deportivas
También puede leer artículos de web scraping en el sitio web oficial
Author: Surie M.(Octoparse Team)
Edit: Ashley Weldon