Skip to content

Profits or Losses? Expected Value From Soccer Betting

If I always bet on the favorite team, would I win or lose money? What if I bet on the second favorite or the least favorite?

In this project, the expected value from soccer betting is discussed. The purpose is to show how good the bookies are at assigning odds to soccer events in order to ensure profits.

This work is informative to the public who wants to know how casinos make money and what is behind the saying: 'The House always wins'.

The data used in this project were obtained from https://www.football-data.co.uk/ and contain information about past soccer games from some of the european leagues, including results and odds. If you want to know more about the data, visit the site.

Modules

#First import modules
import pandas as pd
import numpy as np
import os
import re
import matplotlib.pyplot as plt

Load and Transform

#Then read all csv and merge them
files = os.listdir()
files = [f for f in files if re.search('csv', f)]
df = pd.DataFrame()
for f in files:
    d = pd.read_csv(f)
    df = pd.concat([df,d])
#Take only necesary columns (FTHG=Goals score by home team, FTAG=Goals score by away team, B365H=Bet 365 Home Odd
#B365D=Bet 365 Draw Odd, B365A=Bet 365 Away Odd)
df = df[['Date', 'HomeTeam', 'AwayTeam', 'FTHG', 'FTAG', 'B365H', 'B365D', 'B365A']]
#Create the Result column (0=Home Win, 1=Draw, 2=Away Win)
df['Result'] = np.where(df['FTHG'] > df['FTAG'], 0, np.where(df['FTHG'] == df['FTAG'], 1,2))
#Create a Column for the Win odd
df['WinOdd'] = np.where(df['Result']==0, df['B365H'], np.where(df['Result']==1, df['B365D'], df['B365A']))
# Create a new column with the min odd
df['MinOdd'] = df[['B365H', 'B365D', 'B365A']].min(axis=1)
# Create a new column with the middle odd
df['MiddleOdd'] = df[['B365H', 'B365D', 'B365A']].apply(lambda x: sorted(x)[1], axis=1)
#Create a new column with the max odd
df['MaxOdd'] = df[['B365H', 'B365D', 'B365A']].max(axis=1)
#Create Combined Odd for The two less probale events
df['OddCNF'] = 1/((1/df['MiddleOdd'])+(1/df['MaxOdd']))
#Create Combined Odd for The two more probale events
df['OddSNF'] = 1/((1/df['MiddleOdd'])+(1/df['MinOdd']))
#Create Column with the most probable result according to Odds
df['Fav'] = np.where((df['B365H'] < df['B365D']) & (df['B365H'] < df['B365A']), 0, 
                     np.where((df['B365A'] < df['B365D']) & (df['B365A'] < df['B365H']), 2, 1))
#Create Column with the least probable result according to Odds
df['Les_favorite'] = df[['B365H', 'B365D', 'B365A']].idxmax(axis=1).map({'B365H': 0, 'B365D': 1, 'B365A': 2})