Skip to content

I am working as a sports journalist at a major online sports media company, specializing in soccer analysis and reporting. I've been watching both men's and women's international soccer matches for a number of years, and my gut instinct tells me that more goals are scored in women's international football matches than men's. This would make an interesting investigative article that our subscribers are bound to love, but I'll need to perform a valid statistical hypothesis test to be sure!

While scoping this project, I acknowledge that the sport has changed a lot over the years, and performances likely vary a lot depending on the tournament, so I decide to limit the data used in the analysis to only official FIFA World Cup matches (not including qualifiers) since 2002-01-01.

I created two datasets containing the results of every official men's and women's international football match since the 19th century, which I scraped from a reliable online source. This data is stored in two CSV files: women_results.csv and men_results.csv.

The question I am trying to determine the answer to is:

Are more goals scored in women's international soccer matches than men's?

I assume a 10% significance level, and use the following null and alternative hypotheses:

: The mean number of goals scored in women's international soccer matches is the same as men's.

: The mean number of goals scored in women's international soccer matches is greater than men's.

# Imports
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import mannwhitneyu

# Load Data
men = pd.read_csv("men_results.csv", parse_dates=["date"])
women = pd.read_csv("women_results.csv", parse_dates=["date"])

# Filter Data
men_fifa_recent = men[(men["date"] > "2002-01-01") & (men["tournament"] == "FIFA World Cup")]
women_fifa_recent = women[(women["date"] > "2002-01-01") & (women["tournament"] == "FIFA World Cup")]

# Add group column and calculate goals scored
men_fifa_recent["group"] = "men"
women_fifa_recent["group"] = "women"
men_fifa_recent["goals_scored"] = men_fifa_recent["home_score"] + men_fifa_recent["away_score"]
women_fifa_recent["goals_scored"] = women_fifa_recent["home_score"] + women_fifa_recent["away_score"]

# Plot distributions for EDA
plt.hist(men_fifa_recent["goals_scored"], alpha=0.7, label="Men")
plt.hist(women_fifa_recent["goals_scored"], alpha=0.7, label="Women")
plt.legend()
plt.title("Goals Scored Distribution")
plt.xlabel("Goals Scored")
plt.ylabel("Frequency")
plt.show()

# Combine datasets for analysis
combined = pd.concat([men_fifa_recent, women_fifa_recent], axis=0, ignore_index=True)

# Perform Mann-Whitney U Test
result, p_val = mannwhitneyu(
    x=women_fifa_recent["goals_scored"],
    y=men_fifa_recent["goals_scored"],
    alternative="greater"
)

# Print the results
result_dict = {"p_val": float(p_val), "result": float(result)}
print(result_dict)

# Determine hypothesis test result
alpha = 0.10
if p_val < alpha:
    print("Reject the null hypothesis: The number of goals scored in women's and men's matches is significantly different.")
else:
    print("Fail to reject the null hypothesis: No significant difference in the number of goals scored.")

# Boxplot comparison
plt.figure(figsize=(10, 6))
sns.boxplot(data=combined, x="group", y="goals_scored")
plt.title("Goals Scored Comparison by Gender (Boxplot)")
plt.xlabel("Group")
plt.ylabel("Goals Scored")
plt.show()

# KDE comparison
plt.figure(figsize=(10, 6))
sns.kdeplot(data=men_fifa_recent, x="goals_scored", label="Men", shade=True, bw_adjust=0.5)
sns.kdeplot(data=women_fifa_recent, x="goals_scored", label="Women", shade=True, bw_adjust=0.5)
plt.title("Goals Scored Distribution by Gender (KDE Plot)")
plt.xlabel("Goals Scored")
plt.ylabel("Density")
plt.legend()
plt.show()

Visual Analysis

Histograms:

The distribution for men's goals scored shows a pronounced peak around 2-3 goals, with a rapid decrease in frequency for higher goal counts.

The distribution for women's goals scored also peaks around 2-4 goals but appears slightly broader and less concentrated than men's, indicating a greater variability in scoring.

Boxplots:

The boxplot for men's matches indicates a median around 2 goals, with most scores falling between 1 and 3 goals (the interquartile range). There are a few outliers, extending up to 8 goals.

The boxplot for women's matches shows a slightly higher median, approximately 3 goals, and a wider interquartile range, suggesting more variability. Notably,_ women's matches exhibit a greater number of high-scoring outliers, with some matches reaching up to 13 goals_.

KDE Plots:

The KDE plot for men's goals (purple) displays a relatively sharp and narrow peak, reinforcing the concentration of scores around 2-3 goals.

The KDE plot for women's goals (green) is broader and less peaked, extending further along the x-axis, which visually confirms the greater spread and propensity for higher goal counts in women's matches.

Statistical Test Results

The Mann-Whitney U test yielded the following result:

P-value: p=0.005106

The analysis concluded: "Reject the null hypothesis: The number of goals scored in women's and men's matches is significantly different."

With a p-value of 0.005106 and a significance level of α=0.1, since p<α (0.005106<0.1), we reject the null hypothesis. This indicates strong statistical evidence that the number of goals scored in women's and men's matches is significantly different.

Conclusion

The analysis of FIFA World Cup match data from 2002 onwards reveals observable differences in the distribution of goals scored between men's and women's matches. Visual evidence from histograms, boxplots, and KDE plots consistently suggests that while both genders typically score a low number of goals per game, women's matches tend to have a wider spread of scores and a higher frequency of matches with a greater number of goals, including more high-scoring outliers.

The Mann-Whitney U test, with a p-value of 0.005106 and a significance level of 0.1, provides strong statistical evidence to reject the null hypothesis. This confirms that there is a significant difference in the number of goals scored between women's and men's matches, indicating distinct patterns in goal distributions across genders in FIFA World Cup history.