Skip to content

To the Office of Transportation at The 22nd Century Sporting League,

After our inaugural 2101 season, The League is looking for ways to optimize our game scheduling process and costs. We know that transportation logistics are a major variable to consider during scheduling, and as such, we’ve got a few questions for you.

Our primary areas of focus are surrounding the number of jets that The League needs to own, and the cost of fuel for those flights. If we want The League to enjoy continued success, we'll need to make sure we manage transportation costs.

We’re sharing schedule data for the upcoming 2102 season. On each row, you’ll find information about which teams are needing to travel to their next set of games, the time the flight will likely depart (based on our estimations of gameplay durations) and the time the flight will likely land.

Additionally, we're also providing the fuel price that was paid each day during this past 2101 season. The fuel price fluctuates over time, but we're hoping you'll be able to project it to the future to help with the analysis.

The Data

team_flights.csv

ColumnDescription
team_nameOfficial team name
departure_datetimeDate and Time (in UTC) when the flight will depart
landing_datetimeDate and Time (in UTC) when the flight will land

fuel_price.csv

ColumnDescription
dateDate when the fuel price was recorded
fuel_priceCorresponding fuel price (in $ per gallon)

Important Things to Know

  • You can assume that the flight's average speed is 500 MPH. (So, as an example, a 2-hour flight would travel 1000 miles)
  • You can assume that each team’s jet fills up with fuel equivalent to 1 gallon per mile-of-travel
  • You can assume that the jet is fueled on the day the travel departs (and thus can use the fuel price corresponding to the departure date)

Load the Data and Preprocess

# Import required modules
import pandas as pd
import numpy as np
from statsmodels.tsa.arima.model import ARIMA
import matplotlib.pyplot as plt


# Explore the data
team_flights = pd.read_csv('./team_flights.csv', parse_dates=["departure_datetime", "landing_datetime"])
fuel_price = pd.read_csv('./fuel_prices_2101.csv',
                         parse_dates=["date"])

# Sort by date for time series analysis
fuel_price = fuel_price.sort_values("date").reset_index(drop=True)

Calculate Maximum Teams in Flight

# Create a list of events (departures + landings)
events = []

for _, row in team_flights.iterrows():
    events.append((row["departure_datetime"], 1))  # Flight takes off
    events.append((row["landing_datetime"], -1))   # Flight lands

# Convert to DataFrame and sort by time
events_df = pd.DataFrame(events, columns=["timestamp", "change"])
events_df = events_df.sort_values("timestamp").reset_index(drop=True)

# Compute number of active flights over time
events_df["active_flights"] = events_df["change"].cumsum()

# Find the max number of simultaneous flights
max_teams_in_flight = events_df["active_flights"].max()

# Plot the number of flights over time
plt.figure(figsize=(12, 6))
plt.plot(events_df["timestamp"], events_df["active_flights"], label="Teams in Flight")
plt.axhline(max_teams_in_flight, color="red", linestyle="--", label=f"Max Teams in Flight: {max_teams_in_flight}")
plt.xlabel("Time")
plt.ylabel("Number of Teams in Flight")
plt.title("Teams in Flight Over Time")
plt.legend()
plt.show()

print("Maximum number of teams in flight:", max_teams_in_flight)

Forecast Fuel Prices Using ARIMA

print(fuel_price.head())
print(team_flights.columns)
from statsmodels.tsa.stattools import adfuller
from pmdarima import auto_arima

# Check stationarity
adf_test = adfuller(fuel_price["price"])
print(f"ADF Statistic: {adf_test[0]}, p-value: {adf_test[1]}")

# Fit ARIMA model (automatically selects best parameters)
arima_model = auto_arima(fuel_price["price"], seasonal=False, trace=True, suppress_warnings=True)

# Predict fuel prices for 365 days in 2102
future_dates = pd.date_range(start="2102-01-01", periods=365, freq="D")
future_forecast = arima_model.predict(n_periods=365)

# Store forecasted prices
fuel_forecast_df = pd.DataFrame({"date": future_dates, "price": future_forecast})

# Plot forecast
plt.figure(figsize=(12, 6))
plt.plot(fuel_price["date"], fuel_price["price"], label="Actual 2101 Prices")
plt.plot(fuel_forecast_df["date"], fuel_forecast_df["price"], label="Forecasted 2102 Prices", linestyle="dashed")
plt.xlabel("Date")
plt.ylabel("Fuel Price ($ per gallon)")
plt.title("Fuel Price Forecast for 2102")
plt.legend()
plt.show()

Compute Total Fuel Spend for 2102

# Merge flight data with fuel prices (match departure date)
team_flights["departure_date"] = team_flights["departure_datetime"].dt.date
# Ensure 'date' is in datetime format
fuel_forecast_df["date"] = pd.to_datetime(fuel_forecast_df["date"])
fuel_forecast_df["date"] = fuel_forecast_df["date"].dt.date

# Merge fuel prices with team flights
team_flights = team_flights.merge(fuel_forecast_df, left_on="departure_date", right_on="date", how="left")

# Calculate distance (miles)
team_flights["flight_duration"] = (team_flights["landing_datetime"] - team_flights["departure_datetime"]).dt.total_seconds() / 3600  # Convert to hours
team_flights["distance_miles"] = team_flights["flight_duration"] * 500

# Calculate total fuel consumption
team_flights["fuel_needed_gallons"] = team_flights["distance_miles"]

# Calculate fuel cost
team_flights["fuel_cost"] = team_flights["fuel_needed_gallons"] * team_flights["price"]

# Compute total fuel spend for 2102
total_fuel_spend_2102_dollars = team_flights["fuel_cost"].sum()

print("Total fuel spend for 2102 (in dollars):", total_fuel_spend_2102_dollars)
max_teams_in_flight = max_teams_in_flight.astype('float')
print(max_teams_in_flight)