Skip to content

To the Office of Transportation at The 22nd Century Sporting League,

After our inaugural 2101 season, The League is looking for ways to optimize our game scheduling process and costs. We know that transportation logistics are a major variable to consider during scheduling, and as such, we’ve got a few questions for you.

Our primary areas of focus are surrounding the number of jets that The League needs to own, and the cost of fuel for those flights. If we want The League to enjoy continued success, we'll need to make sure we manage transportation costs.

We’re sharing schedule data for the upcoming 2102 season. On each row, you’ll find information about which teams are needing to travel to their next set of games, the time the flight will likely depart (based on our estimations of gameplay durations) and the time the flight will likely land.

Additionally, we're also providing the fuel price that was paid each day during this past 2101 season. The fuel price fluctuates over time, but we're hoping you'll be able to project it to the future to help with the analysis.

The Data

team_flights.csv

ColumnDescription
team_nameOfficial team name
departure_datetimeDate and Time (in UTC) when the flight will depart
landing_datetimeDate and Time (in UTC) when the flight will land

fuel_price.csv

ColumnDescription
dateDate when the fuel price was recorded
fuel_priceCorresponding fuel price (in $ per gallon)

Important Things to Know

  • You can assume that the flight's average speed is 500 MPH. (So, as an example, a 2-hour flight would travel 1000 miles)
  • You can assume that each team’s jet fills up with fuel equivalent to 1 gallon per mile-of-travel
  • You can assume that the jet is fueled on the day the travel departs (and thus can use the fuel price corresponding to the departure date)
# Import required modules
import pandas as pd
import numpy as np
from statsmodels.tsa.statespace.sarimax import SARIMAX
import matplotlib.pyplot as plt


# Explore the data
team_flights = pd.read_csv('./team_flights.csv')
fuel_prices = pd.read_csv('./fuel_prices_2101.csv',
                         index_col='date')

team_flights.head()

# Some basic data cleaning and pre-processing
team_flights['departure_datetime'] = pd.to_datetime(team_flights['departure_datetime'])
team_flights['landing_datetime']   = pd.to_datetime(team_flights['landing_datetime'])

fuel_prices.index = pd.DatetimeIndex(fuel_prices.index).to_period('D')
team_flights.head(10)
# Visualizing the number of teams flying simultaneously
datetime = pd.concat([team_flights['departure_datetime'], team_flights["landing_datetime"]]).sort_values().reset_index(drop=True)
events = pd.concat([
    pd.DataFrame({'datetime': team_flights['departure_datetime'], 'change':1}),
    pd.DataFrame({'datetime': team_flights['landing_datetime'], 'change':-1})
])

print(events)
events = events.sort_values('datetime').reset_index(drop=True)
events['number_in_air'] = events['change'].cumsum()
max_teams_in_flight_d = events['number_in_air'].max()
print(f"Maximum number of teams simultaneously in flight: {max_teams_in_flight_d}")
max_teams_in_flight = 19
plt.figure(figsize=(12,6))
plt.plot(events['datetime'], events["number_in_air"], marker = 'o', label = "Number of Teams in Flight")
plt.axhline(y=max_teams_in_flight, color='r', linestyle='--', label= f'Max Teams in Flight = {max_teams_in_flight}')
plt.xlabel("Datetime")
plt.ylabel("Number of Teams in Flight")
plt.title("Number of Teams in Flight Throughout 2102 Season")
plt.grid(True)
plt.legend()
plt.tight_layout()
plt.show()

Determining total fuel that will be spent on 2102 in dollars

fuel_prices.head(10)
# Fit Seasonal ARIMA model
model = SARIMAX(fuel_prices, order = (1,1,1), seasonal_order=(1,0,0,7))
model_fit = model.fit()
# Forecast fuel prices for 2102
forecast = model_fit.get_forecast(steps = 365)
# Convert to Dataframe
fuel_prices_forecast = pd.DataFrame(data={"date": forecast.summary_frame().index.to_timestamp(), 
                                         "price": forecast.predicted_mean.values})