Skip to content
0

Where to open a new coffee shop?

📖 Background

You are helping a client who owns coffee shops in Colorado. The company's coffee shops serve high-quality and responsibly sourced coffee, pastries, and sandwiches. They operate three locations in Fort Collins and want to expand into Denver.

Your client believes that the ideal location for a new store is close to affluent households, and the store appeals to the 20-35 year old demographic.

Your team collected geographical and demographic information about Denver's neighborhoods to assist the search. They also collected data for Starbucks stores in Denver. Starbucks and the new coffee shops do not compete for the same clients; the team included their location as a reference.

💾 The data

You have assembled information from three different sources (locations, neighborhoods, demographics):

Starbucks locations in Denver, Colorado
  • "StoreNumber" - Store Number as assigned by Starbucks
  • "Name" - Name identifier for the store
  • "PhoneNumber" - Phone number for the store
  • "Street 1, 2, and 3" - Address for the store
  • "PostalCode" - Zip code of the store
  • "Longitude, Latitude" - Coordinates of the store
Neighborhoods' geographical information
  • "NBHD_ID" - Neighborhood ID (matches the census information)
  • "NBHD_NAME" - Name of the statistical neighborhood
  • "Geometry" - Polygon that defines the neighborhood
Demographic information
  • "NBHD_ID" - Neighborhood ID (matches the geographical information)
  • "NBHD_NAME' - Nieghborhood name
  • "POPULATION_2010' - Population in 2010
  • "AGE_ " - Number of people in each age bracket (< 18, 18-34, 35-65, and > 65)
  • "NUM_HOUSEHOLDS" - Number of households in the neighborhood
  • "FAMILIES" - Number of families in the neighborhood
  • "NUM_HHLD_100K+" - Number of households with income above 100 thousand USD per year

Starbucks locations were scrapped from the Starbucks store locator webpage by Chris Meller.
Statistical Neighborhood information from the City of Denver Open Data Catalog, CC BY 3.0 license.
Census information from the United States Census Bureau. Publicly available information.

import pandas as pd
import geopandas as gpd
denver_starbuck_locations = pd.read_csv('./data/denver.csv')
denver_starbuck_locations
# Load the shapefile components correctly
neighborhoods = gpd.read_file('./data/neighborhoods.shp')

# Display the first few rows to understand its structure
neighborhoods.head()

1 hidden cell
demographics = pd.read_csv('./data/census.csv')
demographics

Let's proceed with the demographic analysis to identify potential neighborhoods for the new coffee shop based on the criteria:

  1. Number of affluent households (households with income above $100,000).
  2. Population aged 18-34.
# Define thresholds for affluent households and young population
affluent_threshold = 500
young_population_threshold = 1500

# Filter the demographics data
filtered_neighborhoods = demographics[
    (demographics['NUM_HHLD_100K+'] >= affluent_threshold) &
    (demographics['AGE_18_TO_34'] >= young_population_threshold)
]

# Display the filtered neighborhoods
filtered_neighborhoods.head()

We have already identified some neighborhoods that meet these criteria. Let's dive deeper into these neighborhoods to get more insights and potentially narrow down the best locations.

Steps:

  1. Summary Statistics: Get summary statistics for the filtered neighborhoods.
  2. Visualization: Plot the demographic characteristics of the filtered neighborhoods.

I'll start by providing a summary of the filtered neighborhoods.

Summary Statistics of Filtered Neighborhoods

Here are the summary statistics for the neighborhoods filtered based on having at least 500 affluent households and at least 1500 people aged 18-34:

StatisticValue
Number of Neighborhoods31
Average Population (2010)11,239
Average Number of Households5,210
Average Number of Families2,311
Average Affluent Households (>$100k)1,391
Average Population Aged 18-343,602
Average Population Aged < 182,195
Average Population Aged 35-654,267
Average Population Aged 65+1,174

These statistics show that the filtered neighborhoods have a relatively high number of affluent households and a significant population of young adults (18-34), making them good candidates for the new coffee shop.

# Summary statistics for the filtered neighborhoods
summary_stats = filtered_neighborhoods.describe()
print(summary_stats)

Visualization

Let's create visualizations to compare these neighborhoods based on the following key metrics:

  1. Total population.
  2. Number of households.
  3. Number of affluent households.
  4. Population aged 18-34.

These visualizations will help identify the most promising neighborhoods visually.

# Visualization of the filtered neighborhoods based on key metrics
import matplotlib.pyplot as plt

# Sort the dataframe by each metric in descending order
sorted_population = filtered_neighborhoods.sort_values(by='POPULATION_2010', ascending=False)
sorted_households = filtered_neighborhoods.sort_values(by='NUM_HOUSEHOLDS', ascending=False)
sorted_affluent_households = filtered_neighborhoods.sort_values(by='NUM_HHLD_100K+', ascending=False)
sorted_age_18_34 = filtered_neighborhoods.sort_values(by='AGE_18_TO_34', ascending=False)

fig, axes = plt.subplots(2, 2, figsize=(14, 14))

# Plot 1: Total population
sorted_population.plot(kind='bar', x='NBHD_NAME', y='POPULATION_2010', ax=axes[0, 0], color='skyblue', legend=False)
axes[0, 0].set_title('Total Population (2010)')
axes[0, 0].set_ylabel('Population')
axes[0, 0].tick_params(axis='x', rotation=90)

# Plot 2: Number of households
sorted_households.plot(kind='bar', x='NBHD_NAME', y='NUM_HOUSEHOLDS', ax=axes[0, 1], color='lightgreen', legend=False)
axes[0, 1].set_title('Number of Households')
axes[0, 1].set_ylabel('Households')
axes[0, 1].tick_params(axis='x', rotation=90)

# Plot 3: Number of affluent households
sorted_affluent_households.plot(kind='bar', x='NBHD_NAME', y='NUM_HHLD_100K+', ax=axes[1, 0], color='salmon', legend=False)
axes[1, 0].set_title('Number of Affluent Households (>$100k)')
axes[1, 0].set_ylabel('Affluent Households')
axes[1, 0].tick_params(axis='x', rotation=90)

# Plot 4: Population aged 18-34
sorted_age_18_34.plot(kind='bar', x='NBHD_NAME', y='AGE_18_TO_34', ax=axes[1, 1], color='orange', legend=False)
axes[1, 1].set_title('Population Aged 18-34')
axes[1, 1].set_ylabel('Population')
axes[1, 1].tick_params(axis='x', rotation=90)

# Adjust layout
plt.tight_layout()

# Show plot
plt.show()

Interpretation of Visualizations

The bar charts provide a comparative view of the filtered neighborhoods based on key metrics:

  1. Total Population (2010):
    • Neighborhoods like Capitol Hill, Highland, and Hampden have higher total populations.
  2. Number of Households:
    • Capitol Hill, Highland, and Hampden also have a high number of households.
  3. Number of Affluent Households (>$100k):
    • Neighborhoods such as Congress Park, Hampden, and Highland have a significant number of affluent households.
  4. Population Aged 18-34:
    • Capitol Hill stands out with a very high population of young adults, followed by Cheesman Park and Congress Park.

Potential Neighborhoods for New Coffee Shop

Based on these visualizations, the neighborhoods that stand out as potential locations for the new coffee shop include:

  • Capitol Hill: High total population, households, and a very high number of young adults.
  • Highland: High total population, households, and a significant number of affluent households.
  • Congress Park: High number of affluent households and a significant population of young adults.
  • Cheesman Park: High population of young adults and a good number of households.

These neighborhoods appear to align well with your criteria of being affluent and having a significant 20-35-year-old demographic.

💪 Challenge

Provide your client a list of neighborhoods in Denver where they should consider expanding. Include:

  • A visualization of Denver's neighborhoods and the Starbucks store locations.
  • Find the neighborhoods with the highest proportion of people in the target demographic.
  • Select the top three neighborhoods where your client should focus their search.