Union Square Business Distribution Analysis

Objective:

To analyze the types, categories, and distribution of businesses within the Union Square area, providing insights for urban planning, business development, and community services.

In this project, I will use NY Union Square Partnership (USP) Business List data from Kaggle

Here is the data description:

Field Name	Description
`Category`	The broad category of the business (e.g., Food and Drink, Retail).
`Type of Business`	More specific business type (e.g., Restaurants, Apparel and Footwear).
`Business`	Name of the business.
`Address`	Address of the business.
`City/State`	City and State where the business is located.
`Zip Code`	Zip code of the business location.
`Phone Number`	Contact phone number of the business.
`Borough`	Borough in which the business is located.
`Latitude`	Latitude coordinate of the business location.
`Longitude`	Longitude coordinate of the business location.
`Community Board`	Administrative identifier for community board.
`Council District`	Administrative identifier for council district.
`Census Tract`	Administrative identifier for census tract.
`BIN`	Administrative identifier for BIN (Building Identification Number)
`BBL`	Administrative identifier for BBL (Borough Block Lot).
`NTA`	Administrative identifier for NTA (Neighborhood Tabulation Area)
`Location 1`	JSON-style string containing latitude and longitude.
Administrative columns `Zip Codes`, `Community Districts`, `Borough Boundaries`, `City Council Districts`, and `Police Precincts`.	Various administrative and geographical identifiers.

DataFrameas

unions

variable

--code to save the CSV file as unions
SELECT * FROM 'union.csv';

DataFrameas

df6

variable

-- Stored Procedure for Counting Businesses by a Specified Column
CREATE PROCEDURE CountBusiness(IN column_name VARCHAR(255))
BEGIN
    SET @sql_query = CONCAT(
        'SELECT ', column_name, ', COUNT(*) AS Business_Count ',
        'FROM unions ',
        'GROUP BY ', column_name, ' ',
        'ORDER BY Business_Count DESC'
    );

    PREPARE stmt FROM @sql_query;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
END;

DataFrameas

df1

variable

-- Count the number of businesses by category using CountBusiness SP
CALL CountBusiness('Category');

DataFrameas

df

variable

-- Count the number of businesses by type using CountBusiness SP
CALL CountBusiness('"Type of Business"');

The procedure is created to dynamically count businesses based on any specified column. It constructs the SQL query dynamically using the input parameter.

Counts the number of businesses for each category and orders them in descending order to identify the most common business categories.

Counts the number of businesses for each type and orders them in descending order to identify the most common business types.

DataFrameas

df4

variable

-- Count the number of businesses by ZIP code using CountBusiness SP
CALL CountBusiness('"Zip Code"');

Counts the number of businesses in each ZIP code to understand the geographical distribution of businesses.

DataFrameas

df2

variable

-- Calculate the density of businesses in various community districts
SELECT "Community Districts", COUNT(*) AS Business_Count
FROM unions
WHERE "Community Districts" IS NOT NULL
GROUP BY "Community Districts"
ORDER BY Business_Count DESC;

Counts the number of businesses in each community district to understand business density and identify districts with the highest concentration of businesses.

DataFrameas

df5

variable

-- Identify the top 5 most diverse business categories (categories with the most different types of businesses)
SELECT Category, COUNT(DISTINCT "Type of Business") AS Type_Count
FROM unions
GROUP BY Category
ORDER BY Type_Count DESC
LIMIT 5;

‌
‌
‌

Union Square Business Distribution Analysis

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Union Square Business Distribution Analysis

Objective:

Union Square Business Distribution Analysis