Skip to content
Union Square Business Distribution Analysis
  • AI Chat
  • Code
  • Report
  • Union Square Business Distribution Analysis

    Objective:

    To analyze the types, categories, and distribution of businesses within the Union Square area, providing insights for urban planning, business development, and community services.

    In this project, I will use NY Union Square Partnership (USP) Business List data from Kaggle

    Here is the data description:

    Field NameDescription
    CategoryThe broad category of the business (e.g., Food and Drink, Retail).
    Type of BusinessMore specific business type (e.g., Restaurants, Apparel and Footwear).
    BusinessName of the business.
    AddressAddress of the business.
    City/StateCity and State where the business is located.
    Zip CodeZip code of the business location.
    Phone NumberContact phone number of the business.
    BoroughBorough in which the business is located.
    LatitudeLatitude coordinate of the business location.
    LongitudeLongitude coordinate of the business location.
    Community BoardAdministrative identifier for community board.
    Council DistrictAdministrative identifier for council district.
    Census TractAdministrative identifier for census tract.
    BINAdministrative identifier for BIN (Building Identification Number)
    BBLAdministrative identifier for BBL (Borough Block Lot).
    NTAAdministrative identifier for NTA (Neighborhood Tabulation Area)
    Location 1JSON-style string containing latitude and longitude.
    Administrative columns Zip Codes, Community Districts, Borough Boundaries, City Council Districts, and Police Precincts.Various administrative and geographical identifiers.
    Spinner
    DataFrameavailable as
    unions
    variable
    --code to save the CSV file as unions
    SELECT * FROM 'union.csv';
    Spinner
    DataFrameavailable as
    df6
    variable
    -- Stored Procedure for Counting Businesses by a Specified Column
    CREATE PROCEDURE CountBusiness(IN column_name VARCHAR(255))
    BEGIN
        SET @sql_query = CONCAT(
            'SELECT ', column_name, ', COUNT(*) AS Business_Count ',
            'FROM unions ',
            'GROUP BY ', column_name, ' ',
            'ORDER BY Business_Count DESC'
        );
    
        PREPARE stmt FROM @sql_query;
        EXECUTE stmt;
        DEALLOCATE PREPARE stmt;
    END;
    Spinner
    DataFrameavailable as
    df1
    variable
    -- Count the number of businesses by category using CountBusiness SP
    CALL CountBusiness('Category');
    Spinner
    DataFrameavailable as
    df
    variable
    -- Count the number of businesses by type using CountBusiness SP
    CALL CountBusiness('"Type of Business"');

    The procedure is created to dynamically count businesses based on any specified column. It constructs the SQL query dynamically using the input parameter.

    Counts the number of businesses for each category and orders them in descending order to identify the most common business categories.

    Counts the number of businesses for each type and orders them in descending order to identify the most common business types.

    Spinner
    DataFrameavailable as
    df4
    variable
    -- Count the number of businesses by ZIP code using CountBusiness SP
    CALL CountBusiness('"Zip Code"');

    Counts the number of businesses in each ZIP code to understand the geographical distribution of businesses.

    Spinner
    DataFrameavailable as
    df2
    variable
    -- Calculate the density of businesses in various community districts
    SELECT "Community Districts", COUNT(*) AS Business_Count
    FROM unions
    WHERE "Community Districts" IS NOT NULL
    GROUP BY "Community Districts"
    ORDER BY Business_Count DESC;

    Counts the number of businesses in each community district to understand business density and identify districts with the highest concentration of businesses.

    Spinner
    DataFrameavailable as
    df5
    variable
    -- Identify the top 5 most diverse business categories (categories with the most different types of businesses)
    SELECT Category, COUNT(DISTINCT "Type of Business") AS Type_Count
    FROM unions
    GROUP BY Category
    ORDER BY Type_Count DESC
    LIMIT 5;