Skip to content
City of Rome Weather Analysis & Prediction
  • AI Chat
  • Code
  • Report
  • City of Rome Weather Analysis & Prediction

    In this workbook I will analyze how Rome climate changed across the last four decades. Particularly, I will focus about temperature related aspects: annual average temperature and number of fog days. At the end, we will see how Rome climate could change in the near future.

    Unfortunately, rain precipitation data are not reliable, so I will not use them in the analysis.

    1. Configuration, Data Mining & Data Cleaning

    In this section we will collect and clean data.

    Config

    Set variables for city, start year / month and end year / month.

    city = 'Roma'
    
    start_year = 1980
    start_month = 1
    
    end_year = 2020
    end_month = 12

    Config #2

    Import libraries and data structures.

    # Import libraries
    import requests, csv
    import pandas as pd
    import numpy as np
    
    import matplotlib.pyplot as plt
    import seaborn as sns
    
    # Set seaborn parameters
    sns.set(rc = {'figure.figsize': (8, 4)}, font = 'calibri')
    sns.set_context('notebook')
    sns.set_style('whitegrid', {'grid.linestyle': ':', 'axes.spines.right': False, 'axes.spines.top': False})
    
    # Import data structures
    month_list = ['Gennaio', 'Febbraio', 'Marzo', 'Aprile', 'Maggio', 'Giugno', 'Luglio', 'Agosto', 'Settembre', 'Ottobre', 'Novembre', 'Dicembre']
    
    header = ['city', 'date', 't_avg_c', 't_min_c', 't_max_c', 'dew_point_c', 'humidity_%', 'visibility_km', 'wind_avg_kmh', 'wind_max_kmh', 'gust_kmh', 'air_pressure_asl_mb', 'air_pressure_avg_mb', 'rain_mm', 'phenomena']
    
    convert_dict = {'t_avg_c': float, 't_min_c': float, 't_max_c': float, 'dew_point_c': float, 'humidity_%': int, 'visibility_km': int, 'wind_avg_kmh': int, 'wind_max_kmh': int, 'gust_kmh': int, 'air_pressure_asl_mb': int, 'air_pressure_avg_mb': int, 'rain_mm': float}

    Data Scraping

    Download weather data from www.ilmeteo.it and store them into weather_list list (it takes a while).

    weather_list = []
    
    for year in range(start_year, end_year + 1):
        
        for month in range(start_month - 1, end_month):
            
            CSV_URL = 'https://www.ilmeteo.it/portale/archivio-meteo/' + city + '/' + str(year) + '/' + month_list[month] + '?format=csv'
    
            with requests.Session() as s:
                download = s.get(CSV_URL)  # Set the connection
    
                decoded_content = download.content.decode('utf-8')  # Decode csv content
    
                records = csv.reader(decoded_content.splitlines(), delimiter = ';')  # Read csv content
                weather_list += list(records)[1::]  # Convert "csv.reader" object in list and glue it into "weather_list" with no header

    Data Cleaning

    Glue weather data into weather_df DataFrame and clean them.

    weather_df = pd.DataFrame(weather_list, columns = header)
    
    # Replace empty cells with NaN, NaN phenomena with zeros, commas with dots
    weather_df = weather_df.replace(r'^\s*$', np.nan, regex = True)
    weather_df['phenomena'] = weather_df['phenomena'].fillna('none')
    weather_df = weather_df.replace(',', '.', regex = True)
    
    # Drop NaN
    weather_df.dropna(inplace = True)
    
    # Convert with correct data types
    weather_df['date'] = pd.to_datetime(weather_df['date'], dayfirst = True)
    weather_df = weather_df.astype(convert_dict)
    
    
    display(weather_df.head())

    Data Cleaning #2

    Clean data on 'phenomena' column.

    weather_df.loc[weather_df['phenomena'].str.contains('temporale|grandine'), 'phenomena'] = 'storm'
    weather_df.loc[weather_df['phenomena'].str.contains('pioggia'), 'phenomena'] = 'rain'
    weather_df.loc[weather_df['phenomena'].str.contains('neve'), 'phenomena'] = 'snow'
    weather_df.loc[weather_df['phenomena'].str.contains('nebbia'), 'phenomena'] = 'fog'
    
    print(weather_df['phenomena'].unique())

    2. Exploratory Analysis

    In this section we will analyze correlation between variables.

    Analysis

    Group weather data by year.