Creating a Plotly Histogram with Olympics Data
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Olympics

    This is a historical dataset on the modern Olympic Games, from Athens 1896 to Rio 2016. Each row consists of an individual athlete competing in an Olympic event and which medal was won (if any).

    Not sure where to begin? Scroll to the bottom to find challenges!

    import pandas as pd
    
    olympic_data = pd.read_csv("data/athlete_events.csv.gz")
    olympic_data.head()

    Data Dictionary

    ColumnExplanation
    idUnique number for each athlete
    nameAthlete's name
    sexM or F
    ageAge of the athlete
    heightIn centimeters
    weightIn kilograms
    teamTeam name
    nocNational Olympic Committee 3
    gamesYear and season
    yearInteger
    seasonSummer or Winter
    cityHost city
    sportSport
    eventEvent
    medalGold, Silver, Bronze, or NA

    Source and license of the dataset. The dataset is a consolidated version of data from www.sports-reference.com.

    Don't know where to start?

    Challenges are brief tasks designed to help you practice specific skills:

    • πŸ—ΊοΈ Explore: In which year and city did the Netherlands win the highest number of medals in their history?
    • πŸ“Š Visualize: Create a plot visualizing the relationship between the number of athletes countries send to an event and the number of medals they receive.
    • πŸ”Ž Analyze: In which sports does the height of an athlete increase their chances of earning a medal?

    Scenarios are broader questions to help you develop an end-to-end project for your portfolio:

    You are working as a data analyst for an international judo club. The owner of the club is looking for new ways to leverage data for competition. One idea they have had is to use past competition data to estimate the threat of future opponents. They have provided you with a dataset of past Olympic data and want to know whether you can use information such as the height, weight, age, and national origin of a judo competitor to estimate the probability that they will earn a medal.

    You will need to prepare a report that is accessible to a broad audience. It should outline your steps, findings, and conclusions.


    ✍️ If you have an idea for an interesting Scenario or Challenge, or have feedback on our existing ones, let us know! You can submit feedback by pressing the question mark in the top right corner of the screen and selecting "Give Feedback". Include the phrase "Content Feedback" to help us flag it in our system.

    Histogram with Plotly Express

    olympic_data.shape
    import plotly.express as px
    
    # Create a histogram
    fig = px.histogram(olympic_data.age, x="age",
                       title="Distribution of Athletes age")
    fig.show()

    Histogram with Plotly GO

    import plotly.graph_objects as go
    
    fig = go.Figure(data=[go.Histogram(x=olympic_data.age)])
    fig.update_layout(title=dict(text="Distribution of Athletes age"))
    fig.show()

    Changing the title font

    # Create histogram
    fig = go.Figure(data=[go.Histogram(x=olympic_data.age)])
    
    fig.update_layout(
        # Set the global font
        font = {
            "family":"Times new Roman",
            "size":16
        },
        # Update title font
        title = {
            "text": "Distribution of Athletes age",
            "y": 0.9, # Sets the y position with respect to `yref` 
            "x": 0.5, # Sets the x position of title with respect to `xref`
            "xanchor":"center", # Sets the title's horizontal alignment with respect to its x position
            "yanchor": "top", # Sets the title's vertical alignment with respect to its y position. "       
            "font": { # Only configures font for title
                "family":"Arial",
                "size":20,
                "color": "red"
            }
        }
    )
    
    # Add X and Y labels
    fig.update_xaxes(title_text="Age")
    fig.update_yaxes(title_text="Number of Athletes")
    
    # Display plot
    fig.show()
    Hidden output

    Changing the bin size of bars

    # Create histogram
    fig = go.Figure(data = [
        go.Histogram(
            x = olympic_data.age,
            xbins=go.histogram.XBins(size=5) # Change the bin size
        )
      ]
    )
    
    fig.update_layout(
        # Set the global font
        font = {
            "family":"Times new Roman",
            "size":16
        },
        # Update title font
        title = {
            "text": "Distribution of Athletes age",
            "y": 0.9, # Sets the y position with respect to `yref` 
            "x": 0.5, # Sets the x position of title with respect to `xref`
            "xanchor":"center", # Sets the title's horizontal alignment with respect to its x position
            "yanchor": "top", # Sets the title's vertical alignment with respect to its y position. "       
            "font": { # Only configures font for title
                "family":"Arial",
                "size":20,
                "color": "red"
            }
        }
    )
    
    # Add X and Y labels
    fig.update_xaxes(title_text="Age")
    fig.update_yaxes(title_text="Number of Athletes")
    
    # Display plot
    fig.show()
    Hidden output

    Changing the color of the bins

    β€Œ
    β€Œ
    β€Œ