Skip to content
Stock Exchange Data
  • AI Chat
  • Code
  • Report
  • Stock Exchange Data

    This dataset consists of stock exchange data since 1965 for several indexes. It contains the daily stock prices along with the volume traded each day.

    Not sure where to begin? Scroll to the bottom to find challenges!

    [1]
    import pandas as pd
    from sklearn.preprocessing import minmax_scale
    import numpy as np
    
    df = pd.read_csv("stock_data.csv", index_col=None, parse_dates=['Date'])
    df.head()

    Data Dictionary

    ColumnExplanation
    IndexTicker symbol for indexes
    DateData of observation
    OpenOpening price
    HighHighest price during trading day
    LowLowest price during trading day
    CloseClose price
    Adj CloseClose price adjusted for stock splits and dividends
    VolumeNumber of shares traded during trading day
    CloseUSDClose price in terms of USD

    Source of dataset.

    Don't know where to start?

    Challenges are brief tasks designed to help you practice specific skills:

    • 🗺️ Explore: Which index has produced the highest average annual return?
    • 📊 Visualize: Create a plot visualizing a 30 day moving average for an index of your choosing.
    • 🔎 Analyze: Compare the volatilities of the indexes included in the dataset.

    Scenarios are broader questions to help you develop an end-to-end project for your portfolio:

    You are working for an investment firm that is looking to invest in index funds. They have provided you with a dataset containing the returns of 13 different indexes. Your manager has asked you to make short-term forecasts for several of the most promising indexes to help them decide which would be a good fund to include. Your analysis should also include a discussion of the associated risks and volatility of each fund you focus on.

    You will need to prepare a report that is accessible to a broad audience. It should outline your motivation, steps, findings, and conclusions.


    ✍️ If you have an idea for an interesting Scenario or Challenge, or have feedback on our existing ones, let us know! You can submit feedback by pressing the question mark in the top right corner of the screen and selecting "Give Feedback". Include the phrase "Content Feedback" to help us flag it in our system.

    df.set_index('Date').groupby('Index')['CloseUSD'].plot(legend=True);
    # Normalize close data by index
    df['closeusd_scale'] = df.groupby('Index')['CloseUSD'].apply(lambda x: pd.DataFrame(minmax_scale(x), index=x.index))
    df.set_index('Date').groupby('Index')['closeusd_scale'].plot(legend=True, alpha=.7);
    ((df.Index=='NYA')|(df.Index=='GSPTSE')).index
    list(filter(lambda x: 'k' in x, ['kyle kosnoff', 'jonny', 'sabrina']))
    # df.filter(like='scale', axis=1)
    nya = df.loc[df.Index=='NYA', ['Date', 'CloseUSD']]
    gsptse = df.loc[df.Index=='GSPTSE', ['Date', 'CloseUSD']]
    corr_test = pd.merge(nya, gsptse, left_on='Date', right_on='Date', suffixes=('_nya', '_gsptse'))
    corr_test = corr_test.set_index('Date').sort_values(by='Date')
    corr_test.corr()
    corr_test.corr(method='kendall')
    corr_test.corr(method='spearman')