Skip to content
New Workbook
Sign up
Stock Exchange Data

Stock Exchange Data

This dataset consists of stock exchange data since 1965 for several indexes. It contains the daily stock prices along with the volume traded each day.

Not sure where to begin? Scroll to the bottom to find challenges!

[1]
import pandas as pd
from sklearn.preprocessing import minmax_scale
import numpy as np

df = pd.read_csv("stock_data.csv", index_col=None, parse_dates=['Date'])
df.head()

Data Dictionary

ColumnExplanation
IndexTicker symbol for indexes
DateData of observation
OpenOpening price
HighHighest price during trading day
LowLowest price during trading day
CloseClose price
Adj CloseClose price adjusted for stock splits and dividends
VolumeNumber of shares traded during trading day
CloseUSDClose price in terms of USD

Source of dataset.

Don't know where to start?

Challenges are brief tasks designed to help you practice specific skills:

  • πŸ—ΊοΈ Explore: Which index has produced the highest average annual return?
  • πŸ“Š Visualize: Create a plot visualizing a 30 day moving average for an index of your choosing.
  • πŸ”Ž Analyze: Compare the volatilities of the indexes included in the dataset.

Scenarios are broader questions to help you develop an end-to-end project for your portfolio:

You are working for an investment firm that is looking to invest in index funds. They have provided you with a dataset containing the returns of 13 different indexes. Your manager has asked you to make short-term forecasts for several of the most promising indexes to help them decide which would be a good fund to include. Your analysis should also include a discussion of the associated risks and volatility of each fund you focus on.

You will need to prepare a report that is accessible to a broad audience. It should outline your motivation, steps, findings, and conclusions.


✍️ If you have an idea for an interesting Scenario or Challenge, or have feedback on our existing ones, let us know! You can submit feedback by pressing the question mark in the top right corner of the screen and selecting "Give Feedback". Include the phrase "Content Feedback" to help us flag it in our system.

df.set_index('Date').groupby('Index')['CloseUSD'].plot(legend=True);
# Normalize close data by index
df['closeusd_scale'] = df.groupby('Index')['CloseUSD'].apply(lambda x: pd.DataFrame(minmax_scale(x), index=x.index))
df.set_index('Date').groupby('Index')['closeusd_scale'].plot(legend=True, alpha=.7);
((df.Index=='NYA')|(df.Index=='GSPTSE')).index
list(filter(lambda x: 'k' in x, ['kyle kosnoff', 'jonny', 'sabrina']))
# df.filter(like='scale', axis=1)
nya = df.loc[df.Index=='NYA', ['Date', 'CloseUSD']]
gsptse = df.loc[df.Index=='GSPTSE', ['Date', 'CloseUSD']]
corr_test = pd.merge(nya, gsptse, left_on='Date', right_on='Date', suffixes=('_nya', '_gsptse'))
corr_test = corr_test.set_index('Date').sort_values(by='Date')
corr_test.corr()
corr_test.corr(method='kendall')
corr_test.corr(method='spearman')
β€Œ
β€Œ
β€Œ