Skip to content

Tech Stock Prices

This dataset consists of the daily stock prices and volume of ten different tech companies: Apple (AAPL), Amazon (AMZN), Alibaba (BABA), Salesforce (CRM), Facebook (FB), Alphabet (GOOG), Intel (INTC), Microsoft (MSFT), Nvidia (NVDA), and Tesla (TSLA).

There are ten CSV files in the data/ folder named with the stock symbol for each of the ten companies listed above. Looking for another company? You can download it from Yahoo Finance and upload it to your workspace.

Not sure where to begin? Scroll to the bottom to find challenges!

import pandas as pd

aapl = pd.read_csv("data/AAPL.csv", parse_dates=[0]).assign(nom='Apple') # parse_date to import dates in datetime format
amzn = pd.read_csv("data/AMZN.csv", parse_dates=[0]).assign(nom='Amazon')
nvidia = pd.read_csv('data/NVDA.csv', parse_dates=[0]).assign(nom='Nvidia')
intel = pd.read_csv('data/INTC.csv', parse_dates=[0]).assign(nom='Intel')
alibaba = pd.read_csv('data/BABA.csv', parse_dates=[0]).assign(nom='Alibaba')

Data Dictionary

ColumnExplanation
DateDate of observation
OpenOpening price
HighHighest price during trading day
LowLowest price during trading day
CloseClose price
Adj CloseAdjusted close price adjusted for splits and dividend and/or capital gain distribution
VolumeNumber of shares traded during trading day

Source of dataset.

Don't know where to start?

Challenges are brief tasks designed to help you practice specific skills:

  • ๐Ÿ—บ๏ธ Explore: Which of the ten companies has the highest closing price based on the most recent data?
  • ๐Ÿ“Š Visualize: Create a plot that visualizes the closing price at the end of each month for the 10 tech stocks.
  • ๐Ÿ”Ž Analyze: Which of the ten companies have experienced the greatest percent increase in closing price over the course of their existence?

Scenarios are broader questions to help you develop an end-to-end project for your portfolio:

You have been hired as an analyst for a small investment firm. They currently specialize in commodities, focusing on coffee, cocoa, and sugar. However, they are now interested in expanding to technology companies. Your manager has asked you to explore the returns and volatilities of the ten stocks provided and contrast them with the three commodities they currently invest in.

They also want you to recommend how tech stocks could be integrated into a portfolio with the existing commodities they invest in to minimize risk while gaining exposure to the new market.

You will need to prepare a report that is accessible to a broad audience. It should outline your motivation, steps, findings, and conclusions.

Commodity prices can be found here.

Working for max and min values in a Series .nlargest(slice, col) .max() .idxmax() .argmax()

# on volume 

# nlargest method - return rows with max values 
aapl.nlargest(5, 'Volume') # return rows

# max method - return only the max value
aapl.Volume.max() #return max value

# idxmax method - return row index number of the max_value
aapl.Volume.idxmax() #return 262

# argmax method - works on Series or numpy nd.array
aapl.Volume.argmax() #return 262

Working for max and min values in all dataframe.values with argmax() .select_dtypes(exclude/include) .ravel() to flatten the array

# .values return a numpy array and works fine with argmax(). Caution, argmax in ndims array returns index of a flatten array.
# numpy array converts values in dominant dtype so you need to get only float or int
(aapl.set_index('Date')
 .select_dtypes(exclude=['object'])
                .values.argmax()) # returns 1577 with max total value in a flatten array

# with select_dtypes(), you can exclude specific dtypes working well with argmax
(aapl.select_dtypes(exclude=['datetime','object']).values
 .argmax()) # returns 1577 - the array with max total value in a flatten array

# select_dtypes() and values.argmax(axis=0) to find all local column maxima indices
aapl.select_dtypes(include=['float','int']).values.argmax(axis=0) # returns an array with max value indice for each column

# checking if 1577 indice of a flatten array returns the same as max Volume 
(aapl.select_dtypes(exclude='datetime').values.ravel())[1577] # return the max volume value

iloc, iat, query

# iloc with single square brackets return Series
aapl.iloc[262]

# iloc with double square brackets return DataFrame
aapl.iloc[[262]]

# iat return a sole value at a specified position in value form
aapl.iat[262,-1]

# indexing with .values array
aapl.set_index('Date').values[2998] # returns the 2998th row of the array

# using query
aapl.query(" Date == '2011-01-18' ") # return 262th row

.diff() method

# finding the variation above diff >5
aapl.set_index('Date')[['Open', 'Close']].diff(axis=1).sort_values('Close', ascending=False).loc[lambda x: x.Close > 5]

# create a new column with assign and variation and check var_day above 6%
aapl.assign( var_day = aapl.Close / aapl.Open * 100 - 100).set_index('Date').sort_values('var_day', ascending=False).loc[lambda x: x.var_day > 6 ]

# check min variation intraday
aapl.assign( var_ext = aapl.High / aapl.Low * 100 - 100).set_index('Date').nsmallest(3, 'var_ext') # 0,41 %

# check max variation intraday
aapl.assign( var_ext = aapl.High / aapl.Low * 100 -100 ).set_index('Date').nlargest(3, 'var_ext') # 29% !!!

Find all local maxima on close

aapl[(aapl.Close.shift(1)<aapl.Close) & (aapl.Close.shift(-1)<aapl.Close)]
โ€Œ
โ€Œ
โ€Œ