Skip to content
Project: Enriching Stock Market Data using the OpenAI API
Enriching stock market data using Open AI API
The Nasdaq-100 is a stock market index made up of 101 equity securities issued by 100 of the largest non-financial companies listed on the Nasdaq stock exchange. It helps investors compare stock prices with previous prices to determine market performance.
In this project you are provided with two CSV files containing Nasdaq-100 stock information:
- nasdaq100_CA.csv: contains information about companies in the index such as symbol, name, etc. For this analysis, only companies headquartered in California have been selected.
- nasdaq100_price_change.csv: contains price changes per stock across periods including (but not limited to) one day, five days, one month, six months, one year, etc.
As an AI developer, you will leverage the OpenAI API to classify companies into sectors and produce a summary of sector and company performance for this year, for the companies in the index that are headquartered in California.
CSV with Nasdaq-100 stock data
In this project, you have available two CSV files nasdaq100_CA.csv
and nasdaq100_price_change.csv
.
nasdaq100_CA.csv
symbol,name,headQuarter,dateFirstAdded,cik,founded AAPL,Apple Inc.,"Cupertino, CA",,0000320193,1976-04-01 ABNB,Airbnb,"San Francisco, CA",,0001559720,2008-08-01 ADBE,Adobe Inc.,"San Jose, CA",,0000796343,1982-12-01 ...
nasdaq100_price_change.csv
symbol,1D,5D,1M,3M,6M,ytd,1Y,3Y,5Y,10Y,max AAPL,-1.7254,-8.30086,-6.20411,3.042,15.64824,42.99992,8.47941,60.96299,245.42031,976.99441,139245.53954 ABNB,2.1617,-2.21919,9.88336,19.43286,19.64241,68.66902,23.64013,-1.04347,-1.04347,-1.04347,-1.04347 ADBE,0.5409,-1.77817,9.16191,52.0465,38.01522,57.22723,21.96206,17.83037,109.05718,1024.69214,251030.66399 ADI,0.9291,-4.03352,2.58486,3.65887,5.01602,17.02062,8.09735,63.42847,92.81874,286.77518,26012.63736 ...
# Start your code here!
import os
import pandas as pd
from openai import OpenAI
# Instantiate an API client
client = OpenAI()
# Continue coding here
# Read both CSV files
nasdaq100_price_change = pd.read_csv('nasdaq100_price_change.csv')
nasdaq100_ca = pd.read_csv('nasdaq100_CA.csv')
# Add ytd column from nasdaq100_price_change to nasdaq100_ca
nasdaq100_ca['ytd'] = nasdaq100_price_change['ytd']
# Create formatted prompt template
prompt = """Classify the company {company_name} into one of these sectors: Technology, Consumer Cyclical, Industrials, Utilities, Healthcare, Communication, Energy, Consumer Defensive, Real Estate, Financial. Respond with only the sector name."""
#Use a loop to classify every stock
for company in nasdaq100_ca["symbol"]:
response = client.chat.completions.create(
model = "gpt-4o-mini",
messages = [{'role': 'user', 'content': prompt.format(company_name=company)}],
temperature = 0.7
)
sector = response.choices[0].message.content
#Add sector information to existing stock data
nasdaq100_ca.loc[nasdaq100_ca["symbol"] == company, "sector"] = sector
#Check the count of sectors
nasdaq100_ca["sector"].value_counts()
summary_prompt = """Based on this Nasdaq-100 stock data:
{sector_name} and {company_name}
Provide:
1. Summary of stock performance this year
2. The three best sectors
3. A few companies per sector"""
response = client.chat.completions.create(
model = 'gpt-3.5-turbo',
messages = [{'role': 'user', 'content': summary_prompt.format(sector_name=sector, company_name=company)}]
)
# Store the recommendations
stock_recommendations = response.choices[0].message.content
print(stock_recommendations)