Enriching stock market data using Open AI API
The Nasdaq-100 is a stock market index made up of 101 equity securities issued by 100 of the largest non-financial companies listed on the Nasdaq stock exchange. It helps investors compare stock prices with previous prices to determine market performance.
In this project you are provided with two CSV files containing Nasdaq-100 stock information:
- nasdaq100.csv: contains information about companies in the index such as symbol, name, etc.
- nasdaq100_price_change.csv: contains price changes per stock across periods including (but not limited to) one day, five days, one month, six months, one year, etc.
As an AI developer, you will leverage the OpenAI API to classify companies into sectors and produce a summary of sector and company performance for this year.
CSV with Nasdaq-100 stock data
In this project, you have available two CSV files nasdaq100.csv and nasdaq100_price_change.csv.
nasdaq100.csv
symbol,name,headQuarter,dateFirstAdded,cik,founded AAPL,Apple Inc.,"Cupertino, CA",,0000320193,1976-04-01 ABNB,Airbnb,"San Francisco, CA",,0001559720,2008-08-01 ADBE,Adobe Inc.,"San Jose, CA",,0000796343,1982-12-01 ADI,Analog Devices,"Wilmington, MA",,0000006281,1965-01-01 ...
nasdaq100_price_change.csv
symbol,1D,5D,1M,3M,6M,ytd,1Y,3Y,5Y,10Y,max AAPL,-1.7254,-8.30086,-6.20411,3.042,15.64824,42.99992,8.47941,60.96299,245.42031,976.99441,139245.53954 ABNB,2.1617,-2.21919,9.88336,19.43286,19.64241,68.66902,23.64013,-1.04347,-1.04347,-1.04347,-1.04347 ADBE,0.5409,-1.77817,9.16191,52.0465,38.01522,57.22723,21.96206,17.83037,109.05718,1024.69214,251030.66399 ADI,0.9291,-4.03352,2.58486,3.65887,5.01602,17.02062,8.09735,63.42847,92.81874,286.77518,26012.63736 ...
Before you start
In order to complete the project you will need to create a developer account with OpenAI and store your API key as an environment variable. Instructions for these steps are outlined below.
Create a developer account with OpenAI
-
Go to the API signup page.
-
Create your account (you'll need to provide your email address and your phone number).
-
Go to the API keys page.
-
Create a new secret key.
- Take a copy of it. (If you lose it, delete the key and create a new one.)
Add a payment method
OpenAI sometimes provides free credits for the API, but it's not clear if that is worldwide or what the conditions are. You may need to add debit/credit card details.
The API costs $0.002 / 1000 tokens for GPT-3.5-turbo. 1000 tokens is about 750 words. This project should cost less than 1 US cents (but if you rerun tasks, you will be charged every time).
-
Go to the Payment Methods page.
-
Click Add payment method.
- Fill in your card details.
Add an environmental variable with your OpenAI key
-
In DataLab, click on "Environment," in the menu.
-
Click on "Environment variables" to add environment variables.
-
In the "Name" field, type "OPENAI_API_KEY". In the "Value" field, paste in your secret key.
- Click "Create", and following instructions to copy the environment variable for use via the
oslibrary.
See this article for further guidance.
# Start your code here!
import os
import pandas as pd
import datetime
import yfinance as yf
import json
import re
import openai
client = openai.OpenAI(
api_key="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"
)
# Load the Nasdaq-100 constituents list and symbols
nasdaq100 = pd.read_csv('nasdaq100.csv')
symbols = nasdaq100['symbol'].tolist()
# Load the Nasdaq-100 constituents list and symbols
nasdaq100 = pd.read_csv('nasdaq100.csv')
symbols = nasdaq100['symbol'].tolist()
# Define date range for current year
today = datetime.date.today()
start_of_year = datetime.date(today.year, 1, 1)
# Download adjusted close prices for all symbols
data = yf.download(symbols, start=start_of_year, end=today)
adj_close = data['Adj Close']
# Compute YTD performance for each symbol
ytd_perf = {}
for sym in symbols:
series = adj_close.get(sym)
if series is not None:
series = series.dropna()
if not series.empty:
ytd_perf[sym] = ((series.iloc[-1] - series.iloc[0]) / series.iloc[0]) * 100
else:
ytd_perf[sym] = None
else:
ytd_perf[sym] = None
# Map YTD performance into DataFrame
nasdaq100['ytd'] = nasdaq100['symbol'].map(ytd_perf)
# Classify stocks into sectors using OpenAI API
tickers = symbols
sector_prompt = (
"Classify each of the following Nasdaq-100 tickers into one of: Technology, Consumer Cyclical, Industrials, Utilities, Healthcare, Communication, Energy, Consumer Defensive, Real Estate, or Financial. "
"Respond with a JSON object mapping each ticker to its sector.\n" + json.dumps(tickers)
)
sector_resp = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": sector_prompt}]
)
sector_content = sector_resp.choices[0].message.content
try:
sector_mapping = json.loads(sector_content)
except json.JSONDecodeError:
obj = re.search(r"\{.*?\}", sector_content, re.DOTALL)
sector_mapping = json.loads(obj.group(0)) if obj else {}
nasdaq100['sector'] = nasdaq100['symbol'].map(sector_mapping)
# Provide summary recommendations YTD via OpenAI API and store as string
summary_prompt = (
"Here is the YTD performance of Nasdaq-100 stocks by sector (JSON array of records):\n" +
json.dumps(nasdaq100.to_dict(orient='records')) +
"\nPlease recommend the three best sectors and at least three top-performing companies per sector in a python dataframe."
)
summary_resp = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": summary_prompt}]
)
# Store raw summary string
stock_recommendations = summary_resp.choices[0].message.content
# Assuming 'stock_recommendations' is a DataFrame or a similar data structure
# that contains the stock recommendations for the project.
# Display the stock recommendations
stock_recommendations