Prompt Engineering with GPT and LangChain
LangChain is framework that is extremely helpful for prompt engineering and the integration of generative AI capabilities in applications or data platforms. It has many capabilities, some of which will not be introduced until later modules, but we will start with a gentle introduction to some of the easy-to-understand concepts in the framework.
You'll build an AI agent that uses Python and GPT to perform sentiment analysis on financial headlines.
In more detail, you'll cover:
- Getting set up with an OpenAI developer account and integration with Workspace.
- Interacting with OpenAI models through the langchain framework.
- Using prompt templates that write reusable, dynamic prompts.
- Working with LLM chains.
- Automatically parsing the output of an LLM to be used downstream.
- Working with langchain agents and tools.
- Using the OpenAI Moderation API to filter explicit content.
For this project, we are using two small samples: financial_headlines.txt
and reddit_comments.txt
. These 5-6 line samples are kept short to keep evaluation easy, but keep in mind that this same code and prompt engineering techniques can scale to datasets of much larger size.
Before you begin
You'll need a developer account with OpenAI.
See getting-started.ipynb for steps on how to create an API key and store it in Workspace. In particular, you'll need to follow the instructions in the "Getting started with OpenAI" and "Setting up Workspace Integrations" sections.
Task 0: Setup
We need to install a few packages, one of which being the langchain
package. This is currently being developed quickly, sometimes with breaking changes, so we fix the version.
langchain
depends on a recent version of typing_extensions
, so we need to update that package, again fixing the version.
Instructions
Run the following code to install openai
, langchain
, typing_extensions
and pandas
.
# Install openai.
!pip install openai==0.28.0
# Install langchain.
!pip install langchain==0.0.293
# Install typing-extensions.
!pip install typing-extensions==4.8.0
# Install pandas.
!pip install pandas==2.0.3
For this project, we need first need to load the openai and os packages to set the API key from the environment variables you just created.
Instructions
- Import the
os
package. - Import the
openai
package. - Set
openai.api_key
to theOPENAI_API_KEY
environment variable.
# Import the os package.
import os
# Import the openai package.
import openai
# Set openai.api_key to the OPENAI_API_KEY environment variable.
# Use os.getenv to avoid KeyError if the environment variable is not set.
openai.api_key = os.getenv("OPENAI_API_KEY")
For the langchain
package, let's start by importing it's OpenAI
and ChatOpenAI
class, which are used to interact with completion models and chat completion models respectively.
Completion models, such as the GPT-1, GPT-2, GPT-3 and GPT-3.5, work as an advanced autocomplete model. Given a certain snippet of text as input, they will complete the text until a certain point. This could be either an end-of-sequence token (a natural way of stopping), the model reaching its maximum token limit for outputs and so on.
Chat completion models, such as GPT-3.5-Turbo (the ChatGPT model) and GPT-4, are designed for conversational use. These models are typically more fine-tuned for conversations, keep a prompt/conversation history and allow access to a system message, which we can use as a meta prompt to define a role, a tone of voice, a scope, etc.
Completion models and chat completion models tend to work with different classes and functions in the SDK. For that reason, we will start by importing both classes.