Skip to main content
HomeTutorialsArtificial Intelligence (AI)

Getting Started With OpenAI Structured Outputs

Learn how to get started with OpenAI Structured Outputs, understand its new syntax, and explore its key applications.
Sep 11, 2024  · 9 min read

In August 2024, OpenAI announced a powerful new feature in their API — Structured Outputs. With this feature, as the name suggests, you can ensure LLMs will generate responses only in the format you specify. This capability will make it significantly easier to build applications that require precise data formatting. 

In this tutorial, you will learn how to get started with OpenAI Structured Outputs, understand its new syntax, and explore its key applications.

Develop AI Applications

Learn to build AI applications using the OpenAI API.

Start Upskilling for Free

Importance of Structured Outputs in AI Applications

Deterministic responses, or, in other words, responses in consistent formatting, are crucial for many tasks such as data entry, information retrieval, question answering, multi-step workflows, and so on. You may have experienced how LLMs can generate outputs in wildly different formats, even if the prompt is the same.

For example, consider this simple classify_sentiment function powered by GPT-4o:

# List of hotel reviews
reviews = [
   "The room was clean and the staff was friendly.",
   "The location was terrible and the service was slow.",
   "The food was amazing but the room was too small.",
]
# Classify sentiment for each review and print the results
for review in reviews:
   sentiment = classify_sentiment(review)
   print(f"Review: {review}\nSentiment: {sentiment}\n")

Output:

Review: The room was clean and the staff was friendly.
Sentiment: Positive
Review: The location was terrible and the service was slow.
Sentiment: Negative
Review: The food was amazing but the room was too small.
Sentiment: The sentiment of the review is neutral.

Even though the first two responses were in the same single-word format, the last one is an entire sentence. If some other downstream application depended on the output of the above code, it would have crashed as it would have been expecting a single-word response.

We can fix this problem with some prompt engineering, but it is a time-consuming, iterative process. Even with a perfect prompt, we can’t be 100% sure the responses will conform to our format in future requests. Unless, of course, we use Structured Outputs:

def classify_sentiment_with_structured_outputs(review):
   """Sentiment classifier with Structured Outputs"""
   ...
# Classify sentiment for each review with Structured Outputs
for review in reviews:
   sentiment = classify_sentiment_with_structured_outputs(review)
   print(f"Review: {review}\nSentiment: {sentiment}\n")

Output:

Review: The room was clean and the staff was friendly.
Sentiment: {"sentiment":"positive"}
Review: The location was terrible and the service was slow.
Sentiment: {"sentiment":"negative"}
Review: The food was amazing but the room was too small.
Sentiment: {"sentiment":"neutral"}

With the new function, classify_sentiment_with_structured_outputs, the responses are all in the same format.

This capability of forcing language models in a rigid format is significant, saving you countless hours of prompt engineering or reliance on other open-source tools.

Getting Started With OpenAI Structured Outputs

In this section, we will break down structured outputs using the example of the sentiment analyzer function.

Setting Up Your Environment

Prerequisites

Before you begin, ensure you have the following:

  • Python 3.7 or later installed on your system.
  • An OpenAI API key. You can obtain this by signing up on the OpenAI website.

Setting Up the OpenAI API

1. Install the OpenAI Python package: Open your terminal and run the following command to install or update the OpenAI Python package to the latest version:

$ pip install -U openai

2. Set up your API key: You can set your API key as an environment variable or directly in your code. To set it as an environment variable, run:

$ export OPENAI_API_KEY='your-api-key'

3. Verify the installation: Create a simple Python script to verify the installation:

from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
   model="gpt-4o-mini",
   messages=[
       {"role": "system", "content": "You are a helpful assistant."},
       {"role": "user", "content": "Say hello!"}
   ],
   max_tokens=5
)
>>> print(response.choices[0].message.content.strip())
Hello! How can I

Run the script to ensure everything is set up correctly. You should see the model’s response printed in the terminal.

In addition to the OpenAI package, you will need the Pydantic library to define and validate JSON schemas for Structured Outputs. Install it using pip:

$ pip install pydantic

With these steps, your environment is now set up to use OpenAI’s Structured Outputs feature.

Defining an Output Schema Using Pydantic

To use Structured Outputs, you need to define the expected output structure using Pydantic models. Pydantic is a data validation and settings management library for Python, which allows you to define data models using Python-type annotations. These models can then be used to enforce the structure of the outputs generated by OpenAI’s models.

Here is an example Pydantic model for specifying the format for our review sentiment classifier:

from pydantic import BaseModel
from typing import Literal
class SentimentResponse(BaseModel):
   sentiment: Literal["positive", "negative", "neutral"]

In this example:

  • SentimentResponse is a Pydantic model that defines the expected structure of the output.
  • The model has a single field sentiment, which can only take one of three literal values: "positive," "negative," or "neutral."

When we pass this model as part of our OpenAI API requests, the outputs will be only one of the words we provided.

Let’s see how.

Using the parse Helper

To enforce our Pydantic schema in OpenAI requests, all we have to do is pass it to the response_format parameter of the chat completions API. Roughly, here is what it looks like:

response = client.beta.chat.completions.parse(
   model=MODEL,
   messages=[...],
   response_format=SentimentResponse
)

If you notice, instead of using client.chat.completions.create, we are using client.beta.chat.completions.parse method. .parse() is a new method in the Chat Completions API specifically written for Structured Outputs.

Now, let’s put everything together by rewriting the reviews sentiment classifier with Structured Outputs. First, we make the necessary imports, define the Pydantic model, the system prompt, and a prompt template:

from openai import OpenAI
from pydantic import BaseModel
from typing import Literal
class SentimentResponse(BaseModel):
   sentiment: Literal["positive", "negative", "neutral"]
client = OpenAI()
MODEL = "gpt-4o-mini"
SYSTEM_PROMPT = "You are a sentiment classifier assistant."
PROMPT_TEMPLATE = """
   Classify the sentiment of the following hotel review as positive, negative, or neutral:\n\n{review}
"""

Then, we write a new function that uses the .parse() helper method:

# Function to classify sentiment using OpenAI's chat completions API with structured outputs
def classify_sentiment_with_structured_outputs(review):
   response = client.beta.chat.completions.parse(
       model=MODEL,
       messages=[
           {"role": "system", "content": SYSTEM_PROMPT},
           {"role": "user", "content": PROMPT_TEMPLATE.format(review=review)},
       ],
       response_format=SentimentResponse
   )
   return response.choices[0].message

The important line in the function is response_format=SentimentResponse, which is what actually enables Structured Outputs.

Let’s test it on one of the reviews:

# List of hotel reviews
reviews = [
   "The room was clean and the staff was friendly.",
   "The location was terrible and the service was slow.",
   "The food was amazing but the room was too small.",
]
result = classify_sentiment_with_structured_outputs(reviews[0])
>>> print(result.content)
{"sentiment":"positive"}

Here, result is a message object:

>>> type(result)
openai.types.chat.parsed_chat_completion.ParsedChatCompletionMessage[SentimentResponse]

Apart from its .content attribute, which retrieves the response, it has a .parsed attribute that returns the parsed information as a class:

>>> result.parsed
SentimentResponse(sentiment='positive')

As you can see, we have got an instance of the SentimentResponse class. This means we can access the sentiment as a string instead of a dictionary using the .sentiment attribute:

>>> result.parsed.sentiment
'positive'

Nesting Pydantic Models for Defining Complex Schemas

In some cases, you may need to define more complex output structures that involve nested data. Pydantic allows you to nest models within each other, enabling you to create intricate schemas that can handle a variety of use cases. This is particularly useful when dealing with hierarchical data or when you need to enforce a specific structure for complex outputs.

Let’s consider an example where we need to extract detailed user information, including their name, contact details, and a list of addresses. Each address should include fields for the street, city, state, and zip code. This requires more than one Pydantic model to build the correct schema.

Step 1: Define the Pydantic models

First, we define the Pydantic models for the address and user information:

from pydantic import BaseModel
from typing import List
# Define the Pydantic model for an address
class Address(BaseModel):
   street: str
   city: str
   state: str
   zip_code: str
# Define the Pydantic model for user information
class UserInfo(BaseModel):
   name: str
   email: str
   phone: str
   addresses: List[Address]

In this example:

  • Address is a Pydantic model that defines the structure of an address.
  • UserInfo is a Pydantic model that includes a list of Address objects, along with fields for the user's name, email, and phone number.

Step 2: Use the nested Pydantic models in API calls

Next, we use these nested Pydantic models to enforce the output structure in an OpenAI API call:

SYSTEM_PROMPT = "You are a user information extraction assistant."
PROMPT_TEMPLATE = """ Extract the user information from the following text:\n\n{text}"""
# Function to extract user information using OpenAI's chat completions API with structured outputs
def extract_user_info(text):
   response = client.beta.chat.completions.parse(
       model=MODEL,
       messages=[
           {"role": "system", "content": SYSTEM_PROMPT},
           {"role": "user", "content": PROMPT_TEMPLATE.format(text=text)},
       ],
       response_format=UserInfo
   )
   return response.choices[0].message
# Example text containing user information
text = """John DoeEmail: john.doe@example.comPhone: 123-456-7890Addresses:- 123 Main St, Springfield, IL, 62701- 456 Elm St, Shelbyville, IL, 62702"""
# Extract user information and print the results
user_info = extract_user_info(text)

The sample text is totally unreadable and lacks spaces between key pieces of information. Let’s see if the model succeeds. We will use the json library to prettify the response:

import json
data = json.loads(user_info.content)
pretty_response = json.dumps(data, indent=2)
print(pretty_response)
{
 "name": "John Doe",
 "email": "john.doe@example.com",
 "phone": "123-456-7890",
 "addresses": [
   {
     "street": "123 Main St",
     "city": "Springfield",
     "state": "IL",
     "zip_code": "62701"
   },
   {
     "street": "456 Elm St",
     "city": "Shelbyville",
     "state": "IL",
     "zip_code": "62702"
   }
 ]
}

As you can see, the model correctly captured a single user’s information along with their two separate addresses based on our provided schema.

In short, by nesting Pydantic models, you can define complex schemas that handle hierarchical data and enforce specific structures for intricate outputs.

Function Calling with Structured Outputs

One of the widespread features of newer language models is function calling (also called tool calling). This capability allows you to connect language models to user defined functions, effectively giving them (models) access to outside world.

Some common examples are:

  • Retrieving real-time data (e.g., weather, stock prices, sports scores)
  • Performing calculations or data analysis
  • Querying databases or APIs
  • Generating images or other media
  • Translating text between languages
  • Controlling smart home devices or IoT systems
  • Executing custom business logic or workflows

We won’t go into the details of how function calling works here, but you can read our OpenAI Function Calling tutorial.

What’s important to know is that with Structured Outputs, using function calling with OpenAI models becomes so much easier. In the past, the functions you would pass to OpenAI models would require writing complex JSON schemas, outlining every function parameter with type hints. Here is an example:

{
   "type": "function",
   "function": {
       "name": "get_current_weather",
       "description": "Get the current weather",
       "parameters": {
           "type": "object",
           "properties": {
               "location": {
                   "type": "string",
                   "description": "The city and state, e.g. San Francisco, CA",
               },
               "format": {
                   "type": "string",
                   "enum": ["celsius", "fahrenheit"],
                   "description": "The temperature unit to use. Infer this from the users location.",
               },
           },
           "required": ["location", "format"],
       },
   }
}

Even though get_current_weather function has two parameters, its JSON schema becomes enormous and error-prone to write manually.

This is solved in Structured Outputs by using Pydantic models again:

from pydantic import BaseModel
from typing import Literal
def get_weather(location: str, unit: str, condition: str):
   # Implementation details...
   pass
class WeatherData(BaseModel):
   location: str
   unit: Literal["celsius", "fahrenheit"]
   condition: Literal["sunny", "cloudy", "rainy", "snowy"]

First, you write the function itself and its logic. Then, you define it again with a Pydantic model specifying the expected input parameters.

Then, to convert the Pydantic model into a compatible JSON schema, you call pydantic_function_tool:

>>> import openai
>>> openai.pydantic_function_tool(WeatherData)
{'type': 'function',
'function': {'name': 'WeatherData',
 'strict': True,
 'parameters': {'properties': {'location': {'title': 'Location',
    'type': 'string'},
   'unit': {'enum': ['celsius', 'fahrenheit'],
    'title': 'Unit',
    'type': 'string'},
   'condition': {'enum': ['sunny', 'cloudy', 'rainy', 'snowy'],
    'title': 'Condition',
    'type': 'string'}},
  'required': ['location', 'unit', 'condition'],
  'title': 'WeatherData',
  'type': 'object',
  'additionalProperties': False}}}

Here is how to use this tool as part of a request:

import openai
client = OpenAI()
tools = [openai.pydantic_function_tool(WeatherData)]
messages = [
   {
       "role": "system",
       "content": "You are a helpful customer support assistant. Use the supplied tools to assist the user.",
   },
   {
       "role": "user",
       "content": "What is the weather in Tokyo?",
   }
]
response = client.chat.completions.create(
   model=MODEL, messages=messages, tools=tools
)
tool_call = response.choices[0].message.tool_calls[0]
>>> tool_call
ChatCompletionMessageToolCall(id='call_QnZZ0DmNN2cxw3bN433JQNIC', function=Function(arguments='{"location":"Tokyo","unit":"celsius","condition":"sunny"}', name='WeatherData'), type='function')

We pass the Pydantic model in a compatible JSON format to the tools parameter of the Chat Completions API. Then, depending on our query, the model decides whether to call the tool or not.

Since our query in the above example is “What is the weather in Tokyo?”, we see a call in the tool_calls of the returned message object.

Remember, the model doesn’t call the get_weather function but generates arguments for it based on the Pydantic schema we provided:

arguments = json.loads(tool_call.function.arguments)
>>> arguments
{'location': 'Tokyo', 'unit': 'celsius', 'condition': 'sunny'}

It is up to us to call the function with the provided arguments:

some_result = get_weather(**arguments)

If you want the model to generate the arguments for the function and call it at the same time, you are looking for an AI agent. 

We have a separate LangChain Agents tutorial if you are interested.

Best Practices When Using OpenAI Structured Outputs

While using Structured Outputs, there are a number of best practices and recommendations to keep in mind. In this section, we will outline some of them.

  1. Use Pydantic models to define output schemas, as they provide a clean and type-safe way to define expected output structures.
  2. Keep schemas simple and specific to get the most accurate results.
  3. Use appropriate data types (str, int, float, bool, List, Dict) to accurately represent your data.
  4. Use Literal types for enums to define specific allowed values for fields.
  5. Handle model refusals. When using the new .parse() method, the message objects have a new .refusal attribute to indicate a refusal:
text = """John DoeEmail: john.doe@example.comPhone: 123-456-7890Addresses:- 123 Main St, Springfield, IL, 62701- 456 Elm St, Shelbyville, IL, 62702"""
user_info = extract_user_info(text)
if user_info.refusal:
   print(user_info.refusal)
else:
   print(user_info.content)

Output:

{"name":"John Doe","email":"john.doe@example.com","phone":"123-456-7890","addresses":[{"street":"123 Main St","city":"Springfield","state":"IL","zip_code":"62701"},{"street":"456 Elm St","city":"Shelbyville","state":"IL","zip_code":"62702"}]}

6. Provide clear and concise descriptions for each field in your Pydantic models to improve the model output precision:

from pydantic import BaseModel, Field
class Person(BaseModel):
   name: str = Field(..., description="The person's full name")
   age: int = Field(..., description="The person's age in years")
   occupation: str = Field(..., description="The person's current job or profession")

These practices will go a long way in making the most effective use of Structured Outputs in your applications.

Conclusion

In this tutorial, we have learned how to get started with a new OpenAI API feature: Structured Outputs. We have seen how this feature forces language models to produce outputs in the format we specify. We have learned how to use it in combination with function calling and explored some best practices to make the most of the feature.

Here are some related sources to enhance your understanding:

Earn a Top AI Certification

Demonstrate you can effectively and responsibly use AI.

Structured Outputs FAQs

How do Pydantic models work with Structured Outputs?

Pydantic models are used to define the schema for the desired output structure, which is then passed to the OpenAI API to enforce the response format.

Can Structured Outputs be used with function calling?

Yes, Structured Outputs can be used with function calling to simplify the process of defining function parameters and expected outputs.

What are the benefits of using Structured Outputs?

Benefits include consistent response formats, reduced need for post-processing, improved reliability in AI applications, and easier integration with existing systems.

Are there any limitations to using Structured Outputs?

While powerful, Structured Outputs may limit the AI's flexibility in responses and require careful schema design to balance structure with the desired level of detail in outputs.


Photo of Bex Tuychiev
Author
Bex Tuychiev
LinkedIn

I am a data science content creator with over 2 years of experience and one of the largest followings on Medium. I like to write detailed articles on AI and ML with a bit of a sarcastıc style because you've got to do something to make them a bit less dull. I have produced over 130 articles and a DataCamp course to boot, with another one in the makıng. My content has been seen by over 5 million pairs of eyes, 20k of whom became followers on both Medium and LinkedIn. 

Topics

Top OpenAI Courses

Course

Working with the OpenAI API

3 hr
21.5K
Start your journey developing AI-powered applications with the OpenAI API. Learn about the functionality that underpins popular AI applications like ChatGPT.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related
OpenAI Google AI Data Science

blog

The Latest On OpenAI, Google AI, and What it Means For Data Science

Learn about the disruptive language, vision, and multimodal technologies and how it is making us more productive and effective.
Abid Ali Awan's photo

Abid Ali Awan

13 min

tutorial

A Beginner's Guide to The OpenAI API: Hands-On Tutorial and Best Practices

This tutorial introduces you to the OpenAI API, it's use-cases, a hands-on approach to using the API, and all the best practices to follow.
Arunn Thevapalan's photo

Arunn Thevapalan

13 min

tutorial

Introduction to Text Embeddings with the OpenAI API

Explore our guide on using the OpenAI API for creating text embeddings. Discover their applications in text classification, information retrieval, and semantic similarity detection.
Zoumana Keita 's photo

Zoumana Keita

7 min

tutorial

OpenAI Assistants API Tutorial

A comprehensive overview of the Assistants API with our article, offering an in-depth look at its features, industry uses, setup guidance, and best practices to maximize its potential in various business applications.
Zoumana Keita 's photo

Zoumana Keita

14 min

tutorial

OpenAI Function Calling Tutorial

Learn how OpenAI's new Function Calling capability enables GPT models to generate structured JSON output, resolving common dev issues caused by irregular outputs.
Abid Ali Awan's photo

Abid Ali Awan

12 min

code-along

Getting Started with the OpenAI API and ChatGPT

Get an introduction to the OpenAI API and the GPT-3 model.
Richie Cotton's photo

Richie Cotton

See MoreSee More