Skip to main content
HomeBlogArtificial Intelligence (AI)

OpenAI Announce GPT-4 Turbo With Vision: What We Know So Far

Discover the latest update from OpenAI, GPT-4 Turbo with vision, and its key features, including improved knowledge cutoff, an expanded context window, budget-friendly pricing, and more.
Updated Nov 2023  · 7 min read

At the recent OpenAI DevDay, the organization made a much-anticipated announcement: the introduction of GPT-4 Turbo, an improvement in their groundbreaking AI model. Here, we take a comprehensive look at what GPT-4 Turbo is, its key features, and how it can benefit developers and users.

In separate articles, you can read more about GPTs and the ChatGPT Store, and the Assistants API, which were also announced at the Dev Day.

What is GPT-4 Turbo?

GPT-4 Turbo is an update to the existing GPT-4 large language model. It brings several improvements, including a greatly increased context window and access to more up-to-date knowledge. OpenAI has gradually been improving the capabilities of GPT-4 in ChatGPT with the addition of custom instructions, ChatGPT plugins, DALL-E 3, and Advanced Data Analytics. This latest update brings a host of exciting new features.

What is GPT-4 Turbo With Vision?

GPT-4 Turbo with vision is a variant of GPT-4 Turbo that includes an optical character recognition (OCR) capability. That is, you can provide it with an image, and it can return any text contained in the image. For example, you can input a photo of a menu, and it will return the food choices written in that photo. Likewise, you can provide a photo of an invoice and automatically extract the vendor name and item details.

The "with vision" features will be available in ChatGPT by default, and available to developers by selecting a "gpt-4-vision" model in the OpenAI API.

GPT-4 Turbo Key Features

GPT-4 Turbo has several improvements from previous models, enhancing its capabilities. Here are some key features that make it stand out:

Improved knowledge cutoff

Sam Altman promises to make sure ChatGPT stays up-to-date

Sam Altman promises to make sure ChatGPT stays up-to-date

The existing versions of GPT-3.5 and GPT-4 had a knowledge cutoff of September 2021. That means they cannot answer questions about real-world events that happened after that time, unless given access to external data sources.

GPT-4 extends the knowledge cutoff by nineteen months to April 2023. It means that GPT-4 Turbo has access to information and events up to that date, making it a more informed and reliable source of information. Furthermore, OpenAI’s CEO, Sam Altman, promised that "[OpenAI] will try to never let [GPT] get that out of date again."

128K context window

The context window of a large language model (LLM) is a measure of how long its memory of the conversation lasts. If a model has a context window of 4,000 tokens (about 3,000 words), then everything in the chat beyond 4,000 tokens ago is ignored, and the responses can become less accurate or even contradictory with previous responses. This is a problem for working with longer documents, or for chatbots holding extended conversations.

GPT-4 has a maximum context length of 32k (32,000) tokens. GPT-4 Turbo increases this to 128k tokens (about 240 pages at 400 words per page). This exceeds the 100k maximum context of Anthropic's Claude 2 model and brings it in line with Nous Research's YARN-MISTRAL-7b-128k model.

It remains to be seen whether the longer context window results in satisfactory performance of responses across the whole window. Recent research from Stanford Univerity showed that existing long context models could only provide accurate responses when retrieving information from near the start or end of the document.

It's also worth noting that 128k seems to merely be a stepping stone toward the dream of "infinite context". Early stage research from Microsoft and Xi'an Jiaotong University is aiming to scale LLMs to a billion tokens of context.

GPT goes on sale

OpenAI has responded to increased competition in the LLM market and reduced the price of GPT-4 Turbo to be budget-friendly for developers. When using the OpenAI API, the GPT-4 Turbo input tokens price is now one-third of its previous price, down from 3 US cents to 1 US cent per 1000 tokens. The output tokens are now half price, down from 6 US cents to 3 US cents per 1000 tokens.

The same trend continues with GPT-3.5 Turbo models, offering 3x cheaper input tokens at 0.1 US cents per 1000 tokens and 2x cheaper output tokens at 0.2 US cents per 1000 tokens.

Additionally, fine-tuned GPT-3.5 Turbo 4K model input tokens are now 4x more affordable, with the price dropping from 1.2 US cents to 0.3 US cents per 1000 tokens, and output tokens are 2.7x cheaper, dropping from 1.6 US cents to 0.6 US cents per 1000 tokens. The training price remains the same at 0.8 US cents per 1000 tokens.

These price adjustments aim to make advanced AI models more cost-effective for developers.

GPT goes multi-modal: image prompts & text-to-speech

"GPT-4 Turbo with vision" was announced as coming soon. You will soon be able to prompt GPT-4 Turbo using images as prompts by inputting them directly in the chat box. The tool will then be able to generate captions or provide a description of what the image depicts. It will also handle text-to-speech requests.

Function calling updates

Function calling is a feature for developers incorporating generative AI into their applications. It enables them to describe functions of their app or external APIs to GPT-4 Turbo. With the ability to call multiple functions in a single message, this feature streamlines the interaction with the model. For example, users can send a single message requesting multiple actions, eliminating the need for multiple back-and-forth interactions with the model.

How To Access GPT-4 Turbo

Access to GPT-4 Turbo is available to ‘all paying developers,’ meaning if you have API access you can simply pass "gpt-4-1106-preview" as the model name in the OpenAI API. Likewise, for GPT-4 Turbo with vision, you can pass "gpt-4-vision-preview" as the model name.

Note that these preview models are not yet considered suitable for production usage. However, as part of the announcement, Altman also promised a production-ready version will be available in the coming weeks.

For non-developers, GPT-4 Turbo will likely become available to ChatGPT Plus and ChatGPT Enterprise users in the coming weeks.

Rate limits

Access to GPt models via the OpenAI API are rate limited. That is, you can only make a limited number of requests to the API per month. OpenAI has now published clearer guidelines on how the rate limits work, so your application won't be unexpectedly cut off.

Additionally, the rate limits for GPT-4 have doubled

As GPT-4 Turbo is currently in the preview stage, the rate limits for GPT-4 Turbo are set at 20 requests per minute and 100 requests per day. OpenAI has indicated that they won't be accommodating rate limit increases for this model at this time. However, they likely will once a public version is available.

Final Thoughts

The announcement of GPT-4 Turbo offers an exciting glimpse into the future of generative AI, and we can’t wait to get to grips with it. If you’re just starting out with all things GPT, check out our Introduction to ChatGPT course. For those looking for a more in-depth look, our tutorial on using GPT-3.5 and GPT-4 via the OpenAI API in Python has plenty to explore.


Photo of Richie Cotton
Author
Richie Cotton

Richie helps individuals and organizations get better at using data and AI. He's been a data scientist since before it was called data science, and has written two books and created many DataCamp courses on the subject. He is a host of the DataFramed podcast, and runs DataCamp's webinar program.

Topics

Start Your OpenAI Journey Today!

Course

Working with the OpenAI API

3 hr
12.5K
Start your journey developing AI-powered applications with the OpenAI API. Learn about the functionality that underpins popular AI applications like ChatGPT.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

You’re invited! Join us for Radar: AI Edition

Join us for two days of events sharing best practices from thought leaders in the AI space
DataCamp Team's photo

DataCamp Team

2 min

What is Llama 3? The Experts' View on The Next Generation of Open Source LLMs

Discover Meta’s Llama3 model: the latest iteration of one of today's most powerful open-source large language models.
Richie Cotton's photo

Richie Cotton

5 min

How Walmart Leverages Data & AI with Swati Kirti, Sr Director of Data Science at Walmart

Swati and Richie explore the role of data and AI at Walmart, how Walmart improves customer experience through the use of data, supply chain optimization, demand forecasting, scaling AI solutions, and much more. 
Richie Cotton's photo

Richie Cotton

31 min

Creating an AI-First Culture with Sanjay Srivastava, Chief Digital Strategist at Genpact

Sanjay and Richie cover the shift from experimentation to production seen in the AI space over the past 12 months, how AI automation is revolutionizing business processes at GENPACT, how change management contributes to how we leverage AI tools at work, and much more.
Richie Cotton's photo

Richie Cotton

36 min

Serving an LLM Application as an API Endpoint using FastAPI in Python

Unlock the power of Large Language Models (LLMs) in your applications with our latest blog on "Serving LLM Application as an API Endpoint Using FastAPI in Python." LLMs like GPT, Claude, and LLaMA are revolutionizing chatbots, content creation, and many more use-cases. Discover how APIs act as crucial bridges, enabling seamless integration of sophisticated language understanding and generation features into your projects.
Moez Ali's photo

Moez Ali

How to Improve RAG Performance: 5 Key Techniques with Examples

Explore different approaches to enhance RAG systems: Chunking, Reranking, and Query Transformations.
Eugenia Anello's photo

Eugenia Anello

See MoreSee More