Skip to main content
HomeBlogArtificial Intelligence (AI)

OpenAI Announce GPT-4 Turbo With Vision: What We Know So Far

Discover the latest update from OpenAI, GPT-4 Turbo with vision, and its key features, including improved knowledge cutoff, an expanded context window, budget-friendly pricing, and more.
Nov 8, 2023  · 7 min read

At the recent OpenAI DevDay, the organization made a much-anticipated announcement: the introduction of GPT-4 Turbo, an improvement in their groundbreaking AI model. Here, we take a comprehensive look at what GPT-4 Turbo is, its key features, and how it can benefit developers and users.

In separate articles, you can read more about GPTs and the ChatGPT Store, and the Assistants API, which were also announced at the Dev Day.

What is GPT-4 Turbo?

GPT-4 Turbo is an update to the existing GPT-4 large language model. It brings several improvements, including a greatly increased context window and access to more up-to-date knowledge. OpenAI has gradually been improving the capabilities of GPT-4 in ChatGPT with the addition of custom instructions, ChatGPT plugins, DALL-E 3, and Advanced Data Analytics. This latest update brings a host of exciting new features.

What is GPT-4 Turbo With Vision?

GPT-4 Turbo with vision is a variant of GPT-4 Turbo that includes an optical character recognition (OCR) capability. That is, you can provide it with an image, and it can return any text contained in the image. For example, you can input a photo of a menu, and it will return the food choices written in that photo. Likewise, you can provide a photo of an invoice and automatically extract the vendor name and item details.

The "with vision" features will be available in ChatGPT by default, and available to developers by selecting a "gpt-4-vision" model in the OpenAI API.

GPT-4 Turbo Key Features

GPT-4 Turbo has several improvements from previous models, enhancing its capabilities. Here are some key features that make it stand out:

Improved knowledge cutoff

Sam Altman promises to make sure ChatGPT stays up-to-date

Sam Altman promises to make sure ChatGPT stays up-to-date

The existing versions of GPT-3.5 and GPT-4 had a knowledge cutoff of September 2021. That means they cannot answer questions about real-world events that happened after that time, unless given access to external data sources.

GPT-4 extends the knowledge cutoff by nineteen months to April 2023. It means that GPT-4 Turbo has access to information and events up to that date, making it a more informed and reliable source of information. Furthermore, OpenAI’s CEO, Sam Altman, promised that "[OpenAI] will try to never let [GPT] get that out of date again."

128K context window

The context window of a large language model (LLM) is a measure of how long its memory of the conversation lasts. If a model has a context window of 4,000 tokens (about 3,000 words), then everything in the chat beyond 4,000 tokens ago is ignored, and the responses can become less accurate or even contradictory with previous responses. This is a problem for working with longer documents, or for chatbots holding extended conversations.

GPT-4 has a maximum context length of 32k (32,000) tokens. GPT-4 Turbo increases this to 128k tokens (about 240 pages at 400 words per page). This exceeds the 100k maximum context of Anthropic's Claude 2 model and brings it in line with Nous Research's YARN-MISTRAL-7b-128k model.

It remains to be seen whether the longer context window results in satisfactory performance of responses across the whole window. Recent research from Stanford Univerity showed that existing long context models could only provide accurate responses when retrieving information from near the start or end of the document.

It's also worth noting that 128k seems to merely be a stepping stone toward the dream of "infinite context". Early stage research from Microsoft and Xi'an Jiaotong University is aiming to scale LLMs to a billion tokens of context.

GPT goes on sale

OpenAI has responded to increased competition in the LLM market and reduced the price of GPT-4 Turbo to be budget-friendly for developers. When using the OpenAI API, the GPT-4 Turbo input tokens price is now one-third of its previous price, down from 3 US cents to 1 US cent per 1000 tokens. The output tokens are now half price, down from 6 US cents to 3 US cents per 1000 tokens.

The same trend continues with GPT-3.5 Turbo models, offering 3x cheaper input tokens at 0.1 US cents per 1000 tokens and 2x cheaper output tokens at 0.2 US cents per 1000 tokens.

Additionally, fine-tuned GPT-3.5 Turbo 4K model input tokens are now 4x more affordable, with the price dropping from 1.2 US cents to 0.3 US cents per 1000 tokens, and output tokens are 2.7x cheaper, dropping from 1.6 US cents to 0.6 US cents per 1000 tokens. The training price remains the same at 0.8 US cents per 1000 tokens.

These price adjustments aim to make advanced AI models more cost-effective for developers.

GPT goes multi-modal: image prompts & text-to-speech

"GPT-4 Turbo with vision" was announced as coming soon. You will soon be able to prompt GPT-4 Turbo using images as prompts by inputting them directly in the chat box. The tool will then be able to generate captions or provide a description of what the image depicts. It will also handle text-to-speech requests.

Function calling updates

Function calling is a feature for developers incorporating generative AI into their applications. It enables them to describe functions of their app or external APIs to GPT-4 Turbo. With the ability to call multiple functions in a single message, this feature streamlines the interaction with the model. For example, users can send a single message requesting multiple actions, eliminating the need for multiple back-and-forth interactions with the model.

How To Access GPT-4 Turbo

Access to GPT-4 Turbo is available to ‘all paying developers,’ meaning if you have API access you can simply pass "gpt-4-1106-preview" as the model name in the OpenAI API. Likewise, for GPT-4 Turbo with vision, you can pass "gpt-4-vision-preview" as the model name.

Note that these preview models are not yet considered suitable for production usage. However, as part of the announcement, Altman also promised a production-ready version will be available in the coming weeks.

For non-developers, GPT-4 Turbo will likely become available to ChatGPT Plus and ChatGPT Enterprise users in the coming weeks.

Rate limits

Access to GPt models via the OpenAI API are rate limited. That is, you can only make a limited number of requests to the API per month. OpenAI has now published clearer guidelines on how the rate limits work, so your application won't be unexpectedly cut off.

Additionally, the rate limits for GPT-4 have doubled

As GPT-4 Turbo is currently in the preview stage, the rate limits for GPT-4 Turbo are set at 20 requests per minute and 100 requests per day. OpenAI has indicated that they won't be accommodating rate limit increases for this model at this time. However, they likely will once a public version is available.

Final Thoughts

The announcement of GPT-4 Turbo offers an exciting glimpse into the future of generative AI, and we can’t wait to get to grips with it. If you’re just starting out with all things GPT, check out our Introduction to ChatGPT course. For those looking for a more in-depth look, our tutorial on using GPT-3.5 and GPT-4 via the OpenAI API in Python has plenty to explore.


Photo of Richie Cotton
Author
Richie Cotton
LinkedIn

Richie helps individuals and organizations get better at using data and AI. He's been a data scientist since before it was called data science, and has written two books and created many DataCamp courses on the subject. He is a host of the DataFramed podcast, and runs DataCamp's webinar program.

Topics

Start Your OpenAI Journey Today!

Course

Working with the OpenAI API

3 hr
20.7K
Start your journey developing AI-powered applications with the OpenAI API. Learn about the functionality that underpins popular AI applications like ChatGPT.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

blog

GPT-4o Guide: How it Works, Use Cases, Pricing, Benchmarks

Learn about OpenAI’s GPT-4o, a multimodal AI model that processes text, audio, and visual data, and discover how it compares with GPT-4 Turbo for various use cases.
Richie Cotton's photo

Richie Cotton

8 min

blog

OpenAI Announces GPTs and ChatGPT Store

Discover the future of AI customization as OpenAI unveils GPTs and the GPT Store. Explore how you can create tailored AI models for specific tasks and learn about the innovative GPT marketplace.
Richie Cotton's photo

Richie Cotton

7 min

blog

Everything We Know About GPT-5

Predicting what the next evolution in OpenAI's AI technology might look like and what advancements the GPT-5 model might have.
Josep Ferrer's photo

Josep Ferrer

10 min

blog

What is GPT-4 and Why Does it Matter?

OpenAI has announced the release of its latest large language model, GPT-4. This model is a large multimodal model that can accept both image and text inputs and generate text outputs.
Abid Ali Awan's photo

Abid Ali Awan

9 min

blog

GPT-3 and the Next Generation of AI-Powered Services

How GPT-3 expands the world of possibilities for language tasks—and why it will pave the way for designers to prototype more easily, streamline work for data analysts, enable more robust research, and automate content generation.
Adel Nehme's photo

Adel Nehme

7 min

tutorial

GPT-4 Vision: A Comprehensive Guide for Beginners

This tutorial will introduce you to everything you need to know about GPT-4 Vision, from accessing it to, going hands-on into real-world examples, and the limitations of it.
Arunn Thevapalan's photo

Arunn Thevapalan

12 min

See MoreSee More