Skip to main content
HomeBlogArtificial Intelligence (AI)

What are Foundation Models?

Discover the key technology that is powering the generative AI boom
Updated Aug 2023  · 9 min read

BERT, GPT-3, DALL-E 2, LLaMA, BLOOM; these models are some of the stars in the AI revolution we’ve been witnessing since the release of ChatGPT. What do these models have in common? You guessed it: they all are foundation models.

Foundation models are a recent development in AI. These models are developed from algorithms designed to optimize for generality and versatility of output. They are based on large-scale neural networks that are often trained on a broad range of data sources and large amounts of data to accomplish a wide range of downstream tasks, including some for which they were not specifically developed and trained.

The popularization of foundation models is agitating the classical debate of Narrow AI vs Artificial General Intelligence (AGI), also known as Strong AI. Narrow AI refers to AI systems designed for specific tasks but which are unable to perform tasks outside their planned scope. By contrast, AGI is a hypothetical AI system that can understand, learn, and apply knowledge across a wide range of tasks, much like a human.

While foundation models are still incapable of thinking like humans, they are delivering groundbreaking results that bring us closer to the threshold of AGI. That’s why data professionals and non-experts should be familiar with these models.

For newcomers to the subject, our AI Essentials Skill Track will help you get a deep overview of next-generation AI models. For those with existing skills, our article on generative AI projects gives you the chance to put your knowledge to the test.

Let’s take a closer look at foundation models!

What are Foundation Models? Understanding Key Concepts

Foundation model is a relatively recent term that can overlap with other popular concepts, such as Generative AI, transformer, and large language models (LLMs).

Yet the terminology of AI is still contested. Here is a list of definitions that will help you navigate the rapidly-evolving field of AI:

  • Generative AI. It’s a broad term used to describe AI systems whose primary function is to generate content, in contrast with other AI systems designed for other tasks, such as classification and prediction.
  • Transformer. Transformers have revolutionized the field of deep learning. They provide an innovative architecture for handling sequential data more effectively. Transformers are particularly well-suited for processing text, and that’s why they have become a cornerstone in the field of natural language processing (NLP) and natural language generation (NLG). However, transformers have also been used with other data types, like images, with equally successful results.
  • Large Language Model. LLMs are AI systems used to model and process human language. Transformers are the underlying technologies behind LLMs. They are called “large” because they have hundreds of millions or even billions of parameters, which are pre-trained using a massive corpus of text data.
  • Foundation Model. It’s a broad term to define AI models designed to produce a wide and general variety of outputs. They are capable of a range of possible tasks and applications, including text, video, image, or audio generation. A singular feature of these models is that they can be standalone systems or used as a ‘foundation’ for other applications. For example, the LLM called GPT works as the foundation model of ChatGPT.

How Do Foundation Models Work?

The underpinning technology of foundation models –irrespective of the task they are designed for and the type of data they use for training– is the transformer.

Developed by Google researchers in 2017, transformers provide an alternative to traditional recurrent neural networks (RNNs) and convolutional neural networks (CNNs) to handle sequential data, namely text.

Transformers work by predicting the next word in a sequence to form a coherent response. This process is done with a mechanism called attention that weighs the influence of different words when generating a response.

Training transformers involves two steps: pretraining and fine-tuning.

Pre-training

In this phase, transformers are trained on large amounts of raw (text) data, with the internet as the primary data source.

The training is done using self-supervised learning, an innovative type of training that doesn’t require human action to label data.

The goal of pre-training is to learn the statistical patterns of the language. Since the mainstream strategy to achieve better performance of transformers is by increasing the size of the model (i.e., increasing the parameters) and the amount of data used during pre-training, this phase is normally time-consuming and costly.

Fine-tuning

Pre-training allows a transformer to gain a basic understanding of language, but it’s not enough to perform specific practical tasks. That’s why the model undergoes a fine-tuning phase, where it is trained on a narrower, domain-specific dataset generated with the help of human reviewers following certain guidelines.

Modality

Another important feature of foundation models is modality. Depending on the type of data foundation models can take as inputs, they can be unimodal or multimodal. The former can only take one type of data and generate the same type of output, while the latter can receive multiple modalities of input type and generate multiple types of outputs (for example, GPT-4 can accept both image and text inputs and generate text outputs.

Understanding how transformers work can be tricky and out of the scope of this article. For a more detailed explanation, check out our article, What is ChatGPT, where we have asked the question directly to ChatGPT and learn about Transformers and Hugging Face to get a more technical view.

If you want to get more details on how LLMs work, our Large Language Models (LLMs) Concepts Course is a great starting point.

Applications of Foundation Models

Foundation models can be used as a standalone system or as the basis for countless downstream AI systems and applications. While the majority of modern foundation systems are designed to generate text or code and perform NLP tasks, there is an increasing number of systems capable of generating other types of outputs, such as images or music.

Below you can find a table with some of the most popular foundation models.

Foundation model

Downstream AI system

Applications

LaMDA (Google)

Bard

Experimental, conversational, AI chat service.

GPT-3.5 (OpenAI)

ChatGPT

Allows you to have human-like conversations.

GPT-4 (OpenAI)

DataLab AI Assistant

Allow DataLab users to code better and smarter.

Codex (OpenAI)

GitHub Copilot

Suggest code and entire functions in real time.

AudioLM (Google)

MusicLM

Create music based on text descriptions.

BLOOM (Hugging Face)

No downstream application. Can be used directly

Multiple NLP tasks. Trained in 46 different languages and 13 programming languages.

LLaMA (Meta)

No downstream application. Can be used directly

Help researchers advance their work in this subfield of AI.

DALL-E 2 (OpenAI)

No downstream application. Can be used directly

Create realistic images and art from a description in natural language.

Challenges and Concerns with Foundation Models

Foundation models are at the forefront of AI and have the potential to power countless applications. However, it’s important to consider its potential risks and challenges.

Here is a non-exhaustive list of risks that need to be associated with the widespread adoption of foundation models:

  • Lack of transparency. Algorithmic opacity is one of the main concerns associated with foundation models, often described as ‘black box’ models, that is, models so complex that it’s impossible to track their reasoning. AI providers are often reluctant to provide information about their models on the grounds of business confidentiality. However, enhancing transparency is essential to know the cost and impact of foundation models, as well as assess their safety and effectiveness.
  • Bias and discrimination. Biased foundation models can result in unfair decisions that often exacerbate discrimination against minority groups. IBM Research is exploring ways to minimize this bias.
  • Privacy issues. Foundation models are trained with vast amounts of data, often comprising personal data. This can lead to issues and risks related to data privacy and security.
  • Ethical considerations. Foundation models can sometimes lead to decisions that have serious implications in our life, with significant impacts on our fundamental rights. We explored the ethics of generative AI in a separate post.

The Future of Foundation Models

Foundation models are fuelling the current generative AI boom. The potential applications are so vast that every sector and industry, including data science, is likely to be affected by the adoption of Ai in the coming future.

While we’re still far from achieving Artificial General Intelligence, the development of foundation models represents an important milestone in the AI race. Companies, regulators, and society, in general, should be aware of the current state of AI, as a precondition to ensure transparency, fairness, and accountability.

DataCamp is working hard to provide comprehensive and accessible resources for everyone to keep updated with AI development. Check them out:


Photo of Javier Canales Luna
Author
Javier Canales Luna
Topics

Start Learning AI Today!

Track

AI Fundamentals

10hrs hr
Discover the fundamentals of AI, dive into models like ChatGPT, and decode generative AI secrets to navigate the dynamic AI landscape.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

What is Llama 3? The Experts' View on The Next Generation of Open Source LLMs

Discover Meta’s Llama3 model: the latest iteration of one of today's most powerful open-source large language models.
Richie Cotton's photo

Richie Cotton

5 min

Attention Mechanism in LLMs: An Intuitive Explanation

Learn how the attention mechanism works and how it revolutionized natural language processing (NLP).
Yesha Shastri's photo

Yesha Shastri

8 min

Top 13 ChatGPT Wrappers to Maximize Functionality and Efficiency

Discover the best ChatGPT wrappers to extend its capabilities
Bex Tuychiev's photo

Bex Tuychiev

5 min

How Walmart Leverages Data & AI with Swati Kirti, Sr Director of Data Science at Walmart

Swati and Richie explore the role of data and AI at Walmart, how Walmart improves customer experience through the use of data, supply chain optimization, demand forecasting, scaling AI solutions, and much more. 
Richie Cotton's photo

Richie Cotton

31 min

Creating an AI-First Culture with Sanjay Srivastava, Chief Digital Strategist at Genpact

Sanjay and Richie cover the shift from experimentation to production seen in the AI space over the past 12 months, how AI automation is revolutionizing business processes at GENPACT, how change management contributes to how we leverage AI tools at work, and much more.
Richie Cotton's photo

Richie Cotton

36 min

How to Improve RAG Performance: 5 Key Techniques with Examples

Explore different approaches to enhance RAG systems: Chunking, Reranking, and Query Transformations.
Eugenia Anello's photo

Eugenia Anello

See MoreSee More