Skip to main content
HomeBlogArtificial Intelligence (AI)

What are Foundation Models?

Discover the key technology that is powering the generative AI boom
Aug 2023  · 9 min read

BERT, GPT-3, DALL-E 2, LLaMA, BLOOM; these models are some of the stars in the AI revolution we’ve been witnessing since the release of ChatGPT. What do these models have in common? You guessed it: they all are foundation models.

Foundation models are a recent development in AI. These models are developed from algorithms designed to optimize for generality and versatility of output. They are based on large-scale neural networks that are often trained on a broad range of data sources and large amounts of data to accomplish a wide range of downstream tasks, including some for which they were not specifically developed and trained.

The popularization of foundation models is agitating the classical debate of Narrow AI vs Artificial General Intelligence (AGI), also known as Strong AI. Narrow AI refers to AI systems designed for specific tasks but which are unable to perform tasks outside their planned scope. By contrast, AGI is a hypothetical AI system that can understand, learn, and apply knowledge across a wide range of tasks, much like a human.

While foundation models are still incapable of thinking like humans, they are delivering groundbreaking results that bring us closer to the threshold of AGI. That’s why data professionals and non-experts should be familiar with these models.

For newcomers to the subject, our AI Essentials Skill Track will help you get a deep overview of next-generation AI models. For those with existing skills, our article on generative AI projects gives you the chance to put your knowledge to the test.

Let’s take a closer look at foundation models!

What are Foundation Models? Understanding Key Concepts

Foundation model is a relatively recent term that can overlap with other popular concepts, such as Generative AI, transformer, and large language models (LLMs).

Yet the terminology of AI is still contested. Here is a list of definitions that will help you navigate the rapidly-evolving field of AI:

  • Generative AI. It’s a broad term used to describe AI systems whose primary function is to generate content, in contrast with other AI systems designed for other tasks, such as classification and prediction.
  • Transformer. Transformers have revolutionized the field of deep learning. They provide an innovative architecture for handling sequential data more effectively. Transformers are particularly well-suited for processing text, and that’s why they have become a cornerstone in the field of natural language processing (NLP) and natural language generation (NLG). However, transformers have also been used with other data types, like images, with equally successful results.
  • Large Language Model. LLMs are AI systems used to model and process human language. Transformers are the underlying technologies behind LLMs. They are called “large” because they have hundreds of millions or even billions of parameters, which are pre-trained using a massive corpus of text data.
  • Foundation Model. It’s a broad term to define AI models designed to produce a wide and general variety of outputs. They are capable of a range of possible tasks and applications, including text, video, image, or audio generation. A singular feature of these models is that they can be standalone systems or used as a ‘foundation’ for other applications. For example, the LLM called GPT works as the foundation model of ChatGPT.

How Do Foundation Models Work?

The underpinning technology of foundation models –irrespective of the task they are designed for and the type of data they use for training– is the transformer.

Developed by Google researchers in 2017, transformers provide an alternative to traditional recurrent neural networks (RNNs) and convolutional neural networks (CNNs) to handle sequential data, namely text.

Transformers work by predicting the next word in a sequence to form a coherent response. This process is done with a mechanism called attention that weighs the influence of different words when generating a response.

Training transformers involves two steps: pretraining and fine-tuning.


In this phase, transformers are trained on large amounts of raw (text) data, with the internet as the primary data source.

The training is done using self-supervised learning, an innovative type of training that doesn’t require human action to label data.

The goal of pre-training is to learn the statistical patterns of the language. Since the mainstream strategy to achieve better performance of transformers is by increasing the size of the model (i.e., increasing the parameters) and the amount of data used during pre-training, this phase is normally time-consuming and costly.


Pre-training allows a transformer to gain a basic understanding of language, but it’s not enough to perform specific practical tasks. That’s why the model undergoes a fine-tuning phase, where it is trained on a narrower, domain-specific dataset generated with the help of human reviewers following certain guidelines.


Another important feature of foundation models is modality. Depending on the type of data foundation models can take as inputs, they can be unimodal or multimodal. The former can only take one type of data and generate the same type of output, while the latter can receive multiple modalities of input type and generate multiple types of outputs (for example, GPT-4 can accept both image and text inputs and generate text outputs.

Understanding how transformers work can be tricky and out of the scope of this article. For a more detailed explanation, check out our article, What is ChatGPT, where we have asked the question directly to ChatGPT and learn about Transformers and Hugging Face to get a more technical view.

If you want to get more details on how LLMs work, our Large Language Models (LLMs) Concepts Course is a great starting point.

Applications of Foundation Models

Foundation models can be used as a standalone system or as the basis for countless downstream AI systems and applications. While the majority of modern foundation systems are designed to generate text or code and perform NLP tasks, there is an increasing number of systems capable of generating other types of outputs, such as images or music.

Below you can find a table with some of the most popular foundation models.

Foundation model

Downstream AI system


LaMDA (Google)


Experimental, conversational, AI chat service.

GPT-3.5 (OpenAI)


Allows you to have human-like conversations.

GPT-3 (OpenAI)

DataCamp AI Assistant

Allow DataCamp Workspace users to code better and smarter.

Codex (OpenAI)

GitHub Copilot

Suggest code and entire functions in real time.

AudioLM (Google)


Create music based on text descriptions.

BLOOM (Hugging Face)

No downstream application. Can be used directly

Multiple NLP tasks. Trained in 46 different languages and 13 programming languages.

LLaMA (Meta)

No downstream application. Can be used directly

Help researchers advance their work in this subfield of AI.

DALL-E 2 (OpenAI)

No downstream application. Can be used directly

Create realistic images and art from a description in natural language.

Challenges and Concerns with Foundation Models

Foundation models are at the forefront of AI and have the potential to power countless applications. However, it’s important to consider its potential risks and challenges.

Here is a non-exhaustive list of risks that need to be associated with the widespread adoption of foundation models:

  • Lack of transparency. Algorithmic opacity is one of the main concerns associated with foundation models, often described as ‘black box’ models, that is, models so complex that it’s impossible to track their reasoning. AI providers are often reluctant to provide information about their models on the grounds of business confidentiality. However, enhancing transparency is essential to know the cost and impact of foundation models, as well as assess their safety and effectiveness.
  • Bias and discrimination. Biased foundation models can result in unfair decisions that often exacerbate discrimination against minority groups. IBM Research is exploring ways to minimize this bias.
  • Privacy issues. Foundation models are trained with vast amounts of data, often comprising personal data. This can lead to issues and risks related to data privacy and security.
  • Ethical considerations. Foundation models can sometimes lead to decisions that have serious implications in our life, with significant impacts on our fundamental rights. We explored the ethics of generative AI in a separate post.

The Future of Foundation Models

Foundation models are fuelling the current generative AI boom. The potential applications are so vast that every sector and industry, including data science, is likely to be affected by the adoption of Ai in the coming future.

While we’re still far from achieving Artificial General Intelligence, the development of foundation models represents an important milestone in the AI race. Companies, regulators, and society, in general, should be aware of the current state of AI, as a precondition to ensure transparency, fairness, and accountability.

DataCamp is working hard to provide comprehensive and accessible resources for everyone to keep updated with AI development. Check them out:

Photo of Javier Canales Luna
Javier Canales Luna

Start Learning AI Today!

AI Fundamentals

AdvancedSkill Level
10 hours hr
Discover the fundamentals of AI, dive into models like ChatGPT, and decode generative AI secrets to navigate the dynamic AI landscape.
See DetailsRight Arrow
Start Course
See MoreRight Arrow

ChatGPT in Space: How AI Can Transform Deep Space Missions

Explore how tools like ChatGPT could revolutionize space travel by improving communication, data quality, and astronaut well-being. Learn about the challenges and solutions for AI in space.
James Chapman's photo

James Chapman

7 min

The Top 5 Vector Databases

A comprehensive guide to the best vector databases. Master high-dimensional data storage, decipher unstructured information, and leverage vector embeddings for AI applications.
Moez Ali's photo

Moez Ali

14 min

What is Similarity Learning? Definition, Use Cases & Methods

While traditional supervised learning focuses on predicting labels based on input data and unsupervised learning aims to find hidden structures within data, similarity learning is somewhat in between.
Abid Ali Awan's photo

Abid Ali Awan

9 min

What is Machine Listening? Definition, Types, Use Cases

Where humans rely on years of experience and context, machines require vast amounts of data and training to "listen".
Abid Ali Awan's photo

Abid Ali Awan

8 min

Building Ethical Machines with Reid Blackman, Founder & CEO at Virtue Consultants

Reid and Richie discuss the dominant concerns in AI ethics, from biased AI and privacy violations to the challenges introduced by generative AI.
Richie Cotton's photo

Richie Cotton

57 min

Intro to Causal AI Using the DoWhy Library in Python

This tutorial provides an introduction to causal AI using the DoWhy library in Python. It discusses fundamental principles and offers code examples.
Paul Hünermund 's photo

Paul Hünermund

14 min

See MoreSee More