BERT, GPT-3, DALL-E 2, LLaMA, BLOOM; these models are some of the stars in the AI revolution we’ve been witnessing since the release of ChatGPT. What do these models have in common? You guessed it: they all are foundation models.
Foundation models are a recent development in AI. These models are developed from algorithms designed to optimize for generality and versatility of output. They are based on large-scale neural networks that are often trained on a broad range of data sources and large amounts of data to accomplish a wide range of downstream tasks, including some for which they were not specifically developed and trained.
The popularization of foundation models is agitating the classical debate of Narrow AI vs Artificial General Intelligence (AGI), also known as Strong AI. Narrow AI refers to AI systems designed for specific tasks but which are unable to perform tasks outside their planned scope. By contrast, AGI is a hypothetical AI system that can understand, learn, and apply knowledge across a wide range of tasks, much like a human.
While foundation models are still incapable of thinking like humans, they are delivering groundbreaking results that bring us closer to the threshold of AGI. That’s why data professionals and non-experts should be familiar with these models.
For newcomers to the subject, our AI Essentials Skill Track will help you get a deep overview of next-generation AI models. For those with existing skills, our article on generative AI projects gives you the chance to put your knowledge to the test.
Let’s take a closer look at foundation models!
What are Foundation Models? Understanding Key Concepts
Foundation model is a relatively recent term that can overlap with other popular concepts, such as Generative AI, transformer, and large language models (LLMs).
Yet the terminology of AI is still contested. Here is a list of definitions that will help you navigate the rapidly-evolving field of AI:
- Generative AI. It’s a broad term used to describe AI systems whose primary function is to generate content, in contrast with other AI systems designed for other tasks, such as classification and prediction.
- Transformer. Transformers have revolutionized the field of deep learning. They provide an innovative architecture for handling sequential data more effectively. Transformers are particularly well-suited for processing text, and that’s why they have become a cornerstone in the field of natural language processing (NLP) and natural language generation (NLG). However, transformers have also been used with other data types, like images, with equally successful results.
- Large Language Model. LLMs are AI systems used to model and process human language. Transformers are the underlying technologies behind LLMs. They are called “large” because they have hundreds of millions or even billions of parameters, which are pre-trained using a massive corpus of text data.
- Foundation Model. It’s a broad term to define AI models designed to produce a wide and general variety of outputs. They are capable of a range of possible tasks and applications, including text, video, image, or audio generation. A singular feature of these models is that they can be standalone systems or used as a ‘foundation’ for other applications. For example, the LLM called GPT works as the foundation model of ChatGPT.
How Do Foundation Models Work?
The underpinning technology of foundation models –irrespective of the task they are designed for and the type of data they use for training– is the transformer.
Developed by Google researchers in 2017, transformers provide an alternative to traditional recurrent neural networks (RNNs) and convolutional neural networks (CNNs) to handle sequential data, namely text.
Transformers work by predicting the next word in a sequence to form a coherent response. This process is done with a mechanism called attention that weighs the influence of different words when generating a response.
Training transformers involves two steps: pretraining and fine-tuning.
In this phase, transformers are trained on large amounts of raw (text) data, with the internet as the primary data source.
The training is done using self-supervised learning, an innovative type of training that doesn’t require human action to label data.
The goal of pre-training is to learn the statistical patterns of the language. Since the mainstream strategy to achieve better performance of transformers is by increasing the size of the model (i.e., increasing the parameters) and the amount of data used during pre-training, this phase is normally time-consuming and costly.
Pre-training allows a transformer to gain a basic understanding of language, but it’s not enough to perform specific practical tasks. That’s why the model undergoes a fine-tuning phase, where it is trained on a narrower, domain-specific dataset generated with the help of human reviewers following certain guidelines.
Another important feature of foundation models is modality. Depending on the type of data foundation models can take as inputs, they can be unimodal or multimodal. The former can only take one type of data and generate the same type of output, while the latter can receive multiple modalities of input type and generate multiple types of outputs (for example, GPT-4 can accept both image and text inputs and generate text outputs.
Understanding how transformers work can be tricky and out of the scope of this article. For a more detailed explanation, check out our article, What is ChatGPT, where we have asked the question directly to ChatGPT and learn about Transformers and Hugging Face to get a more technical view.
If you want to get more details on how LLMs work, our Large Language Models (LLMs) Concepts Course is a great starting point.
Applications of Foundation Models
Foundation models can be used as a standalone system or as the basis for countless downstream AI systems and applications. While the majority of modern foundation systems are designed to generate text or code and perform NLP tasks, there is an increasing number of systems capable of generating other types of outputs, such as images or music.
Below you can find a table with some of the most popular foundation models.
Downstream AI system
Experimental, conversational, AI chat service.
Allows you to have human-like conversations.
Allow DataCamp Workspace users to code better and smarter.
Suggest code and entire functions in real time.
Create music based on text descriptions.
BLOOM (Hugging Face)
No downstream application. Can be used directly
Multiple NLP tasks. Trained in 46 different languages and 13 programming languages.
No downstream application. Can be used directly
Help researchers advance their work in this subfield of AI.
DALL-E 2 (OpenAI)
No downstream application. Can be used directly
Create realistic images and art from a description in natural language.
Challenges and Concerns with Foundation Models
Foundation models are at the forefront of AI and have the potential to power countless applications. However, it’s important to consider its potential risks and challenges.
Here is a non-exhaustive list of risks that need to be associated with the widespread adoption of foundation models:
- Lack of transparency. Algorithmic opacity is one of the main concerns associated with foundation models, often described as ‘black box’ models, that is, models so complex that it’s impossible to track their reasoning. AI providers are often reluctant to provide information about their models on the grounds of business confidentiality. However, enhancing transparency is essential to know the cost and impact of foundation models, as well as assess their safety and effectiveness.
- Bias and discrimination. Biased foundation models can result in unfair decisions that often exacerbate discrimination against minority groups. IBM Research is exploring ways to minimize this bias.
- Privacy issues. Foundation models are trained with vast amounts of data, often comprising personal data. This can lead to issues and risks related to data privacy and security.
- Ethical considerations. Foundation models can sometimes lead to decisions that have serious implications in our life, with significant impacts on our fundamental rights. We explored the ethics of generative AI in a separate post.
The Future of Foundation Models
Foundation models are fuelling the current generative AI boom. The potential applications are so vast that every sector and industry, including data science, is likely to be affected by the adoption of Ai in the coming future.
While we’re still far from achieving Artificial General Intelligence, the development of foundation models represents an important milestone in the AI race. Companies, regulators, and society, in general, should be aware of the current state of AI, as a precondition to ensure transparency, fairness, and accountability.
DataCamp is working hard to provide comprehensive and accessible resources for everyone to keep updated with AI development. Check them out:
- Introduction to Meta AI's LLaMA: Empowering AI Innovation - This blog post introduces LLaMA, a collection of state-of-the-art foundation language models.
- An Introduction to Statistical Machine Learning - This tutorial explains how statistical techniques underpin machine learning models, which is a crucial foundation for understanding Foundation Models.
- 5 Projects You Can Build with Generative AI Models (with examples) - This blog post provides practical project ideas that involve generative AI models, which are a type of Foundation Model.
- How to Ethically Use Machine Learning to Drive Decisions - This blog post discusses the importance of ethical considerations when using machine learning models, including Foundation Models.
- AI Essentials Skill Track - This learning track provides a comprehensive introduction to AI, including models like ChatGPT, which is a type of Foundation Model.
Start Learning AI Today!