Skip to main content
HomeTutorialsData Engineering

What is Terraform? Get Started With Infrastructure as Code

Read our step-by-step beginner's guide to using Terraform, and learn how to efficiently automate and manage your Azure, AWS, and Google Cloud infrastructure.
Jul 12, 2024  · 10 min read

Infrastructure as Code (IaC) is a process that automates the provisioning and management of infrastructure through code. While physical hardware configuration and interactive configuration tools can be used to provision infrastructure, IaC offers several advantages such as version control, repeatability, and scalability. One of the top IaC tools available is Terraform, a solution developed by HashiCorp in 2014 and used by more than 500,000 organizations worldwide. Let’s dive into how Terraform works and how to use it for modern IT operations.

As we get started learning about Terraform, know that familiarity with at least one major cloud provider (AWS, Azure, Google Cloud, etc.) is a prerequisite because Terraform is used to manage cloud infrastructure. Our Understanding Cloud Computing course provides a solid foundation if you’re new to cloud computing.

What is Terraform?

Terraform is an open-source tool that lets you define infrastructure components and their relationships using a high-level configuration language.

In Terraform’s human-readable configuration files, you can specify the desired state of your infrastructure and Terraform automatically works out how to get to that state. These files can be versioned, shared, and reused to provide a consistent way to manage your infrastructure, from compute and storage resources, to DNS and SaaS features.

Terraform can be used with various cloud providers, in multi-cloud infrastructures, and on-premises environments.

Key Features of Terraform

Let’s take a closer look at the fundamental aspects that set Terraform apart: 

HashiCorp Configuration Language

Terraform uses a high-level language called HashiCorp’s Configuration Language (HCL), designed specifically for defining infrastructure as code. High-level configuration languages implement a declarative syntax which is abstract and user-friendly compared to low-level scripting and manual configuration. You might have encountered high-level languages before, in YAML or JSON files. 

HCL follows a block structure, where each nested block represents resources and their configurations. Resources are explicitly defined with names and attributes.

resource "aws_instance" "example" {
  ami           = "ami-123456"
  instance_type = "t2.micro"
}

Execution plan

After you define your infrastructure’s desired state, Terraform will generate an execution plan. This plan will list the steps Terraform needs to take to achieve that state, so you can review the changes before they are applied. Checking the plan before you apply the changes will avoid unwanted modifications, such as resource deletion.

State management

Terraform maintains a state file that automatically tracks the current state of your infrastructure and serves as the source of truth when determining what changes need to be made. By default, it is stored locally and called terraform.tfstate. 

Providers

Providers are plugins that interact with APIs of cloud platforms and other services and allow Terraform to manage a wide variety of resources. Official providers are developed and maintained by HashiCorp and their trusted partners like AWS, Azure, Google Cloud, Github, Datadog, etc. There are also community-developed plugins, which can be found on the Terraform Registry or Github.

Resource graph

Terraform builds a resource graph showing your infrastructure’s resources and the relationships between them. This graph enables Terraform to generate plans effectively, handle dependencies between resources, and make sure that resources are created, updated, and deleted in the correct order. This graph is also a great way to visualize your infrastructure and understand the impact of the changes you are looking to make. 

Example Terraform Resource Graph, showing AWS resources and the dependency relationships between them

Example resource graph. Source: HashiCorp documentation

Getting Started with Terraform

Let’s walk through our first Terraform workflow. If you are new to DevOps or cloud computing in general, I would recommend taking our Understanding Cloud Computing and Introduction to DevOps courses before you go any further.

Installation and setup

In this Terraform tutorial, we will be creating resources on AWS. If you want to follow along and do not have an account, go to AWS and sign up.

  1. Download Terraform from the official website. Pick the right binary for your operating system and follow the installation instructions.
  2. Once you have installed Terraform, open a new Terminal window and run terraform -version to make sure the software is installed properly. 
  3. Install the AWS CLI
  4. Configure the AWS CLI by running aws configure in your terminal. You will be prompted to enter your Access Key ID, your Secret Access Key, your default region name, and your default output format. Terraform should be able to access those automatically.

First Terraform configuration

Create a new directory and your first configuration file. Let’s call it main.tf, and tell Terraform we want to create an AWS EC2 instance. The code, written in HCL, looks like this:

provider "aws" {
  region = "us-west-2"
}

resource "aws_instance" "example" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t2.micro"
}

Initializing the project

Now let’s initialize our Terraform project by running the following command:

$ terraform init

This is a one-off command and will download the required provider plugins. You will not need to run it for subsequent changes. 

Plan

We now want to generate our execution plan. Remember, the plan will highlight the changes Terraform will need to make to reach our desired state (in this case, a running EC2 instance). Run:

$ terraform plan

You should see something like this:

Terraform will perform the following actions:

  # aws_instance.example will be created
  + resource "aws_instance" "example" {
      + ami                           = "ami-0c55b159cbfafe1f0"
      + instance_type                 = "t2.micro"
      ...
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Apply

We reviewed the plan and now decide to execute it and provision the resources. In your terminal, run:

$ terraform apply

You will be asked to confirm the action. Type “yes” to proceed.

Do you want to perform these actions? 
Terraform will perform the actions described above. 
Only 'yes' will be accepted to approve. 
Enter a value: yes 

 You should see the following:

aws_instance.example: Creating... 
aws_instance.example: Still creating... [10s elapsed] aws_instance.example: 
Creation complete after 15s [id=i-0abcdef1234567890] 

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

 Congrats, you provisioned your first EC2 instance on AWS using Terraform!

Benefits of Using Terraform

We mentioned before that using Terraform provides many benefits, like consistency and repeatability, and a reduced rate of human errors thanks to execution plans. 

Other benefits include:

  • Rapid Provisioning: Terraform can create and update resources in parallel, which is particularly useful for spinning up development, testing, and staging environments quickly.
  • Increased Collaboration: Paired with a version control system like Git, Terraform is a great way to increase collaboration and transparency over your infrastructure. Changes can be reviewed, approved, and tracked just like application code.
  • Efficient Disaster Recovery: If anything goes wrong, Terraform can quickly recreate the entire infrastructure from scratch using the same configuration files.
  • Community Support: Terraform encourages the use of modules, which are reusable configurations that can be shared and promote best practices when it comes to managing infrastructure. Terraform also boasts an active community and an extensive ecosystem of providers and modules, which means you’ll likely find something for your tech stack no matter how niche it is.

Common Use Cases for Terraform

Whether you are a Data Engineer building pipelines or a Data Scientist looking to deploy your solutions to production, Terraform is a great tool to manage your infrastructure needs. Here are some of the most common use cases:

Multi-cloud management

It is not always easy to manage infrastructure that spans multiple cloud providers. Terraform allows you to do just that with a single tool. On top of that, Terraform can manage on-premise resources, making it an ideal solution for organizations that operate in a hybrid environment.

CI/CD workflows

Terraform integrates well with CI/CD workflows, which means that you can deploy your infrastructure as part of your software delivery process. You can also automatically create and destroy preview, dev, testing, or staging environments as and when needed.

Kubernetes clusters

Terraform can manage Kubernetes clusters on different cloud providers like AWS (EKS), Azure (AKS), or GCP (GKE). This includes setting up the necessary infrastructure, managing Kubernetes resources within the cluster, and scaling clusters based on workload demands.

Terraform vs. Other IaC Tools

Other popular IaC tools include Ansible, Chef, Puppet and AWS CloudFormation. Each tool has its strengths and weaknesses. Take a look:

Feature Terraform Ansible Chef Puppet CloudFormation
Open Source Yes Yes Yes Yes No
Declarative Syntax Yes No No Yes Yes
Multi-Cloud Support Yes Yes Yes Yes No, AWS only
State Management Yes No No Yes Yes
Execution Plans Yes No No No No
Dependency Management Yes Limited Limited Yes Yes

As we can see, while all these tools are powerful in their own right, Terraform stands out for its comprehensive feature set. However, this list is not exhaustive, and each tool serves slightly different use cases depending on requirements. For instance, Ansible and its procedural nature are well-suited for configuration management and ad-hoc tasks.

Conclusion

In the past 10 years, Terraform has helped thousands of organizations manage their IT infrastructure. With its declarative approach, automation capabilities, and support for multi-cloud environments, Terraform has become one of the most popular IaC tools, and will undoubtedly remain a major player in the infrastructure management space as cloud adoption continues to grow. 

Now that you understand the basics of Terraform, you can dive into more advanced configurations and real-life scenarios. Take a look at the HashiCorp Terraform tutorials to learn how to use Terraform for common tasks and use cases, or check out our 14 Essential Data Engineering Tools to Use in 2024 blog post to understand how Terraform fits into a Data Engineer toolkit.


Photo of Marie Fayard
Author
Marie Fayard

Senior Software Engineer, Technical Writer and Advisor with a background in physics. Committed to helping early-stage startups reach their potential and making complex concepts accessible to everyone.

Frequently Asked Questions

Is Terraform suitable for small-scale projects or is it primarily for enterprise use?

Terraform is a helpful tool, no matter your project size! Whether you have 5 resources to manage or 1000, you will definitely see the benefits.

How much does Terraform cost?

Terraform is open-source and free to use, but bear in mind that your cloud provider(s) will charge you for the resources provisioned and managed via Terraform.

Can Terraform manage databases and other stateful services?

Yes, Terraform can manage databases and other stateful services. You can define instances, clusters and configurations in your IaC files.

How does Terraform handle secrets and sensitive information in configuration files?

Terraform encourages best practices for handling secrets by supporting environment variables, encrypted variables, and integrations with secret management services like HashiCorp Vault or AWS Secrets Manager.

Topics

Learn with DataCamp

Course

Understanding Data Engineering

2 hr
237.3K
Discover how data engineers lay the groundwork that makes data science possible. No coding involved!
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

tutorial

How to Set Up and Configure Azure: Beginner's Guide

Learn how to set up and configure Azure with this beginner's guide. Follow easy steps to navigate the Azure portal, create resources, and manage your cloud services.
Florin Angelescu's photo

Florin Angelescu

8 min

tutorial

Google Cloud for Data Science: Beginner's Guide

Learn to set up a data science environment on Google Cloud: create an instance on Google Compute Engine, install Anaconda and run Jupyter notebooks!
Alexis Perrier's photo

Alexis Perrier

16 min

tutorial

How to Set Up and Configure AWS: A Comprehensive Tutorial

Learn how to set up and configure your AWS account for the first time with this comprehensive tutorial. Discover essential settings, best practices for security, and how to leverage AWS services for data analysis and machine learning.
Joleen Bothma's photo

Joleen Bothma

30 min

tutorial

Getting Started with Apache Airflow

Learn the basics of bringing your data pipelines to production, with Apache Airflow. Install and configure Airflow, then write your first DAG with this interactive tutorial.
Jake Roach's photo

Jake Roach

10 min

tutorial

Mastering AWS Step Functions: A Comprehensive Guide for Beginners

This article serves as an in-depth guide that introduces AWS Step Functions, their key features, and how to use them effectively.
Zoumana Keita 's photo

Zoumana Keita

tutorial

AWS EC2 Tutorial For Beginners

Discover why you should use Amazon Web Services Elastic Compute Cloud (EC2) and how you can set up a basic data science environment on a Windows instance.
DataCamp Team's photo

DataCamp Team

7 min

See MoreSee More