10 Docker Project Ideas: From Beginner to Advanced

Learn Docker with these hands-on project ideas for all skill levels, from beginner to advanced, focused on building and optimizing data science applications.

Oct 8, 2024 · 9 min read

Hands-on experience is essential for mastering Docker. Docker is a core tool in modern software development and data science, allowing you to build, deploy, and manage applications in containers.

In this article, I provide examples of beginner, intermediate, and advanced Docker projects focusing on multi-stage builds, optimizing Docker images, and applying Docker in data science. These projects are designed to deepen your understanding of Docker and improve your practical skills.

Getting Started with Docker Projects

Before jumping into the projects, ensure you have Docker installed on your machine. Depending on your OS (Windows, macOS, Linux), you can download Docker from the official Docker website.

You will also need a basic understanding of:

Dockerfiles (to define what’s inside your containers)
Docker Compose (for multi-container applications)
Basic CLI commands like docker build, docker run, docker-compose up, etc.

If you need to brush up your skills on the above concepts, check out the Introduction to Docker or the Containerization and Virtualization Concepts courses.

Let's get started!

Docker Projects for Beginners

When starting with Docker, choosing projects that match your skill level while challenging you to learn new concepts is important. Here are some project ideas to get you started:

Project 1: Setting up a simple web server

In this project, you'll create a Docker container that runs a basic web server using Nginx. Nginx is one of the most popular open-source web servers for reverse proxying, load balancing, and more. By the end of this project, you will have learned how to create and run containers with Docker and expose ports so the application can be accessed from your local machine.

Difficulty level: Beginner

Technologies used: Docker, Nginx

Step-by-step instructions

Install Docker: Make sure Docker is installed on your system.
Create the project directory: Create a new folder and an index.html file inside it that will be served by Nginx.
Write the Dockerfile: A Dockerfile is a script that defines the environment of the container. It tells Docker what base image to use, what files to include, and what ports to expose:

FROM nginx:alpine
COPY ./index.html /usr/share/nginx/html
EXPOSE 80

Build the Docker image: Navigate to your project folder and build the image using:

docker build -t my-nginx-app .

Run the container: Start the container and map port 80 of the container to port 8080 on your machine:

docker run -d -p 8080:80 my-nginx-app

Access the web server: Open your browser and navigate to http://localhost:8080 to see your created page.

Project 2: Dockerizing a Python script

This project involves containerizing a simple Python script that processes data from a CSV file using the pandas library. The goal is to learn how to manage dependencies and execute Python scripts inside Docker containers, making the script portable and executable in any environment.

Difficulty level: Beginner

Technologies used: Docker, Python, pandas

Step-by-step instructions

Write the Python script: Create a script named process_data.py that reads and processes a CSV file. Here’s an example script:

import pandas as pd
df = pd.read_csv('data.csv')
print(df.describe())

Create a requirements.txt file: This file lists the Python libraries the script needs. In this case, we only need pandas:

pandas

Write the Dockerfile: This file will define the environment for the Python script:

FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "process_data.py"]

Build the Docker image:

docker build -t python-script .

Run the container:

docker run -v $(pwd)/data:/app/data python-script

Project 3: Building a simple multi-container application

This project will help you get familiar with Docker Compose by building a multi-container application. You’ll create a simple web application using Flask as the frontend and MySQL as the backend database. Docker Compose allows you to manage multiple containers that work together.

Difficulty level: Beginner

Technologies used: Docker, Docker Compose, Flask, MySQL

Step-by-step instructions

Write the Flask application: Create a simple Flask app that connects to a MySQL database and displays a message. Here’s an example:

from flask import Flask
import mysql.connector
 
app = Flask(__name__)
 
def get_db_connection():
 	connection = mysql.connector.connect(
	 host="db",
	 user="root",
	 password="example",
	 database="test_db"
 	)
 	return connection
 
@app.route('/')
def hello_world():
 	connection = get_db_connection()
 	cursor = connection.cursor()
 	cursor.execute("SELECT 'Hello, Docker!'")
 	result = cursor.fetchone()
 	connection.close()
 	return str(result[0])
 
if __name__ == "__main__":
 	app.run(host='0.0.0.0')

Create the docker-compose.yml file: Docker Compose defines and runs multi-container Docker applications. In this file, you will define the Flask app and the MySQL database services:

version: '3'
services:
  db:
    image: mysql:5.7
    environment:
      MYSQL_ROOT_PASSWORD: example
      MYSQL_DATABASE: test_db
    ports:
      - "3306:3306"
    volumes:
      - db_data:/var/lib/mysql

  web:
    build: .
    ports:
      - "5000:5000"
    depends_on:
      - db
    environment:
      FLASK_ENV: development
    volumes:
      - .:/app

volumes:
  db_data:

Write the Dockerfile for Flask: This will create the Docker image for the Flask application:

FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "app.py"]

Build and run the containers: Use Docker Compose to bring up the entire application:

docker-compose up --build

Access the Flask app: Go to http://localhost:5000 in your browser.

Become a Data Engineer

Become a data engineer through advanced Python learning

Start Learning for Free

Intermediate-Level Docker Projects

The following projects are for those with a solid understanding of Docker basics. These will introduce more complex concepts, such as multi-stage builds and optimization techniques.

Project 4: Multi-stage build for a Node.js application

Multi-stage builds help reduce Docker image sizes by separating the build and runtime environments. In this project, you will containerize a Node.js application using multi-stage builds.

Difficulty level: Intermediate

Technologies used: Docker, Node.js, Nginx

Step-by-step instructions

Create a simple Node.js app: Write a basic Node.js server that returns a simple message. Here’s an example:

const express = require('express');
const app = express();
 
app.get('/', (req, res) => res.send('Hello from Node.js'));
 
app.listen(3000, () => console.log('Server running on port 3000'));

Write the Dockerfile with multi-stage build: The first stage builds the app, and the second stage runs the app with a lighter base image.

# Stage 1: Build
FROM node:14 as build-stage
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
# Add the following line if there's a build step for the app
# RUN npm run build

# Stage 2: Run
FROM node:14-slim
WORKDIR /app
COPY --from=build-stage /app .
EXPOSE 3000
ENV NODE_ENV=production
CMD ["node", "server.js"]

Build the image:

docker build -t node-multi-stage .

Run the container:

docker run -p 3000:3000 node-multi-stage

Project 5: Dockerizing a machine learning model with TensorFlow

This project will involve containerizing a machine learning model using TensorFlow. The goal is to create a portable environment where you can run TensorFlow models across various systems without worrying about the underlying setup.

Difficulty level: Intermediate

Technologies used: Docker, TensorFlow, Python

Step-by-step instructions

Install TensorFlow in a Python script: Create a Python script model.py that loads and runs a pre-trained TensorFlow model:

import tensorflow as tf
model = tf.keras.applications.MobileNetV2(weights='imagenet')
print("Model loaded successfully")

Write the Dockerfile: Define the environment for TensorFlow inside Docker:

FROM tensorflow/tensorflow:latest
WORKDIR /app
COPY . .
CMD ["python", "model.py"]

Build the image:

docker build -t tensorflow-model .

Run the container:

docker run tensorflow-model

Project 6: Creating a data science environment with Jupyter and Docker

This project focuses on creating a reproducible data science environment using Docker and Jupyter notebooks. The environment will include popular Python libraries like pandas, NumPy, and scikit-learn.

Difficulty level: Intermediate

Technologies used: Docker, Jupyter, Python, scikit-learn

Step-by-step instructions

Create the docker-compose.yml file: Define the Jupyter Notebook service and the necessary libraries. Here’s an example:

version: '3'
services:
  jupyter:
    	image: jupyter/scipy-notebook
    	ports:
    	- "8888:8888"
    	volumes:
    	- ./notebooks:/home/joelwembo/work

Start the Jupyter Notebook: Use Docker Compose to start the Jupyter Notebook.

docker-compose up

Access the Jupyter Notebook: Open your browser and go to http://localhost:8888.

Advanced-Level Docker Projects

These advanced-level projects will focus on real-world applications and advanced Docker concepts, such as deep learning pipelines and automated data pipelines.

Project 7: Reducing a Docker image size for a Python application

In this project, you'll optimize a Docker image for a Python application by using minimal base images like Alpine Linux and implementing multi-stage builds to keep the image size as small as possible.

Difficulty level: Advanced

Technologies used: Docker, Python, Alpine Linux

Step-by-step instructions

Write the Python script: Create a script that analyzes data using pandas. Here’s an example script:

import pandas as pd
df = pd.read_csv('data.csv')
print(df.head())

Optimize the Dockerfile: Use multi-stage builds and Alpine Linux to create a lightweight image.

# Stage 1: Build stage
FROM python:3.9-alpine as build-stage
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY script.py .

# Stage 2: Run stage
FROM python:3.9-alpine
WORKDIR /app
COPY --from=build-stage /app/script.py .
CMD ["python", "script.py"]

Build the image:

docker build -t optimized-python-app .

Project 8: Dockerizing a deep learning pipeline with PyTorch

This project involves containerizing a deep learning pipeline using PyTorch. The focus is optimizing the Dockerfile for performance and size, making it easy to run deep learning models in different environments.

Difficulty level: Advanced

Technologies used: Docker, PyTorch, Python

Step-by-step instructions

Install PyTorch in a Python script: Create a script that loads a pre-trained PyTorch model and performs inference. Here’s an example:

import torch
model = torch.hub.load('pytorch/vision', 'resnet18', pretrained=True)
print("Model loaded successfully")

Write the Dockerfile: Define the environment for PyTorch:

FROM pytorch/pytorch:1.9.0-cuda11.1-cudnn8-runtime
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY model.py .
CMD ["python", "model.py"]

Build the image:

docker build -t pytorch-model .

Run the container:

docker run pytorch-model

Project 9: Automating data pipelines with Apache Airflow and Docker

In this project, you’ll set up and containerize an Apache Airflow environment to automate data pipelines. Apache Airflow is a popular tool for orchestrating complex workflows widely used in data engineering.

Difficulty level: Advanced

Technologies used: Docker, Apache Airflow, Python, PostgreSQL

Step-by-step instructions

Create the docker-compose.yml file: Define the Airflow services and the PostgreSQL database:

version: '3'
services:
  postgres:
    image: postgres:latest
    environment:
      POSTGRES_USER: airflow
      POSTGRES_PASSWORD: airflow
      POSTGRES_DB: airflow
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data

  webserver:
    image: apache/airflow:latest
    environment:
      AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
      AIRFLOW__CORE__EXECUTOR: LocalExecutor
    depends_on:
      - postgres
    ports:
      - "8080:8080"
    volumes:
      - ./dags:/opt/airflow/dags
    command: ["webserver"]

  scheduler:
    image: apache/airflow:latest
    environment:
      AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
      AIRFLOW__CORE__EXECUTOR: LocalExecutor
    depends_on:
      - postgres
      - webserver
    volumes:
      - ./dags:/opt/airflow/dags
    command: ["scheduler"]

volumes:
  postgres_data:

Start the Airflow environment: Use Docker Compose to bring up the Airflow environment:

docker-compose up

Access the Airflow UI: Open your browser and go to http://localhost:8080.

Project 10: Deploying a data science API with FastAPI and Docker

Build and deploy a data science API using FastAPI. You’ll containerize the API using Docker and focus on optimizing it for production environments.

Difficulty level: Advanced

Technologies used: Docker, FastAPI, Python, scikit-learn

Step-by-step instructions

Write the FastAPI application: Create a simple API that uses a machine learning model for predictions. Here’s an example:

from fastapi import FastAPI
import pickle
 
app = FastAPI()
with open("model.pkl", "rb") as f:
   model = pickle.load(f)
 
@app.post("/predict/")
def predict(data: list):
   return {"prediction": model.predict(data)}

Write the Dockerfile: Create a Dockerfile that defines the environment for FastAPI:

FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Build the image:

docker build -t fastapi-app .

Run the container:

docker run -p 8000:8000 fastapi-app

Tips for Working on Docker Projects

As you work through these projects, keep the following tips in mind:

Start small: Begin with slightly challenging projects, then move on to more complex tasks. Building confidence with more straightforward tasks is critical.
Document your progress: Keep a detailed log of your projects to track your learning and to use as a reference for future endeavors.
Join Docker communities: Engage with online forums and local meetups to share your experiences, ask questions, and learn from others.
Experiment and customize: Don't be afraid to tweak the projects, try different approaches, and explore new Docker features.
Keep learning: Continue to expand your Docker knowledge by exploring advanced topics and tools such as Kubernetes, Docker Swarm, or microservices architecture.

Conclusion

Mastering Docker involves more than just learning commands and configurations. It’s about understanding how Docker fits into modern application development, data science workflows, and infrastructure management.

The projects I shared in this guide provided some ideas to build the foundational skills and hands-on experience needed to eventually excel in real-world scenarios.

At this point, I suggest solidifying your knowledge by following these courses:

Become a Data Engineer

Prove your skills as a job-ready data engineer.

Fast-Track My Data Career

What are the best practices for writing efficient Dockerfiles?

What is a multi-stage build in Docker?

How can you reduce the size of a Docker image?

How do I troubleshoot common errors when building Docker images?

Can I use Docker with Kubernetes for these projects?

What are some best practices for managing Docker volumes and persistent data?

What is the purpose of the ENTRYPOINT directive in a Dockerfile?

What is the difference between CMD and ENTRYPOINT in a Dockerfile?

Author

Joel Wembo

Topics

Data Engineering

Data Science

Docker

Learn more about Docker with these courses!

Track

Containerization and Virtualization with Docker and Kubernetes

0 min

Learn the power of Docker and Kubernetes, this interactive track will allow you to build and deploy applications in modern environments.

See Details

Start Course

Course

Introduction to Docker

4 hr

38.7K

Gain an introduction to Docker and discover its importance in the data professional’s toolkit. Learn about Docker containers, images, and more.

See Details

Start Course

Course

Intermediate Docker

4 hr

6.6K

Master multi-stage builds, Docker networking tools, and Docker Compose for optimal containerized applications!

See Details

Start Course

blog

Top 10 Google Cloud Project Ideas for Beginners and Experts

Explore 10 Google Cloud project ideas to build hands-on skills, from beginner to advanced. Perfect for expanding your GCP knowledge and boosting your portfolio!

Kurtis Pykes

13 min

blog

How to Learn Docker from Scratch: A Guide for Data Professionals

This guide teaches you how to learn Docker from scratch. Discover practical tips, resources, and a step-by-step plan to accelerate your learning!

Joel Wembo

14 min

blog

14 Java Projects For All Levels: Beginner, Intermediate, and Advanced

Discover ideas for Java projects across all experience levels from beginner to advanced.

Austin Chia

9 min

blog

Top 11 Data Mining Projects to Build Your Portfolio

Explore top data mining project ideas in different industries to build your skills - from beginner to advanced. Datasets and resources to get started are included!

Kurtis Pykes

14 min

blog

60+ Python Projects for All Levels of Expertise

60 data science project ideas that data scientists can use to build a strong portfolio regardless of their expertise.

Bekhruz Tuychiev

15 min

Tutorial

Docker for Data Science: An Introduction

In this Docker tutorial, discover the setup, common Docker commands, dockerizing machine learning applications, and industry-wide best practices.

Arunn Thevapalan

See More See More

Getting Started with Docker Projects

Docker Projects for Beginners

Project 1: Setting up a simple web server

Step-by-step instructions

Project 2: Dockerizing a Python script

Step-by-step instructions

Project 3: Building a simple multi-container application

Step-by-step instructions

Become a Data Engineer

Intermediate-Level Docker Projects

Project 4: Multi-stage build for a Node.js application

Step-by-step instructions

Project 5: Dockerizing a machine learning model with TensorFlow

Step-by-step instructions

Project 6: Creating a data science environment with Jupyter and Docker

Step-by-step instructions

Advanced-Level Docker Projects

Project 7: Reducing a Docker image size for a Python application

Step-by-step instructions

Project 8: Dockerizing a deep learning pipeline with PyTorch

Step-by-step instructions

Project 9: Automating data pipelines with Apache Airflow and Docker

Step-by-step instructions

Project 10: Deploying a data science API with FastAPI and Docker

Step-by-step instructions

Tips for Working on Docker Projects

Conclusion

Become a Data Engineer

FAQs

How can you reduce the size of a Docker image?

How do I troubleshoot common errors when building Docker images?

Can I use Docker with Kubernetes for these projects?

What are some best practices for managing Docker volumes and persistent data?

What is the purpose of the ENTRYPOINT directive in a Dockerfile?

What is the difference between CMD and ENTRYPOINT in a Dockerfile?

Top 10 Google Cloud Project Ideas for Beginners and Experts

How to Learn Docker from Scratch: A Guide for Data Professionals

14 Java Projects For All Levels: Beginner, Intermediate, and Advanced

Top 11 Data Mining Projects to Build Your Portfolio

60+ Python Projects for All Levels of Expertise

Docker for Data Science: An Introduction

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Containerization and Virtualization with Docker and Kubernetes

Introduction to Docker

Intermediate Docker

Top 10 Google Cloud Project Ideas for Beginners and Experts

How to Learn Docker from Scratch: A Guide for Data Professionals

14 Java Projects For All Levels: Beginner, Intermediate, and Advanced

Top 11 Data Mining Projects to Build Your Portfolio

60+ Python Projects for All Levels of Expertise

Docker for Data Science: An Introduction

Containerization and Virtualization with Docker and Kubernetes