Skip to main content

10 Docker Project Ideas: From Beginner to Advanced

Learn Docker with these hands-on project ideas for all skill levels, from beginner to advanced, focused on building and optimizing data science applications.
Oct 8, 2024  · 22 min read

Hands-on experience is essential for mastering Docker. Docker is a core tool in modern software development and data science, allowing you to build, deploy, and manage applications in containers.

In this article, I provide examples of beginner, intermediate, and advanced Docker projects focusing on multi-stage builds, optimizing Docker images, and applying Docker in data science. These projects are designed to deepen your understanding of Docker and improve your practical skills.

Getting Started with Docker Projects

Before jumping into the projects, ensure you have Docker installed on your machine. Depending on your OS (Windows, macOS, Linux), you can download Docker from the official Docker website.

You will also need a basic understanding of:

  • Dockerfiles (to define what’s inside your containers)
  • Docker Compose (for multi-container applications)
  • Basic CLI commands like docker build, docker run, docker-compose up, etc.

If you need to brush up your skills on the above concepts, check out the Introduction to Docker or the Containerization and Virtualization Concepts courses.

Let's get started!

Docker Projects for Beginners

When starting with Docker, choosing projects that match your skill level while challenging you to learn new concepts is important. Here are some project ideas to get you started:

Project 1: Setting up a simple web server

In this project, you'll create a Docker container that runs a basic web server using Nginx. Nginx is one of the most popular open-source web servers for reverse proxying, load balancing, and more. By the end of this project, you will have learned how to create and run containers with Docker and expose ports so the application can be accessed from your local machine.

Difficulty level: Beginner

Technologies used: Docker, Nginx

Step-by-step instructions

  • Install Docker: Make sure Docker is installed on your system.
  • Create the project directory: Create a new folder and an index.html file inside it that will be served by Nginx.
  • Write the Dockerfile: A Dockerfile is a script that defines the environment of the container. It tells Docker what base image to use, what files to include, and what ports to expose:
FROM nginx:alpine
COPY ./index.html /usr/share/nginx/html
EXPOSE 80
  • Build the Docker image: Navigate to your project folder and build the image using:
docker build -t my-nginx-app .
  • Run the container: Start the container and map port 80 of the container to port 8080 on your machine:
docker run -d -p 8080:80 my-nginx-app
  • Access the web server: Open your browser and navigate to http://localhost:8080 to see your created page.

Project 2: Dockerizing a Python script

This project involves containerizing a simple Python script that processes data from a CSV file using the pandas library. The goal is to learn how to manage dependencies and execute Python scripts inside Docker containers, making the script portable and executable in any environment.

Difficulty level: Beginner

Technologies used: Docker, Python, pandas

Step-by-step instructions

  • Write the Python script: Create a script named process_data.py that reads and processes a CSV file. Here’s an example script:
import pandas as pd
df = pd.read_csv('data.csv')
print(df.describe())
  • Create a requirements.txt file: This file lists the Python libraries the script needs. In this case, we only need pandas:
pandas
  • Write the Dockerfile: This file will define the environment for the Python script:
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "process_data.py"]
  • Build the Docker image:
docker build -t python-script .
  • Run the container:
docker run -v $(pwd)/data:/app/data python-script

Project 3: Building a simple multi-container application

This project will help you get familiar with Docker Compose by building a multi-container application. You’ll create a simple web application using Flask as the frontend and MySQL as the backend database. Docker Compose allows you to manage multiple containers that work together.

Difficulty level: Beginner

Technologies used: Docker, Docker Compose, Flask, MySQL

Step-by-step instructions

  • Write the Flask application: Create a simple Flask app that connects to a MySQL database and displays a message. Here’s an example:
from flask import Flask
import mysql.connector
 
app = Flask(__name__)
 
def get_db_connection():
 	connection = mysql.connector.connect(
	 host="db",
	 user="root",
	 password="example",
	 database="test_db"
 	)
 	return connection
 
@app.route('/')
def hello_world():
 	connection = get_db_connection()
 	cursor = connection.cursor()
 	cursor.execute("SELECT 'Hello, Docker!'")
 	result = cursor.fetchone()
 	connection.close()
 	return str(result[0])
 
if __name__ == "__main__":
 	app.run(host='0.0.0.0')
  • Create the docker-compose.yml file: Docker Compose defines and runs multi-container Docker applications. In this file, you will define the Flask app and the MySQL database services:
version: '3'
services:
  db:
    image: mysql:5.7
    environment:
      MYSQL_ROOT_PASSWORD: example
      MYSQL_DATABASE: test_db
    ports:
      - "3306:3306"
    volumes:
      - db_data:/var/lib/mysql

  web:
    build: .
    ports:
      - "5000:5000"
    depends_on:
      - db
    environment:
      FLASK_ENV: development
    volumes:
      - .:/app

volumes:
  db_data:
  • Write the Dockerfile for Flask: This will create the Docker image for the Flask application:
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "app.py"]
  • Build and run the containers: Use Docker Compose to bring up the entire application:
docker-compose up --build

Become a Data Engineer

Become a data engineer through advanced Python learning
Start Learning for Free

Intermediate-Level Docker Projects

The following projects are for those with a solid understanding of Docker basics. These will introduce more complex concepts, such as multi-stage builds and optimization techniques.

Project 4: Multi-stage build for a Node.js application

Multi-stage builds help reduce Docker image sizes by separating the build and runtime environments. In this project, you will containerize a Node.js application using multi-stage builds.

Difficulty level: Intermediate

Technologies used: Docker, Node.js, Nginx

Step-by-step instructions

  • Create a simple Node.js app: Write a basic Node.js server that returns a simple message. Here’s an example:
const express = require('express');
const app = express();
 
app.get('/', (req, res) => res.send('Hello from Node.js'));
 
app.listen(3000, () => console.log('Server running on port 3000'));
  • Write the Dockerfile with multi-stage build: The first stage builds the app, and the second stage runs the app with a lighter base image.
# Stage 1: Build
FROM node:14 as build-stage
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
# Add the following line if there's a build step for the app
# RUN npm run build

# Stage 2: Run
FROM node:14-slim
WORKDIR /app
COPY --from=build-stage /app .
EXPOSE 3000
ENV NODE_ENV=production
CMD ["node", "server.js"]
  • Build the image:
docker build -t node-multi-stage .
  • Run the container:
docker run -p 3000:3000 node-multi-stage

Project 5: Dockerizing a machine learning model with TensorFlow

This project will involve containerizing a machine learning model using TensorFlow. The goal is to create a portable environment where you can run TensorFlow models across various systems without worrying about the underlying setup.

Difficulty level: Intermediate

Technologies used: Docker, TensorFlow, Python

Step-by-step instructions

  • Install TensorFlow in a Python script: Create a Python script model.py that loads and runs a pre-trained TensorFlow model:
import tensorflow as tf
model = tf.keras.applications.MobileNetV2(weights='imagenet')
print("Model loaded successfully")
  • Write the Dockerfile: Define the environment for TensorFlow inside Docker:
FROM tensorflow/tensorflow:latest
WORKDIR /app
COPY . .
CMD ["python", "model.py"]
  • Build the image:
docker build -t tensorflow-model .
  • Run the container:
docker run tensorflow-model

Project 6: Creating a data science environment with Jupyter and Docker

This project focuses on creating a reproducible data science environment using Docker and Jupyter notebooks. The environment will include popular Python libraries like pandas, NumPy, and scikit-learn.

Difficulty level: Intermediate

Technologies used: Docker, Jupyter, Python, scikit-learn

Step-by-step instructions

  • Create the docker-compose.yml file: Define the Jupyter Notebook service and the necessary libraries. Here’s an example:
version: '3'
services:
  jupyter:
    	image: jupyter/scipy-notebook
    	ports:
    	- "8888:8888"
    	volumes:
    	- ./notebooks:/home/joelwembo/work
  • Start the Jupyter Notebook: Use Docker Compose to start the Jupyter Notebook.
docker-compose up
  • Access the Jupyter Notebook: Open your browser and go to http://localhost:8888.

Advanced-Level Docker Projects

These advanced-level projects will focus on real-world applications and advanced Docker concepts, such as deep learning pipelines and automated data pipelines.

Project 7: Reducing a Docker image size for a Python application

In this project, you'll optimize a Docker image for a Python application by using minimal base images like Alpine Linux and implementing multi-stage builds to keep the image size as small as possible.

Difficulty level: Advanced

Technologies used: Docker, Python, Alpine Linux

Step-by-step instructions

  • Write the Python script: Create a script that analyzes data using pandas. Here’s an example script:
import pandas as pd
df = pd.read_csv('data.csv')
print(df.head())
  • Optimize the Dockerfile: Use multi-stage builds and Alpine Linux to create a lightweight image.
# Stage 1: Build stage
FROM python:3.9-alpine as build-stage
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY script.py .

# Stage 2: Run stage
FROM python:3.9-alpine
WORKDIR /app
COPY --from=build-stage /app/script.py .
CMD ["python", "script.py"]
  • Build the image:
docker build -t optimized-python-app .

Project 8: Dockerizing a deep learning pipeline with PyTorch

This project involves containerizing a deep learning pipeline using PyTorch. The focus is optimizing the Dockerfile for performance and size, making it easy to run deep learning models in different environments.

Difficulty level: Advanced

Technologies used: Docker, PyTorch, Python

Step-by-step instructions

  • Install PyTorch in a Python script: Create a script that loads a pre-trained PyTorch model and performs inference. Here’s an example:
import torch
model = torch.hub.load('pytorch/vision', 'resnet18', pretrained=True)
print("Model loaded successfully")
  • Write the Dockerfile: Define the environment for PyTorch:
FROM pytorch/pytorch:1.9.0-cuda11.1-cudnn8-runtime
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY model.py .
CMD ["python", "model.py"]
  • Build the image:
docker build -t pytorch-model .
  • Run the container:
docker run pytorch-model 

Project 9: Automating data pipelines with Apache Airflow and Docker

In this project, you’ll set up and containerize an Apache Airflow environment to automate data pipelines. Apache Airflow is a popular tool for orchestrating complex workflows widely used in data engineering.

Difficulty level: Advanced

Technologies used: Docker, Apache Airflow, Python, PostgreSQL

Step-by-step instructions

  • Create the docker-compose.yml file: Define the Airflow services and the PostgreSQL database:
version: '3'
services:
  postgres:
    image: postgres:latest
    environment:
      POSTGRES_USER: airflow
      POSTGRES_PASSWORD: airflow
      POSTGRES_DB: airflow
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data

  webserver:
    image: apache/airflow:latest
    environment:
      AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
      AIRFLOW__CORE__EXECUTOR: LocalExecutor
    depends_on:
      - postgres
    ports:
      - "8080:8080"
    volumes:
      - ./dags:/opt/airflow/dags
    command: ["webserver"]

  scheduler:
    image: apache/airflow:latest
    environment:
      AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
      AIRFLOW__CORE__EXECUTOR: LocalExecutor
    depends_on:
      - postgres
      - webserver
    volumes:
      - ./dags:/opt/airflow/dags
    command: ["scheduler"]

volumes:
  postgres_data:
  • Start the Airflow environment: Use Docker Compose to bring up the Airflow environment:
docker-compose up

Project 10: Deploying a data science API with FastAPI and Docker

Build and deploy a data science API using FastAPI. You’ll containerize the API using Docker and focus on optimizing it for production environments.

Difficulty level: Advanced

Technologies used: Docker, FastAPI, Python, scikit-learn

Step-by-step instructions

  • Write the FastAPI application: Create a simple API that uses a machine learning model for predictions. Here’s an example:
from fastapi import FastAPI
import pickle
 
app = FastAPI()
with open("model.pkl", "rb") as f:
   model = pickle.load(f)
 
@app.post("/predict/")
def predict(data: list):
   return {"prediction": model.predict(data)}
  • Write the Dockerfile: Create a Dockerfile that defines the environment for FastAPI:
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
  • Build the image:
docker build -t fastapi-app .
  • Run the container:
docker run -p 8000:8000 fastapi-app

Tips for Working on Docker Projects

As you work through these projects, keep the following tips in mind:

  • Start small: Begin with slightly challenging projects, then move on to more complex tasks. Building confidence with more straightforward tasks is critical.
  • Document your progress: Keep a detailed log of your projects to track your learning and to use as a reference for future endeavors.
  • Join Docker communities: Engage with online forums and local meetups to share your experiences, ask questions, and learn from others.
  • Experiment and customize: Don't be afraid to tweak the projects, try different approaches, and explore new Docker features.
  • Keep learning: Continue to expand your Docker knowledge by exploring advanced topics and tools such as Kubernetes, Docker Swarm, or microservices architecture.

Conclusion

Mastering Docker involves more than just learning commands and configurations. It’s about understanding how Docker fits into modern application development, data science workflows, and infrastructure management.

The projects I shared in this guide provided some ideas to build the foundational skills and hands-on experience needed to eventually excel in real-world scenarios.

At this point, I suggest solidifying your knowledge by following these courses:

Become a Data Engineer

Prove your skills as a job-ready data engineer.

FAQs

What are the best practices for writing efficient Dockerfiles?

Best practices for writing efficient Dockerfiles include minimizing the number of layers by combining commands, using multi-stage builds to reduce image size, selecting lightweight base images, caching dependencies, and avoiding including unnecessary files in the final image.

What is a multi-stage build in Docker?

A multi-stage build is a method for optimizing Docker images by separating the build and runtime environments. This results in smaller, more secure images.

How can you reduce the size of a Docker image?

Use minimal base images, manage dependencies efficiently, and employ multi-stage builds to reduce image size and improve performance.

How do I troubleshoot common errors when building Docker images?

Common errors when building Docker images include permission issues, incorrect Dockerfile syntax, and failed dependency installation. To troubleshoot, check Docker's build logs, ensure you’re using the correct base image, and confirm that any paths or file permissions are set correctly. Tools like docker build --no-cache can help identify caching issues.

Can I use Docker with Kubernetes for these projects?

Yes, once you're comfortable with Docker, Kubernetes can be the next step. Kubernetes helps manage containerized applications at scale. You can deploy your Docker projects on a Kubernetes cluster to manage multiple instances, handle scaling, and automate deployments.

What are some best practices for managing Docker volumes and persistent data?

When working with Docker volumes, it's important to use named volumes to ensure data persistence across container restarts. Regularly backup your volumes and monitor for any performance bottlenecks due to disk I/O. Avoid storing sensitive data in containers directly; use secure storage solutions or external databases instead.

What is the purpose of the ENTRYPOINT directive in a Dockerfile?

The ENTRYPOINT directive in a Dockerfile specifies the command that will always run when a container starts. It allows the container to be treated as an executable, where arguments can be passed during runtime, enhancing flexibility.

What is the difference between CMD and ENTRYPOINT in a Dockerfile?

Both CMD and ENTRYPOINT specify commands to run when a container starts. However, CMD provides default arguments that can be overridden, while ENTRYPOINT defines the command that always runs. ENTRYPOINT is useful for creating containers that act like executables, whereas CMD is more flexible for specifying default commands.


Joel Wembo's photo
Author
Joel Wembo
LinkedIn
Twitter

AWS Certified Cloud Solutions Architect, DevOps, Cloud Engineer with extensive understanding of high availability architecture and concepts. I possess knowledge of cloud engineering and DevOps and am adept at utilizing open-source resources to execute enterprise applications. I build cloud-based applications using AWS, AWS CDK, AWS SAM, CloudFormation, Serverless Framework, Terraform, and Django.

Topics

Learn more about Docker with these courses!

course

Introduction to Docker

4 hr
19.3K
Gain an introduction to Docker and discover its importance in the data professional’s toolkit. Learn about Docker containers, images, and more.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

blog

14 Java Projects For All Levels: Beginner, Intermediate, & Advanced

Discover ideas for Java projects across all experience levels from beginner to advanced.
Austin Chia's photo

Austin Chia

9 min

blog

How to Learn Docker from Scratch: A Guide for Data Professionals

This guide teaches you how to learn Docker from scratch. Discover practical tips, resources, and a step-by-step plan to accelerate your learning!
Joel Wembo's photo

Joel Wembo

28 min

blog

Top 13 AWS Projects: From Beginner to Pro

Explore 13 hands-on AWS projects for all levels. Enhance your cloud skills with practical, real-world applications and expert guidance.
Joleen Bothma's photo

Joleen Bothma

12 min

blog

60+ Python Projects for All Levels of Expertise

60 data science project ideas that data scientists can use to build a strong portfolio regardless of their expertise.
Bekhruz Tuychiev's photo

Bekhruz Tuychiev

16 min

cheat-sheet

Docker for Data Science Cheat Sheet

In this cheat sheet, learn how to apply Docker in your Data Science projects
Richie Cotton's photo

Richie Cotton

5 min

tutorial

Docker for Data Science: An Introduction

In this Docker tutorial, discover the setup, common Docker commands, dockerizing machine learning applications, and industry-wide best practices.
Arunn Thevapalan's photo

Arunn Thevapalan

15 min

See MoreSee More