Kafka Docker Explained: Setup, Best Practices, and Tips

Learn how to set up Apache Kafka with Docker using Compose. Discover best practices, common pitfalls, and tips for development and testing environment

Jun 11, 2025 · 9 min read

In the world of modern data pipelines and microservices, Apache Kafka stands out as a go-to solution for handling real-time event streams and integrating distributed systems. As more engineering teams move toward scalable, event-driven architectures, Kafka becomes central to reliable data movement.

Alongside this trend, Docker has emerged as the leading way for developers to manage, share, and deploy complex services without worrying about inconsistent environments or local issues.

By using Docker to run Kafka, teams can quickly spin up production-like clusters for testing, proof-of-concept demos, or even production workloads.

This article is aimed at intermediate backend developers, DevOps engineers, and anyone involved in managing data platforms who wants to make working with Kafka straightforward and reproducible.

If you are new to Kafka or Docker, be sure to refer to our Introduction to Apache Kafka course or Docker for Beginners: A Practical Guide to Containers.

What is Kafka, and Why Use Docker?

Apache Kafka is a distributed streaming platform designed for high-throughput, fault-tolerant, and scalable messaging.

It acts as a durable, high-speed pipeline between data producers and consumers. Whenever you need to connect microservices, orchestrate data flows, or process real-time analytics, Kafka is often the tool of choice.

You can learn how to build real-time data processing applications with Kafka Streams with our Kafka Streams Tutorial. The tutorial covers core concepts, Java & Python implementations, and step-by-step examples for building scalable streaming applications.

Running Kafka can be complex. Even experienced engineers find the complexity of brokers, topics, partitions, and necessary components like Zookeeper or KRaft frustrating.

Docker simplifies the process by packaging up binaries, dependencies, and configurations into containers.

With Docker, Engineers can start a Kafka cluster on any machine, share identical setups with their colleagues, and avoid the common "works on my machine" headache. Dockerized Kafka is especially popular for:

Local development and testing, where fast spins and teardowns are invaluable
Isolated integration environments in CI/CD pipelines
Mimicking multi-broker clusters on a single host
Training and demonstration setups

Explore Apache Kafka with our Apache Kafka for Beginners guide. Learn the basics, get started, and uncover advanced features and real-world applications of this powerful event-streaming platform.

Fundamentals of Kafka in Docker

Before diving into orchestration, it helps to break down both Kafka’s technical pieces and how Docker helps recreate its distributed nature on a laptop or server.

As you will learn in our Learn Docker from Scratch tutorial, Docker is a popular tool for simplifying the deployment, scaling, and management of applications such as Kafka using containerization.

Kafka architecture basics

Kafka’s backbone is its broker: a server process that stores, receives, and serves messages. Each cluster can consist of multiple brokers, distributing data and load.

Inside each broker are topics (named channels for organizing messages) and partitions (sub-channels that spread events across many servers for parallelism and durability). Producers write data to topics, while consumers subscribe to and process those messages.

Coordination is key. Traditionally, Kafka has relied on Zookeeper to handle broker metadata, partition leadership, and overall cluster health. Newer deployments may use KRaft mode, which internalizes this coordination into Kafka itself, removing the Zookeeper dependency. ZooKeeper was deprecated in 3.5, and is on track for removal in 4.0.

Learn how to containerize machine learning applications with Docker and Kubernetes using our How to Containerize an Application Using Docker tutorial.

KRaft mode: Simplifying Kafka's architecture

Starting from version 2.8, Kafka introduced KRaft mode, allowing it to manage metadata internally without relying on Zookeeper, being declared production-ready in version 3.3. This shift simplifies deployments and reduces the number of components to manage.

To set up Kafka in KRaft mode using Docker Compose:

version: '3.8'
services:
  kafka:
    image: apache/kafka:latest
    container_name: kafka
    ports:
      - "9092:9092"
      - "9093:9093"
    environment:
      KAFKA_NODE_ID: 1
      KAFKA_PROCESS_ROLES: broker,controller
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_CONTROLLER_QUORUM_VOTERS: 1@localhost:9093
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
      KAFKA_LOG_DIRS: /var/lib/kafka/data
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_LOG_RETENTION_HOURS: 168
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      CLUSTER_ID: "Mk3OEYBSD34fcwNTJENDM2Qk"
    volumes:
      - ./data:/var/lib/kafka/data

This configuration sets up a single-node Kafka cluster in KRaft mode, eliminating the need for Zookeeper.

Docker essentials

To run Kafka with Docker, you only need a few basics: Docker Engine (the runtime) and often Docker Compose for multi-container management.

Compose files allow you to specify several services (Kafka, Zookeeper, Kafka UI, etc.), their networks, environment variables, and any storage mounts. Docker’s networking makes it possible to emulate multi-broker clusters, even on a single machine, by giving each broker its container, hostnames, and ports.

One of the best ways to get familiar with Docker is through projects. Practice using these 10 Docker Project Ideas.

Setting Up Kafka with Docker Compose

Docker Compose is often the preferred way to run applications that require multiple interacting containers. Instead of juggling shell scripts or manual commands, Compose lets you define all services and their links in a single YAML file.

This approach reduces configuration drift, speeds up onboarding, and guarantees everyone can use the same stack for development.

Minimal Compose setup

Here’s a basic docker-compose.yml that starts Kafka and Zookeeper for local development:

version: '3'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
    ports:
      - "2181:2181"
  kafka:
    image: confluentinc/cp-kafka:latest
    depends_on:
      - zookeeper
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
    ports:
      - "9092:9092"
    volumes:
      - kafka_data:/var/lib/kafka/data
volumes:
  kafka_data:

This minimal setup maps the relevant ports, links Kafka to Zookeeper, and mounts a Docker volume, so messages do not vanish if the containers restart. Be sure to use environment variables to configure broker identity, listener addresses, and connection endpoints between services.

A thorough understanding of Kafka is essential in acing a data engineering interview. Prepare for your next data engineering interview with our extensive list of Kafka interview questions and answers using these 20 Kafka Interview Questions for Data Engineers.

Extended configuration

Professional setups often add tools such as Kafka UI, Schema Registry, or REST Proxy for easier management and inspection of messages. These can be added to the services block in your compose file. For example:

kafka-ui:
  image: provectuslabs/kafka-ui:latest
  ports:
    - "8080:8080"
  environment:
    KAFKA_CLUSTERS_0_NAME: "Local"
    KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: "kafka:9092"

Persisting Kafka topics and logs across restarts is essential for testing recovery scenarios or long-running jobs. Always mount Docker volumes to /var/lib/kafka/data for each broker and /var/lib/zookeeper for Zookeeper, if in use.

Choosing the right image is key. Here is a quick summary:

Kafka Docker Image	Maintainer	Features	Best Use Case
Confluent	Confluent	Full set, many add-ons	Production-like setups, advanced testing
Bitnami	Bitnami	Clean, minimal	Local development, resource-constrained environments
Apache Kafka (KRaft)	Apache	Zookeeper-less, simplified setup	Modern deployments, simplified architecture

The How to Learn Apache Kafka in 2025 tutorial goes into more details about Kafka, including

What makes Apache Kafka popular?
Main features of Apache Kafka
Various applications of Apache Kafka

Interacting with Kafka in Docker

Once Kafka is running via Docker Compose, you can immediately create topics and send data, just as you would with any standard deployment.

CLI access and Kafka commands

Use docker-compose exec or docker exec to run Kafka CLI tools. For example, to create a topic, you can execute:

docker-compose exec kafka kafka-topics.sh --create --topic demo --bootstrap-server localhost:9092

You can also use kafka-console-producer.sh and kafka-console-consumer.sh in the same way. This provides a fast feedback loop for integration testing or experimentation without polluting your local environment.

Programmatic access from apps

Applications and scripts can connect to your containerized Kafka using the host’s advertised bootstrap servers. For local apps, set bootstrap.servers=localhost:9092 or the equivalent.

Common clients include the official Kafka libraries for Python, Java, NodeJS, and Go. Watch that your application’s network stack can reach the correct ports and addresses.

Docker Networking and Kafka Connectivity

Kafka's networking configuration frequently presents challenges. Understanding internal and external listeners and how to expose services is vital.

Internal vs external listeners

Kafka brokers use listeners to control client connections. Two common setups are:

PLAINTEXT://:9092 for local, unsecured traffic, especially for dev
SSL or SASL_SSL listeners for encrypted or authenticated connections

In docker-compose, expose the right ports and ensure KAFKA_ADVERTISED_LISTENERS matches the actual host and port your apps use. If running in a VM or cloud, set this configuration to the public IP and mapped port.

Aspect	Internal Listener	External Listener
Common Protocol	PLAINTEXT://:9092	SSL://, SASL_SSL://, or mapped PLAINTEXT://
Security	Unencrypted, unauthenticated	Encrypted and/or authenticated
Typical Usage	Local development, intra-container traffic	Remote clients, cloud VMs, production access
Docker Compose Setup	Expose port 9092 inside network	Map 9092 to host, set KAFKA_ADVERTISED_LISTENERS to public IP
Configuration Focus	Simplicity and speed	Reliability, security, public access
Networking Scope	Localhost or internal Docker network	Public IP or external-facing domain

Configuring multiple listeners for diverse client access

In Docker environments, it's common to set up multiple listeners to handle different client access scenarios:

environment:
  KAFKA_LISTENERS: INTERNAL://0.0.0.0:29092,EXTERNAL://0.0.0.0:9092
  KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka:29092,EXTERNAL://localhost:9092
  KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
  KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL

This setup allows internal Docker clients to connect via the INTERNAL listener and external clients to connect via the EXTERNAL listener.

Troubleshooting connectivity

Common errors include ‘broker not available’, ‘connection refused’, or client timeouts. Verify:

Ports are exposed and mapped correctly
Broker advertised listeners match your client’s target address
All containers are healthy using docker-compose ps
DNS resolution works between containers (use the service name, e.g. kafka:9092)
Try docker network inspect to debug cross-container links

Kafka Docker in Production-Like Environments

Beyond local testing, Dockerized Kafka environments also help with staging and CI/CD.

Container orchestration strategies

Orchestrators like Docker Swarm and Kubernetes can manage multi-broker Kafka setups, rolling updates, and service discovery. Each broker gets its container, with attached persistent storage.

In Kubernetes, StatefulSets handle ordered deployment and create stable DNS names for each broker.

Logging and monitoring

Direct Kafka logs to the Docker log driver or external storage for analysis. Many teams pipe logs to Elasticsearch, Loki, or Splunk for troubleshooting. Pair Dockerized Kafka with Prometheus and Grafana for cluster monitoring.

Optimization Tips for Kafka in Docker

Kafka’s performance depends on careful resource and storage management.

Resource allocation

Give each broker container enough CPU, RAM, and disk to mimic production as closely as possible. Tune Docker resource limits and pass JVM options with KAFKA_JVM_PERFORMANCE_OPTS for heap and garbage collector settings.

Persistent storage

Always use Docker volumes or bind mounts for /var/lib/kafka/data and /var/lib/zookeeper, storing data inside the container means you lose it on restart, which defeats any durability tests. High disk throughput is critical for sustained performance, especially in CI/CD load testing.

Best Practices and Common Pitfalls

Stability, maintainability, and team productivity all benefit from a few best practices.

Configuration management

Manage secrets and configurations externally with .env files or mounted config directories. Never hardcode sensitive data in your compose YAML. For reused setups, consider templating tools or config management frameworks.

Common mistakes

Avoiding common mistakes

Do not store data inside containers; always mount volumes
Check for port conflicts on your host
Avoid using default admin passwords; secure exposed ports, even for local development
Keep Docker images up to date; patch for CVEs and deprecated dependencies

Strengthening Kafka Security

To secure your Kafka deployment:

Enable TLS Encryption: Protect data in transit by configuring SSL/TLS for all connections.
Implement Authentication: Use SASL mechanisms (e.g., SCRAM, GSSAPI) to authenticate clients.
Set Up Authorization: Define Access Control Lists (ACLs) to control client permissions.
Regularly Rotate Credentials: Change passwords and keys periodically to minimize risk.
Monitor and Audit: Enable audit logs to track access and changes within the cluster

Conclusion

Docker makes it remarkably easy to start, tweak, and experiment with Apache Kafka clusters for integration, learning, or even production-like testing. Containers insulate you from host OS issues and dependency mismatches, providing confidence that your dev and test environments match what you expect. As you grow from simple local clusters to orchestrated Kafka platforms, investing in robust Docker setups and best practices will save your team time and frustration.

Explore more information about Kafka and Docker from one of our comprehensive courses:

Do I need Zookeeper to run Kafka in Docker?

Can I use Kafka in Docker for production workloads?

Why is my Kafka client getting connection errors when using Docker?

Which Kafka Docker image should I use, Confluent, Bitnami, or Apache?

How do I persist data across Kafka container restarts?

Author

Topics

Top DataCamp Courses

Track

Containerization and Virtualization with Docker and Kubernetes

0 min

Learn the power of Docker and Kubernetes, this interactive track will allow you to build and deploy applications in modern environments.

See Details

Start Course

Course

Intermediate Docker

4 hr

Master multi-stage builds, Docker networking tools, and Docker Compose for optimal containerized applications!

See Details

Start Course

Course

Introduction to Apache Kafka

2 hr

7.9K

Master Apache Kafka! From core concepts to advanced architecture, learn to create, manage, and troubleshoot Kafka for real-world data streaming challenges!

See Details

Start Course

blog

How to Learn Apache Kafka in 2025: Resources, Plan, Careers

Learn what Apache Kafka is, why it is useful, and how to start learning it.

Maria Eugenia Inzaugarat

10 min

blog

Docker Compose vs Kubernetes: A Detailed Comparison

Discover the key differences between Docker Compose and Kubernetes, their ideal use cases, and how to choose the right tool for your development and deployment needs.

Derrick Mwiti

9 min

Tutorial

Apache Kafka for Beginners: A Comprehensive Guide

Explore Apache Kafka with our beginner's guide. Learn the basics, get started, and uncover advanced features and real-world applications of this powerful event-streaming platform.

Kurtis Pykes

Tutorial

Docker Compose Guide: Simplify Multi-Container Development

Master Docker Compose for efficient multi-container application development. Learn best practices, scaling, orchestration, and real-world examples.

Derrick Mwiti

Tutorial

Docker for Data Science: An Introduction

In this Docker tutorial, discover the setup, common Docker commands, dockerizing machine learning applications, and industry-wide best practices.

Arunn Thevapalan

Tutorial

Docker for Beginners: A Practical Guide to Containers

This beginner-friendly tutorial covers the essentials of containerization, helping you build, run, and manage containers with hands-on examples.

Moez Ali

See More See More

What is Kafka, and Why Use Docker?

Fundamentals of Kafka in Docker

Kafka architecture basics

KRaft mode: Simplifying Kafka's architecture

Docker essentials

Setting Up Kafka with Docker Compose

Minimal Compose setup

Extended configuration

Interacting with Kafka in Docker

CLI access and Kafka commands

Programmatic access from apps

Docker Networking and Kafka Connectivity

Internal vs external listeners

Configuring multiple listeners for diverse client access

Troubleshooting connectivity

Kafka Docker in Production-Like Environments

Container orchestration strategies

Logging and monitoring

Optimization Tips for Kafka in Docker

Resource allocation

Persistent storage

Best Practices and Common Pitfalls

Configuration management

Common mistakes

Strengthening Kafka Security

Conclusion

Docker Kafka FAQs

Why is my Kafka client getting connection errors when using Docker?

Which Kafka Docker image should I use, Confluent, Bitnami, or Apache?

How do I persist data across Kafka container restarts?

How to Learn Apache Kafka in 2025: Resources, Plan, Careers

Docker Compose vs Kubernetes: A Detailed Comparison

Apache Kafka for Beginners: A Comprehensive Guide

Docker Compose Guide: Simplify Multi-Container Development

Docker for Data Science: An Introduction

Docker for Beginners: A Practical Guide to Containers

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Containerization and Virtualization with Docker and Kubernetes

Intermediate Docker

Introduction to Apache Kafka

How to Learn Apache Kafka in 2025: Resources, Plan, Careers

Docker Compose vs Kubernetes: A Detailed Comparison

Apache Kafka for Beginners: A Comprehensive Guide

Docker Compose Guide: Simplify Multi-Container Development

Docker for Data Science: An Introduction

Docker for Beginners: A Practical Guide to Containers

Containerization and Virtualization with Docker and Kubernetes