Skip to main content

Kafka Docker Explained: Setup, Best Practices, and Tips

Learn how to set up Apache Kafka with Docker using Compose. Discover best practices, common pitfalls, and tips for development and testing environment
Jun 11, 2025  · 9 min read

In the world of modern data pipelines and microservices, Apache Kafka stands out as a go-to solution for handling real-time event streams and integrating distributed systems. As more engineering teams move toward scalable, event-driven architectures, Kafka becomes central to reliable data movement. 

Alongside this trend, Docker has emerged as the leading way for developers to manage, share, and deploy complex services without worrying about inconsistent environments or local issues. 

By using Docker to run Kafka, teams can quickly spin up production-like clusters for testing, proof-of-concept demos, or even production workloads. 

This article is aimed at intermediate backend developers, DevOps engineers, and anyone involved in managing data platforms who wants to make working with Kafka straightforward and reproducible.

If you are new to Kafka or Docker, be sure to refer to our Introduction to Apache Kafka course or Docker for Beginners: A Practical Guide to Containers.

What is Kafka, and Why Use Docker?

Apache Kafka is a distributed streaming platform designed for high-throughput, fault-tolerant, and scalable messaging. 

It acts as a durable, high-speed pipeline between data producers and consumers. Whenever you need to connect microservices, orchestrate data flows, or process real-time analytics, Kafka is often the tool of choice.

You can learn how to build real-time data processing applications with Kafka Streams with our Kafka Streams Tutorial. The tutorial covers core concepts, Java & Python implementations, and step-by-step examples for building scalable streaming applications.

Running Kafka can be complex. Even experienced engineers find the complexity of brokers, topics, partitions, and necessary components like Zookeeper or KRaft frustrating. 

Docker simplifies the process by packaging up binaries, dependencies, and configurations into containers. 

With Docker, Engineers can start a Kafka cluster on any machine, share identical setups with their colleagues, and avoid the common "works on my machine" headache. Dockerized Kafka is especially popular for:

  • Local development and testing, where fast spins and teardowns are invaluable
  • Isolated integration environments in CI/CD pipelines
  • Mimicking multi-broker clusters on a single host
  • Training and demonstration setups

Explore Apache Kafka with our Apache Kafka for Beginners guide. Learn the basics, get started, and uncover advanced features and real-world applications of this powerful event-streaming platform.

Fundamentals of Kafka in Docker

Before diving into orchestration, it helps to break down both Kafka’s technical pieces and how Docker helps recreate its distributed nature on a laptop or server. 

As you will learn in our Learn Docker from Scratch tutorial, Docker is a popular tool for simplifying the deployment, scaling, and management of applications such as Kafka using containerization.

Kafka architecture basics

Kafka’s backbone is its broker: a server process that stores, receives, and serves messages. Each cluster can consist of multiple brokers, distributing data and load. 

Inside each broker are topics (named channels for organizing messages) and partitions (sub-channels that spread events across many servers for parallelism and durability). Producers write data to topics, while consumers subscribe to and process those messages.

Coordination is key. Traditionally, Kafka has relied on Zookeeper to handle broker metadata, partition leadership, and overall cluster health. Newer deployments may use KRaft mode, which internalizes this coordination into Kafka itself, removing the Zookeeper dependency. ZooKeeper was deprecated in 3.5, and is on track for removal in 4.0.

Learn how to containerize machine learning applications with Docker and Kubernetes using our How to Containerize an Application Using Docker tutorial. 

KRaft mode: Simplifying Kafka's architecture

Starting from version 2.8, Kafka introduced KRaft mode, allowing it to manage metadata internally without relying on Zookeeper, being declared production-ready in version 3.3. This shift simplifies deployments and reduces the number of components to manage.

To set up Kafka in KRaft mode using Docker Compose:

version: '3.8'
services:
  kafka:
    image: apache/kafka:latest
    container_name: kafka
    ports:
      - "9092:9092"
      - "9093:9093"
    environment:
      KAFKA_NODE_ID: 1
      KAFKA_PROCESS_ROLES: broker,controller
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_CONTROLLER_QUORUM_VOTERS: 1@localhost:9093
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
      KAFKA_LOG_DIRS: /var/lib/kafka/data
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_LOG_RETENTION_HOURS: 168
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      CLUSTER_ID: "Mk3OEYBSD34fcwNTJENDM2Qk"
    volumes:
      - ./data:/var/lib/kafka/data

This configuration sets up a single-node Kafka cluster in KRaft mode, eliminating the need for Zookeeper.

Docker essentials

To run Kafka with Docker, you only need a few basics: Docker Engine (the runtime) and often Docker Compose for multi-container management. 

Compose files allow you to specify several services (Kafka, Zookeeper, Kafka UI, etc.), their networks, environment variables, and any storage mounts. Docker’s networking makes it possible to emulate multi-broker clusters, even on a single machine, by giving each broker its container, hostnames, and ports.

One of the best ways to get familiar with Docker is through projects. Practice using these 10 Docker Project Ideas

Setting Up Kafka with Docker Compose

Docker Compose is often the preferred way to run applications that require multiple interacting containers. Instead of juggling shell scripts or manual commands, Compose lets you define all services and their links in a single YAML file. 

This approach reduces configuration drift, speeds up onboarding, and guarantees everyone can use the same stack for development.

Minimal Compose setup

Here’s a basic docker-compose.yml that starts Kafka and Zookeeper for local development:

version: '3'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
    ports:
      - "2181:2181"
  kafka:
    image: confluentinc/cp-kafka:latest
    depends_on:
      - zookeeper
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
    ports:
      - "9092:9092"
    volumes:
      - kafka_data:/var/lib/kafka/data
volumes:
  kafka_data:

This minimal setup maps the relevant ports, links Kafka to Zookeeper, and mounts a Docker volume, so messages do not vanish if the containers restart. Be sure to use environment variables to configure broker identity, listener addresses, and connection endpoints between services.

A thorough understanding of Kafka is essential in acing a data engineering interview. Prepare for your next data engineering interview with our extensive list of Kafka interview questions and answers using these 20 Kafka Interview Questions for Data Engineers

Extended configuration

Professional setups often add tools such as Kafka UI, Schema Registry, or REST Proxy for easier management and inspection of messages. These can be added to the services block in your compose file. For example:

kafka-ui:
  image: provectuslabs/kafka-ui:latest
  ports:
    - "8080:8080"
  environment:
    KAFKA_CLUSTERS_0_NAME: "Local"
    KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: "kafka:9092"

Persisting Kafka topics and logs across restarts is essential for testing recovery scenarios or long-running jobs. Always mount Docker volumes to /var/lib/kafka/data for each broker and /var/lib/zookeeper for Zookeeper, if in use.

Choosing the right image is key. Here is a quick summary:

Kafka Docker Image

Maintainer

Features

Best Use Case

Confluent

Confluent

Full set, many add-ons

Production-like setups, advanced testing

Bitnami

Bitnami

Clean, minimal

Local development, resource-constrained environments

Apache Kafka (KRaft)

Apache

Zookeeper-less, simplified setup

Modern deployments, simplified architecture

The How to Learn Apache Kafka in 2025 tutorial goes into more details about Kafka, including 

  • What makes Apache Kafka popular?
  • Main features of Apache Kafka
  • Various applications of Apache Kafka

Interacting with Kafka in Docker

Once Kafka is running via Docker Compose, you can immediately create topics and send data, just as you would with any standard deployment.

CLI access and Kafka commands

Use docker-compose exec or docker exec to run Kafka CLI tools. For example, to create a topic, you can execute:

docker-compose exec kafka kafka-topics.sh --create --topic demo --bootstrap-server localhost:9092

You can also use kafka-console-producer.sh and kafka-console-consumer.sh in the same way. This provides a fast feedback loop for integration testing or experimentation without polluting your local environment.

Programmatic access from apps

Applications and scripts can connect to your containerized Kafka using the host’s advertised bootstrap servers. For local apps, set bootstrap.servers=localhost:9092 or the equivalent. 

Common clients include the official Kafka libraries for Python, Java, NodeJS, and Go. Watch that your application’s network stack can reach the correct ports and addresses.

Docker Networking and Kafka Connectivity

Kafka's networking configuration frequently presents challenges. Understanding internal and external listeners and how to expose services is vital.

Internal vs external listeners

Kafka brokers use listeners to control client connections. Two common setups are:

  • PLAINTEXT://:9092 for local, unsecured traffic, especially for dev
  • SSL or SASL_SSL listeners for encrypted or authenticated connections

In docker-compose, expose the right ports and ensure KAFKA_ADVERTISED_LISTENERS matches the actual host and port your apps use. If running in a VM or cloud, set this configuration to the public IP and mapped port.

Aspect

Internal Listener

External Listener

Common Protocol

PLAINTEXT://:9092

SSL://, SASL_SSL://, or mapped PLAINTEXT://

Security

Unencrypted, unauthenticated

Encrypted and/or authenticated

Typical Usage

Local development, intra-container traffic

Remote clients, cloud VMs, production access

Docker Compose Setup

Expose port 9092 inside network

Map 9092 to host, set KAFKA_ADVERTISED_LISTENERS to public IP

Configuration Focus

Simplicity and speed

Reliability, security, public access

Networking Scope

Localhost or internal Docker network

Public IP or external-facing domain

Configuring multiple listeners for diverse client access

In Docker environments, it's common to set up multiple listeners to handle different client access scenarios:

environment:
  KAFKA_LISTENERS: INTERNAL://0.0.0.0:29092,EXTERNAL://0.0.0.0:9092
  KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka:29092,EXTERNAL://localhost:9092
  KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
  KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL

This setup allows internal Docker clients to connect via the INTERNAL listener and external clients to connect via the EXTERNAL listener.

Troubleshooting connectivity

Common errors include ‘broker not available’, ‘connection refused’, or client timeouts. Verify:

  • Ports are exposed and mapped correctly
  • Broker advertised listeners match your client’s target address
  • All containers are healthy using docker-compose ps
  • DNS resolution works between containers (use the service name, e.g. kafka:9092)
  • Try docker network inspect to debug cross-container links

Kafka Docker in Production-Like Environments

Beyond local testing, Dockerized Kafka environments also help with staging and CI/CD.

Container orchestration strategies

Orchestrators like Docker Swarm and Kubernetes can manage multi-broker Kafka setups, rolling updates, and service discovery. Each broker gets its container, with attached persistent storage. 

In Kubernetes, StatefulSets handle ordered deployment and create stable DNS names for each broker.

Logging and monitoring

Direct Kafka logs to the Docker log driver or external storage for analysis. Many teams pipe logs to Elasticsearch, Loki, or Splunk for troubleshooting. Pair Dockerized Kafka with Prometheus and Grafana for cluster monitoring.

Optimization Tips for Kafka in Docker

Kafka’s performance depends on careful resource and storage management.

Resource allocation

Give each broker container enough CPU, RAM, and disk to mimic production as closely as possible. Tune Docker resource limits and pass JVM options with KAFKA_JVM_PERFORMANCE_OPTS for heap and garbage collector settings.

Persistent storage

Always use Docker volumes or bind mounts for /var/lib/kafka/data and /var/lib/zookeeper, storing data inside the container means you lose it on restart, which defeats any durability tests. High disk throughput is critical for sustained performance, especially in CI/CD load testing.

Best Practices and Common Pitfalls

Stability, maintainability, and team productivity all benefit from a few best practices.

Configuration management

Manage secrets and configurations externally with .env files or mounted config directories. Never hardcode sensitive data in your compose YAML. For reused setups, consider templating tools or config management frameworks.

Common mistakes

Avoiding common mistakes

  • Do not store data inside containers; always mount volumes
  • Check for port conflicts on your host
  • Avoid using default admin passwords; secure exposed ports, even for local development
  • Keep Docker images up to date; patch for CVEs and deprecated dependencies

Strengthening Kafka Security

To secure your Kafka deployment:

  • Enable TLS Encryption: Protect data in transit by configuring SSL/TLS for all connections.
  • Implement Authentication: Use SASL mechanisms (e.g., SCRAM, GSSAPI) to authenticate clients.
  • Set Up Authorization: Define Access Control Lists (ACLs) to control client permissions.
  • Regularly Rotate Credentials: Change passwords and keys periodically to minimize risk.
  • Monitor and Audit: Enable audit logs to track access and changes within the cluster

Conclusion

Docker makes it remarkably easy to start, tweak, and experiment with Apache Kafka clusters for integration, learning, or even production-like testing. Containers insulate you from host OS issues and dependency mismatches, providing confidence that your dev and test environments match what you expect. As you grow from simple local clusters to orchestrated Kafka platforms, investing in robust Docker setups and best practices will save your team time and frustration. 

Explore more information about Kafka and Docker from one of our comprehensive courses: 

Docker Kafka FAQs

Do I need Zookeeper to run Kafka in Docker?

Not necessarily. While traditional Kafka setups rely on Zookeeper, newer versions support KRaft mode, which eliminates the need for Zookeeper by internalizing cluster coordination. However, many Docker Compose examples still use Zookeeper by default for compatibility and stability. ZooKeeper was deprecated in version 3.5 and is on track for removal in 4.0.

Can I use Kafka in Docker for production workloads?

While it is possible to deploy applications without orchestration tools like Kubernetes or Docker Swarm, it is not always recommended. For production, consider persistence, monitoring, security, scaling, and failover support. Docker is ideal for development and testing, but requires careful planning for production.

Why is my Kafka client getting connection errors when using Docker?

Most client connection issues stem from incorrect KAFKA_ADVERTISED_LISTENERS settings. Make sure the listener value reflects the address the client uses to connect (e.g., localhost:9092 for local dev) and that Docker ports are exposed correctly.

Which Kafka Docker image should I use, Confluent, Bitnami, or Apache?

Use Confluent’s image for full-featured setups and advanced testing, Bitnami’s for straightforward local development, and Apache’s for minimal, customizable builds. Choose based on your use case, resource needs, and security requirements.

How do I persist data across Kafka container restarts?

Always use Docker volumes or bind mounts mapped to /var/lib/kafka/data and /var/lib/zookeeper (if used). Container filesystems are ephemeral, without external storage, all messages and topics will be lost on restarts.


Derrick Mwiti's photo
Author
Derrick Mwiti
Topics

Top DataCamp Courses

Track

Containerization and Virtualization with Docker and Kubernetes

0 min
Learn the power of Docker and Kubernetes, this interactive track will allow you to build and deploy applications in modern environments.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

blog

How to Learn Apache Kafka in 2025: Resources, Plan, Careers

Learn what Apache Kafka is, why it is useful, and how to start learning it.
Maria Eugenia Inzaugarat's photo

Maria Eugenia Inzaugarat

10 min

blog

Docker Compose vs Kubernetes: A Detailed Comparison

Discover the key differences between Docker Compose and Kubernetes, their ideal use cases, and how to choose the right tool for your development and deployment needs.
Derrick Mwiti's photo

Derrick Mwiti

9 min

Tutorial

Apache Kafka for Beginners: A Comprehensive Guide

Explore Apache Kafka with our beginner's guide. Learn the basics, get started, and uncover advanced features and real-world applications of this powerful event-streaming platform.
Kurtis Pykes 's photo

Kurtis Pykes

8 min

Tutorial

Docker Compose Guide: Simplify Multi-Container Development

Master Docker Compose for efficient multi-container application development. Learn best practices, scaling, orchestration, and real-world examples.
Derrick Mwiti's photo

Derrick Mwiti

Docker bot

Tutorial

Docker for Data Science: An Introduction

In this Docker tutorial, discover the setup, common Docker commands, dockerizing machine learning applications, and industry-wide best practices.
Arunn Thevapalan's photo

Arunn Thevapalan

15 min

Tutorial

Docker for Beginners: A Practical Guide to Containers

This beginner-friendly tutorial covers the essentials of containerization, helping you build, run, and manage containers with hands-on examples.
Moez Ali's photo

Moez Ali

14 min

See MoreSee More