Kafka Topic Cleanup Configuration Guide
Cleanup Policy
Cleanup Policy
The Kafka cleanup policy defines how Kafka manages old messages within a topic, whether they are deleted after a certain time or compacted to retain only the most recent value for each key.
Delete Policy (Default)
The delete policy is the default retention strategy in Kafka. With this policy, Kafka automatically removes records based on either a time duration or a configured size threshold. This prevents topics from consuming unlimited storage space.
Key Configuration Options:
retention.ms: Specifies how long (in milliseconds) Kafka should retain messages. For example, setting it to86400000will keep messages for 24 hours.retention.bytes: Defines the maximum size in bytes for a topic partition. When this size is reached, older messages are deleted to free up space. For example,1073741824equals 1GB.
Kafka automatically monitors these settings and deletes log segments that exceed the specified limits.
Note: In newer Kafka versions especially when using KRaft mode (ZooKeeper-less architecture) the setting log.retention.check.interval.ms, which was previously used to control how frequently Kafka checks for expired segments, is no longer available at the topic level. Kafka now manages this behavior internally, so there is usually no need to configure this property manually.
Compact Policy
The compact policy offers a different way to manage data. Instead of deleting data based on age or size, Kafka retains only the latest message for each unique key. This is especially useful when you need to maintain a complete view of the current state for each key such as user settings or configuration changes.
Key Configuration Options:
cleanup.policy=compact: Enables compaction for the topic. Kafka will keep only the most recent value for each key.min.cleanable.dirty.ratio=0.5: Determines when compaction should start. A value of0.5means Kafka begins cleaning when 50% of the log contains outdated messages.
In this policy, the log cleaner service handles compaction automatically based on system activity and available resources.
Unlike the delete policy, retention time does not affect compaction. Kafka will retain the latest record per key regardless of how old it is.
In summary, newer versions of Kafka include enhancements that improve how retention policies work, especially when using the KRaft architecture. Kafka now handles certain internal timings — such as log deletion intervals automatically, meaning you no longer need to manage properties like log.retention.check.interval.ms manually.
To quickly get started with Kafka in a local development environment, you can use Docker. Below is a docker-compose.yml file configured to run Kafka in KRaft mode (ZooKeeper-less) using the Bitnami Kafka image:
services:
kafka:
image: docker.io/bitnami/kafka:3.9.0
container_name: kafka
ports:
- "9092:9092"
volumes:
- kafka_data:/bitnami
environment:
KAFKA_CFG_NODE_ID: 0
KAFKA_CFG_PROCESS_ROLES: controller,broker
KAFKA_CFG_CONTROLLER_QUORUM_VOTERS: 0@kafka:9093
KAFKA_CFG_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093
KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
KAFKA_CFG_CONTROLLER_LISTENER_NAMES: CONTROLLER
KAFKA_CFG_INTER_BROKER_LISTENER_NAME: PLAINTEXT
volumes:
kafka_data:- Save the content above into a file named
docker-compose.yml.
2. Open a terminal and navigate to the directory containing the file.
3. Run the following command to start the Kafka container:
docker compose upThis will pull the bitnami/kafka:3.9.0 image and start a container named kafka.
4. Once the container is running, you can access the Kafka CLI tools by entering the container:
docker exec -it kafka sh5. Kafka’s CLI scripts are located in:
/opt/bitnami/kafka/binTo navigate to this directory:
cd /opt/bitnami/kafka/binTo create a topic:
kafka-topics.sh --create --topic my-topic --bootstrap-server localhost:9092 -- partitions 1 --replication-factor 1Here are some commands to update topic configuration for retention using Kafka’s command line interface:
- Set the maximum size:
kafka-configs.sh --bootstrap-server localhost:9092 --entity-type topics --entity-name my-topic --alter --add-config retention.bytes=1073741824- Set the retention period:
kafka-configs.sh --bootstrap-server localhost:9092 --entity-type topics --entity-name my-topic --alter --add-config retention.ms=86400000- Set the cleanup policy for a topic to compact:
kafka-configs.sh --bootstrap-server localhost:9092 --entity-type topics --entity-name my-topic --alter --add-config cleanup.policy=compact- Set the minimum cleanable dirty ratio:
kafka-configs.sh --bootstrap-server localhost:9092 --entity-type topics --entity-name my-topic --alter --add-config min.cleanable.dirty.ratio=0.5- To verify topic configurations:
kafka-configs.sh --bootstrap-server localhost:9092 --entity-type topics --entity-name my-topic --describeReferences:

