Migrate from ZooKeeper to KRaft on Confluent Platform

Migrating from ZooKeeper to KRaft means migrating existing metadata from Kafka brokers that are using ZooKeeper to store metadata, to brokers that are using a KRaft quorum controller to store metadata in Apache Kafka®. This topic will walk you through how to perform the migration. To learn more about KRaft, see KRaft Overview for Confluent Platform.

Important

Migration of production clusters is generally available in Confluent Platform version 7.6.1; however, it is recommended that you migrate Confluent Platform version 7.7.0 or later. You should not attempt production migration on any earlier versions of Confluent Platform or Kafka version 3.6.1 or earlier.

Phases of migration

There are several phases of migration:

  • Phase 1: The initial phase when brokers are using ZooKeeper to manage metadata and a ZooKeeper controller is running.
  • Phase 2: You provision and start KRaft controllers.
  • Phase 3: You configure the brokers for migration and load broker metadata from ZooKeeper to the KRaft controller.
  • Phase 3: A hybrid phase where a KRaft controller that contains the cluster metadata is running, but you are moving the brokers to KRaft mode. This means that some brokers may be running in KRaft mode and others in ZooKeeper during this phase.
  • Phase 4: A dual write phase where all brokers are moved to KRaft but the KRaft controller also writes the metadata to ZooKeeper.
  • Phase 5: A final phase when brokers are writing metadata only to KRaft and the metadata is not copied to ZooKeeper.

Phase 1: Cluster is running in ZooKeeper mode

Note the following before you start migration:

  • When you are migrating a cluster from ZooKeeper mode to KRaft mode, changing the metadata version (inter.broker.protocol.version) is not supported.

  • Upgrade to Confluent Platform 7.8.0. Before you migrate from ZooKeeper to KRaft you should upgrade to the latest Confluent Platform version. See Upgrade Confluent Platform for the considerations and steps to do this.

  • You cannot revert the cluster to ZooKeeper after you have finalized the migration. Steps to roll back at earlier phases are noted in the Reverting to ZooKeeper mode section.

  • During migration, if a ZooKeeper broker is running with multiple log directories, any directory failure will cause the broker to shutdown. Brokers with broken log directories will only be able to migrate to KRaft once the directories are repaired. For more information, see KAFKA-16431.

  • Review the KRaft Limitations and Known Issues before you migrate. Do not migrate if you are using unsupported features.

  • To help with debugging, enable TRACE level logging for metadata migration. Add the following line to the log4j.properties file found in the CONFLUENT_HOME/etc/kafka/ directory:

    log4j.logger.org.apache.kafka.metadata.migration=TRACE
    

Phase 2: Start KRaft controllers

This phase involves configuring and starting one or more KRaft controllers.

  1. Retrieve the cluster ID.

    Before you start a KRaft controller, you must format storage for your Kafka cluster with the ID of the existing cluster. You can get this ID with the zookeeper-shell tool. For example:

    ./bin/zookeeper-shell localhost:2181
    
    Connecting to localhost:2181
    Welcome to ZooKeeper!
    
    get /cluster/id
    
    {"version":"1","id":"WZEKwK-bS62oT3ZOSU0dgw"}
    

    Save this ID, you will use it later after you configure the KRaft controllers.

  2. Configure and deploy a KRaft controller quorum.

    Deploy a set of KRaft controllers that will take over from ZooKeeper. Configure each of the KRaft controllers with the following:

    • A node.id that is unique across all brokers and controllers.
    • Migration enabled with zookeeper.metadata.migration.enable=true.
    • ZooKeeper connection configuration.
    • Other KRaft-mode required properties, such as controller.quorum.voters, and controller.listener.names. Following is an example controller.properties file, for a controller listening on port 9093:

    Note

    The KRaft controller node.id values must be different from any existing ZooKeeper broker broker.id property. In KRaft mode, the brokers and controllers share the same node ID namespace.

    process.roles=controller
    node.id=3000
    controller.quorum.voters=3000@localhost:9093
    controller.listener.names=CONTROLLER
    listeners=CONTROLLER://:9093
    
     # Enable the migration
      zookeeper.metadata.migration.enable=true
    
     # ZooKeeper client configuration
      zookeeper.connect=localhost:2181
    
     # Enable migrations for cluster linking
      confluent.cluster.link.metadata.topic.enable=true
    
     # Other configuration entries ...
    
  3. Format storage on each node with the ID and the controller configuration file. For example:

    ./bin/kafka-storage format --config ./etc/kafka/kraft/controller.properties --cluster-id WZEKwK-bS62oT3ZOSU0dgw
    

    You might see output like the following:

    Formatting /tmp/kraft-controller-logs with metadata version 3.8
    
  4. Start each controller, specifying the configuration file with migration enabled.

    ./bin/kafka-server-start.sh ./etc/kafka/kraft/controller.properties
    

Phase 3: Migrate broker metadata from ZooKeeper to KRaft

Once the KRaft controllers are started, you will reconfigure each broker for KRaft migration and restart the broker. You can do a rolling restart to help ensure cluster availability during the migration. Metadata migration automatically starts when all of the brokers have been restarted.

Set the following for brokers:

  • inter.broker.protocol.version to version 3.8.
  • Enable migration with zookeeper.metadata.migration.enable=true.
  • Enable the cluster linking metadata topic (if using cluster linking) with confluent.cluster.link.metadata.topic.enable=true.
  • The ZooKeeper connection configuration (zookeeper.connect).
  • Any other KRaft-mode required properties, such as controller.quorum.voters and controller.listener.names.
  • The controller.listener.names should also be added to listener.security.property.map.

Following is an example configuration file for a broker that is ready for the KRaft migration.

# Sample ZK broker server.properties listening on 9092
broker.id=0
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://localhost:9092
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT

# Set the IBP
  inter.broker.protocol.version=3.8

# Enable the migration
  zookeeper.metadata.migration.enable=true

# Cluster linking metadata topic enabled
  confluent.cluster.link.metadata.topic.enable=true

# ZooKeeper client configuration
  zookeeper.connect=localhost:2181

# KRaft controller quorum configuration
  controller.quorum.voters=3000@localhost:9093
  controller.listener.names=CONTROLLER

Restart each broker with the modified configuration file. When all of the Kafka brokers running with ZooKeeper for metadata management are restarted with migration properties set, the migration automatically begins. When migration is complete, you should see the following entry in active controller log at INFO level.

Completed migration of metadata from ZooKeeper to KRaft.

You can also check your broker logs for the znode_type for /controller. You should see kraftControllerEpoch

Phase 4: Migrate the brokers to use the KRaft controller

At this point, the metadata migration has been completed, but the Kafka brokers are still running in ZooKeeper mode.

The KRaft controller is running in migration mode, and it will send remote procedure calls (RPCs) such as UpdateMetadata and LeaderAndIsr to the ZooKeeper-mode brokers.

To migrate the brokers to KRaft, they need to be reconfigured as KRaft brokers and restarted. Using the broker configuration in the previous phase as an example:

  • broker.id is replaced with with node.id.
  • Add process.roles=broker.
  • Remove ZooKeeper configuration entries.
  • If you are using ACLs, change the authorizer class. For more information, see ACL concepts.
  • Remove migration entries.

Following is an example of how a server.properties file for a migrated broker might look. Note that ZooKeeper-specific properties are commented out.

process.roles=broker
node.id=0
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://localhost:9092
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT

# IBP
# inter.broker.protocol.version=3.8

# Remove the migration enabled flag
# zookeeper.metadata.migration.enable=true

# Remove the cluster linking metadata topic setting
# confluent.cluster.link.metadata.topic.enable=true

# Remove ZooKeeper client configuration
# zookeeper.connect=localhost:2181

# Keep the KRaft controller quorum configuration
  controller.quorum.voters=3000@localhost:9093
  controller.listener.names=CONTROLLER

# If using ACLs, change the authorizer from AclAuthorizer used for ZooKeeper to the StandardAuthorizer used for KRaft.
  authorizer.class.name=kafka.security.authorizer

Configure and restart each broker in a rolling fashion. All of the brokers are running in KRaft mode.

Phase 5: Take KRaft controllers out of migration mode

The final step of the migration is to remove the migration entries such as zookeeper.metadata.migration.enable. to take the controllers out of migration mode, and to remove the ZooKeeper configuration entry. An example controller.properties file for a controller that is migrated to KRaft mode and is listening on port 9093 is shown. Note that the ZooKeeper-specific properties are commented out.

process.roles=controller
node.id=3000
controller.quorum.voters=3000@localhost:9093
controller.listener.names=CONTROLLER
listeners=CONTROLLER://:9093

# Disable migration
# zookeeper.metadata.migration.enable=true

# Remove the cluster linking metadata topic setting
# confluent.cluster.link.metadata.topic.enable=true

# Remove ZooKeeper client configuration
# zookeeper.connect=localhost:2181

After this step, restart each controller and your cluster should be migrated to KRaft mode.

Reverting to ZooKeeper mode

If the cluster is still in migration mode, it is possible to revert it back to ZooKeeper mode.

The process to follow for reverting depends on how far the migration has progressed. The following table lists migration phases and the revert process at that phase.

To revert successfully, you must have completed the step fully, in the correct order. If you did not fully complete a phase, back out those changes and use the specified revert steps.

Migration phase completed Steps to revert Notes
You have provisioned the KRaft controller quorum (Phase 2). Deprovision the KRaft controller quorum. No additional steps required.
You have enabled migration on the brokers (Phase 3).
  1. Deprovision the KRaft controller quorum.
  2. Use the zookeeper-shell tool, and run rmr /controller so that one of the brokers can become the new ZooKeeper-mode controller.
  3. On each broker, remove the zookeeper.metadata.migration.enable, controller.listener.names, and controller.quorum.voters configurations, and replace node.id with broker.id.
  4. Finally, perform a rolling restart of all brokers.
You must perform the zookeeper-shell step quickly to minimize the amount of time that the cluster lacks a controller. Until the /controller znode is deleted, ignore any errors in the broker log about failing to connect to the KRaft controller. Those error logs should disappear after a second roll to ZooKeeper mode.
Migrated brokers to KRaft (Phase 4).
  1. On each broker, remove the process.roles configuration, replace the node.id with broker.id
  2. Restore the zookeeper.connect configuration property to its previous value. If your cluster requires other ZooKeeper configurations for brokers, such as zookeeper.ssl.protocol, re-add those configurations as well.
  3. Perform a rolling restart of all brokers.
  4. Deprovision the KRaft controller quorum.
  5. Connect to ZooKeeper using zookeeper-shell tool and run rmr /controller so that one of the brokers can become the new ZooKeeper mode controller.
  6. On each broker, remove the following configuration properties from the configuration file: zookeeper.metadata.migration.enable, controller.listener.names, and controller.quorum.voters.
  7. Finally, perform a second rolling restart of all brokers. When this is done, the roll-back is complete.
You must perform the zookeeper-shell step quickly, to minimize the amount of time that the cluster lacks a controller. Until the /controller ZooKeeper node is deleted, you can ignore errors in the broker log about failing to connect to the KRaft controller. Those error logs should disappear after second roll to pure ZooKeeper mode. Make sure that on the first cluster roll, zookeeper.metadata.migration.enable remains set to true. Do not set it to false until the second cluster roll.
The controllers are migrated and migration is finalized (Phase 5). If you have finalized the ZooKeeper migration, you cannot revert to ZooKeeper mode. To help ensure migration will be successful, you can wait for a week or two before finalizing the migration, and use this time to validate KRaft mode for your cluster. This requires that you run the ZooKeeper cluster for a longer period of time, but may prevent issues after migration.