Skip to content

Upgrading Bitnami Kafka from Version 20.0.0 to 28.2.5

This section provides the instructions for upgrading Bitnami Kafka from version 20.0.0 to 28.2.5 in a Kubernetes environment. System administrators and DevOps engineers can use this guide as a technical reference for managing and maintaining Kafka clusters.

Resource specifications

This table provides key configuration parameters and environment details required for upgrading Bitnami Kafka from version 20.0.0 to 28.2.5 in a Kubernetes cluster using Helm:

Parameter Value
Kubernetes Version v1.28
Helm Version v3
Release Name kafka
Namespace mdsp-bk-kafka
Chart bitnami/kafka
From Version 20.0.0
To Version 28.2.5
Kafka Replicas 3 (0, 1, 2)
Test Topic test
Downtime Window 20 minutes

Prerequisites

The following requirements must be met to proceed with the Bitnami Kafka upgrade.

Pull required Docker images

Note

Ensure all required Docker images are pulled in advance.

bitnami/jmx-exporter:0.20.0-debian-12-r17
bitnami/kafka:3.7.0-debian-12-r6
bitnami/kafka-exporter:1.7.0-debian-12-r27
bitnami/zookeeper:3.9.2-debian-12-r6
docker.io/bitnami/jmx-exporter:0.20.0-debian-12-r17
docker.io/bitnami/kafka:3.7.0-debian-12-r6
docker.io/bitnami/kafka-exporter:1.7.0-debian-12-r27
docker.io/bitnami/zookeeper:3.9.2-debian-12-r6

Info

The business will experience approximately 20 minutes of downtime during the upgrade.

Verify Insights Hub GUI Pre-Upgrade

Ensure the Insights Hub GUI is functioning properly before starting the upgrade. Verify that there are no pre-existing Kafka-related issues.

Get Kafka release revision

In case the Kafka upgrade fails, get the release revision details and roll back the changes.

To get Kafka release revision details, execute the following:

  • If ArgoCD is used, the release revision can be checked as mentioned in the below image.

    rollback

  • If Helm is used, run the following command to get release revision.

    helm -n mdsp-bk-kafka history kafka
    

1. Stop Kafka access

To prevent any impact on the upgrade process, ensure Kafka does not receive data from clients during the upgrade.

NAMESPACE=mdsp-bk-kafka
kubectl -n $NAMESPACE delete svc kafka  kafka-zookeeper

2. Retain existing Persistent Volumes (PV)

Preserve the PersistentVolumes (PVs) and set the reclaim policy to "Retain".

NAMESPACE=mdsp-bk-kafka
rm -f pv_list.txt
for REPLICA in 0 1 2;
do
  OLD_PVC="data-kafka-${REPLICA}"
  PV_NAME=$(kubectl -n $NAMESPACE get pvc $OLD_PVC -o jsonpath="{.spec.volumeName}")
  # Store old volume name to pv_list.txt file
  echo $PV_NAME >> pv_list.txt
  # Modify PV reclaim policy
  kubectl  -n $NAMESPACE  patch pv $PV_NAME -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
done
# check if pv retained
kubectl get pv|grep $NAMESPACE

3. Generate new PersistentVolumeClaim (PVC) manifest

Create YAML manifest for new PVCs.

NAMESPACE=mdsp-bk-kafka
for REPLICA in 0 1 2
do
  OLD_PVC="data-kafka-${REPLICA}"
  NEW_PVC="data-kafka-broker-${REPLICA}"
  NEW_PVC_MANIFEST_FILE="$NEW_PVC.yaml"
  # Create new PVC manifest
  kubectl  -n $NAMESPACE  get pvc $OLD_PVC -o json | jq ".metadata.name = \"$NEW_PVC\"|with_entries(select([.key] |inside([\"metadata\", \"spec\", \"apiVersion\", \"kind\"]))) | del(.metadata.annotations, .metadata.creationTimestamp,.metadata.finalizers, .metadata.resourceVersion,.metadata.selfLink, .metadata.uid)"> $NEW_PVC_MANIFEST_FILE
done
# check if they exist and content correct
ls -l data-kafka-broker-*.yaml;cat data-kafka-broker-*.yaml

4. Delete Statefulset and old PVCs

Delete old Kafka statefulset and PVCs.

NAMESPACE=mdsp-bk-kafka
kubectl -n $NAMESPACE delete sts "kafka"
# check if pod deleted , it will deleted from now on
kubectl  -n $NAMESPACE  get pod

for REPLICA in 0 1 2;
do
  kubectl  -n $NAMESPACE delete pvc data-kafka-${REPLICA}
done

# check if pvc deleted, it will disapeared from now on
kubectl  -n $NAMESPACE  get pvc

# check if pv exists, it will still exists from now on
kubectl get pv|grep $NAMESPACE

5. Re-enable Persistent Volumes (PVs) and create a new PVC

Verify detachment and prepare PVs for reuse.

NAMESPACE=mdsp-bk-kafka
for PV_NAME in `cat /tmp/pv_list.txt`;do
    echo $PV_NAME
    kubectl  -n $NAMESPACE  patch pv $PV_NAME -p '{"spec":{"claimRef": null}}'
done
# after patched,pv will missed in the pv list while excute get command
kubectl  get pv|grep $NAMESPACE
# but volumes still exist
for PV_NAME in `cat /tmp/pv_list.txt`;do
    kubectl  get volumes.longhorn.io -A|grep $PV_NAME
done
# if not detached , you need wait it detached or detach it manully from page or cmd. kubectl -n longhorn-system get volumeattachments.storage.k8s.io |egrep 'pvc-b876d645-9438-4c65-ad35-a061fe5e4830|pvc-b38cf148-aaf5-4f59-b801-84e15cbb43fa|pvc-0daa9f83-0c65-4398-9d40-3b4621f4aecf' ,also need delete attachments for specified volume

6. Create new PVC using existing PV

Create new PVC using existing PV.

NAMESPACE=mdsp-bk-kafka
for REPLICA in 0 1 2
do
    kubectl -n $NAMESPACE apply -f data-kafka-broker-$REPLICA.yaml
done
# get new pvc
kubectl  -n $NAMESPACE  get pvc
# get pv
kubectl  get pv|grep $NAMESPACE

7. Upgrade Kafka

Update the Kafka values in the values.yaml file, including additional parameters that need to be modified, such as size, extraConfig, authentication protocol, user, password etc., and then proceed with the upgrade process.

Name Value
global.storageClass "longhorn-ssd"
extraConfig log.dirs=/bitnami/kafka/data
delete.topic.enable=false
auto.create.topics.enable=true
num.recovery.threads.per.data.dir=1
allow.everyone.if.no.acl.found=true
super.users=User:admin
heapOpts -Xmx2048m -Xms2048m
listeners.client.protocol PLAINTEXT
listeners.controller.protocol PLAINTEXT
listeners.interbroker.protocol PLAINTEXT
listeners.external.protocol PLAINTEXT
sasl.interbroker.user admin
sasl.client.users ["user"]
controller.replicaCount 0
broker.replicaCount 3
broker.minId 0
broker.resources.limits {"cpu": "1","memory": "8.5Gi"}
broker.resources.requests {"cpu": "250m","memory": "2.5Gi"}
broker.affinity nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "iaas"
operator: "In"
values:
- "true"
broker.tolerations - effect: "NoSchedule"
key: "domain"
operator: "Equal"
value: "iaas"
broker.persistence.size 2Ti
metrics.kafka.enabled true
metrics.kafka.resources limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 100m
memory: 128Mi
metrics.kafka.affinity nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "iaas"
operator: "In"
values:
- "true"
metrics.kafka.tolerations - effect: "NoSchedule"
key: "domain"
operator: "Equal"
value: "iaas"
metrics.jmx.enabled true
metrics.jmx.resources limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 100m
memory: 128Mi
metrics.serviceMonitor.enabled true
metrics.serviceMonitor.namespace monitoring
kraft.enabled false
zookeeper.enabled true
zookeeper.replicaCount 3
zookeeper.affinity nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "iaas"
operator: "In"
values:
- "true"
zookeeper.tolerations - effect: "NoSchedule"
key: "domain"
operator: "Equal"
value: "iaas"
zookeeper.resources requests:
memory: 256Mi
cpu: 250m
zookeeper.persistence.size 5Gi
networkPolicy.enabled false

Info

When upgrading with ArgoCD, you need to rename the Kafka repository and select the "PRUNE" option to remove old versions of resources, such as services (svc).

NAMESPACE=mdsp-bk-kafka
cd kafka-28.2.5/
helm -n $NAMESPACE upgrade kafka ./ -f values.yaml
# if error,do scale sts of kafka and zookeeper step by step. and delete sts of zookeeper.
# kubectl -n longhorn-system get volumeattachments.storage.k8s.io |egrep 'pvc-b876d645-9438-4c65-ad35-a061fe5e4830|pvc-b38cf148-aaf5-4f59-b801-84e15cbb43fa|pvc-0daa9f83-0c65-4398-9d40-3b4621f4aecf' # It is possible to check if it is already attached
# see new pods up
kubectl  -n $NAMESPACE  get pod

8. Validate Kafka data

Test consuming messages from the Kafka topic test to verify if any data is lost. Assume that the test topic contains messages.

NAMESPACE=mdsp-bk-kafka
kubectl -n $NAMESPACE exec -it kafka-client bash
>> NAMESPACE=mdsp-bk-kafka;cd /opt/bitnami/kafka/bin/
>> kafka-console-consumer.sh --bootstrap-server kafka.$NAMESPACE.svc.cluster.local:9092 --topic test --from-beginning

9. Restart all services

  • Restart all services that are using Kafka. You can use the following command to identify which deployments are utilizing Kafka:
cat alldeploy.json | jq -c '.items[]' | while IFS= read -r deployment; do
namespace=$(echo "$deployment" | jq -r '.metadata.namespace')
name=$(echo "$deployment" | jq -r '.metadata.name')

# Check if the namespace starts with domain beginning or kafka-connector
if [[ "$namespace" == domain* || "$namespace" == "kafka-connector" ]]; then
# Check if it exists secretKeyRef.key amount kafka_host、kafka_host_port or kafka_host_port_ha...
if echo "$deployment" | jq -e '.spec.template.spec.containers[].env[]? | select(.valueFrom.secretKeyRef.key == "kafka_host" or .valueFrom.secretKeyRef.key == "kafka_host_port" or .valueFrom.secretKeyRef.key == "kafka_host_port_ha" or .valueFrom.secretKeyRef.key == "kafka_port" or .valueFrom.secretKeyRef.key == "kafka_zookeeper_host" or .valueFrom.secretKeyRef.key == "kafka_zookeeper_host_ha")' > /dev/null; then
# The output is eligible Deployment Information
#echo "Namespace: $namespace, Deployment Name: $name"
echo "kubectl -n $namespace rollout restart deployment $name"
fi
fi
done
  • Manually restart core services:
# kafka connector
kubectl -n kafka-connector rollout restart deploy kafka-connector-job-cp-kafka-connect || kubectl -n kafka-connector rollout restart deployment confluent-cp-kafka-connect

# core
kubectl -n mindsphere-core rollout restart deploy hypergate
kubectl -n mindsphere-core rollout restart deploy hypergate-proxy
kubectl -n mindsphere-core rollout restart deploy mindgate
kubectl -n mindsphere-core rollout restart deploy mindgate-oscloud

# iot
kubectl -n mindsphere-iots rollout restart deploy idl-access-token-svc
kubectl -n mindsphere-iots rollout restart deploy idl-metadata-svc
kubectl -n mindsphere-iots rollout restart deploy idl-notification-listener
kubectl -n mindsphere-iots rollout restart deploy iot-cts-aggregate-svc
kubectl -n mindsphere-iots rollout restart deploy iot-cts-coldstore-jobs
kubectl -n mindsphere-iots rollout restart deploy iot-cts-data-ingest
kubectl -n mindsphere-iots rollout restart deploy iot-cts-data-svc
kubectl -n mindsphere-iots rollout restart deploy iot-cts-iav-writer
kubectl -n mindsphere-iots rollout restart deploy iot-cts-throttling-consumer
kubectl -n mindsphere-iots rollout restart deploy iot-cts-writer
kubectl -n mindsphere-iots rollout restart deploy iot-ts-billing-ingestion-size-extractor
kubectl -n mindsphere-iots rollout restart deploy iot-ts-streaming-svc
kubectl -n mindsphere-iots rollout restart deploy iot-ts-subscription-writer
kubectl -n mindsphere-strt rollout restart deploy energy-prediction-ts-aggregator
kubectl -n mindsphere-strt rollout restart deploy ep-agg-worker

# advs
kubectl -n mindsphere-advs rollout restart deploy assetmanagement
kubectl -n mindsphere-advs rollout restart deploy assettenantservice
kubectl -n mindsphere-advs rollout restart deploy assettypemanagement
kubectl -n mindsphere-advs rollout restart deploy eventmanagement
kubectl -n mindsphere-advs rollout restart deploy eventmanagement-entity-manager

# conn
kubectl -n mindsphere-conn rollout restart deploy agentonlinedetector
kubectl -n mindsphere-conn rollout restart deploy customparserproxy
kubectl -n mindsphere-conn rollout restart deploy datasourceconfigurationparser
kubectl -n mindsphere-conn rollout restart deploy eventparser
kubectl -n mindsphere-conn rollout restart deploy exchange
kubectl -n mindsphere-conn rollout restart deploy fileparser
kubectl -n mindsphere-conn rollout restart deploy messagerouter
kubectl -n mindsphere-conn rollout restart deploy recordrecoveryservice
kubectl -n mindsphere-conn rollout restart deploy timeseriesparser

# uts
kubectl -n mindsphere-core rollout restart deploy coremasterscheduler-athenards
kubectl -n mindsphere-core rollout restart deploy coremasterscheduler-athenardsnb
kubectl -n mindsphere-core rollout restart deploy coremasterscheduler-general
kubectl -n mindsphere-core rollout restart deploy coremasterscheduler-kpi
kubectl -n mindsphere-core rollout restart deploy coremasterscheduler-metering
kubectl -n mindsphere-core rollout restart deploy coremasterscheduler-partition
kubectl -n mindsphere-core rollout restart deploy corereportservice
kubectl -n mindsphere-core rollout restart deploy utsreportservice

After the upgrade process is complete, verify the Insights Hub GUI to check for any Kafka-related issues and verify that the Kafka services are running correctly.


Last update: January 27, 2025