Skip to content

Installing KServe

This section provides the steps to set up KServe, a Kubernetes-based AI/ML model serving platform, on a Kubernetes cluster. It includes the necessary prerequisites, installation steps, and configuration guidelines for deploying KServe to manage and serve machine learning models.

Prerequisites

The following requirements must be met to install KServe:

Resource Requirements

Components Namespace Resources Limits cpu(m) Resources Limits memory(Mi) Resources Requests cpu(m) Resources Requests memory(Mi)
Istio istio-ingressgateway 2000 1024 100 128
istiod
knative-serving activator 1000 600 300 60
autoscaler 1000 1000 100 100
autoscaler-hpa 300 400 30 30
controller 1000 1000 100 100
net-istio-controller 300 400 30 40
net-istio-webhook 200 200 20 20
webhook 500 500 100 100
Kserve kube-rbac-proxy
kserve-controller-manager 100 300 100 300
Cert-Manager cert-manager
cert-manager-cainjector
cert-manager-webhook
Total 6400 5424 880 878

To check if a node has enough resources for scheduling, attempt to schedule a pod with high CPU or memory requirements. And review the event logs to confirm whether the node has sufficient resources.

Deployment Version

Ensure the deployment version is set to KServe Version: v1.13.1.

Node Preparation

lsmod |grep iptable_nat
sudo modprobe iptable_nat
sudo modprobe nf_nat

Deployment Order

Follow the deployment sequence:

Sl no Service Namespace
1 cert-manager cert-manager
2 istio istio-system
3 knative-serving-crd
4 knative-serving knative-serving
5 kserve-crd
6 kserve mdsp-bk-kserve

Pre-pulled images

Ensure that all required images are pulled and saved in an archive.

Packaging and transferring images

  1. The following images are pulled beforehand:

    gcr.io/knative-releases/knative.dev/net-istio/cmd/controller:v1.13.1
    gcr.io/knative-releases/knative.dev/net-istio/cmd/webhook:v1.13.1
    gcr.io/knative-releases/knative.dev/serving/cmd/activator:v1.13.1
    gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa:v1.13.1
    gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler:v1.13.1
    gcr.io/knative-releases/knative.dev/serving/cmd/controller:v1.13.1
    gcr.io/knative-releases/knative.dev/serving/cmd/queue:v1.13.1
    gcr.io/knative-releases/knative.dev/serving/cmd/webhook:v1.13.1
    gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1
    istio/pilot:1.20.0
    istio/proxyv2:1.20.0
    kserve/agent:v0.12.0
    kserve/kserve-controller:v0.12.0
    kserve/modelmesh-controller:v0.11.2
    kserve/router:v0.12.0
    kserve/sklearnserver:v0.12.0
    kserve/storage-initializer:v0.12.0
    kserve/xgbserver:v0.12.0
    quay.io/jetstack/cert-manager-cainjector:v1.14.3
    quay.io/jetstack/cert-manager-controller:v1.14.3
    quay.io/jetstack/cert-manager-webhook:v1.14.3
    
  2. Use the following commands to pull all necessary images.

    docker pull gcr.io/knative-releases/knative.dev/net-istio/cmd/controller:v1.13.1
    docker pull gcr.io/knative-releases/knative.dev/net-istio/cmd/webhook:v1.13.1
    docker pull gcr.io/knative-releases/knative.dev/serving/cmd/activator:v1.13.1
    docker pull gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa:v1.13.1
    docker pull gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler:v1.13.1
    docker pull gcr.io/knative-releases/knative.dev/serving/cmd/controller:v1.13.1
    docker pull gcr.io/knative-releases/knative.dev/serving/cmd/queue:v1.13.1
    docker pull gcr.io/knative-releases/knative.dev/serving/cmd/webhook:v1.13.1
    docker pull gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1
    docker pull istio/pilot:1.20.0
    docker pull istio/proxyv2:1.20.0
    docker pull kserve/agent:v0.12.0
    docker pull kserve/kserve-controller:v0.12.0
    docker pull kserve/modelmesh-controller:v0.11.2
    docker pull kserve/router:v0.12.0
    docker pull kserve/sklearnserver:v0.12.0
    docker pull kserve/storage-initializer:v0.12.0
    docker pull kserve/xgbserver:v0.12.0
    docker pull quay.io/jetstack/cert-manager-cainjector:v1.14.3
    docker pull quay.io/jetstack/cert-manager-controller:v1.14.3
    docker pull quay.io/jetstack/cert-manager-webhook:v1.14.3
    
  3. Package the pulled images into a single tarball.

    docker save gcr.io/knative-releases/knative.dev/net-istio/cmd/controller:v1.13.1 gcr.io/knative-releases/knative.dev/net-istio/cmd/webhook:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/activator:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/controller:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/queue:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/webhook:v1.13.1 gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1 istio/pilot:1.20.0 istio/proxyv2:1.20.0 kserve/agent:v0.12.0 kserve/kserve-controller:v0.12.0 kserve/modelmesh-controller:v0.11.2 kserve/router:v0.12.0 kserve/sklearnserver:v0.12.0 kserve/storage-initializer:v0.12.0 kserve/xgbserver:v0.12.0 quay.io/jetstack/cert-manager-cainjector:v1.14.3 quay.io/jetstack/cert-manager-controller:v1.14.3 quay.io/jetstack/cert-manager-webhook:v1.14.3 |gzip > kserveallimg.tar.gz
    
  4. Send kserveallimg.tar.gz to the partner, and have them load the images to their nodes using the following command:

    docker load -i kserveallimg.tar.gz
    
  5. To use a different registry (e.g., Azure), edit the registry in the chart configuration. The images.txt file should list the Azure registry images as follows:

    xiotlpccertcr.azurecr.io/jetstack/cert-manager-cainjector:v1.14.3
    xiotlpccertcr.azurecr.io/jetstack/cert-manager-controller:v1.14.3
    xiotlpccertcr.azurecr.io/jetstack/cert-manager-webhook:v1.14.3
    xiotlpccertcr.azurecr.io/istio/pilot:1.20.0
    xiotlpccertcr.azurecr.io/istio/proxyv2:1.20.0
    xiotlpccertcr.azurecr.io/knative-releases/knative.dev/net-istio/cmd/controller:v1.13.1
    xiotlpccertcr.azurecr.io/knative-releases/knative.dev/net-istio/cmd/webhook:v1.13.1
    xiotlpccertcr.azurecr.io/knative-releases/knative.dev/serving/cmd/activator:v1.13.1
    xiotlpccertcr.azurecr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa:v1.13.1
    xiotlpccertcr.azurecr.io/knative-releases/knative.dev/serving/cmd/autoscaler:v1.13.1
    xiotlpccertcr.azurecr.io/knative-releases/knative.dev/serving/cmd/controller:v1.13.1
    xiotlpccertcr.azurecr.io/knative-releases/knative.dev/serving/cmd/queue:v1.13.1
    xiotlpccertcr.azurecr.io/knative-releases/knative.dev/serving/cmd/webhook:v1.13.1
    xiotlpccertcr.azurecr.io/kserve/agent:v0.12.0
    xiotlpccertcr.azurecr.io/kserve/kserve-controller:v0.12.0
    xiotlpccertcr.azurecr.io/kserve/lgbserver:v0.12.0
    xiotlpccertcr.azurecr.io/kserve/modelmesh-controller:v0.11.2
    xiotlpccertcr.azurecr.io/kserve/paddleserver:v0.12.0
    xiotlpccertcr.azurecr.io/kserve/pmmlserver:v0.12.0
    xiotlpccertcr.azurecr.io/kserve/router:v0.12.0
    xiotlpccertcr.azurecr.io/kserve/sklearnserver:v0.12.0
    xiotlpccertcr.azurecr.io/kserve/storage-initializer:v0.12.0
    xiotlpccertcr.azurecr.io/kserve/xgbserver:v0.12.0
    xiotlpccertcr.azurecr.io/kubebuilder/kube-rbac-proxy:v0.13.1
    xiotlpccertcr.azurecr.io/nvcr.io/nvidia/tritonserver:23.05-py3
    xiotlpccertcr.azurecr.io/pytorch/torchserve-kfs:0.9.0
    xiotlpccertcr.azurecr.io/seldonio/mlserver:1.3.2
    xiotlpccertcr.azurecr.io/tensorflow/serving:2.6.2
    xiotlpccertcr.azurecr.io/xiot-lpc-anls/serving-runtimes/statsmodels:0.12.0
    
  6. If you are using a partner's registry, create a script (mirror_images.sh) to pull, tag and push the images:

    #!/bin/bash
    SOURCE_REGISTRY="our_registry_url"
    TARGET_REGISTRY="partener's_harbor_url"
    IMAGE_LIST="images.txt"
    while IFS= read -r IMAGE; do
      IMAGE_PATH_TAG="${IMAGE#${SOURCE_REGISTRY}/}"
      NEW_IMAGE="${TARGET_REGISTRY}/${IMAGE_PATH_TAG}"
      echo "Pulling image: $IMAGE"
      docker pull "$IMAGE" || { echo "Failed to pull $IMAGE"; exit 1; }
      echo "Tagging image: $NEW_IMAGE"
      docker tag "$IMAGE" "$NEW_IMAGE"
      echo "Pushing image: $NEW_IMAGE"
      docker push "$NEW_IMAGE" || { echo "Failed to push $NEW_IMAGE"; exit 1; }
      docker rmi "$IMAGE" "$NEW_IMAGE"
    done < "$IMAGE_LIST"
    echo "All images have been pulled, tagged, and pushed successfully."
    
  7. Create a docker registry secret for each namespace (cert-manager, istio-system, knative-serving):

    for namespace in cert-manager istio-system knative-serving mdsp-bk-kserve;do
    kubectl -n $namespace delete secret docker-registry partener-registry-secret
    kubectl -n $namespace create secret docker-registry partener-registry-secret \
    --docker-server=partener's harbor url \
    --docker-username=partener's user \
    --docker-password=partener's password \
    --docker-email=myemail@example.com
    done
    
  8. Enable the IP table NAT module on the nodes.

    modprobe iptable_nat
    
  9. Run a batch command to apply it to all nodes.

    cat nodes.txt | xargs -I {} sshpass -p 'your_password' ssh -o StrictHostKeyChecking=no your_username@{} 'modprobe iptable_nat && modprobe nf_nat'
    history -d $(history | tail -n 2 |head -n 1| awk '{print $1}')
    

Installing services using Helm and ArgoCD

  1. Clone the Helm charts repository from GitLab. Please contact Siemens OPS team.

  2. Navigate to the kserve directory (cd kserve) within the cloned repository and install the services:

    cd kserve
    
    helm -n cert-manager install cert-manager  cert-manager
    
    helm -n istio-system install istio istio
    
    helm -n knative-serving install knative-serving-crd knative-serving/knative-serving-crd
    
    helm -n knative-serving install knative-serving knative-serving/knative-serving
    
    helm -n mdsp-bk-kserve install kserve-crd kserve/kserve-crd
    
    helm -n mdsp-bk-kserve install kserve kserve/kserve
    
  3. Alternatively, you can also install the services by using ArgoCD.

  - name: cert-manager
    repoURL: <repo-url> # Please contact Siemens OPS team
    repoPath: manifests/oss/kserve/cert-manager
    targetRevision: lpc-rancher-china
    namespace: cert-manager


  - name: istio
    repoURL: <repo-url> # Please contact Siemens OPS team
    repoPath: manifests/oss/kserve/istio
    targetRevision: lpc-rancher-china
    namespace: istio-system


  - name: knative-serving-crd
    repoURL: <repo-url> # Please contact Siemens OPS team
    repoPath: manifests/oss/kserve/knative-serving/knative-serving-crd
    targetRevision: lpc-rancher-china


  - name: knative-serving
    repoURL: <repo-url> # Please contact Siemens OPS team
    repoPath: manifests/oss/kserve/knative-serving/knative-serving
    targetRevision: lpc-rancher-china
    namespace: knative-serving


  - name: kserve-crd
    repoURL: <repo-url> # Please contact Siemens OPS team
    repoPath: manifests/oss/kserve/kserve/kserve-crd
    targetRevision: lpc-rancher-china


  - name: kserve
    repoURL: <repo-url> # Please contact Siemens OPS team
    repoPath: manifests/oss/kserve/kserve/kserve
    targetRevision: lpc-rancher-china
    namespace: mdsp-bk-kserve

Last update: January 28, 2025