Installing KServe¶
This section provides the steps to set up KServe, a Kubernetes-based AI/ML model serving platform, on a Kubernetes cluster. It includes the necessary prerequisites, installation steps, and configuration guidelines for deploying KServe to manage and serve machine learning models.
Prerequisites¶
The following requirements must be met to install KServe:
Resource Requirements¶
Components | Namespace | Resources Limits cpu(m) | Resources Limits memory(Mi) | Resources Requests cpu(m) | Resources Requests memory(Mi) |
---|---|---|---|---|---|
Istio | istio-ingressgateway | 2000 | 1024 | 100 | 128 |
istiod | |||||
knative-serving | activator | 1000 | 600 | 300 | 60 |
autoscaler | 1000 | 1000 | 100 | 100 | |
autoscaler-hpa | 300 | 400 | 30 | 30 | |
controller | 1000 | 1000 | 100 | 100 | |
net-istio-controller | 300 | 400 | 30 | 40 | |
net-istio-webhook | 200 | 200 | 20 | 20 | |
webhook | 500 | 500 | 100 | 100 | |
Kserve | kube-rbac-proxy | ||||
kserve-controller-manager | 100 | 300 | 100 | 300 | |
Cert-Manager | cert-manager | ||||
cert-manager-cainjector | |||||
cert-manager-webhook | |||||
Total | 6400 | 5424 | 880 | 878 |
To check if a node has enough resources for scheduling, attempt to schedule a pod with high CPU or memory requirements. And review the event logs to confirm whether the node has sufficient resources.
Deployment Version¶
Ensure the deployment version is set to KServe Version: v1.13.1.
Node Preparation¶
lsmod |grep iptable_nat
sudo modprobe iptable_nat
sudo modprobe nf_nat
Deployment Order¶
Follow the deployment sequence:
Sl no | Service | Namespace |
---|---|---|
1 | cert-manager | cert-manager |
2 | istio | istio-system |
3 | knative-serving-crd | |
4 | knative-serving | knative-serving |
5 | kserve-crd | |
6 | kserve | mdsp-bk-kserve |
Pre-pulled images¶
Ensure that all required images are pulled and saved in an archive.
Packaging and transferring images¶
-
The following images are pulled beforehand:
gcr.io/knative-releases/knative.dev/net-istio/cmd/controller:v1.13.1 gcr.io/knative-releases/knative.dev/net-istio/cmd/webhook:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/activator:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/controller:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/queue:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/webhook:v1.13.1 gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1 istio/pilot:1.20.0 istio/proxyv2:1.20.0 kserve/agent:v0.12.0 kserve/kserve-controller:v0.12.0 kserve/modelmesh-controller:v0.11.2 kserve/router:v0.12.0 kserve/sklearnserver:v0.12.0 kserve/storage-initializer:v0.12.0 kserve/xgbserver:v0.12.0 quay.io/jetstack/cert-manager-cainjector:v1.14.3 quay.io/jetstack/cert-manager-controller:v1.14.3 quay.io/jetstack/cert-manager-webhook:v1.14.3
-
Use the following commands to pull all necessary images.
docker pull gcr.io/knative-releases/knative.dev/net-istio/cmd/controller:v1.13.1 docker pull gcr.io/knative-releases/knative.dev/net-istio/cmd/webhook:v1.13.1 docker pull gcr.io/knative-releases/knative.dev/serving/cmd/activator:v1.13.1 docker pull gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa:v1.13.1 docker pull gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler:v1.13.1 docker pull gcr.io/knative-releases/knative.dev/serving/cmd/controller:v1.13.1 docker pull gcr.io/knative-releases/knative.dev/serving/cmd/queue:v1.13.1 docker pull gcr.io/knative-releases/knative.dev/serving/cmd/webhook:v1.13.1 docker pull gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1 docker pull istio/pilot:1.20.0 docker pull istio/proxyv2:1.20.0 docker pull kserve/agent:v0.12.0 docker pull kserve/kserve-controller:v0.12.0 docker pull kserve/modelmesh-controller:v0.11.2 docker pull kserve/router:v0.12.0 docker pull kserve/sklearnserver:v0.12.0 docker pull kserve/storage-initializer:v0.12.0 docker pull kserve/xgbserver:v0.12.0 docker pull quay.io/jetstack/cert-manager-cainjector:v1.14.3 docker pull quay.io/jetstack/cert-manager-controller:v1.14.3 docker pull quay.io/jetstack/cert-manager-webhook:v1.14.3
-
Package the pulled images into a single tarball.
docker save gcr.io/knative-releases/knative.dev/net-istio/cmd/controller:v1.13.1 gcr.io/knative-releases/knative.dev/net-istio/cmd/webhook:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/activator:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/controller:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/queue:v1.13.1 gcr.io/knative-releases/knative.dev/serving/cmd/webhook:v1.13.1 gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1 istio/pilot:1.20.0 istio/proxyv2:1.20.0 kserve/agent:v0.12.0 kserve/kserve-controller:v0.12.0 kserve/modelmesh-controller:v0.11.2 kserve/router:v0.12.0 kserve/sklearnserver:v0.12.0 kserve/storage-initializer:v0.12.0 kserve/xgbserver:v0.12.0 quay.io/jetstack/cert-manager-cainjector:v1.14.3 quay.io/jetstack/cert-manager-controller:v1.14.3 quay.io/jetstack/cert-manager-webhook:v1.14.3 |gzip > kserveallimg.tar.gz
-
Send
kserveallimg.tar.gz
to the partner, and have them load the images to their nodes using the following command:docker load -i kserveallimg.tar.gz
-
To use a different registry (e.g., Azure), edit the registry in the chart configuration. The
images.txt
file should list the Azure registry images as follows:xiotlpccertcr.azurecr.io/jetstack/cert-manager-cainjector:v1.14.3 xiotlpccertcr.azurecr.io/jetstack/cert-manager-controller:v1.14.3 xiotlpccertcr.azurecr.io/jetstack/cert-manager-webhook:v1.14.3 xiotlpccertcr.azurecr.io/istio/pilot:1.20.0 xiotlpccertcr.azurecr.io/istio/proxyv2:1.20.0 xiotlpccertcr.azurecr.io/knative-releases/knative.dev/net-istio/cmd/controller:v1.13.1 xiotlpccertcr.azurecr.io/knative-releases/knative.dev/net-istio/cmd/webhook:v1.13.1 xiotlpccertcr.azurecr.io/knative-releases/knative.dev/serving/cmd/activator:v1.13.1 xiotlpccertcr.azurecr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa:v1.13.1 xiotlpccertcr.azurecr.io/knative-releases/knative.dev/serving/cmd/autoscaler:v1.13.1 xiotlpccertcr.azurecr.io/knative-releases/knative.dev/serving/cmd/controller:v1.13.1 xiotlpccertcr.azurecr.io/knative-releases/knative.dev/serving/cmd/queue:v1.13.1 xiotlpccertcr.azurecr.io/knative-releases/knative.dev/serving/cmd/webhook:v1.13.1 xiotlpccertcr.azurecr.io/kserve/agent:v0.12.0 xiotlpccertcr.azurecr.io/kserve/kserve-controller:v0.12.0 xiotlpccertcr.azurecr.io/kserve/lgbserver:v0.12.0 xiotlpccertcr.azurecr.io/kserve/modelmesh-controller:v0.11.2 xiotlpccertcr.azurecr.io/kserve/paddleserver:v0.12.0 xiotlpccertcr.azurecr.io/kserve/pmmlserver:v0.12.0 xiotlpccertcr.azurecr.io/kserve/router:v0.12.0 xiotlpccertcr.azurecr.io/kserve/sklearnserver:v0.12.0 xiotlpccertcr.azurecr.io/kserve/storage-initializer:v0.12.0 xiotlpccertcr.azurecr.io/kserve/xgbserver:v0.12.0 xiotlpccertcr.azurecr.io/kubebuilder/kube-rbac-proxy:v0.13.1 xiotlpccertcr.azurecr.io/nvcr.io/nvidia/tritonserver:23.05-py3 xiotlpccertcr.azurecr.io/pytorch/torchserve-kfs:0.9.0 xiotlpccertcr.azurecr.io/seldonio/mlserver:1.3.2 xiotlpccertcr.azurecr.io/tensorflow/serving:2.6.2 xiotlpccertcr.azurecr.io/xiot-lpc-anls/serving-runtimes/statsmodels:0.12.0
-
If you are using a partner's registry, create a script (
mirror_images.sh
) to pull, tag and push the images:#!/bin/bash SOURCE_REGISTRY="our_registry_url" TARGET_REGISTRY="partener's_harbor_url" IMAGE_LIST="images.txt" while IFS= read -r IMAGE; do IMAGE_PATH_TAG="${IMAGE#${SOURCE_REGISTRY}/}" NEW_IMAGE="${TARGET_REGISTRY}/${IMAGE_PATH_TAG}" echo "Pulling image: $IMAGE" docker pull "$IMAGE" || { echo "Failed to pull $IMAGE"; exit 1; } echo "Tagging image: $NEW_IMAGE" docker tag "$IMAGE" "$NEW_IMAGE" echo "Pushing image: $NEW_IMAGE" docker push "$NEW_IMAGE" || { echo "Failed to push $NEW_IMAGE"; exit 1; } docker rmi "$IMAGE" "$NEW_IMAGE" done < "$IMAGE_LIST" echo "All images have been pulled, tagged, and pushed successfully."
-
Create a docker registry secret for each namespace (
cert-manager
,istio-system
,knative-serving
):for namespace in cert-manager istio-system knative-serving mdsp-bk-kserve;do kubectl -n $namespace delete secret docker-registry partener-registry-secret kubectl -n $namespace create secret docker-registry partener-registry-secret \ --docker-server=partener's harbor url \ --docker-username=partener's user \ --docker-password=partener's password \ --docker-email=myemail@example.com done
-
Enable the IP table NAT module on the nodes.
modprobe iptable_nat
-
Run a batch command to apply it to all nodes.
cat nodes.txt | xargs -I {} sshpass -p 'your_password' ssh -o StrictHostKeyChecking=no your_username@{} 'modprobe iptable_nat && modprobe nf_nat' history -d $(history | tail -n 2 |head -n 1| awk '{print $1}')
Installing services using Helm and ArgoCD¶
-
Clone the Helm charts repository from GitLab. Please contact Siemens OPS team.
-
Navigate to the
kserve
directory (cd kserve
) within the cloned repository and install the services:cd kserve helm -n cert-manager install cert-manager cert-manager helm -n istio-system install istio istio helm -n knative-serving install knative-serving-crd knative-serving/knative-serving-crd helm -n knative-serving install knative-serving knative-serving/knative-serving helm -n mdsp-bk-kserve install kserve-crd kserve/kserve-crd helm -n mdsp-bk-kserve install kserve kserve/kserve
-
Alternatively, you can also install the services by using ArgoCD.
- name: cert-manager
repoURL: <repo-url> # Please contact Siemens OPS team
repoPath: manifests/oss/kserve/cert-manager
targetRevision: lpc-rancher-china
namespace: cert-manager
- name: istio
repoURL: <repo-url> # Please contact Siemens OPS team
repoPath: manifests/oss/kserve/istio
targetRevision: lpc-rancher-china
namespace: istio-system
- name: knative-serving-crd
repoURL: <repo-url> # Please contact Siemens OPS team
repoPath: manifests/oss/kserve/knative-serving/knative-serving-crd
targetRevision: lpc-rancher-china
- name: knative-serving
repoURL: <repo-url> # Please contact Siemens OPS team
repoPath: manifests/oss/kserve/knative-serving/knative-serving
targetRevision: lpc-rancher-china
namespace: knative-serving
- name: kserve-crd
repoURL: <repo-url> # Please contact Siemens OPS team
repoPath: manifests/oss/kserve/kserve/kserve-crd
targetRevision: lpc-rancher-china
- name: kserve
repoURL: <repo-url> # Please contact Siemens OPS team
repoPath: manifests/oss/kserve/kserve/kserve
targetRevision: lpc-rancher-china
namespace: mdsp-bk-kserve