Skip to content

Installing Ceph

This section provides the detailed process of deploying a Ceph cluster using Rook in a Kubernetes environment. It covers key aspects such as infrastructure requirements, resource planning and configuration updates.

Ceph is a open-source distributed storage system designed to provide object, block and file storage under a unified system.

Prerequisites

The following requirements must be met to install the Ceph cluster using Rook:

  • Kubernetes cluster requirements:

    • Minimum of 3 Kubernetes nodes.
    • Kubernetes version: v1.17 or higher.
  • Node hardware specifications: Each node must have at least 4 CPU cores and 8 GB of RAM.

  • Storage requirements:

    • Raw disk devices (unpartitioned and unformatted). Verify using: lsblk -f.
    • Ensure the FSTYPE field is empty to confirm the disk is unformatted.
  • Logical volume management: LVM2 must be installed and configured on each node.

Rook Architecture

The image illustrates a storage orchestration framework for Ceph within Kubernetes.

rook

Planning to deploy Ceph cluster

This section outlines the resource and software planning required for deploying a Ceph cluster using Rook in a Kubernetes environment.

  • Resource planning:

    Each OSD node requires 2 GB of memory, so four OSD nodes will consume 8 GB in total. Additionally, the node itself requires 8 GB, resulting in a combined memory requirement of 16 GB.

    Node CPU Memory Disk
    ceph1 4 cores 16 GB Flexible based on requirements
    ceph2 4 cores 16 GB Flexible based on requirements
    ceph3 4 cores 16 GB Flexible based on requirements
    ceph4 4 cores 16 GB Flexible based on requirements
  • Software versions:

    Image Version
    Rook 1.7
    Ceph v16.2.6
  • Service resource planning:

    Service CPU Request CPU Limit Memory Request Memory Limit Comments
    rgw 500m 500m 1024Mi 1024Mi Recommended multiple replicas
    mgr 500m 500m 1024Mi 1024Mi For monitoring, 512Mi is recommended
    mon 500m 500m 1024Mi 1024Mi Adjust to 2 GB if needed
    osd 1 core 1 core 2048Mi 2048Mi 1 Gbit/s network recommended

Updating parameters

  1. Retrieve the required packages by executing the following commands:

    git clone --branch release-1.7 https://github.com/rook/rook.git  
    cd rook/cluster/examples/kubernetes/ceph  
    

    Note: The block of code that modifies parameters, where --- represents the separation of the content to be modified and is not written into the actual YAML.

  2. Update the namespace in the cluster.yaml file: namespace: mdsp-bk-ceph

    namespace: mdsp-bk-ceph # namespace:cluster
    ---
    spec:
      mgr:
        dashboard:
          port: 8443
    
    rulesNamespace: mdsp-bk-ceph
    ---
    placement:
      all:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: "ceph"
                operator: In
                values:
                - "true"
        podAffinity:
        podAntiAffinity:
        tolerations:
        - effect: "NoExecute"
          key: "domain"
          operator: "Equal"
          value: "ceph"
    ---
      mgr:
        limits:
          cpu: "0.6"
          memory: "1024Mi"
        requests:
          cpu: "0.2"
          memory: "512Mi"
      mon:
        limits:
          cpu: "0.5"
          memory: "1024Mi"
        requests:
          cpu: "0.2"
          memory: "1024Mi"
      osd:
        limits:
          cpu: "1"
          memory: "2048Mi"
        requests:
          cpu: "0.5"
          memory: "2048Mi"
    ---
    storage:
      useAllNodes: true
      useAllDevices: false
      deviceFilter: ^[s|v]d[c-f]
    
  3. Use the search key to modify the content below in the dashboard-external-https.yaml file.

    namespace: mdsp-bk-ceph # namespace:cluster
    ---
    rook_cluster: mdsp-bk-ceph # namespace:cluster
    ---
    rook_cluster: mdsp-bk-ceph
    
  4. Modify the content below in the object.yaml file.

    namespace: mdsp-bk-ceph # namespace:cluster
    ---
    instances: 2
    ---
    placement:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: "ceph"
              operator: In
              values:
              - "true"
      tolerations:
      - effect: "NoExecute"
        key: "domain"
        operator: "Equal"
        value: "ceph"
    ---
    resources:
    # The requests and limits set here, allow the object store gateway Pod(s) to use half of one CPU core and 1 gigabyte of memory
    limits:
      cpu: "500m"
      memory: "1024Mi"
    requests:
      cpu: "100m"
      memory: "200Mi"
    
  5. Use the search key to modify the content below in the object-user.yaml file.

    namespace: mdsp-bk-ceph # namespace:cluster
    
  6. Use the search key to modify the content below in the operator.yaml file, including the ConfigMap and deployment.

    # configmap
    ROOK_CSI_ENABLE_CEPHFS: "false"
    ROOK_CSI_ENABLE_RBD: "false"
    CSI_ENABLE_CEPHFS_SNAPSHOTTER: "false"
    CSI_ENABLE_RBD_SNAPSHOTTER: "false"
    CSI_ENABLE_CEPHFS_SNAPSHOTTER: "false"
    CSI_FORCE_CEPHFS_KERNEL_CLIENT: "false"
    CSI_PROVISIONER_NODE_AFFINITY: "ceph=true"
    CSI_PROVISIONER_TOLERATIONS: |
      - effect: NoExecute
        key: domain
        operator: Exists
    CSI_PLUGIN_TOLERATIONS: |
      - effect: NoExecute
        key: domain
        operator: Exists
    
    CSI_PLUGIN_NODE_AFFINITY: "ceph=true"
    
    ---
    namespace: mdsp-bk-ceph
    ---
    #deployment
    namespace: mdsp-bk-ceph # namespace:operator
    #spec.template.spec (add)
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: "ceph"
              operator: "In"
              values:
              - "true"
    tolerations:
    - effect: "NoExecute"
      key: "domain"
      operator: "Equal"
      value: "ceph"
    
    #env (under the modification)
    - name: AGENT_TOLERATION
      value: "NoExecute"
    - name: AGENT_NODE_AFFINITY
      value: "ceph=true"  
    - name: DISCOVER_TOLERATION
      value: "NoSchedule"
    - name: DISCOVER_AGENT_NODE_AFFINITY
      value: "ceph=true"
    
  7. Update the content of the rgw-external.yaml file.

    namespace: mdsp-bk-ceph
    ---
    rook_cluster: mdsp-bk-ceph
    ---
    rook_cluster: mdsp-bk-ceph
    
  8. Update the content of the toolbox.yaml file.

    namespace: mdsp-bk-ceph
    ---
    #insert below content under spec.template.spec
    
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: "ceph"
                    operator: "In"
                    values:
                    - "true"
          tolerations:
          - effect: "NoExecute"
            key: "domain"
            operator: "Equal"
            value: "ceph"
    
  9. Update the content of the monitoring/service-monitor.yaml file.

    namespace: monitoring
    labels:
      team: rook
      release: "prometheus-operator"
    ---
    namespaceSelector:
      matchNames:
        - mdsp-bk-ceph
    ---
    rook_cluster: mdsp-bk-ceph
    
  10. Update the content of the monitoring/rbac.yaml file.

    sed -i 's/namespace: rook-ceph/namespace: mdsp-bk-ceph/g' monitoring/rbac.yaml
    

Deploying Ceph

This section outlines the sequential deployment steps along with the corresponding commands.

  1. Create CRD and other common resources.

    kubectl create -f crds.yaml -f common.yaml
    
  2. Deploy the operator.

    kubectl create -f operator.yaml
    
  3. Create the Ceph storage cluster.

    kubectl create -f cluster.yaml
    
  4. Create the S3 object store interface (RGW).

    kubectl create -f object.yaml
    
  5. Create object storage user and generate credentials for the S3 interface.

    kubectl create -f object-user.yaml
    
  6. Deploy the external RGW service.

    kubectl create -f rgw-external.yaml
    
  7. Deploy the dashboard UI.

    kubectl create -f dashboard-external-https.yaml
    
  8. Create the toolbox for Ceph cluster command operations.

    kubectl create -f toolbox.yaml
    
  9. Create monitoring resources (if needed).

    kubectl create -f monitoring/rbac.yaml
    kubectl create -f monitoring/service-monitor.yaml
    

Configuring Ceph

  1. Get the name of the Rook Ceph tools pod by executing the following command:

    TOOLS_POD=$(kubectl -n mdsp-bk-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}')
    
  2. Execute the following commands to add permissions for my-user:

    kubectl -n mdsp-bk-ceph exec -it $TOOLS_POD bash
    radosgw-admin caps add --uid=my-user --caps="roles=*"
    radosgw-admin caps add --uid=my-user --caps="user-policy=*"
    radosgw-admin caps add --uid=my-user --caps="usage=*"
    radosgw-admin user modify --uid=my-user --system
    radosgw-admin user info --uid=my-user
    
  3. Modify the RGW Ceph configuration. Since the modification is specific to RGW, only the RGW needs to be restarted. Apply the configuration changes as follows:

    kubectl -n mdsp-bk-ceph patch cm rook-config-override --type merge -p '{"data":{"config": "[global]\nrgw_sts_key = \"mindspheremdspcd\"\nrgw_s3_auth_use_sts = true\n"}}'
    
  4. Restart the RGW deployment.

    kubectl -n mdsp-bk-ceph rollout restart deploy rook-ceph-rgw-my-store-a
    

Extracting Access Key (AK) and Secret Key (SK)

This section describes the extraction and secure delivery of the AK/SK.

  1. Set the configuration and namespace as follows:

    • Configuration File: ~/.kube/config
    • Namespace: mdsp-bk-ceph
  2. Run the following command to extract and decode the access key from the Kubernetes secret:

    kubectl --kubeconfig=$config -n $NS get secret rook-ceph-object-user-my-store-my-user -o yaml | grep '^  AccessKey' | awk '{print $2}' | base64 -d
    
  3. Run the following command to extract and decode the secret key from the Kubernetes secret:

    kubectl --kubeconfig=$config -n $NS get secret rook-ceph-object-user-my-store-my-user -o yaml | grep '^  SecretKey' | awk '{print $2}' | base64 -d
    
  4. Provide the extracted access key and secret key to the Siemens OPS team securely.


Last update: January 29, 2025