Uploaded image for project: 'ONAP Operations Manager'
  1. ONAP Operations Manager
  2. OOM-2418

Readiness-check 2.0.2 not working properly for stateful set

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: High High
    • Guilin Release
    • Frankfurt Release
    • None
    • None
    • Hide

      This is a sample of a default description for OOM bug questions

      Show
      This is a sample of a default description for OOM bug questions

      The script for stateful set compares the spec and status with wrong information.

      It currently check for status.updatedReplicas = spec.replicas and also status.currentReplicas = spec.replicas and this does not work properly as soon as the revision of the stateful changes.

      For instance if you deploy mariadb cluster from scratch the value of currentReplicas and updatedReplicas will be the same and equal to the spec.replicas

      Here's an example with one replica

      status:
        collisionCount: 0
        currentReplicas: 1
        currentRevision: onap-mariadb-galera-59f77768cf
        observedGeneration: 1
        readyReplicas: 1
        replicas: 1
        updateRevision: onap-mariadb-galera-59f77768cf
        updatedReplicas: 1

      Now if i change stateful set definition (change the livenessProbe value for instance). The status becomes

      status:
        collisionCount: 0
        currentReplicas: 1
        currentRevision: onap-mariadb-galera-59f77768cf
        observedGeneration: 2
        readyReplicas: 1
        replicas: 1
        updateRevision: onap-mariadb-galera-b8bfbd7cd

      As you can see the value of currentRevision and updateRevision changed, i did not delete the pod yet.

      If i delete the pod and let it restart the status becomes

      status:
        collisionCount: 0
        currentRevision: onap-mariadb-galera-59f77768cf
        observedGeneration: 2
        readyReplicas: 1
        replicas: 1
        updateRevision: onap-mariadb-galera-b8bfbd7cd
        updatedReplicas: 1

      Now the value of currentReplicas is no longer there because there are no pods that were created with the stateful set revision `currentRevision` but only contains one pod created with the revision `updatedRevision`. 

      This is causing all pods depending on mariadb to no longer start because the script always returns false since currentReplicas is not equal to spec.replicas.

      The logic should be fairly easy to fix by only looking at readyReplicas since this is ultimately what we want, wait for the number of ready pods to be equals to spec.replicas. 

       

       

            spremont spremont
            spremont spremont
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: