Uploaded image for project: 'ONAP Operations Manager'
  1. ONAP Operations Manager
  2. OOM-1128

AAF CS fails to start in OpenLab

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Medium Medium
    • Beijing Release
    • Beijing Release
    • None
    • None

      I noticed that AAF CS does not start successfully due to a short hardcoded sleep in the PosStartHook:

      lifecycle:
      postStart:
      exec:
      command:

      • /bin/sh
      • -c
      • >
        /bin/sleep 30;
        cd /data/;
        cqlsh -u root -p root -f keyspace.cql ;
        cqlsh -u root -p root -f init.cql ;
        cqlsh -u root -p root -f osaaf.cql ;
        cqlsh -u root -p root -f temp_identity.cql

       

       

      onap fff-aaf-cm-667494469-cxd7g 0/1 Init:0/2 0 4m
      onap fff-aaf-create-config-bzj9l 0/1 PodInitializing 0 4m
      onap fff-aaf-cs-6577f9c5cf-xlczl 0/1 PostStartHookError 1 4m
      onap fff-aaf-fs-6f9774d4d9-pmgnt 0/1 Init:0/2 0 4m
      onap fff-aaf-gui-6f4f5cb666-p56jf 0/1 Init:0/2 0 4m
      onap fff-aaf-hello-749949b4c5-wd45z 0/1 Init:0/2 0 4m
      onap fff-aaf-locate-ffc97bbc9-r6xlj 0/1 Init:0/2 0 4m
      onap fff-aaf-oauth-567c496768-plfsf 0/1 Init:0/2 0 4m
      onap fff-aaf-service-79db8c9896-c572n 0/1 Init:0/2 0 4m
      onap fff-aaf-sms-77c7894ff7-sx66k 0/1 Running 4 4m
      onap fff-aaf-sms-quorumclient-0 1/1 Running 0 4m
      onap fff-aaf-sms-quorumclient-1 1/1 Running 0 2m
      onap fff-aaf-sms-quorumclient-2 1/1 Running 0 2m
      onap fff-aaf-sms-vault-0 2/2 Running 1 4m
      onap vid-config-galera-2wfmx 0/1 Init:Error 0 18h
      onap fff-aaf-sms-77c7894ff7-sx66k 1/1 Running 4 4m
      onap fff-aaf-hello-749949b4c5-wd45z 0/1 Init:0/2 0 4m
      ^Croot@borislav-rancher:~/onap/master/oom/kubernetes# k describe po -n onap fff-aaf-cs-6577f9c5cf-xlczl 0/1 PostStartHookError
      error: there is no need to specify a resource type as a separate argument when passing arguments in resource/name form (e.g. 'kubectl get resource/<resource_name>' instead of 'kubectl get resource resource/<resource_name>'
      root@borislav-rancher:~/onap/master/oom/kubernetes# k describe po -n onap fff-aaf-cs-6577f9c5cf-xlczl
      Name: fff-aaf-cs-6577f9c5cf-xlczl
      Namespace: onap
      Node: borislav-node-2/10.0.17.4
      Start Time: Thu, 07 Jun 2018 07:57:22 +0000
      Labels: app=aaf-cs
      pod-template-hash=2133957179
      release=fff
      Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"onap","name":"fff-aaf-cs-6577f9c5cf","uid":"69e38f32-6a28-11e8-9c36-021e6111a393"...
      Status: Running
      IP: 10.42.220.138
      Created By: ReplicaSet/fff-aaf-cs-6577f9c5cf
      Controlled By: ReplicaSet/fff-aaf-cs-6577f9c5cf
      Containers:
      aaf-cs:
      Container ID: docker://6ca2f7a1da02807c402073bf41120f58c13c50ab5b6df303d63677791f5f3b58
      Image: nexus3.onap.org:10001/library/cassandra:3.11
      Image ID: docker-pullable://nexus3.onap.org:10001/library/cassandra@sha256:7155cb41dd1a508c5ea67b1750e98381d495873637aea6dcb9a5b13e3b2fb133
      Ports: 7000/TCP, 7001/TCP, 9042/TCP, 9160/TCP
      State: Waiting
      Reason: PostStartHookError
      Last State: Terminated
      Reason: Error
      Exit Code: 143
      Started: Thu, 07 Jun 2018 08:00:52 +0000
      Finished: Thu, 07 Jun 2018 08:01:50 +0000
      Ready: False
      Restart Count: 1
      Liveness: tcp-socket :9042 delay=180s timeout=1s period=10s #success=1 #failure=3
      Readiness: tcp-socket :9042 delay=180s timeout=1s period=10s #success=1 #failure=3
      Environment: <none>
      Mounts:
      /data from aaf-cs-data (rw)
      /etc/localtime from localtime (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-fzrzl (ro)
      Conditions:
      Type Status
      Initialized True
      Ready False
      PodScheduled True
      Volumes:
      localtime:
      Type: HostPath (bare host directory volume)
      Path: /etc/localtime
      aaf-cs-data:
      Type: Secret (a volume populated by a Secret)
      SecretName: fff-aaf-cs
      Optional: false
      default-token-fzrzl:
      Type: Secret (a volume populated by a Secret)
      SecretName: default-token-fzrzl
      Optional: false
      QoS Class: BestEffort
      Node-Selectors: <none>
      Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
      node.alpha.kubernetes.io/unreachable:NoExecute for 300s
      Events:
      Type Reason Age From Message
      ---- ------ ---- ---- -------
      Normal SuccessfulMountVolume 5m kubelet, borislav-node-2 MountVolume.SetUp succeeded for volume "localtime"
      Normal Scheduled 5m default-scheduler Successfully assigned fff-aaf-cs-6577f9c5cf-xlczl to borislav-node-2
      Normal SuccessfulMountVolume 5m kubelet, borislav-node-2 MountVolume.SetUp succeeded for volume "aaf-cs-data"
      Normal SuccessfulMountVolume 5m kubelet, borislav-node-2 MountVolume.SetUp succeeded for volume "default-token-fzrzl"
      Warning FailedPostStartHook 56s (x2 over 3m) kubelet, borislav-node-2 Exec lifecycle hook ([/bin/sh -c /bin/sleep 30; cd /data/; cqlsh -u root -p root -f keyspace.cql ; cqlsh -u root -p root -f init.cql ; cqlsh -u root -p root -f osaaf.cql ; cqlsh -u root -p root -f temp_identity.cql
      ]) for Container "aaf-cs" in Pod "fff-aaf-cs-6577f9c5cf-xlczl_onap(69e5002c-6a28-11e8-9c36-021e6111a393)" failed - error: command '/bin/sh -c /bin/sleep 30; cd /data/; cqlsh -u root -p root -f keyspace.cql ; cqlsh -u root -p root -f init.cql ; cqlsh -u root -p root -f osaaf.cql ; cqlsh -u root -p root -f temp_identity.cql
      ' exited with 1: Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
      Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
      Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
      Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
      , message: "Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, \"Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused\")})\nConnection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, \"Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused\")})\nConnection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, \"Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused\")})\nConnection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, \"Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused\")})\n"
      Normal Killing 45s (x2 over 2m) kubelet, borislav-node-2 Killing container with id docker://aaf-cs:FailedPostStartHook
      Warning FailedSync 45s (x2 over 2m) kubelet, borislav-node-2 Error syncing pod
      Normal Pulled 24s (x3 over 4m) kubelet, borislav-node-2 Container image "nexus3.onap.org:10001/library/cassandra:3.11" already present on machine
      Normal Created 24s (x3 over 4m) kubelet, borislav-node-2 Created container
      Normal Started 24s (x3 over 4m) kubelet, borislav-node-2 Started container

            BorislavG Borislav Glozman
            BorislavG Borislav Glozman
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: