Uploaded image for project: 'Application Authorization Framework'
  1. Application Authorization Framework
  2. AAF-1078

AAF pods not starting in Master (Frankfurt)

XMLWordPrintable

      aaf pods are no more starting properly in the last Ci gating and daily chains.

      It is not systematic, the behavior seems a little bit hieratic (yesterday daily master was OK) but we are experiencing the issues more frequently and there is no issue reported on the Daily ElAtlo runs, which conforts the idea of a regression in Master.

      Moreover It has side effects on components interacting with AAF (OOF, DCAE, Clamp...)

      ---------------------------------------------------------------------------------------------------------

      Log 1 (gating 100789-3)

      https://orange-opensource.gitlab.io/lfn/onap/xtesting-onap-view/results//100789-3/infrastructure-healthcheck/k8s/onap-k8s.log

      In the describe of the pod we can see

      ...

      Readiness: http-get https://:10443/v1/sms/quorum/status delay=10s timeout=1s period=30s #success=1 #failure=3

      .....

      Events:
      Type Reason Age From Message
      ---- ------ ---- ---- -------
      Normal Pulling 37m (x20 over 113m) kubelet, compute01-onap-gating-2 Pulling image "oomk8s/readiness-check:2.0.2"
      Warning BackOff 2m47s (x483 over 112m) kubelet, compute01-onap-gating-2 Back-off restarting failed container
      Error from server (BadRequest): container "aaf-sms" in pod "onap-aaf-aaf-sms-c94596d94-gpgwz" is waiting to start: PodInitializing

       

      in the clamp pod logs we have

      020-01-31T00:27:31.735+0000 INIT [cadi] jdk.tls.client.protocols set from Default Protocols
      org.onap.aaf.cadi.CadiException: org.onap.aaf.cadi.LocatorException: No Entries found for 'https://aaf-locate.onap:8095/locate/onap.org.osaaf.aaf.cm:2.1'
      at org.onap.aaf.cadi.aaf.v2_0.AAFConHttp.rclient(AAFConHttp.java:135)
      at org.onap.aaf.cadi.aaf.v2_0.AAFCon.client(AAFCon.java:223)
      at org.onap.aaf.cadi.configure.Agent.readArtifact(Agent.java:542)
      at org.onap.aaf.cadi.configure.Agent.main(Agent.java:313)
      Caused by: org.onap.aaf.cadi.LocatorException: No Entries found for 'https://aaf-locate.onap:8095/locate/onap.org.osaaf.aaf.cm:2.1'

      ---------------------------------------------------------------------------------------------------------

      Log 2 (gating 100964-1)

      https://orange-opensource.gitlab.io/lfn/onap/xtesting-onap-view/results//100964-1/infrastructure-healthcheck/k8s/onap-k8s.log 

      Name: onap-aaf-aaf-cm-76bf68f7bf-96qg5

      ....

      Warning Unhealthy 6m19s (x357 over 112m) kubelet, compute08-onap-gating-3 Readiness probe failed: dial tcp 10.233.69.118:8150: connect: connection refused
      Warning BackOff 83s (x226 over 94m) kubelet, compute08-onap-gating-3 Back-off restarting failed container

      2020-01-31 00:05:30,966 WARN [init] 2020-01-31T00:05:30.965+0000 INIT [init] Cass User = cassandra
      2020-01-31 00:05:31,065 WARN [init] 2020-01-31T00:05:31.065+0000 INIT [init] cadi_keyfile points to /opt/app/osaaf/local/org.osaaf.aaf.keyfile
      2020-01-31 00:05:31,069 WARN [init] 2020-01-31T00:05:31.069+0000 INIT [init] Cass ResetExceptions = com.datastax.driver.core.exceptions.NoHostAvailableException:"no host was tried":"Connection has been closed"
      2020-01-31 00:05:31,267 WARN [init] 2020-01-31T00:05:31.267+0000 INIT [init] Service Latitude,Longitude = 38.000000,-72.000000
      2020-01-31 00:05:31,268 WARN [init] 2020-01-31T00:05:31.267+0000 INIT [init] Cass Clusters = 'aaf-cass.onap'

      2020-01-31 00:05:31,566 WARN [init] 2020-01-31T00:05:31.566+0000 INIT [init] Cassandra is using Default Policy, which is not DC aware
      2020-01-31 00:06:02,567 WARN [init] 2020-01-31T00:06:02.567+0000 INIT [init] Loading Certificate Authority Module: local

       

      note: I assigned the Jira to dmcbride because AAF PTL role is not clear for me, as far as I understood Jonathan is no more involved

      + sdesbure kopasiak Katel34 zwarico melliott jackl bdfreeman1421 vv770d ChrisC

            johnfraney johnfraney
            mrichomme mrichomme
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: