Uploaded image for project: 'Data Movement as a Platform'
  1. Data Movement as a Platform
  2. DMAAP-1010

[DR] DMaaP Data Router fails healthcheck

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Medium Medium
    • Dublin Release
    • Casablanca Maintenance Release
    • Observed on Windriver lab Integration-OOM-Daily tenant

      After ONAP is deployed, DMaaP data router fails healthcheck and one of its pods is in CrashLoopBackOff state. 

      root@release-rancher:~# kubectl -n onap get pod |grep dmaap
      dev-dmaap-dbc-pg-0 1/1 Running 0 13h
      dev-dmaap-dbc-pg-1 1/1 Running 0 13h
      dev-dmaap-dbc-pgpool-7b748d5894-t97w5 1/1 Running 0 13h
      dev-dmaap-dbc-pgpool-7b748d5894-v4pkf 1/1 Running 0 13h
      dev-dmaap-dmaap-bus-controller-765495b674-h9nm8 1/1 Running 0 13h
      dev-dmaap-dmaap-dr-db-56b956df8d-4dbgp 1/1 Running 1 13h
      dev-dmaap-dmaap-dr-node-64864d5cc-fbp7g 0/1 Init:0/1 82 13h
      dev-dmaap-dmaap-dr-prov-69bd7c6665-n68h6 0/1 CrashLoopBackOff 166 13h
      dev-dmaap-message-router-647bbfc54d-7xj8f 1/1 Running 0 13h
      dev-dmaap-message-router-kafka-64465d9ff4-8kvv8 1/1 Running 0 13h
      dev-dmaap-message-router-zookeeper-59577dc877-2l5wc 1/1 Running 0 13h

       

      Robot healthcheck output: 

      ------------------------------------------------------------------------------
      Basic DMAAP Data Router Health Check [ WARN ] Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fbe3169c490>: Failed to establish a new connection: [Errno 111] Connection refused',)': /internal/fetchProv
      [ WARN ] Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fbe3169cb10>: Failed to establish a new connection: [Errno 111] Connection refused',)': /internal/fetchProv
      [ WARN ] Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fbe3164d950>: Failed to establish a new connection: [Errno 111] Connection refused',)': /internal/fetchProv
      | FAIL |
      ConnectionError: HTTPConnectionPool(host='dmaap-dr-node.onap', port=8080): Max retries exceeded with url: /internal/fetchProv (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fbe316b3b50>: Failed to establish a new connection: [Errno 111] Connection refused',))
      ------------------------------------------------------------------------------
      

       

      The issue doesn't happen every time, and normally after redeploying DMaaP, the problem disappears. 

       

       

            oneito oneito
            xuyang11 xuyang11
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: