Uploaded image for project: 'Policy Framework'
  1. Policy Framework
  2. POLICY-878

pdp-d: feature-pooling disables policy-controllers preventing processing of onset events

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Low Low
    • Casablanca Release
    • Beijing Release, Casablanca Release
    • None
    • SB02

      platania reported a situation in SB-02 (OOM deployments) which causes policy controllers to stop processing events.  ONSETs events were received but none processed by the PDP-D.     Going thorugh the logs and inspecting the source code, it seems that the culprit is in the feature-pooling added new in Beijing.   It seems that the feature-pooling calls the controller.stop() when there’s some issue with the channel.   This in turn leaves the amsterdam controller stopped and never goes back to life, which explains the situation that Marco was seeing.    ONSETs were going to drools-pdp-0, were discarded, and the other drools-pdp-1 did not take over.   Actually, his lab was stuck for many hours until policy was restarted.   The implementation has to be reviewed to avoid these drastic side effects.    This seems to be it will be a not uncommon situation in a k8s environment.

      Code that stops the controller :

       

          @Override

          public CountDownLatch internalTopicFailed() {

              logger.error("communication failed for topic {}", topic);

       

              ..

              new Thread(() -> {

                  controller.stop();

                  latch.countDown();

              }).start();

             ..

          }

       

      +State of amsterdam controller in drools-pdp-0 (note that the controller is false):
      +
      HTTP/1.1 200 OK

      Content-Length: 2247

      Content-Type: application/json

      Date: Thu, 31 May 2018 20:22:51 GMT

      Server: Jetty(9.3.20.v20170531)

       

      {

          "alive": false,

          "drools": {

              "alive": false,

              "artifactId": "policy-amsterdam-rules",

              "brained": true,

              "groupId": "org.onap.policy-engine.drools.amsterdam",

              "locked": false,

              "modelClassLoaderHash": 651347276,

              "recentSinkEvents": [],

              "recentSourceEvents": [],

              "sessionCoordinates": [

                  "org.onap.policy-engine.drools.amsterdam:policy-amsterdam-rules:0.4.0:closedloop-amsterdam"

              ],

              "sessions": [

                  "closedloop-amsterdam"

              ],

              "version": "0.4.0"

          },

          "locked": false,

          "name": "amsterdam",

          "topicSinks": [

              {

                  "alive": false,

                  "allowSelfSignedCerts": false,

                  "apiKey": "",

                  "apiSecret": "",

                  "locked": false,

                  "partitionKey": "ad654ed3-78f1-4fc1-bb41-5a554cbda1a8",

                  "recentEvents": [],

                  "servers": [

                      "message-router"

                  ],

                  "topic": "APPC-CL",

                  "topicCommInfrastructure": "UEB",

                  "useHttps": false

              },

              {

                  "alive": false,

                  "allowSelfSignedCerts": false,

                  "apiKey": "",

                  "apiSecret": "",

                  "locked": false,

                  "partitionKey": "375fe28b-d575-40ba-9812-e1e8591a6079",

                  "recentEvents": [],

                  "servers": [

                      "message-router"

                  ],

                  "topic": "APPC-LCM-READ",

                  "topicCommInfrastructure": "UEB",

                  "useHttps": false

              },

              {

                  "alive": false,

                  "allowSelfSignedCerts": false,

                  "apiKey": "",

                  "apiSecret": "",

                  "locked": false,

                  "partitionKey": "be167d60-09e1-40d3-a8e1-894955e293fa",

                  "recentEvents": [],

                  "servers": [

                      "message-router"

                  ],

                  "topic": "POLICY-CL-MGT",

                  "topicCommInfrastructure": "UEB",

                  "useHttps": false

              }

          ],

          "topicSources": [

              {

                  "alive": false,

                  "allowSelfSignedCerts": false,

                  "apiKey": "",

                  "apiSecret": "",

                  "consumerGroup": "dcae.policy.shared",

                  "consumerInstance": "dev-drools-1",

                  "fetchLimit": 100,

                  "fetchTimeout": 15000,

                  "locked": false,

                  "recentEvents": [],

                  "servers": [

                      "message-router"

                  ],

                  "topic": "unauthenticated.DCAE_CL_OUTPUT",

                  "topicCommInfrastructure": "UEB",

                  "useHttps": false

              },

              {

                  "alive": false,

                  "allowSelfSignedCerts": false,

                  "apiKey": "",

                  "apiSecret": "",

                  "consumerGroup": "c602f0c9-f6dd-48e0-8524-2d560dae8998",

                  "consumerInstance": "dev-drools-1",

                  "fetchLimit": 100,

                  "fetchTimeout": 15000,

                  "locked": false,

                  "recentEvents": [],

                  "servers": [

                      "message-router"

                  ],

                  "topic": "APPC-CL",

                  "topicCommInfrastructure": "UEB",

                  "useHttps": false

              },

              {

                  "alive": false,

                  "allowSelfSignedCerts": false,

                  "apiKey": "",

                  "apiSecret": "",

                  "consumerGroup": "acadf51f-1519-49cc-b923-551548bc8619",

                  "consumerInstance": "dev-drools-1",

                  "fetchLimit": 100,

                  "fetchTimeout": 15000,

                  "locked": false,

                  "recentEvents": [],

                  "servers": [

                      "message-router"

                  ],

                  "topic": "APPC-LCM-WRITE",

                  "topicCommInfrastructure": "UEB",

                  "useHttps": false

              }

          ]

      }

       

      Logs (see debug.2018-05-30.0.log at 10.12.5.123:

       

      9991 [2018-05-30T16:17:18.523+00:00|INFO|CambriaConsumerImpl|UEB-source-POOLING] UEB GET /events/POOLING/a94d13ce-8e1f-4056-a7d2-b0c49195abfd/dev-drools-0?timeout=15000&limit=100&filter=%7B%22class%22%3A%22Or%22%2C%22filters%22%3A%5B%7B%22class%22%3A%22Equals%22%2C%22field%22%3A%22channel%22%2C%22value%22%3A%22_admin%22%7D%2C%7B%22class%22%3A%22And%22%2C%22filters%22%3A%5B%7B%22class%22%3A%22Equals%22%2C%22field%22%3A%22channel%22%2C%22value%22%3A%220d67cebb-0b80-41bf-b378-b7f4466c997e%22%7D%2C%7B%22class%22%3A%22Equals%22%2C%22field%22%3A%22timestampMs%22%2C%22value%22%3A%221527696929094%22%7D%5D%7D%5D%7D

        

         9992 [2018-05-30T16:17:18.523+00:00|INFO|HttpClient|UEB-source-POOLING] GET http://message-router:3904/events/POOLING/a94d13ce-8e1f-4056-a7d2-b0c49195abfd/dev-drools-0?timeout=15000&limit=100&filter=%7B%22class%22%3A%22Or%22%2C%22filters%22%3A%5B%7B%22class%22%3A%22Equals%22%2C%22field%22%3A%22channel%22%2C%22value%22%3A%22_admin%22%7D%2C%7B%22class%22%3A%22And%22%2C%22filters%22%3A%5B%7B%22class%22%3A%22Equals%22%2C%22field%22%3A%22channel%22%2C%22value%22%3A%220d67cebb-0b80-41bf-b378-b7f4466c997e%22%7D%2C%7B%22class%22%3A%22Equals%22%2C%22field%22%3A%22timestampMs%22%2C%22value%22%3A%221527696929094%22%7D%5D%7D%5D%7D (anonymous) ...

        

         10216 [2018-05-30T16:17:33.290+00:00|ERROR|PoolingManagerImpl|pool-9-thread-1] communication failed for topic POOLING

        

        10217 [2018-05-30T16:17:33.290+00:00|INFO|AggregatedPolicyController|Thread-51] AggregatedPolicyController [name=amsterdam, alive=true, locked=false, droolsController=NullDroolsController []]: stop

       

        10218 [2018-05-30T16:17:33.291+00:00|INFO|BusConsumer$CambriaConsumerWrapper|pool-9-thread-1] CambriaConsumerWrapper [fetchTimeout=15000]: setting DMAAP server-side filter: {"class":"Or","filters":[\{"class":"Equals","field":"channel","value":"_admin"},\{"class":"Equals","field":"channel","value":"0d67cebb-0b80-41bf-b378-b7f4466c997e"}]}

        10219 [2018-05-30T16:17:33.293+00:00|INFO|State|pool-9-thread-1] entered InactiveState for topic POOLING

       

        10220 [2018-05-30T16:17:33.293+00:00|INFO|BusConsumer$CambriaConsumerWrapper|Thread-51] CambriaConsumerWrapper [fetchTimeout=15000]: setting DMAAP server-side filter: {"class":"Or","filters":[\{"class":"Equals","field":"channel","value":"_admin"},\{"class":"Equals","field":"channel","value":"0d67cebb-0b80-41bf-b378-b7f4466c997e"}]}

       

        10221 [2018-05-30T16:17:33.296+00:00|INFO|State|Thread-51] entered IdleState for topic POOLING

       

        10222 [2018-05-30T16:17:33.296+00:00|INFO|DmaapManager|Thread-51] stop consuming from topic POOLING

       

        10223 [2018-05-30T16:17:33.296+00:00|INFO|TopicBase|Thread-51] SingleThreadedUebTopicSource [getTopicCommInfrastructure()=UEB, toString()=SingleThreadedBusTopicSource [consumerGroup=a94d13ce-8e1f-4056-a7d2-b0c49195abfd, consumerInstance=dev-drools-0, fetchTimeout=15000, fetchLimit=100, consumer=CambriaConsumerWrapper [fetchTimeout=15000], alive=true, locked=false, uebThread=Thread[UEB-source-POOLING,5,main], topicListeners=1, toString()=BusTopicBase [apiKey=, apiSecret=, useHttps=false, allowSelfSignedCerts=false, toString()=TopicBase servers=[message-router], topic=POOLING, #recentEvents=10, locked=false, #topicListeners=1]]]: unregistering org.onap.policy.drools.pooling.PoolingManagerImpl@43599640

       

        10224 [2018-05-30T16:17:33.296+00:00|INFO|InlineBusTopicSink|Thread-51] SingleThreadedUebTopicSource [getTopicCommInfrastructure()=UEB, toString()=SingleThreadedBusTopicSource [consumerGroup=a94d13ce-8e1f-4056-a7d2-b0c49195abfd, consumerInstance=dev-drools-0, fetchTimeout=15000, fetchLimit=100, consumer=CambriaConsumerWrapper [fetchTimeout=15000], alive=true, locked=false, uebThread=Thread[UEB-source-POOLING,5,main], topicListeners=0, toString()=BusTopicBase [apiKey=, apiSecret=, useHttps=false, allowSelfSignedCerts=false, toString()=TopicBase servers=[message-router], topic=POOLING, #recentEvents=10, locked=false, #topicListeners=0]]]: stopping

       

        10225 [2018-05-30T16:17:33.296+00:00|INFO|PoolingManagerImpl|Thread-51] publish Offline to _admin on topic POOLING

       

          10225 [2018-05-30T16:17:33.296+00:00|INFO|PoolingManagerImpl|Thread-51] publish Offline to _admin on topic POOLING

        10226 [2018-05-30T16:17:33.297+00:00|INFO|TopicBase|Thread-51] SingleThreadedUebTopicSource [getTopicCommInfrastructure()=UEB, toString()=SingleThreadedBusTopicSource [consumerGroup=dcae.policy.shared, consumerInstance=dev-drools-0, fetchTimeout=15000, fetchLimit=100, consumer=CambriaConsumerWrapper [fetchTimeout=15000], alive=true, locked=false, uebThread=Thread[UEB-source-unauthenticated.DCAE_CL_OUTPUT,5,main], topicListeners=1, toString()=BusTopicBase [apiKey=, apiSecret=, useHttps=false, allowSelfSignedCerts=false, toString()=TopicBase servers=[message-router], topic=unauthenticated.DCAE_CL_OUTPUT, #recentEvents=0, locked=false, #topicListeners=1]]]: unregistering AggregatedPolicyController [name=amsterdam, alive=false, locked=false, droolsController=NullDroolsController []]

        10227 [2018-05-30T16:17:33.297+00:00|INFO|InlineBusTopicSink|Thread-51] SingleThreadedUebTopicSource [getTopicCommInfrastructure()=UEB, toString()=SingleThreadedBusTopicSource [consumerGroup=dcae.policy.shared, consumerInstance=dev-drools-0, fetchTimeout=15000, fetchLimit=100, consumer=CambriaConsumerWrapper [fetchTimeout=15000], alive=true, locked=false, uebThread=Thread[UEB-source-unauthenticated.DCAE_CL_OUTPUT,5,main], topicListeners=0, toString()=BusTopicBase [apiKey=, apiSecret=, useHttps=false, allowSelfSignedCerts=false, toString()=TopicBase servers=[message-router], topic=unauthenticated.DCAE_CL_OUTPUT, #recentEvents=0, locked=false, #topicListeners=0]]]: stopping

        10228 [2018-05-30T16:17:33.297+00:00|INFO|TopicBase|Thread-51] SingleThreadedUebTopicSource [getTopicCommInfrastructure()=UEB, toString()=SingleThreadedBusTopicSource [consumerGroup=ce3c4088-f788-45e3-80d9-fa251f064341, consumerInstance=dev-drools-0, fetchTimeout=15000, fetchLimit=100, consumer=CambriaConsumerWrapper [fetchTimeout=15000], alive=true, locked=false, uebThread=Thread[UEB-source-APPC-CL,5,main], topicListeners=1, toString()=BusTopicBase [apiKey=, apiSecret=, useHttps=false, allowSelfSignedCerts=false, toString()=TopicBase servers=[message-router], topic=APPC-CL, #recentEvents=0, locked=false, #topicListeners=1]]]: unregistering AggregatedPolicyController [name=amsterdam, alive=false, locked=false, droolsController=NullDroolsController []]

        10229 [2018-05-30T16:17:33.297+00:00|INFO|InlineBusTopicSink|Thread-51] SingleThreadedUebTopicSource [getTopicCommInfrastructure()=UEB, toString()=SingleThreadedBusTopicSource [consumerGroup=ce3c4088-f788-45e3-80d9-fa251f064341, consumerInstance=dev-drools-0, fetchTimeout=15000, fetchLimit=100, consumer=CambriaConsumerWrapper [fetchTimeout=15000], alive=true, locked=false, uebThread=Thread[UEB-source-APPC-CL,5,main], topicListeners=0, toString()=BusTopicBase [apiKey=, apiSecret=, useHttps=false, allowSelfSignedCerts=false, toString()=TopicBase servers=[message-router], topic=APPC-CL, #recentEvents=0, locked=false, #topicListeners=0]]]: stopping

        10230 [2018-05-30T16:17:33.297+00:00|INFO|TopicBase|Thread-51] SingleThreadedUebTopicSource [getTopicCommInfrastructure()=UEB, toString()=SingleThreadedBusTopicSource [consumerGroup=94d49b6c-6461-49e0-b479-4dd0483b124e, consumerInstance=dev-drools-0, fetchTimeout=15000, fetchLimit=100, consumer=CambriaConsumerWrapper [fetchTimeout=15000], alive=true, locked=false, uebThread=Thread[UEB-source-APPC-LCM-WRITE,5,main], topicListeners=1, toString()=BusTopicBase [apiKey=, apiSecret=, useHttps=false, allowSelfSignedCerts=false, toString()=TopicBase servers=[message-router], topic=APPC-LCM-WRITE, #recentEvents=0, locked=false, #topicListeners=1]]]: unregistering AggregatedPolicyController [name=amsterdam, alive=false, locked=false, droolsController=NullDroolsController []]

        10231 [2018-05-30T16:17:33.297+00:00|INFO|InlineBusTopicSink|Thread-51] SingleThreadedUebTopicSource [getTopicCommInfrastructure()=UEB, toString()=SingleThreadedBusTopicSource [consumerGroup=94d49b6c-6461-49e0-b479-4dd0483b124e, consumerInstance=dev-drools-0, fetchTimeout=15000, fetchLimit=100, consumer=CambriaConsumerWrapper [fetchTimeout=15000], alive=true, locked=false, uebThread=Thread[UEB-source-APPC-LCM-WRITE,5,main], topicListeners=0, toString()=BusTopicBase [apiKey=, apiSecret=, useHttps=false, allowSelfSignedCerts=false, toString()=TopicBase servers=[message-router], topic=APPC-LCM-WRITE, #recentEvents=0, locked=false, #topicListeners=0]]]: stopping

       

          10232 [2018-05-30T16:17:33.672+00:00|INFO|HttpClient|UEB-source-POOLING]      --> HTTP/1.1 200 OK

        10233 [2018-05-30T16:17:33.672+00:00|INFO|InlineBusTopicSink|UEB-source-POOLING] SingleThreadedUebTopicSource [getTopicCommInfrastructure()=UEB, toString()=SingleThreadedBusTopicSource [consumerGroup=a94d13ce-8e1f-4056-a7d2-b0c49195abfd, consumerInstance=dev-drools-0, fetchTimeout=15000, fetchLimit=100, consumer=null, alive=false, locked=false, uebThread=Thread[UEB-source-POOLING,5,main], topicListeners=0, toString()=BusTopicBase [apiKey=, apiSecret=, useHttps=false, allowSelfSignedCerts=false, toString()=TopicBase servers=[message-router], topic=POOLING, #recentEvents=10, locked=false, #topicListeners=0]]]: exiting thread

       

          10235 [2018-05-30T16:17:34.309+00:00|WARN|HostSelector|pool-4-thread-1] All hosts were blacklisted; reverting to full set of hosts.

        10236 [2018-05-30T16:17:34.309+00:00|INFO|HttpClient|pool-4-thread-1] POST http://message-router:3904/events/POOLING (anonymous) ...

        10237 [2018-05-30T16:17:36.298+00:00|INFO|DmaapManager|Thread-51] stop publishing to topic POOLING

        10238 [2018-05-30T16:17:37.960+00:00|INFO|InlineBusTopicSink|UEB-source-unauthenticated.DCAE_CL_OUTPUT] SingleThreadedUebTopicSource [getTopicCommInfrastructure()=UEB, toString()=SingleThreadedBusTopicSource [consumerGroup=dcae.policy.shared, consumerInstance=dev-drools-0, fetchTimeout=15000, fetchLimit=100, consumer=null, alive=false, locked=false, uebThread=Thread[UEB-source-unauthenticated.DCAE_CL_OUTPUT,5,main], topicListeners=0, toString()=BusTopicBase [apiKey=, apiSecret=, useHttps=false, allowSelfSignedCerts=false, toString()=TopicBase servers=[message-router], topic=unauthenticated.DCAE_CL_OUTPUT, #recentEvents=0, locked=false, #topicListeners=0]]]: exiting thread

        10239 [2018-05-30T16:17:38.020+00:00|INFO|InlineBusTopicSink|UEB-source-APPC-CL] SingleThreadedUebTopicSource [getTopicCommInfrastructure()=UEB, toString()=SingleThreadedBusTopicSource [consumerGroup=ce3c4088-f788-45e3-80d9-fa251f064341, consumerInstance=dev-drools-0, fetchTimeout=15000, fetchLimit=100, consumer=null, alive=false, locked=false, uebThread=Thread[UEB-source-APPC-CL,5,main], topicListeners=0, toString()=BusTopicBase [apiKey=, apiSecret=, useHttps=false, allowSelfSignedCerts=false, toString()=TopicBase servers=[message-router], topic=APPC-CL, #recentEvents=0, locked=false, #topicListeners=0]]]: exiting thread

        10240 [2018-05-30T16:17:38.028+00:00|INFO|InlineBusTopicSink|UEB-source-APPC-LCM-WRITE] SingleThreadedUebTopicSource [getTopicCommInfrastructure()=UEB, toString()=SingleThreadedBusTopicSource [consumerGroup=94d49b6c-6461-49e0-b479-4dd0483b124e, consumerInstance=dev-drools-0, fetchTimeout=15000, fetchLimit=100, consumer=null, alive=false, locked=false, uebThread=Thread[UEB-source-APPC-LCM-WRITE,5,main], topicListeners=0, toString()=BusTopicBase [apiKey=, apiSecret=, useHttps=false, allowSelfSignedCerts=false, toString()=TopicBase servers=[message-router], topic=APPC-LCM-WRITE, #recentEvents=0, locked=false, #topicListeners=0]]]: exiting thread

        10241 [2018-05-30T16:17:43.820+00:00|INFO|HttpClient|UEB-source-PDPD-CONFIGURATION]   --> HTTP/1.1 200 OK

        10242 [2018-05-30T16:17:43.820+00:00|INFO|CambriaConsumerImpl|UEB-source-PDPD-CONFIGURATION] UEB GET /events/PDPD-CONFIGURATION/52035527-511e-4385-9ba6-dacf63d6b335/dev-drools-0?timeout=15000&limit=100

        10243 [2018-05-30T16:17:43.820+00:00|INFO|HttpClient|UEB-source-PDPD-CONFIGURATION] GET http://message-router:3904/events/PDPD-CONFIGURATION/52035527-511e-4385-9ba6-dacf63d6b335/dev-drools-0?timeout=15000&limit=100 (anonymous) ...

       

       

       

            jhh jhh
            jhh jhh
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: