Uploaded image for project: 'Configuration Persistence Service'
  1. Configuration Persistence Service
  2. CPS-1716

NCMP: Java Heap OutOfMemory errors and slow registration in case of 20k cmhandles

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: High High
    • Montreal Release
    • London Release
    • None
    • None

      Current behavior

      During the testing with high amount of cmhandles it turned out that sending in the cmhandles to NCMP extremely slowed down and have memory usage problems.
      It is visible that sending in the first batches are around 1-2 seconds, the last batches are more than 2 minutes.
      20000 cmhandles were sent in in 100 sized batches (200x100)

      If we use our default Java heap settings than the NCMP logs are full with "OutOfMemoryError: Java heap space" errors, almost all of the background processes complaining about this.
      Sending in the 20k cmhandles took more than 3.5 hours however it was around 20 minutes before.
      The default memory usage settings:
       - requested memory 2Gi
       - memory limit 3Gi,
       - From these the java heap is around 750MB

      If we increase the java heap to 1500 MB than there are significantly less out of memory errors, but still working with 100% memory usage.
      Sending in the 20k cmhandles still took more than 3.5 hours however it was around 20 minutes before.
      Increased java heap settings:
       - requested memory 2Gi
       - memory limit 3Gi,
       - From these the java heap is around 1500MB
       
      Tried with and without hazelcast configuration, but seems to have the same result

       Seems to be a memory leak, or at least really high memory usage compared to the previous version we tested, because there the mentioned default memry settings were enough previously

      Previously used version
      3.2.5
      https://gerrit.onap.org/r/gitweb?p=cps.git;a=commit;h=3bc22ed0ea833bdb649f393ec20c08dbb1bb7610

       

      Version where we found the fault
      3.3.1
      https://gerrit.onap.org/r/gitweb?p=cps.git;a=commit;h=8879947bcad66545beba83614d8a3a7327e1889f

       

      Expected behavior:
      No OutOfMemoryError when sending in high amount of cmhandles (20k)
      Do not have memory leak and higher heap usage than it was before
      Similar cmHandle registration performance as before

       

      Reproduction
      Configure java heap size to our settings (see above)
      Try to register 20000 cmhandles in to ncmp in 100 sized batches
      It will be more hours to register it.
      Sending in the batches takes even more time.
      NCMP logs will be full with OutOfMemory errors
      Profiling might be needed to see what is using high amount of memory

       

      Test environment:
        Test was performed on a kubernetes cluster without wiremocked elements
        However it seems to be independent from the dmi plugins, so it might be reproducible with wiremocked elements too.
       

      Collected logs:

      discovery.log
      ncmp logs already sent previously, can not attach here because of large size and sensitive content

        1. discovery.log
          107 kB
        2. dump.png
          dump.png
          64 kB

            danielhanrahan Daniel Hanrahan
            csaba.eder csaba Eder
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: