Uploaded image for project: 'Logging analytics'
  1. Logging analytics
  2. LOG-376

Logstash full saturation of 8 cores with AAI deployed on one of the quad 8 vCore vms for 30 logs/sec - up replicaSet 1 to 3 or use DaemonSet

XMLWordPrintable

      20190109
      from aai team
      https://wiki.onap.org/display/DW/2019-01-17+AAI+Developers+Meeting+Open+Agenda
      "hector has discovered that the stress test jar (liveness probe?) in aai-cassandra is hammering the cpu/ram/hd on the vm that aai is on - this breaks the etcd cluster (not the latency/network issues we suspected that may cause pod rescheduling) "

      20181017: update - reopen or re-raise AAI/Logstash specific JIRA for Dublin - in LOG-707 - as the issue is more of an AAI to logstash issue

      Need to find out which container - it is the logstash one - mine

      12588 ubuntu 20 0 6397972 699192 22012 S 578.1 1.1 567:55.27 /usr/bin/java -Xmx500m -Xss2048k -Djffi.boot.library.path=/usr/share/logstash/vendor/jruby/lib/jni -Xbo+

      http://jenkins.onap.info/job/oom-cd-master/2897/console
      find out what the reason is for the saturation - is it excessive logs from example the cluster heartbeat from all the db clusters
      or a misconfiguration of the resource section

      it looks like logs are still being processed up to 4 min after they come into logstash - getting an average of 200-400 logs per 30 sec on

      http://18.188.238.244:30253/app/kibana#/discover?_g=(refreshInterval:('$$hashKey':'object:1198',display:'5%20seconds',pause:!f,section:1,value:5000),time:(from:now-15m,mode:quick,to:now))&_a=(columns:!(_source),index:'logstash-*',interval:auto,query:(query_string:(analyze_wildcard:!t,query:'*')),sort:!('@timestamp',desc))

        1. Screenshot 2018-05-08 01.34.16.png
          Screenshot 2018-05-08 01.34.16.png
          400 kB
        2. Screenshot 2018-05-08 19.04.25.png
          Screenshot 2018-05-08 19.04.25.png
          1.20 MB
        3. Screenshot 2018-05-15 15.27.26.png
          Screenshot 2018-05-15 15.27.26.png
          1.30 MB
        4. Screenshot 2018-05-16 18.08.35.png
          Screenshot 2018-05-16 18.08.35.png
          141 kB
        5. Screenshot 2018-05-16 18.15.09.png
          Screenshot 2018-05-16 18.15.09.png
          936 kB
        6. Screenshot 2018-05-16 18.33.08.png
          Screenshot 2018-05-16 18.33.08.png
          499 kB
        7. Screenshot 2018-05-16 18.33.20.png
          Screenshot 2018-05-16 18.33.20.png
          139 kB
        8. Screenshot 2018-05-16 18.34.03.png
          Screenshot 2018-05-16 18.34.03.png
          140 kB
        9. Screenshot 2018-05-16 18.35.24.png
          Screenshot 2018-05-16 18.35.24.png
          138 kB
        10. Screenshot 2018-05-18 08.07.18.png
          Screenshot 2018-05-18 08.07.18.png
          431 kB
        11. Screenshot 2018-05-28 01.26.27.png
          Screenshot 2018-05-28 01.26.27.png
          120 kB
        12. Screenshot 2018-05-29 07.39.02.png
          Screenshot 2018-05-29 07.39.02.png
          643 kB
        13. Screenshot 2018-05-29 07.39.13.png
          Screenshot 2018-05-29 07.39.13.png
          758 kB
        14. Screenshot 2018-05-30 13.26.34.png
          Screenshot 2018-05-30 13.26.34.png
          1012 kB
        15. Screenshot 2018-05-30 17.45.00.png
          Screenshot 2018-05-30 17.45.00.png
          489 kB
        16. Screenshot 2018-05-30 17.45.46.png
          Screenshot 2018-05-30 17.45.46.png
          1.20 MB
        17. Screenshot 2018-05-30 18.33.07.png
          Screenshot 2018-05-30 18.33.07.png
          1.16 MB
        18. Screenshot 2018-05-30 18.34.23.png
          Screenshot 2018-05-30 18.34.23.png
          40 kB
        19. Screenshot 2018-05-30 18.52.11.png
          Screenshot 2018-05-30 18.52.11.png
          1.23 MB
        20. Screenshot 2018-05-30 18.53.37.png
          Screenshot 2018-05-30 18.53.37.png
          1.35 MB
        21. values.yaml
          3 kB

            michaelobrien michaelobrien
            michaelobrien michaelobrien
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: