Uploaded image for project: 'Portal'
  1. Portal
  2. PORTAL-476

portalDB: fatal error on Cassandra

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Medium Medium
    • None
    • Beijing Release
    • Portal

      Orange ONAP Openlab users reported an issue on the portal: a 500 HTTP error was displayed.

      The error cause was easy to find:

      debian@control01-openlab:~$ kubectl get po -n onap |grep portal

      onap-portal-app-57c46f7d7f-rtnvt                 2/2       Running                0          11d

      onap-portal-cassandra-8669c8f94f-xgpx2           0/1       CreateContainerError   8          11d

      onap-portal-db-85f77bcff5-ggmqw                  1/1       Running                0          11d

      onap-portal-sdk-75946fd885-kkdsn                 2/2       Running                0          23h

      onap-portal-widget-6d94d9474f-lszck              1/1       Running                0          11d

      onap-portal-zookeeper-cd86c64d-6gjlr             1/1       Running                0          11d

      onap-sdnc-portal-8455b7b4cc-hlgfg                1/1       Running                0          11d

       

      Information from the logs of this container

      Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})

      Cassandra is unavailable - sleeping

      .....

      #

        1. A fatal error has been detected by the Java Runtime Environment:*

      #

      #  SIGBUS (0x7) at pc=0x00007f0d6f8fee7a, pid=1, tid=0x00007f0d7080d700

      #

      1. JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-8u151-b12-1~deb9u1-b12)
      1. Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 compressed oops)
      1. Problematic frame:
      1. C  [libc.so.6+0x128e7a]

      #

      1. Core dump written. Default location: //core or core.1

      #

      1. An error report file with more information is saved as:
      1. /tmp/hs_err_pid1.log

      Compiled method (nm)    7057 1173     n 0       sun.misc.Unsafe::copyMemory (native)

       total in heap  [0x00007f0d5db7a9d0,0x00007f0d5db7ad40] = 880

       relocation     [0x00007f0d5db7aaf8,0x00007f0d5db7ab40] = 72

       main code      [0x00007f0d5db7ab40,0x00007f0d5db7ad40] = 512

      #

      The error occured 11 days after the installation.

      It had never been experienced before and could be impssible to reproduce as such. However as the error messages were explicit, it could make sens to report it.

      A simple restart of the docker lead to a recovering of the system

      however it was detected by a humane tests and not by the daily healthchecks

       

       

       

       

            Unassigned Unassigned
            mrichomme mrichomme
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: