Uploaded image for project: 'Portal'
  1. Portal
  2. PORTAL-299

Portal pods (DB) fail to come up intermittently on locked DB file

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Medium Medium
    • Casablanca Release
    • Casablanca Release
    • None
    • None
    • Portal Sprint 10

      Portal is coming up only every 4th deploy, sometimes for 4 builds and sometimes not for 4+ builds
      looks like a timing/resource availbility issue not reflected in the logs

      http://kibana.onap.info:5601/app/kibana#/dashboard/AWAtvpS63NTXK5mX2kuS?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-7d,mode:quick,to:now))&_a=(description:'',filters:!(),options:(darkTheme:!f),panels:!((col:1,id:AWAts77k3NTXK5mX2kuM,panelIndex:1,row:1,size_x:8,size_y:3,type:visualization),(col:9,id:AWAtuTVI3NTXK5mX2kuP,panelIndex:2,row:1,size_x:4,size_y:3,type:visualization),(col:1,id:AWAtuBTY3NTXK5mX2kuO,panelIndex:3,row:7,size_x:6,size_y:3,type:visualization),(col:1,id:AWAttmqB3NTXK5mX2kuN,panelIndex:4,row:4,size_x:6,size_y:3,type:visualization),(col:7,id:AWAtvHtY3NTXK5mX2kuR,panelIndex:6,row:4,size_x:6,size_y:6,type:visualization)),query:(match_all:()),timeRestore:!f,title:'CD%20Health%20Check',uiState:(),viewMode:view)

      ubuntu@ip-172-31-0-133:~$ kubectl get pods --all-namespaces | grep portal
      onap          onap-portal-app-765f57f9b7-4xpq9                0/2       Init:0/1           0          9m
      onap          onap-portal-cassandra-fdd79fccc-cnn9l           1/1       Running            0          3h
      onap          onap-portal-db-7d9b8f4f4c-jvhf9                 0/1       CrashLoopBackOff   6          9m
      onap          onap-portal-sdk-6d4757d4fc-tjstd                0/2       Init:0/1           0          9m
      onap          onap-portal-widget-64c9d944f7-8q5lf             0/1       Init:0/1           0          9m
      onap          onap-portal-zookeeper-696c5f95cb-js5qp          1/1       Running            0          3h
      onap          onap-sdnc-portal-577694b6f6-p48c5               1/1       Running            0          3h
      
      
        Type     Reason                 Age                  From                                                  Message
        ----     ------                 ----                 ----                                                  -------
        Normal   Scheduled              37m                  default-scheduler                                     Successfully assigned onap-portal-db-7d9b8f4f4c-jvhf9 to ip-172-31-21-129.us-east-2.compute.internal
        Normal   SuccessfulMountVolume  37m                  kubelet, ip-172-31-21-129.us-east-2.compute.internal  MountVolume.SetUp succeeded for volume "onap-portal-db"
        Normal   SuccessfulMountVolume  37m                  kubelet, ip-172-31-21-129.us-east-2.compute.internal  MountVolume.SetUp succeeded for volume "localtime"
        Normal   SuccessfulMountVolume  37m                  kubelet, ip-172-31-21-129.us-east-2.compute.internal  MountVolume.SetUp succeeded for volume "default-token-gxfjv"
        Normal   Pulling                36m (x3 over 37m)    kubelet, ip-172-31-21-129.us-east-2.compute.internal  pulling image "nexus3.onap.org:10001/onap/portal-db:2.2.0"
        Normal   Pulled                 36m (x3 over 37m)    kubelet, ip-172-31-21-129.us-east-2.compute.internal  Successfully pulled image "nexus3.onap.org:10001/onap/portal-db:2.2.0"
        Normal   Created                36m (x3 over 37m)    kubelet, ip-172-31-21-129.us-east-2.compute.internal  Created container
        Normal   Started                36m (x3 over 37m)    kubelet, ip-172-31-21-129.us-east-2.compute.internal  Started container
        Warning  BackOff                12m (x113 over 36m)  kubelet, ip-172-31-21-129.us-east-2.compute.internal  Back-off restarting failed container
        Warning  FailedSync             2m (x157 over 36m)   kubelet, ip-172-31-21-129.us-east-2.compute.internal  Error syncing pod
      ubuntu@ip-172-31-0-133:~$ kubectl describe pod onap-portal-db-7d9b8f4f4c-jvhf9  -n onap
      
      
      
      ubuntu@ip-172-31-0-133:~$ kubectl logs -f onap-portal-db-7d9b8f4f4c-jvhf9   -n onap
      2018-06-08 17:37:29 140009038980992 [Note] mysqld (mysqld 10.2.15-MariaDB-10.2.15+maria~jessie) starting as process 1 ...
      2018-06-08 17:37:29 140009038980992 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2018-06-08 17:37:29 140009038980992 [Note] InnoDB: Uses event mutexes
      2018-06-08 17:37:29 140009038980992 [Note] InnoDB: Compressed tables use zlib 1.2.8
      2018-06-08 17:37:29 140009038980992 [Note] InnoDB: Using Linux native AIO
      2018-06-08 17:37:29 140009038980992 [Note] InnoDB: Number of pools: 1
      2018-06-08 17:37:29 140009038980992 [Note] InnoDB: Using SSE2 crc32 instructions
      2018-06-08 17:37:29 140009038980992 [Note] InnoDB: Initializing buffer pool, total size = 256M, instances = 1, chunk size = 128M
      2018-06-08 17:37:29 140009038980992 [Note] InnoDB: Completed initialization of buffer pool
      2018-06-08 17:37:29 140008303728384 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
      2018-06-08 17:37:29 140009038980992 [Note] InnoDB: Highest supported file format is Barracuda.
      2018-06-08 17:37:29 140009038980992 [Note] InnoDB: Starting crash recovery from checkpoint LSN=3759052
      2018-06-08 17:37:29 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/fn_user.ibd' with space ID 137, since the redo log references ./ecomp_sdk/fn_user.ibd with space ID 133.
      2018-06-08 17:37:30 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/cr_folder.ibd' with space ID 134, since the redo log references ./ecomp_sdk/cr_folder.ibd with space ID 80.
      2018-06-08 17:37:30 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/fn_user_pseudo_role.ibd' with space ID 136, since the redo log references ./ecomp_sdk/fn_user_pseudo_role.ibd with space ID 112.
      2018-06-08 17:37:30 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/fn_user_pseudo_role.ibd' with space ID 136, since the redo log references ./ecomp_sdk/fn_user_pseudo_role.ibd with space ID 135.
      2018-06-08 17:37:30 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/fn_user_pseudo_role.ibd' with space ID 136, since the redo log references ./ecomp_sdk/fn_user_pseudo_role.ibd with space ID 135.
      2018-06-08 17:37:30 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/fn_user.ibd' with space ID 137, since the redo log references ./ecomp_sdk/fn_user.ibd with space ID 133.
      2018-06-08 17:37:30 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/cr_report_file_history.ibd' with space ID 139, since the redo log references ./ecomp_sdk/cr_report_file_history.ibd with space ID 90.
      2018-06-08 17:37:30 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/cr_report_file_history.ibd' with space ID 139, since the redo log references ./ecomp_sdk/cr_report_file_history.ibd with space ID 138.
      2018-06-08 17:37:30 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/cr_report_file_history.ibd' with space ID 139, since the redo log references ./ecomp_sdk/cr_report_file_history.ibd with space ID 138.
      2018-06-08 17:37:30 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/cr_hist_user_map.ibd' with space ID 141, since the redo log references ./ecomp_sdk/cr_hist_user_map.ibd with space ID 82.
      2018-06-08 17:37:30 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/cr_hist_user_map.ibd' with space ID 141, since the redo log references ./ecomp_sdk/cr_hist_user_map.ibd with space ID 140.
      2018-06-08 17:37:30 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/cr_hist_user_map.ibd' with space ID 141, since the redo log references ./ecomp_sdk/cr_hist_user_map.ibd with space ID 140.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/fn_user.ibd' with space ID 137, since the redo log references ./ecomp_sdk/fn_user.ibd with space ID 133.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 134. Another data file called ./ecomp_sdk/cr_folder.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 134. Another data file called ./ecomp_sdk/cr_folder.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 134. Another data file called ./ecomp_sdk/cr_folder.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 134. Another data file called ./ecomp_sdk/cr_folder.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 134. Another data file called ./ecomp_sdk/cr_folder.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/fn_user_pseudo_role.ibd' with space ID 136, since the redo log references ./ecomp_sdk/fn_user_pseudo_role.ibd with space ID 135.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 136. Another data file called ./ecomp_sdk/fn_user_pseudo_role.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 136. Another data file called ./ecomp_sdk/fn_user_pseudo_role.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 136. Another data file called ./ecomp_sdk/fn_user_pseudo_role.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 136. Another data file called ./ecomp_sdk/fn_user_pseudo_role.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/fn_user_pseudo_role.ibd' with space ID 136, since the redo log references ./ecomp_sdk/fn_user_pseudo_role.ibd with space ID 135.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 136. Another data file called ./ecomp_sdk/fn_user_pseudo_role.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 137. Another data file called ./ecomp_sdk/fn_user.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 137. Another data file called ./ecomp_sdk/fn_user.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 137. Another data file called ./ecomp_sdk/fn_user.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 137. Another data file called ./ecomp_sdk/fn_user.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/fn_user.ibd' with space ID 137, since the redo log references ./ecomp_sdk/fn_user.ibd with space ID 133.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 137. Another data file called ./ecomp_sdk/fn_user.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/cr_report_file_history.ibd' with space ID 139, since the redo log references ./ecomp_sdk/cr_report_file_history.ibd with space ID 138.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 139. Another data file called ./ecomp_sdk/cr_report_file_history.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 139. Another data file called ./ecomp_sdk/cr_report_file_history.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 139. Another data file called ./ecomp_sdk/cr_report_file_history.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 139. Another data file called ./ecomp_sdk/cr_report_file_history.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/cr_report_file_history.ibd' with space ID 139, since the redo log references ./ecomp_sdk/cr_report_file_history.ibd with space ID 138.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 139. Another data file called ./ecomp_sdk/cr_report_file_history.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/cr_hist_user_map.ibd' with space ID 141, since the redo log references ./ecomp_sdk/cr_hist_user_map.ibd with space ID 140.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 141. Another data file called ./ecomp_sdk/cr_hist_user_map.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 141. Another data file called ./ecomp_sdk/cr_hist_user_map.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 141. Another data file called ./ecomp_sdk/cr_hist_user_map.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 141. Another data file called ./ecomp_sdk/cr_hist_user_map.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/cr_hist_user_map.ibd' with space ID 141, since the redo log references ./ecomp_sdk/cr_hist_user_map.ibd with space ID 140.
      2018-06-08 17:37:32 140009038980992 [Note] InnoDB: Ignoring data file './ecomp_sdk/#sql-66_b.ibd' with space ID 141. Another data file called ./ecomp_sdk/cr_hist_user_map.ibd exists with the same space ID.
      2018-06-08 17:37:32 140009038980992 [ERROR] InnoDB: Unable to lock ./portal/fn_tab.ibd error: 122
      2018-06-08 17:37:32 0x7f566508a780  InnoDB: Assertion failure in file /home/buildbot/buildbot/build/mariadb-10.2.15/storage/innobase/fil/fil0fil.cc line 761
      InnoDB: Failing assertion: success
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mysqld startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: https://mariadb.com/kb/en/library/xtradbinnodb-recovery-modes/
      InnoDB: about forcing recovery.
      180608 17:37:32 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
      
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
      
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
      
      Server version: 10.2.15-MariaDB-10.2.15+maria~jessie
      key_buffer_size=134217728
      read_buffer_size=2097152
      max_used_connections=0
      max_threads=102
      thread_count=0
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 759911 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
      
      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x49000
      mysqld(my_print_stacktrace+0x2e)[0x556364a2eb6e]
      mysqld(handle_fatal_signal+0x345)[0x5563644ad185]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7f5664c6b890]
      /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f5662ff5067]
      /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f5662ff6448]
      mysqld(+0x42229c)[0x55636424529c]
      mysqld(+0xa2753e)[0x55636484a53e]
      mysqld(+0xa27a6f)[0x55636484aa6f]
      mysqld(+0xa30bed)[0x556364853bed]
      mysqld(+0x87d1e5)[0x5563646a01e5]
      mysqld(+0x87d2d2)[0x5563646a02d2]
      mysqld(+0x87dbc3)[0x5563646a0bc3]
      mysqld(+0x87e0f5)[0x5563646a10f5]
      mysqld(+0x87f77c)[0x5563646a277c]
      mysqld(+0x92d78a)[0x55636475078a]
      mysqld(+0x815d41)[0x556364638d41]
      mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0x64)[0x5563644af4d4]
      mysqld(+0x4f2175)[0x556364315175]
      mysqld(_Z11plugin_initPiPPci+0x8aa)[0x55636431688a]
      mysqld(+0x44326e)[0x55636426626e]
      mysqld(_Z11mysqld_mainiPPc+0x512)[0x55636426b702]
      /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f5662fe1b45]
      mysqld(+0x43b36d)[0x55636425e36d]
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      Fatal signal 11 while backtracing

            dr695h dr695h
            michaelobrien michaelobrien
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: