-
Bug
-
Resolution: Done
-
Medium
-
Jakarta Release, Kohn Release
The HB service starts the worker process (hb_worker_process), if the process "htbtworker" has finished.
Only, then the "hb_worker_process" is up, the HB pod supervision is showing a successful result and the deployment shows "healthy" (readiness check"
The "htbtworker" process tries periodically to connect to the Kafka Topic "unauthenticated.SEC_HEARTBEAT_OUTPUT", which usually not exists.
Therefor the "htbtworker" will not finish and the "hb_worker_process" is blocked and the pod readiness check fails.
In usual ONAP deployment scenarios the message-router is not running, when the HB service is started, which leads to an exception in the "htbtworker" and the "hb_worker_process" is launched:
2022-09-30 05:13:29,931 | __main__ | db_monitoring | <module> | 235 | INFO | DBM: DBM Process started 2022-09-30 05:13:29,932 | __main__ | db_monitoring | <module> | 240 | INFO | ('DBM:Parent process ID and json file name', '6', '../etc/config.json') 2022-09-30 05:13:48,984 | __main__ | htbtworker | process_msg | 79 | INFO | ('\n\nHBT:eventnameList values ', ['Heartbeat_vDNS', 'Heartbeat_vFW', 'Heartbeat_xx']) 2022-09-30 05:13:48,985 | __main__ | htbtworker | process_msg | 85 | INFO | HBT:Getting :http://message-router:3904/events/unauthenticated.SEC_HEARTBEAT_OUTPUT/hbgrpID/1?timeout=15000 2022-09-30 05:13:49,015 | urllib3.connectionpool | connectionpool | _new_conn | 226 | DEBUG | Starting new HTTP connection (1): message-router:3904 2022-09-30 05:13:49,985 | __main__ | db_monitoring | db_monitoring | 161 | INFO | DBM: Active DB Monitoring Instance Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 159, in _new_conn conn = connection.create_connection( File "/usr/local/lib/python3.9/site-packages/urllib3/util/connection.py", line 84, in create_connection raise err File "/usr/local/lib/python3.9/site-packages/urllib3/util/connection.py", line 74, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refusedDuring handling of the above exception, another exception occurred: ... requests.exceptions.ConnectionError: HTTPConnectionPool(host='message-router', port=3904): Max retries exceeded with url: /events/unauthenticated.SEC_HEARTBEAT_OUTPUT/hbgrpID/1?timeout=15000 (Caused by New ConnectionError('<urllib3.connection.HTTPConnection object at 0x7f737a282460>: Failed to establish a new connection: [Errno 111] Connection refused')) 2022-09-30 05:13:50,123 | __main__ | misshtbtd | hb_worker_process | 202 | INFO | MSHBT:Creaated Heartbeat worker process 2022-09-30 05:13:54,625 | __main__ | misshtbtd | main | 359 | INFO | ('MSHBT: hb_common values ', 6, 'RUNNING', 'dcae-heartbeat-', 1664514809) 2022-09-30 05:13:54,626 | __main__ | misshtbtd | main | 363 | INFO | ('MSHBD:pid,srcName,state,time,ctime,timeDiff is', 6, 'dcae-heartbeat-', 'RUNNING', 1664514809, 1664514835, 26) 2022-09-30 05:13:54,627 | __main__ | misshtbtd | main | 379 | INFO | ('MSHBD:config status is', 'RUNNING')
So only due to this exception finally the hb_worker_process is started:
2022-09-30 05:13:50,123 | __main__ | misshtbtd | hb_worker_process | 202 | INFO | MSHBT:Creaated Heartbeat worker process
In the "normal" case the "htbtworker" would run endlessly until the topic is created (e.g. in the ServiceMesh environment) and the "hb_worker_process" will never be started.
2022-09-28 10:49:08,399 | __main__ | htbtworker | <module> | 258 | INFO | HBT:HeartBeat thread Created 2022-09-28 10:49:08,399 | __main__ | htbtworker | <module> | 260 | INFO | ('HBT:The config file name passed is -%s', '../etc/config.json') 2022-09-28 10:49:09,011 | __main__ | misshtbtd | create_process | 327 | INFO | ('MSHBD:jobs list is', [<Process name='Process-2' pid=13 parent=6 started>, <Process name='Process-3' pid=15 parent=6 started >]) 2022-09-28 10:49:09,129 | __main__ | misshtbtd | create_update_hb_common | 121 | INFO | MSHBT:Updated hb_common DB with new values 2022-09-28 10:49:09,510 | __main__ | db_monitoring | <module> | 235 | INFO | DBM: DBM Process started 2022-09-28 10:49:09,511 | __main__ | db_monitoring | <module> | 240 | INFO | ('DBM:Parent process ID and json file name', '6', '../etc/config.json') 2022-09-28 10:49:28,461 | __main__ | htbtworker | process_msg | 79 | INFO | ('\n\nHBT:eventnameList values ', ['Heartbeat_vDNS', 'Heartbeat_vFW', 'Heartbeat_xx']) 2022-09-28 10:49:28,462 | __main__ | htbtworker | process_msg | 85 | INFO | HBT:Getting :http://message-router:3904/events/unauthenticated.SEC_HEARTBEAT_OUTPUT/hbgrpID/1?timeout=15000 2022-09-28 10:49:28,485 | urllib3.connectionpool | connectionpool | _new_conn | 226 | DEBUG | Starting new HTTP connection (1): message-router:3904 2022-09-28 10:49:28,562 | urllib3.connectionpool | connectionpool | _make_request | 433 | DEBUG | http://message-router:3904 "GET /events/unauthenticated.SEC_HEARTBEAT_OUTPUT/hbgrpID/1?timeout=15000 HTTP/ 1.1" 404 142 2022-09-28 10:49:28,563 | __main__ | htbtworker | process_msg | 100 | INFO | ('HBT:', '{"mrstatus":3001,"helpURL":"http://onap.readthedocs.io","message":"No such topic exists.-[unauthenticated.SEC_HEARTBE AT_OUTPUT]","status":404}') 2022-09-28 10:49:29,583 | __main__ | db_monitoring | db_monitoring | 161 | INFO | DBM: Active DB Monitoring Instance 2022-09-28 10:49:34,210 | __main__ | misshtbtd | main | 359 | INFO | ('MSHBT: hb_common values ', 6, 'RUNNING', 'dcae-heartbeat-', 1664362149) 2022-09-28 10:49:34,211 | __main__ | misshtbtd | main | 363 | INFO | ('MSHBD:pid,srcName,state,time,ctime,timeDiff is', 6, 'dcae-heartbeat-', 'RUNNING', 1664362149, 1664362174, 25) 2022-09-28 10:49:34,215 | __main__ | misshtbtd | main | 379 | INFO | ('MSHBD:config status is', 'RUNNING') 2022-09-28 10:49:34,339 | __main__ | misshtbtd | create_update_hb_common | 121 | INFO | MSHBT:Updated hb_common DB with new values 2022-09-28 10:49:48,624 | __main__ | htbtworker | process_msg | 79 | INFO | ('\n\nHBT:eventnameList values ', ['Heartbeat_vDNS', 'Heartbeat_vFW', 'Heartbeat_xx']) 2022-09-28 10:49:48,624 | __main__ | htbtworker | process_msg | 85 | INFO | HBT:Getting :http://message-router:3904/events/unauthenticated.SEC_HEARTBEAT_OUTPUT/hbgrpID/1?timeout=15000 2022-09-28 10:49:48,637 | urllib3.connectionpool | connectionpool | _new_conn | 226 | DEBUG | Starting new HTTP connection (1): message-router:3904