-
Bug
-
Resolution: Done
-
Medium
-
Dublin Release
-
None
Software version: ONAP Dublin
For some reason, DCAE’s RestConfClient component (RCC) saturated all the threads from the k8s worker node it was running on. It created more than 28K threads in 10 days and left their resources not-OS-reclaimable. That in turn resulted in a lot of PODs that were co-located to the same node to fail as their functionality was relying on periodically creating new threads. OS could not start any new native thread … all resources were consumed.
Scaling down RCC to zero instances allowed the node’s OS to reclaim thread resources … this in turn allowed failing PODs to recover.
Unfortunately, no useful logs were gathered. It needs to be further investigated.
- mentioned in
-
Page Loading...