-
Bug
-
Resolution: Done
-
High
-
Guilin Release, Honolulu Release
-
None
PRH (PNF Registration Handler) microservice periodically get its configuration from CBS (Config Binding Service).
However, it does not clean up connections with CBS, so the old sessions accumulated endlessly.
Finally, the PRH/CBS pod exhausted all available file descriptors up to system limit, then the other pods on the same node crashed with 'connection refused' 'connection timed out' 'too many open files' error.
This issue occurs if ONAP is running for long time.
How long to occur depends on setting, by default polling interval to CBS is 5 mins and ubuntu's file descriptor limit is 65535. If PRH and CBS are running on the same node, the issue can happen in 2~3 months.
cbs-netstat.txt.zip ... active connections by netstat command. we can see about 25k sessions were kept by PRH/CBS.
Steps to reproduce
- To be reproduced faster, change the property updates-interval: 5m to 5s in bootstrap.yaml in prh-app-server-1.5.4.jar .
- Start CBS and PRH pod on the same node (use nodeSelector etc)
- About 35 hours later, the node become unstable (due to system resource exhausted), and some pods on the same node crash and stop working due to an error.
- relates to
-
DCAEGEN2-2868 CBS-Client supporting configMap - PRH integration
- Closed