-
Story
-
Resolution: Done
-
Medium
-
None
-
None
If DMaaP message-router is running for long time ( typically couple of days), the GET /events/{topicname}/{consumergroup}/{consumerid} API very frequently stops working by giving 503 error response like below.
_
{"mrstatus":3002,"helpURL":"http://onap.readthedocs.io","message":"Server is temporarily unavailable or busy.Try again later, or try another server in the cluster.org.onap.dmaap.dmf.mr.backends.kafka.KafkaConsumerCache$KafkaConsumerCacheException: The cache service is unavailable.","status":503}_
As DMaaP is getting deprecated, a workaround is suggested instead of fixing the issue in message-router.
The helm chart has livenessProbe which currently check the TCP connection towards the API port of message-router pod. It can be changed to use httpGet mechanism of livenessProbe to fetch events on some harmless/test topic. So, if this functionality breaks, kubernetes will restart the message-router pod.