-
Epic
-
Resolution: Unresolved
-
Medium
-
None
-
None
-
Service Assurance with Big Data analytics (Needed generally and also satisfies Analytics at the source as discussed in Edge-automation working group)
Purpose:
- Big Data analytics to make sure that the analysis is accurate.
- Big data frameworks to allow usage of machine learning and deep learning.
- Avoid sending large amount of data to ONAP-Central for training, by letting training happen near the data source (Clolud-regions).
- ONAP scale-out performance, by distributing some functions out of ONAP-Central such as Analytics
- Letting inferencing happen closer to the edges/cloud-regions for future closed loop operations, thereby reducing the latency for closed loop.
How it would be done?
- Use PNDA as a base
- Create/adapt Helm charts
- Ensure that no HEAT based deployment is necessary.
- Use components that are needed for normal analytics as well ML based analytics (Apache Spark latest stable release, HDFS, OpenTSDB, Kafka, Avro schema etc..)
- Use some PNDA specific packages - Deployment manager as one example.
- Develop new (or enhance existing) software components
- that allow distribution of analytics applications to various analytics instances
- that allow
onboarding new analytics applicationsand onboarding of ML/DL models. - that integrates with CLAMP and DCAE framework
Dublin Scope (Activities)
- DCAE & OOM - Helm charts for analytics framework. Two packages
- Standard package (with all SW) and inferencing package (Minimal).
- Use case - Deploy Analytics framework in the cloud-regions that are based on K8S.
- DCAE - Update PNDA deployment manager for spark app and ML/DL model distribution
- DCAE - Analytics application management & dispatcher (to dispatch application image to various cloud regions)
- DCAE - ML/DL Model Management & Dispatcher
- MultiCloud/K8S - Analytics application configuration profile support
- DCAE - Collection and Distribution Service - CollectD to Kafka (CollectD-kafka/avro)
- DCAE - Collection and Distribution Service - Node-export & cAdvisor to Kafka (Stretch Goal)
DCAE - Changes required to integrate with CLAMP and DCAE (E2E configuration, E2E stiching etc...) - Stretch Goal(As per DCAE team recommendation, moving this to new EPIC story as it requires more discussion on architecture https://jira.onap.org/browse/ONAPARC-372)- DCAE - ONAP TCA Event dispatcher micro-service (ONAP-TCA-event-dispatcher)
- DCAE - Make TCA application generic or create a test TCA application (since it needs to run in cloud-region that does not have ONAP specific components) to run on any spark based framework (Get input using Kafka, Get configuration updates via Consul directly, Output via Kafka) : TCA-spark application
- DCAE - Test analytics application (Before integration with DCAE)
- Onboard analytics framework (standard package)
- Instantiate analytics framework in cloud-region1
- Make analytics application CSAR with helm charts to instantiate CollectD-Kafka/Avro micro service, spark-k8s-operator to submit TCA-spark-application and helm chart to bring up ONAP-TCA-event-dispatcher)
- Onboard analytics application
- Upload TCA-spark image and get it synchronized with cloud-region1
- Add configuration profiles for each component of analytics application (configure to generate alarm if memory size consumption is more than XGbytes).
- Do something on a compute ode to use up all the memory.
- Check whether memory events are coming to the ONAP-TCA-event-dispatcher.
- DCAE - Test analytics application (after integration with DCAE) - Dependent on item 9 (stretch goal)
- Leverage DCAE infrastructure components
Note: There is some discussion on doing activities 1 to 8 in PNDA project. To be decided. But, for now,assumption is that these would be placed in ONAP repositories.
Assumptions:
- For Dublin, it is assumed that Analytics framework and applications are deployed where K8S is present.
- For Dublin, only Spark+HDFS+OpenTSDB+Kafka framework is supported. Other frameworks such as flint etc.. are further study.
- For Dublin, Kubeflow with tensorflow serving is not used for training and inferencing. Only spark based ML Pipelines are supported.
- For Dublin, closed loop actions are performed only at the ONAP-Central level