Uploaded image for project: 'Logging analytics'
  1. Logging analytics
  2. LOG-300

CD: OOM framework for continuous E2E deploy validation of tagged commit/merge trigger docker snapshots

XMLWordPrintable

    • OOM CD framework for continuous E2E deploy validation of component commits

       Issue: we currently build docker images daily - not by gerrit merge per developer change - so we can run CD on each commit

      TODO: look at out of the box systems like gitlab, gocd, bamboo...

      POC running now that can consume a tagged docker manifest for a commit under test
      http://jenkins.onap.info/job/oom-cd/
      against http://beijing.onap.info:8880/env/1a7/kubernetes/dashboard
      tracked on http://kibana.onap.info:5601/app/kibana#/dashboards?_g=()

      Manifests

      https://onap.readthedocs.io/en/latest/release/release-manifest.html 

      https://gerrit.onap.org/r/gitweb?p=integration.git;a=blob;f=version-manifest/src/main/resources/docker-manifest.csv;h=35e992adc0678ccd98328d33c9a5ffc88dbb4dfa;hb=refs/heads/master 

       

       existing flow (actual)

      • gerrit review commits - partial-CI runs - JobBuilder marks -1 (compile failure, sonar failure)
      • gerrit review commits - partial-CI runs - JobBuilder marks +1, committer +2 merges to master, (later daily docker merge build tagged docker blindly)
      • no way to know whether that commit degraded ONAP

       proposed flow (expected) - two phase JobBuilder +1 process

      • gerrit review commits - partial-CI runs - JobBuilder marks -1 (compile failure, sonar failure)
      • gerrit review commits - partial-CI runs - JobBuilder marks +1, (what we add below)
        kick in docker merge immediately, we tag the docker image, we adjust the manifest for this review
        run extra CDBuilder that runs CD deploy of OOM using above tagset manifest, healthcheck, vFW
        report failure - marks gerrit review as -1 - no tag set published for this failed commit
        or report pass - marks gerrit review as +1
        jenkins now retags "latest" to the the docker built above for the review (do not blindly tag "latest" to the last docker build whether it passes/fails CD)
        Tagset for that build becomes the latest stable master build of ONAP (as in an OOM deploy will run not from master tip but from a tagged set that omits the last breaking change)
        committer +2 merges to master, (docker merge and tagging should not be required as long as master has not moved during the 1 hour CD build)

      (todo: developer can destabilize this if the submit order is wrong - ie: dcae will add an appc rest api, then checkin dcae first and appc second) 

      Note: Align with OOM-460 upcoming config retrofit

      Issues: (workarounds for both)

      • out of order commits on cross-project commits - 
      • repo drift during the CD build

      We also need a way to do a full deployment of OOM containing the change in a tagged docker image (from the CI build) - OOM will need either a script to retrofit all the values.yaml docker version tags or a way to pull in an automated manifest that has the entire ONAP tag set.

      The CD build with heatlhcheck and minimal (vFirewall run - as much as is automated) - is then run on this single commit trigger or set of triggers for the past hour - later the JobBuilder marks the gerrit reviews as passed/failed based on the CD run

      A CD poc running on AWS EC2 is currently up that does part of this (jenkins, elk stack , cd server to run OOM) - but it needs to be finished and migrated into the LF infrastructure
      -------OOM-393-------, OOM-540

       Proposal

      1 - developer gerrit merge occurs
      2 - CI system saves git hash - INT-359
      3 - tag git based on when build starts (not when it finishes) - INT-189
      4 - build docker image per merge - not daily
      5 - insert artifact into docker image
      6 - tag docker image with TBD timestamp standard tag
      https://nexus3.onap.org/#browse/browse/components:docker.snapshot

      currently we have non-standard tags

      aai 

      version 1.2.0-SNAPSHOT-STAGING-20180121T124253

      ccsdk 0.2.0-SNAPSHOT-STAGING-180117-133202

      0.1.1-STAGING-180121-124346

      appc

      Version 1.3.0-STAGING-20180105T120937

      7 - run CD system - after 1 hour collect results on running vFW for example

      8 - tag gerrit review -1 or  +1 via JobBuilder for CD results - INT-366
      9 - If success CD: mark tag set including new blind tag from earlier gerrit merge per component

      9b - tag latest only if succesfull

      10 - publish tag set in dynamic manifest
      https://gerrit.onap.org/r/gitweb?p=integration.git;a=blob;f=version-manifest/src/main/resources/docker-manifest.csv;h=35e992adc0678ccd98328d33c9a5ffc88dbb4dfa;hb=refs/heads/master 
      11 - retrofit manifest into all values.yaml files in OOM (replace v1.1.0 below - per component)
      https://git.onap.org/oom/tree/kubernetes/aai/values.yaml?h=amsterdam 

      Acceptance Critieria

      • jobbuilder can mark a gerrit review CD pass/fail - just like we do for compile CI jobs
      • kibana history is available for CD jobs
      • run on any branch or tag
      • publicly visible server results
      • mitigation procedure on not using unvalidated tags until they pass
      • mitigation procedure on high volume commit times - lack of resources (group commits - has its own issues)

       Issues to fix

      1 - tag timestamps are non-standard

      https://nexus3.onap.org/#browse/browse/components:docker.snapshot

      v2/onap/aai-resources/manifests/1.2.0-SNAPSHOT-STAGING-20171205T173445

      v2/onap/aai/esr-gui/manifests/1.0.1-SNAPSHOT-STAGING-171205-122425

      We can pull all the highest version tags in a query (after we fix above) and generate the tag set this way

       

      2- need to update the manifest and use the plugin

      https://wiki.onap.org/display/DW/ONAP+Version+Manifest+Maven+Plugin 

       
      Notes
      http://blog.christianposta.com/deploy/blue-green-deployments-a-b-testing-and-canary-releases/
      https://aws.amazon.com/about-aws/whats-new/2016/08/netflix-oss-spinnaker-on-the-aws-cloud-quick-start-reference-deployment/
      https://medium.com/@gajus/the-missing-ci-cd-kubernetes-component-helm-package-manager-1fe002aac680

            michaelobrien michaelobrien
            michaelobrien michaelobrien
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: