Uploaded image for project: 'ONAP TSC'
  2. TSC-25

Task Force to implement CD (Continuous Deployment)


      Meeting Videos: https://wiki.onap.org/display/DW/CD+-+Continuous+Deployment
      20181026: LF Ticket 62287 is tracking this request

      Meets 1230 EDT (GMT-5) Thu https://zoom.us/j/7939937123

      Integration team: 2018
      gwu - full onap system 13 vms - freq: master once a day

      Logging/OOM Team: 201710
      michaelobrien - partial onap system 1 vm - freq: hourly for 3 pod - 6 hours for full onap deploy 256G vm
      GITLAB OOM mirror for 2nd CD pipeline

      Orange Labs 201901
      Working E2E demo: https://wiki.onap.org/display/DW/CD+-+Continuous+Deployment#CD-ContinuousDeployment-20190131-OrangeCDdemofromSylvainDesbureaux
      Orange demo from last week on https://wiki.onap.org/display/DW/CD+-+Continuous+Deployment#CD-ContinuousDeployment-20190131-OrangeCDdemofromSylvainDesbureaux

      example on https://gerrit.onap.org/r/#/c/77660/

      • flow:
      Existing flow
      - gerrit commit on the oom repo for a particular component like so, aai - keyed by Issue-ID Jira
      - helm-verify jjb jenkins job currently runs - reports +1/-1
      - review is merged
      - helm verify runs again on master
      - no helm deploy
      Proposed flow 1
      - gerrit commit on the oom repo for a particular component like so, aai - keyed by Issue-ID Jira
      - helm-verify jjb jenkins job currently runs - reports +1/-1
      - manual magic word "run-helm-deploy" will kick in a helm-deploy jjb job that deploys robot and the particular pod to a 16-32g VM (preconfigured with rancher as a single node) - how? jenkins will run a remote ssh shell to a server using a cached key - a cd.sh script will need to be written - see the 2 pocs below already running 
      - reports +1/-1 if healthcheck for that component passes after 20 min - parse the logs from jenkins
      - scripts to bring up k8s/helm/docker - see links in comments
        3 types of tests (docker image tag, kubernetes chart/job/config changes) - docker image tag changes will require that the image is in nexus3 already - ideally only oom repo changes will be in phase 1
      Proposed flow 2 later
      - gerrit commit on the oom repo for a particular component like so, aai - keyed by Issue-ID Jira
      - helm-verify jjb jenkins job currently runs - reports +1/-1
      - same helm-install jjb job again is automatically triggered and reports back a +1/-1 after 20 min
      - based on # of vms - jobs can be parallelized or batched (report -1/+1 to the batch - but only 1+ are at fault

      Task Force: Michael O'Brien, Gildas, Christophe Closset, Jessica, Jeremy,

      Linux Foundation Ticket: 62287

      Use labs if they have public non-VPN access - https://wiki.onap.org/display/DW/Physical+Labs

        Use cases

      (essentially at a minimum the CD deploy does)

      kubectl get pods --all-namespaces
      oom/kubernetes/robot$ sudo ./ete-k8s.sh onap health

      Ultimate Main Goal: prevent to merge code that has not been tested in a CD environment.

      Realistic code: test merged code in a CD environment.

      20181015 notes: glanilis michaelobrien ChrisC

      • when to run: only after a successfull helm verify (before code is merged)
        example https://jenkins.onap.org/job/oom-master-merge-helm/289/
        had a +1
      • Q: if the LF would like a separate CDBuilder job to run after or do as part of the current jobbuilder
        "Verified +1 ONAP Jobbuilder"
      • recommend a single VM deployment for 1 pod - not a full ONAP system yet
        (for example - the commit is under LOG-NNN
        so we run the command with a --set log.enabled=true - the rest of onap is false like
        sudo helm install local/onap -n onap --namespace onap -f onap/resources/environments/disable-allcharts.yaml --set log.enabled=true

        example http://kibana.onap.info:5601/app/kibana#/dashboard/AWAtvpS63NTXK5mX2kuS

      • Timing should be <20 min for one pod to helm install and run healthcheck
      • Concurrency limit (as a result of multiple gerrit merges within a period) - ask for 4
      • LF: servers # and capacity (13 x 16Gb for full onap, 1x 16Gb for a particular component)
        (vCPU limits are not enforced yet - but in the future a component like pomba with 11 containers using 10G ram will need 2 x 11 for example vCPUs) - current vCPU limit is 2 - a fraction of a vCPU (10% of a core per container) - full system between 32 and 64 cores
      • HD need 40G per VM for K8S system and 10G for /dockerdata_nfs - a full deploy is 90G on master and 50g on each cluster VM's
        Numbers: full 13x16 = 700G total HD , one component = 90G (single VM)
      • access (public not VPN) from jenkins jobs
      • pilot project
      • LF: work on JJB to continue past existing helm-verify
      • LF: to see the outcome of the particular helm install - like we get the +1 in for example https://gerrit.onap.org/r/#/c/70486/
      • optional nice to have gwu kibana view like in http://onapci.org/grafana/d/8cGRqBOmz/daily-summary?orgId=1

      investigate ansible based https://zuul-ci.org/

            michaelobrien michaelobrien
            katel34 katel34
            0 Vote for this issue
            12 Start watching this issue