Details
-
Bug
-
Status: Closed
-
High
-
Resolution: Done
-
Casablanca Release
Description
Issue: nexus3 is taking 80-100x longer to download images since the 17th - likely routing because local domain jenkins jobs are OK
workaround: use a temporary proxy from AWS or Azure
nexus3 public proxy - currently filling up the server - eta 35+ hours
nexus3.onap.info:5000 (aws based private proxy -cert required)
nexus3.onap.cloud:5000 (azure based public proxy - cert required - eta 8 hours)
(20181219 update pre-TSC- https://wiki.onap.org/display/DW/TSC+2018-12-20+Meeting+Agenda
Nexus3.onap.org:10001 experiencing serious routing issue not the older 3 hour slowdown - but 35+ hours for a full prepull? since 20181217 80x slowdown on image downloads
nexus3.onap.cloud:5000 alternate proxy (have the LF's back when on vacation) up on 20181218 - taking 2 days to saturate with casablanca images 128G FS ETA late Friday - as of 20h of pulliing - see 21 images of https://git.onap.org/integration/tree/version-manifest/src/main/resources/docker-manifest.csv?h=casablanca
nexus4.onap.cloud:5000 alternate proxy for master - larger FS 200G
- access instructions, cert, installation on Cloud Native Deployment#NexusProxy
20181217
Started noticing this on the weekend - docker pulls are very-very slow causing config job timeouts on oom deployments
Tested both in windriver lab and on AWS
RCA
Q) why jenkins has no issue with nexus3 - A) they are on the same domain - and don't go through a bell.ca exchange
;; ANSWER SECTION: jenkins.onap.org. 60 IN CNAME cloud.onap.org. cloud.onap.org. 60 IN A 199.204.45.137 ;; ANSWER SECTION: nexus3.onap.org. 60 IN CNAME cloud.onap.org. cloud.onap.org. 47 IN A 199.204.45.137 |
run a traceroute and notice
# this is from an AWS EC2 instance in us-east-2 ubuntu@ip{ traceroute to nexus3.onap.org (199.204.45.137), 30 hops max, 60 byte packets 16 tcore3-ashburnbk_hundredgige0-1{ |
Q) route
from AWS EC2 us-east-2 ubuntu@ip-172-31-10-98:~$ traceroute nexus3.onap.org traceroute to nexus3.onap.org (199.204.45.137), 30 hops max, 60 byte packets 1 ec2-52-15-0-0.us-east-2.compute.amazonaws.com (52.15.0.0) 13.311 ms ec2-52-15-0-4.us-east-2.compute.amazonaws.com (52.15.0.4) 11.291 ms ec2-52-15-0-8.us-east-2.compute.amazonaws.com (52.15.0.8) 13.231 ms 2 100.64.2.8 (100.64.2.8) 18.561 ms 100.64.1.10 (100.64.1.10) 14.609 ms 100.64.2.12 (100.64.2.12) 18.531 ms 3 100.66.3.4 (100.66.3.4) 10.228 ms 100.66.3.172 (100.66.3.172) 16.576 ms 100.66.3.136 (100.66.3.136) 21.578 ms 4 100.66.7.133 (100.66.7.133) 21.433 ms 100.66.7.99 (100.66.7.99) 15.832 ms 100.66.7.1 (100.66.7.1) 16.592 ms 5 100.66.4.117 (100.66.4.117) 14.519 ms 100.66.4.97 (100.66.4.97) 14.504 ms 100.66.4.209 (100.66.4.209) 16.951 ms 6 100.65.9.193 (100.65.9.193) 0.241 ms 100.65.8.97 (100.65.8.97) 0.364 ms 100.65.9.65 (100.65.9.65) 0.199 ms 7 52.95.1.1 (52.95.1.1) 2.031 ms 2.266 ms 52.95.3.131 (52.95.3.131) 1.341 ms 8 52.95.1.110 (52.95.1.110) 1.690 ms 52.95.2.22 (52.95.2.22) 5.371 ms 52.95.1.166 (52.95.1.166) 8.017 ms 9 52.95.1.179 (52.95.1.179) 1.122 ms 52.95.2.29 (52.95.2.29) 1.319 ms 52.95.1.147 (52.95.1.147) 1.060 ms 10 100.91.39.96 (100.91.39.96) 20.834 ms 100.91.39.112 (100.91.39.112) 15.177 ms 100.91.39.96 (100.91.39.96) 11.106 ms 11 54.239.45.239 (54.239.45.239) 11.895 ms 11.907 ms 54.239.45.247 (54.239.45.247) 11.567 ms 12 100.91.0.33 (100.91.0.33) 11.380 ms 100.91.38.231 (100.91.38.231) 19.709 ms 100.91.0.25 (100.91.0.25) 11.401 ms 13 54.239.108.116 (54.239.108.116) 26.534 ms 54.239.109.32 (54.239.109.32) 31.305 ms 54.239.108.216 (54.239.108.216) 30.614 ms 14 54.239.111.255 (54.239.111.255) 11.377 ms 54.239.111.247 (54.239.111.247) 11.406 ms 54.239.111.235 (54.239.111.235) 11.486 ms 15 52.95.216.141 (52.95.216.141) 11.262 ms 11.189 ms 11.197 ms 16 tcore3-ashburnbk_hundredgige0-1-0-0.net.bell.ca (64.230.125.182) 25.657 ms tcore4-ashburnbk_hundredgige0-1-0-0.net.bell.ca (64.230.125.184) 30.723 ms 30.673 ms 17 tcore3-montreal02_hundredgige1-5-0-0.net.bell.ca (64.230.79.107) 23.715 ms tcore4-montreal02_hundredgige1-5-0-0.net.bell.ca (64.230.79.111) 29.493 ms tcore3-montreal02_hundredgige1-5-0-0.net.bell.ca (64.230.79.107) 23.707 ms 18 dis53-montreal02_hundredgige0-2-0-0.net.bell.ca (64.230.91.19) 25.871 ms 25.883 ms 25.756 ms 19 204.101.4.238 (204.101.4.238) 28.213 ms 28.194 ms 28.182 ms 20 * * * 21 compute-199-204-45-137.ca-ymq-1.vexxhost.net (199.204.45.137) 34.124 ms 34.187 ms 34.418 ms
Usually a full pull of 40G onap takes 25min - now just pulling AAI containers takes 3 hours
This renders some pods like OOF to fail deployment waiting for the docker images to load
this is over the course of 4 hours ubuntu@ip-172-31-16-86:~$ sudo docker images REPOSITORY TAG IMAGE ID CREATED SIZE nexus3.onap.org:10001/onap/aaf/aaf_service 2.1.8 6eb295fed110 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_oauth 2.1.8 74dcdce76094 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_locate 2.1.8 2a4eaa6275ff 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_hello 2.1.8 495a01176053 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_gui 2.1.8 8caa6dc681f0 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_fs 2.1.8 3d663698534d 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_cm 2.1.8 0ba25c4ec3fb 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_agent 2.1.8 090b326a7f11 4 weeks ago 1.14 GB nexus3.onap.org:10001/onap/aaf/aaf_config 2.1.8 6506ac785cb5 4 weeks ago 1.14 GB nexus3.onap.org:10001/onap/aaf/aaf_cass 2.1.8 4b91e9b0b43f 4 weeks ago 323 MB ubuntu@ip-172-31-16-86:~$ sudo docker images REPOSITORY TAG IMAGE ID CREATED SIZE nexus3.onap.org:10001/onap/aaf/aaf_service 2.1.8 6eb295fed110 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_oauth 2.1.8 74dcdce76094 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_locate 2.1.8 2a4eaa6275ff 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_hello 2.1.8 495a01176053 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_gui 2.1.8 8caa6dc681f0 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_fs 2.1.8 3d663698534d 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_cm 2.1.8 0ba25c4ec3fb 4 weeks ago 1.16 GB nexus3.onap.org:10001/onap/aaf/aaf_agent 2.1.8 090b326a7f11 4 weeks ago 1.14 GB nexus3.onap.org:10001/onap/aaf/aaf_config 2.1.8 6506ac785cb5 4 weeks ago 1.14 GB nexus3.onap.org:10001/onap/aaf/aaf_cass 2.1.8 4b91e9b0b43f 4 weeks ago 323 MB nexus3.onap.org:10001/onap/aaf/smsquorumclient 3.0.1 f8cf701eadc3 6 weeks ago 18.2 MB nexus3.onap.org:10001/onap/aaf/sms 3.0.1 02363fccc6c7 6 weeks ago 35.4 MB nexus3.onap.org:10001/onap/aaf/testcaservice 3.0.0 fc717d0b071c 2 months ago 1.17 GB nexus3.onap.org:10001/onap/aaf/abrmd 3.0.0 00a91d2dc09d 2 months ago 1.15 GB nexus3.onap.org:10001/onap/aaf/distcenter 3.0.0 d8d9137ef2d3 2 months ago 1.09 GB nexus3.onap.org:10001/onap/aai-cacher 1.0.0 8ec3df246a35 2 months ago 466 MB ubuntu@onap-oom-obrien-rancher-e0:~$ sudo docker pull nexus3.onap.org:10001/onap/babel:1.3.2 1.3.2: Pulling from onap/babel 027274c8e111: Downloading [=========> ] 12.13 MB/67.13 MB d3f9339a1359: Download complete 872f75707cf4: Download complete dd5eed9f50d5: Download complete 1e7eb83bf142: Downloading [============================================> ] 11.62 MB/13.08 MB f3f38846c182: Downloading [===================> ] 10.76 MB/27.49 MB ef51b9ba357f: Waiting a50c1a87a76e: Waiting
Attachments
Issue Links
- blocks
-
LOG-898 ONAP Deployment Resiliency changes to deploy without POD failures
-
- Closed
-
-
TSC-25 Task Force to implement CD (Continuous Deployment)
-
- Closed
-
-
LOG-355 Add Nexus proxy procedure for ONAP deployments - LF Nexus3 is timing out periodically
-
- Closed
-
-
LOG-375 Nexus3.onap.info public proxy for throughput
-
- Closed
-
-
LOG-905 docker_prepull.sh script for casablanca
-
- Closed
-
- is blocked by
-
OOM-1563 All Charts do no honor global image pull Policy
-
- Closed
-
-
TSC-86 Lock down docker image tag name source of truth - oom values.yaml or integration repo manifest - A: both but manifest is the source
-
- Closed
-
- relates to
-
TSC-58 Dublin Toolchain Improvement
-
- Closed
-
-
COMMON-27 ONAP Docker images and base images should be ONAP controlled
-
- Open
-
-
OOM-1543 Helm deploy/undeploy adjustments for triage of failed/unlisted deployments
-
- Closed
-
-
OOM-1563 All Charts do no honor global image pull Policy
-
- Closed
-
-
LOG-352 Docker prepull strategy for multi-node clusters
-
- Closed
-
-
LOG-355 Add Nexus proxy procedure for ONAP deployments - LF Nexus3 is timing out periodically
-
- Closed
-
-
LOG-371 K8S DevOps: add Nexus3 proxy helm chart
-
- Closed
-
-
LOG-375 Nexus3.onap.info public proxy for throughput
-
- Closed
-
-
OOM-1553 Patch Integration environments
-
- Closed
-
- links to
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...