Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jw/nexus update cloud config #2892

Merged
merged 4 commits into from
Nov 18, 2022

Conversation

JaimieWi
Copy link
Contributor

Resolves #2877

What is being addressed

Nexus failed to deploy correctly. The runcmd script within the cloud-config.yaml for nexus was failing to run due to the packages not being installed.

The error in the cloud-init.logs of the Nexus VM showed a 470 error for [IP: 51.132.212.186 80]. This is the azure.archive.ubuntu.com url.

This had been added to the nexus firewall rule but, on deployment of nexus was being denied. Likely as the rule had not taken effect quick enough.

Err:1 http://azure.archive.ubuntu.com/ubuntu bionic InRelease
  470  status code 470 [IP: 51.132.212.186 80]
Err:2 http://azure.archive.ubuntu.com/ubuntu bionic-updates InRelease
  470  status code 470 [IP: 51.132.212.186 80]
Err:3 http://azure.archive.ubuntu.com/ubuntu bionic-backports InRelease
  470  status code 470 [IP: 51.132.212.186 80]
Err:4 http://azure.archive.ubuntu.com/ubuntu bionic-security InRelease
  470  status code 470 [IP: 51.132.212.186 80]
Get:5 https://download.docker.com/linux/ubuntu bionic InRelease [64.4 kB]
Get:6 https://packages.microsoft.com/repos/azure-cli bionic InRelease [1873 B]
Get:7 https://download.docker.com/linux/ubuntu bionic/stable amd64 Packages [29.6 kB]
Get:8 https://packages.microsoft.com/repos/azure-cli bionic/main all Packages [7065 B]
Reading package lists...
E: The repository 'http://azure.archive.ubuntu.com/ubuntu bionic InRelease' is no longer signed.
E: Failed to fetch http://azure.archive.ubuntu.com/ubuntu/dists/bionic/InRelease  470  status code 470 [IP: 51.132.212.186 80]

Therefore, I have added azure.archive.ubuntu.com and repo.almalinux.org to the Firewall shared service. Specifically, to the list of Target FQDNs within the nexus-bootstrap application rule collection. These were taken from the list of nexus_allowed_fqdns.

How is this addressed

  • Within the Firewall shared service firewall.tf file, add "azure.archive.ubuntu.com", "repo.almalinux.org" to the nexus-bootstrap application rule. This was done to ensure access during the deployment of Nexus.

  • Add package_update: true to the cloud-config.yaml file to update packages on deployment. Documentation

  • I have not updated the CHANGELOG.md as I'm not sure it's needed. Happy to add that if it is.

@github-actions github-actions bot added the external PR from an external contributor label Nov 18, 2022
@github-actions
Copy link

github-actions bot commented Nov 18, 2022

Unit Test Results

0 tests   0 ✔️  0s ⏱️
0 suites  0 💤
0 files    0

Results for commit 620101a.

♻️ This comment has been updated with latest results.

@tamirkamara
Copy link
Collaborator

tamirkamara commented Nov 18, 2022

Likely as the rule had not taken effect quick enough

What does it mean? I don't think we've seen this before nor if it's even possible. Other rules we create just before needing them work fine...
Anyway, not sure I like opening more things on the core subnets for a single app.

@jjgriff93
Copy link
Collaborator

@JaimieWi thank you for putting this together :) did you try deploying with only the azure archive package added - i.e. without repo.almalinux.org? Curious whether that's actually needed

@jjgriff93
Copy link
Collaborator

Likely as the rule had not taken effect quick enough

What does it mean? I don't think we've seen this before nor if it's even possible. Other rules we create just before needing them work fine... Anyway, not sure I like opening more things on the core subnets for a single app.

We have seen this before unfortunately (PR #2811 was supposed to add a workaround but as Jaimie has found it seems it was missing these two packages) - essentially the pipeline has to run main first to generate the required terraform outputs, and then run the firewall step. Because the cloud-init bootstrapping can vary in the time it takes, sometimes when that's kicked off, the firewall step has chance to complete, and sometimes it doesn't.

For getting Nexus running reliably in the short term I don't think we've got much choice in adding these extra repositories to the core subnet whitelist. It is indeed a single app however most TRE installations require it as it's needed for Guacamole URs. I opened up an issue for the original bug to modify the pipeline logic so we can get terraform outputs required by the firewall step without having to run main first (#2816), this is a bigger change but will allow us to remove the application rule once it's in.

@marrobi thoughts here?

@JaimieWi
Copy link
Contributor Author

@JaimieWi thank you for putting this together :) did you try deploying with only the azure archive package added - i.e. without repo.almalinux.org? Curious whether that's actually needed

@jjgriff93 Looking at the firewall logs there is nothing to say repo.almalinux.org is required on deployment, there is no record of it. So I will take it out!

I had added it as I had seen it next to azure.archive.ubuntu in the previous deployment error logs but it seems it's not needed.

@jjgriff93
Copy link
Collaborator

@JaimieWi thank you for putting this together :) did you try deploying with only the azure archive package added - i.e. without repo.almalinux.org? Curious whether that's actually needed

@jjgriff93 Looking at the firewall logs there is nothing to say repo.almalinux.org is required on deployment, there is no record of it. So I will take it out!

I had added it as I had seen it next to azure.archive.ubuntu in the previous deployment error logs but it seems it's not needed.

Perfect, thanks for checking @JaimieWi - in that case I don't see any additional security concern as the azure.archive.ubuntu.com package is already whitelisted in the resource processor subnet

@jjgriff93
Copy link
Collaborator

/test

@github-actions
Copy link

🤖 pr-bot 🤖

⚠️ When using /test on external PRs, the SHA of the checked commit must be specified

(in response to this comment from @jjgriff93)

@jjgriff93
Copy link
Collaborator

/test 620101a

@github-actions
Copy link

🤖 pr-bot 🤖

🏃 Running tests: https://github.com/microsoft/AzureTRE/actions/runs/3497296534 (with refid 799017d2)

(in response to this comment from @jjgriff93)

@marrobi
Copy link
Member

marrobi commented Nov 18, 2022

I'm good with this, as @jjgriff93 says discussed previously, and is more of a case of a rule being missed and we have a "better" solution tracked.

@jjgriff93 jjgriff93 merged commit 06ced09 into microsoft:main Nov 18, 2022
@JaimieWi JaimieWi deleted the jw/nexus_update_cloud-config branch November 22, 2022 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external PR from an external contributor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nexus - curl: (7) Failed to connect - error
4 participants