Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated init script to allow metadata override #82

Merged
merged 6 commits into from
Jun 13, 2023

Conversation

liamfallon
Copy link
Member

This PR:

  • Allows the initialization variables to be overridden on invocation without using the get metadata, for example by using
    cat gce_init.sh | sudo NEPHIO_DEBUG=true NEPHIO_RUN_E2E=false bash
  • Creates the docker use before management cluster invocation to allow the kpt fn apply commands in the
    gce_install_sandbox.sh script to run on the nephio user.

@nephio-prow nephio-prow bot requested review from aravind254 and radoslawc June 12, 2023 14:16
@liamfallon
Copy link
Member Author

Comment on lines 50 to 53

if ! getent group docker > /dev/null; then
addgroup docker
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker is installed by the Ansible playbooks, so technically at this point the docker group shouldn't exist

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but if you don't add it before calling the gce_install_sandbox.sh script using the nephio user, you can't add the nephio user to the docker group. In a shell, while you can add a user to a group, you can't execute anything using that new group's rights without spawning a new shell.

I know this is not a perfect way of doing this and we should restructure these scripts afterwards, but it does work and the Docker install works fine even if the docker user already exists.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that line 57 adds $NEPHIO_USER into the docker group, correct me if I'm wrong, but is this not doing the same?

BTW, I was planning to move 56-58 into the gce_install_sandbox.sh script.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried this:
sudo usermod -aG docker2 ubuntu
usermod: group 'docker2' does not exist

(used non-existant docker2 group as a test)
The group has to exist before it can be given to a user.

I tried moving lines 56-58 into the gce_install_sandbox.sh script but because of the requirement to restart the shell, docker or kpt fn render commands later in the script would not run as $NEPHIO_USER because a new shell has to be started to pick up the new group privileges

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should have one phase in the install process to install all the prerequisites (docker kpt kubectl etc etc) and then in another phase install the more nephio-specific stuff we need but I guess that'll be post R1 :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's done here before executing the bootstrap Ansible role. So my point is that before those instructions has been executed, docker group shouldn't exist

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @electrocucaracha I have reverted this.

@@ -22,8 +22,7 @@ function deploy_kpt_pkg {
local temp=$(mktemp -d -t kpt-XXXX)
local localpkg="$temp/$name"
kpt pkg get --for-deployment "https://github.com/nephio-project/nephio-example-packages.git/$pkg" "$localpkg"
# sudo because docker
sudo kpt fn render "$localpkg"
kpt fn render "$localpkg"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to achieve that is using newgrp

Suggested change
kpt fn render "$localpkg"
newgrp docker <<EONG
kpt fn render "$localpkg"
EONG

Copy link
Member Author

@liamfallon liamfallon Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, newgrp spawns a new shell and then the render happens in that shell with the new privileges?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But doesn't newgrp have to be run as sudo?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @electrocucaracha , I am currently working on refactoring the scripts a bit so will take the docker group issue after this change if that's ok.
Plan is to add an intermediary script to setup the platform and isolate and sudo calls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @electrocucaracha I have reverted this and left in the sudo.

- 16 cores
- 32 GB memory
- 200 GB disk size
- default user is "ubuntu" and "ubuntu" has sudo rights
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- default user is "ubuntu" and "ubuntu" has sudo rights
- default user is "ubuntu" and "ubuntu" needs sudo passwordless permissions

@liamfallon
Copy link
Member Author

Just added the hacks/resolve_secrets.sh script to destroy and re-apply the kpt packages for mgmt and mgmt-cluster, this script works around the issue with secrets not being created by the gce_install_sandbox.sh script.

@radoslawc
Copy link
Collaborator

/approve
/lgtm

@nephio-prow
Copy link
Contributor

nephio-prow bot commented Jun 13, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liamfallon, radoslawc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@nephio-prow nephio-prow bot added the approved label Jun 13, 2023
@nephio-prow nephio-prow bot merged commit 5e0ca32 into nephio-project:main Jun 13, 2023
@liamfallon liamfallon deleted the init-non-gce branch June 13, 2023 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants