Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#739 - Fixed commands used landingzonev2 readme #740

Merged

Conversation

jacyang2010
Copy link
Collaborator

@jacyang2010 jacyang2010 commented Nov 27, 2023

@jacyang2010 jacyang2010 self-assigned this Nov 27, 2023
@jacyang2010 jacyang2010 marked this pull request as ready for review November 27, 2023 22:56
@jacyang2010 jacyang2010 added the documentation Improvements or additions to documentation label Nov 27, 2023
@jacyang2010 jacyang2010 linked an issue Nov 27, 2023 that may be closed by this pull request
@jacyang2010 jacyang2010 changed the title #739 - fixed commands used landingzonev2 readme #739 - Fix commands used landingzonev2 readme Nov 27, 2023
@obriensystems
Copy link
Collaborator

FYI, previous PR context where the delete work was spawned out of
#722

Alain, good point on the order of deletion, i rarely had a chance to delete the full lz lately - to sacrifice the cluster. normally i use kpt to recycle but in my last corruption of the lz running hub-env on top of the clz package had issues with remaining services - using kubectl describe was too late. I dont think i started with the config-controller ns

but i will retest on a clean org and update the docs and script with config-control last - good point

https://github.com/GoogleCloudPlatform/pubsec-declarative-toolkit/wiki/DevOps#delete-the-landing-zone

#593

Copy link
Contributor

@fmichaelobrien fmichaelobrien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jackson, for both of the sh fragments in the readme - they ideally should have before/after or cause/fix runtime testing on a running V2 LZ with at least the core-landing-zone package up.

see for example PR testing around a single line change to list the cc cluster in the this readme
#743

@jacyang2010 jacyang2010 force-pushed the gh739-fix-problematic-commands-from-landingzonev2-readme branch from 1ceecdb to 7320c4c Compare December 1, 2023 20:05
@jacyang2010 jacyang2010 changed the title #739 - Fix commands used landingzonev2 readme docs(core-landingzone-v2): fix commands used landingzonev2 readme Dec 4, 2023
@jacyang2010 jacyang2010 changed the title docs(core-landingzone-v2): fix commands used landingzonev2 readme docs(landing-zone-v2): fix commands used landingzonev2 readme Dec 4, 2023
@jacyang2010 jacyang2010 changed the title docs(landing-zone-v2): fix commands used landingzonev2 readme #739 - fixed commands used landingzonev2 readme Dec 4, 2023
@jacyang2010 jacyang2010 changed the title #739 - fixed commands used landingzonev2 readme #739 - Fixed commands used landingzonev2 readme Dec 4, 2023
@jacyang2010
Copy link
Collaborator Author

jacyang2010 commented Dec 4, 2023

The below shows the devtest results before and after fixing

Before Fixing

When ran the below command to clean up the RootSync resource, got the below error.

kubectl delete root-sync landing-zone -n config-management-system

The below screenshot shows the above root-sync resource is not even found.
image
Then, when ran the below command to clean up all gcp resource, it actually only removed all gcp resource from the current default namespace "config-control", as shown below.

kubectl delete gcp --all

The above command just removed all gcp resources from the default namespace config-control as shown below.
image
However, when ran the below command to show all gcp resources deployed to the current cluster, still see a lot of gcp resource not deleted yet.

kubectl get gcp -A
image

To conclude based on the above facts, the given command does not remove all gcp resources.

After Fixing

Run the below kpt command to destroy all deployed gcp resources of a given solution.
(If you have just made a deployment with solutions, you should still have the local bootstrapping folders for each of deployed solutions, otherwise, you should checkout those folders out from your git repository if you have pushed the bootstrapped solutions to a git repository.)

# kpt live destroy <solution_folder_path>
kpt live destroy core-landing-zone

You can see the above command is deleting a lot of gcp resources as shown below.
image
Once it completed, run the below command to get all gcp resources and you will see no any gcp resource found as shown below.

kubectl get gcp -A
image

To conclude based on the above devtest results, the above kpt approach works well as expected as for making a full cleanup.

@fmichaelobrien
Copy link
Contributor

Good point, I would stick to kpt - the readme just happens to have the lower kubectl delete - hence why I raised #593 Oct 22nd

kpt live destroy
is recommended over going lower in the stack with kubectl
The issue with kubectl is services are removed at the namespace level across packages. if you have core-landing-zone and hub-env deployed for example it would be better to go higher in the stack and kpt live destroy each in reverse sequence

This is what my issue id on deletion mentions to do in 593 - don't use kubectl delete
"Automation: deletion of the landing zone should include the 5 ns - policies, logging, networking, projects, hierarchy - or let the config controller handle deletion via kpt live destroy"
#593

kpt live destroy $REL_SUB_PACKAGE

Check the in-progress LZ automation script (I didn't have a problem with liens last time)
https://github.com/GoogleCloudPlatform/pubsec-declarative-toolkit/blob/gh446-hub/solutions/setup.sh#L488

if [[ "$REMOVE_LZ" != false ]]; then
  echo "deleting lz on ${CLUSTER} in region ${REGION}"
  #kubectl get gcp
  # stay in current dir
  # will take up to 15-45 min and may hang unless liens are removed
  # 3 problematic projects
  #gcloud config set project audit-prj-id-oldv1
  #AUDIT_LIEN=$(gcloud alpha resource-manager liens list)
  #gcloud alpha resource-manager liens delete $AUDIT_LIEN

  #gcloud config set project net-host-prj-prod-oldv1
  #PROD_LIEN=$(gcloud alpha resource-manager liens list)
  #gcloud alpha resource-manager liens delete $PROD_LIEN

  #gcloud config set project net-host-prj-nonprod-oldv1
  #NONPROD_LIEN=$(gcloud alpha resource-manager liens list)
  #gcloud alpha resource-manager liens delete $NONPROD_LIEN

  echo "moving to folder ../../../$KPT_FOLDER_NAME"
  cd ../../../kpt
  #cd $KPT_FOLDER_NAME

  REL_SUB_PACKAGE="core-landing-zone"
  echo "deleting REL_SUB_PACKAGE: $REL_SUB_PACKAGE"
  kpt live destroy $REL_SUB_PACKAGE
  # all packages delete
  #kubectl delete gcp --all

This is what the faq mentions
https://github.com/GoogleCloudPlatform/pubsec-declarative-toolkit/wiki/DevOps#scenarios

kpt live destroy core-landing-zone

I added 593 in a comment above 5 days ago
#740 (comment)

Copy link
Contributor

@fmichaelobrien fmichaelobrien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jackson, very nice reproduction and options for adjustment around either recommended kpt live destroy - or reverse aligned lower level kubectl delete - in sequence.

When the patch is finalized I will approve your changes

@jacyang2010 jacyang2010 force-pushed the gh739-fix-problematic-commands-from-landingzonev2-readme branch 2 times, most recently from 6bbbad9 to a28267e Compare December 5, 2023 20:42
@jacyang2010
Copy link
Collaborator Author

jacyang2010 commented Dec 5, 2023

Jackson, very nice reproduction and options for adjustment around either recommended kpt live destroy - or reverse aligned lower level kubectl delete - in sequence.

When the patch is finalized I will approve your changes

Resolved with only keeping the ktp approach as suggested by you, and please continue to review it. @fmichaelobrien

Copy link
Contributor

@fmichaelobrien fmichaelobrien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good - just the anthos keyword back and the pr is ready

@jacyang2010 jacyang2010 force-pushed the gh739-fix-problematic-commands-from-landingzonev2-readme branch from a28267e to 4f5bd58 Compare December 6, 2023 03:09
@jacyang2010
Copy link
Collaborator Author

All good - just the anthos keyword back and the pr is ready

Resolved.

Copy link
Contributor

@fmichaelobrien fmichaelobrien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@jacyang2010
Copy link
Collaborator Author

@stanimprover do you mind review and approve this PR?

Copy link
Collaborator

@stanimprover stanimprover left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I thought we would leave the "kubectl delete gcp -A" in its place incase of any underlying resources that need to be clean up. Overall looks good and I would add on the comment above the "gloud anthos config controller delete $CLUSTER --location $REGION", CLUSTER NAME without the [krmapihost-]. Thanks

@jacyang2010
Copy link
Collaborator Author

jacyang2010 commented Dec 7, 2023

Looks good. I thought we would leave the "kubectl delete gcp -A" in its place incase of any underlying resources that need to be clean up. Overall looks good and I would add on the comment above the "gloud anthos config controller delete $CLUSTER --location $REGION", CLUSTER NAME without the [krmapihost-]. Thanks

Hey @stanimprover ,

As for the suggested kubectl-based low level deletion approach, we have made a consent to no provide such a way, not even mention the command "kubectl delete gcp -A" is invalid as shown below.

jackson_yang@cloudshell:~/workspace/company-pbmm-landingzone (single-kcc-yjs06)$ kubectl delete gcp -A
error: resource(s) were provided, but no name was specified

As for the prompt about cluster name prefix, when you run the list command, you will have the below result.

jackson_yang@cloudshell:~/workspace/company-pbmm-landingzone (single-kcc-yjs06)$ gcloud anthos config controller list
NAME: single-kcc-clz-yjs06
LOCATION: northamerica-northeast1
STATE: RUNNING

You can see there is NO any ambiguity on cluster name.

@stanimprover
Copy link
Collaborator

looks good.

@obriensystems
Copy link
Collaborator

guys, lets check permissions tomorrow and verify that review +1s are avaulable, so far inly one +1 is in.

we should be able to fix this so you can review each others PRs

Copy link
Collaborator

@stanimprover stanimprover left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Copy link
Collaborator

@craigenator craigenator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@fmichaelobrien fmichaelobrien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@fmichaelobrien
Copy link
Contributor

Merge #738 or #739 in sequence - check merge conflict

@davelanglois-ssc
Copy link
Collaborator

lgtm....good job

@tackaberry tackaberry merged commit ec1d839 into main Jan 22, 2024
2 checks passed
@tackaberry tackaberry deleted the gh739-fix-problematic-commands-from-landingzonev2-readme branch January 22, 2024 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
delete documentation Improvements or additions to documentation scrum
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix problematic commands used in the landing zone v2 readme doc
8 participants