Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add helm support #135

Merged
merged 42 commits into from
Sep 23, 2024
Merged

Add helm support #135

merged 42 commits into from
Sep 23, 2024

Conversation

MonkeyCanCode
Copy link
Contributor

Description

This PR adds helm support when deploying Polaris. This is based off work from https://github.com/projectnessie/nessie/tree/main/helm/nessie.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

This had being tested locally with installation instruction under helm/polaris/README.md

Checklist:

Please delete options that are not relevant.

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • If adding new functionality, I have discussed my implementation with the community using the linked GitHub issue
  • I have signed and submitted the ICLA and if needed, the CCLA. See Contributing for details.

@MonkeyCanCode MonkeyCanCode requested a review from a team as a code owner August 10, 2024 01:55
@MonkeyCanCode
Copy link
Contributor Author

MonkeyCanCode commented Aug 10, 2024

@snazy / @flyrain can your guys take a look at this PR for adding helm support?

This is based off work from https://github.com/projectnessie/nessie/tree/main/helm/nessie.

Here are a couple of items for which I would like further clarification and suggestions:

  1. Instead of using two k8s services as the current Nessie setup does, I propose using only one k8s service. Nessie uses a headless service for the management port. However, since the pods will be deployed with non-consistent naming (due to k8s deployment), a headless service may not provide significant benefits. There might be a specific purpose for it, so I’d appreciate feedback on this.

  2. The current bootstrapped Kind cluster does not support ingress, but this chart does add that support. To allow end-users to use ingress, I will need to update the kind-registry.sh script for the Kind cluster definition. Should I proceed with this update?

  3. Polaris supports a set of configuration parameters, which I have currently grouped under the polaris_config section in values.yaml. While this organization might look a bit messy, it makes it easier for end-users to copy and manage their local configurations. What do you think? Are there better ways to organize the configuration for the Polaris service?

  4. Currently, the image provides only one endpoint for health checks, and the image itself does not include a JSON parser. Therefore, I am using the same endpoint for both livenessProbe and readinessProbe. Ideally, we should split this endpoint into two distinct endpoints (or create a wrapper endpoint to parse the response) and use them for the respective probes accordingly.

  5. If we decide to proceed with Helm support, we may need a separate page or section (e.g., Providing Secrets for Cassandra). Currently, this is not included in this pull request.

  6. As the persistence.yaml can only be load from a separate jar if not wanting to use the built-in one from resources dir, should we create an init container to pack the config jar on a PV then shared the PV with deployment?

@flyrain
Copy link
Contributor

flyrain commented Aug 10, 2024

Thanks @MonkeyCanCode for working on it. Haven't got the chance to go through it, but we need the license header in each new files.

@MonkeyCanCode
Copy link
Contributor Author

license header

All good. Added license header.

@MonkeyCanCode
Copy link
Contributor Author

license header

All good. Added license header.

It is more common to use LICENSE file under helm chart repo. For now, I added license header for each of the files in this PR. Let me know if we should revert and use LICENSE file instead.

@MonkeyCanCode MonkeyCanCode marked this pull request as draft August 13, 2024 03:52
@MonkeyCanCode MonkeyCanCode marked this pull request as ready for review August 13, 2024 05:31
@MonkeyCanCode
Copy link
Contributor Author

Made couple more changes and addressed couple places where I had questions with. Here is the updated list and things I would like to get clarification/suggestions with (along with couple solutions I implemented):

  1. Instead of using two k8s services as the current Nessie setup does, I propose using only one k8s service. Nessie uses a headless service for the management port. However, since the pods will be deployed with non-consistent naming (due to k8s deployment), a headless service may not provide significant benefits. There might be a specific purpose for it, so I’d appreciate feedback on this.

  2. The current bootstrapped Kind cluster does not support ingress, but this chart does add that support. To allow end-users to use ingress, I will need to update the kind-registry.sh script for the Kind cluster definition. Should I proceed with this update?

  3. Polaris supports a set of configuration parameters, which I have currently grouped under the polaris_config section in values.yaml. While this organization might look a bit messy, it makes it easier for end-users to copy and manage their local configurations. What do you think? Are there better ways to organize the configuration for the Polaris service?

For this one, I ended up using `persistenceConfigSecret`, `polarisServerConfig`, and `extraEnv`. Here is the reasoning:
- `persistenceConfigSecret` will be use for `persistence.xml` which can contain sensitive info such as password. Thus, I don't have this being part of the helm chart but rather depends on an externally created k8s secret. 
- `polarisServerConfig` will be use for `polaris-server.yml` where we won't be putting sensitive info here. An end-user can quickly update their server config as needed by updating the `values.yaml` for their helm chart. 
- `extraEnv` will be use for handling credential with AWS secret. It has the ability to populate envs from external secret (e.g. an end-user can create external secret then ref to those as env through this...by doing so, the `values.yaml` will be clean and easy for check-in)
  1. Currently, the image provides only one endpoint for health checks, and the image itself does not include a JSON parser. Therefore, I am using the same endpoint for both livenessProbe and readinessProbe. Ideally, we should split this endpoint into two distinct endpoints (or create a wrapper endpoint to parse the response) and use them for the respective probes accordingly.

  2. If we decide to proceed with Helm support, we may need a separate page or section (e.g., Providing Secrets for Cassandra). Currently, this is not included in this pull request.

  3. License header for helm chart

The standard way for helm chart is to use a single file named `LICENSE` within the root dir of the chart. Let me know if we should still add license header for all files
  1. As the persistence.yaml can only be load from a separate jar if not wanting to use the built-in one from resources dir, should we create an init container to pack the config jar on a PV then shared the PV with deployment?
This part is taken care as item 3 along with an init container. The init container will take care the config jar creation. The only downside is currently there is no safe guard for bootstrap operation and rerun of bootstrap can wipe the backend. This will be taken care by PR from https://github.com/polaris-catalog/polaris/pull/59. As of now, I put following to ensure it will only run once (unless an end-user remove the helm and reinstall it): 
    "helm.sh/hook": post-install
    "helm.sh/hook-delete-policy": hook-succeeded

@flyrain
Copy link
Contributor

flyrain commented Aug 13, 2024

The change looks good to me overall, but I don't have much experience with Helm. Would like others to take a look. cc @snazy @dimas-b @collado-mike @aihuaxu @RussellSpitzer

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall 👍 I have not had a chance to try it in practice yet.. will try to do that soonish... hopefully tomorrow.

helm/polaris/LICENSE Outdated Show resolved Hide resolved
helm/polaris/values.yaml Outdated Show resolved Hide resolved
helm/polaris/values.yaml Outdated Show resolved Hide resolved
helm/polaris/templates/tests/test-connection.yaml Outdated Show resolved Hide resolved
helm/polaris/templates/job.yaml Outdated Show resolved Hide resolved
@dimas-b
Copy link
Contributor

dimas-b commented Aug 15, 2024

@adutra : your feedback would be very welcome too, when you get a chance. thx!

helm/polaris/values.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@adutra adutra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, sorry for the late review as I was out on vacation :-)
Looks mostly good to me, just a few minor remarks.

helm/polaris/templates/hpa.yaml Outdated Show resolved Hide resolved
helm/polaris/templates/ingress.yaml Outdated Show resolved Hide resolved
helm/polaris/values.yaml Outdated Show resolved Hide resolved
helm/polaris/values.yaml Show resolved Hide resolved
helm/polaris/values.yaml Show resolved Hide resolved
selector:
{{- include "polaris.selectorLabels" . | nindent 4 }}
ports:
{{- range $portName, $portNumber := .Values.service.ports }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be good to separate the metrics port from the main API port into 2 distinct services. But that can be done later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure beneficial such as network policies and traffic managements. However, I am not seeing much helm deployment separate metrics away from their main API (good practices for sure but most are not doing it). I will be more than happy to do it but here are couple things I would like to get feedback with:

  1. Should we put ingress on top of metrics endpoint (I am assuming no here?)
  2. Should we support diff service settings for diff endpoints (e.g. sessionAffinity, service type, sessionAffinity, etc.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put ingress on top of metrics endpoint (I am assuming no here?)

No, and that's the main reason why in Nessie we split the services in two. You typically want to expose the main port, but not the admin/mgmt one. So if you need to put an ingress in front of it, or turn the service into a LoadBalancer or NodePort type, the change would affect both ports and potentially expose the mgmt port as well.

Should we support diff service settings for diff endpoints

Yes, you could have e.g. different session affinities, but more importantly, you can turn the mgmt service into a headless one.

Finally, you can also set publishNotReadyAddresses: true. This allows pods to be addressable even if they are not ready, which allows health checks to provide more insights as to why the pod is not healthy.

But I agree to look into this later on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I can put those in the next PR after current one got merged.

helm/polaris/values.yaml Outdated Show resolved Hide resolved
@MonkeyCanCode
Copy link
Contributor Author

@adutra thanks for taking a look and nothing to worry. I had addressed most of the issues you listed above. I will run some final tests later tonight then push out those changes. For the ones that are not yet addressed, I had put a comment for additional discussion/clarification.

@MonkeyCanCode
Copy link
Contributor Author

@adutra thanks for taking a look and nothing to worry. I had addressed most of the issues you listed above. I will run some final tests later tonight then push out those changes. For the ones that are not yet addressed, I had put a comment for additional discussion/clarification.

@adutra all comments had being addressed and tested locally. Please take another look when you get a chance.

@MonkeyCanCode MonkeyCanCode requested a review from adutra August 29, 2024 01:43
Copy link
Contributor

@adutra adutra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @MonkeyCanCode !

@dimas-b
Copy link
Contributor

dimas-b commented Aug 29, 2024

Sorry about the delay from my side... I can have another look next week, but please feel free to merge based on @adutra 's review :)

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this helm chart with "in memory" mode and it worked well.

I did not test the EclipseLink secret. TBH, I do not think it is very user-friendly to put the whole XML file with persistence config into a k8s secret, but I think it is fine to have it like this for a start.

Thanks for your work on this @MonkeyCanCode !

@MonkeyCanCode
Copy link
Contributor Author

I tested this helm chart with "in memory" mode and it worked well.

I did not test the EclipseLink secret. TBH, I do not think it is very user-friendly to put the whole XML file with persistence config into a k8s secret, but I think it is fine to have it like this for a start.

Thanks for your work on this @MonkeyCanCode !

Yeah, I agreed that is not the best. We can improve it once we have a better handle of EclipseLink and build tools around to support it.

@dimas-b
Copy link
Contributor

dimas-b commented Sep 3, 2024

@snazy @RussellSpitzer : pinging for review/merge, please :)

@adutra
Copy link
Contributor

adutra commented Sep 5, 2024

@snazy @RussellSpitzer : can you please review and merge 🙏 ?

@MonkeyCanCode
Copy link
Contributor Author

Anything else needed on this PR?

@flyrain
Copy link
Contributor

flyrain commented Sep 10, 2024

@MonkeyCanCode, thank you so much for your work on this. Apologies for the delay; we're currently limited by the number of committers. We'll resolve this issue shortly.

@MonkeyCanCode
Copy link
Contributor Author

@MonkeyCanCode, thank you so much for your work on this. Apologies for the delay; we're currently limited by the number of committers. We'll resolve this issue shortly.

All good. Thanks for confirmation. Was planning to rollout more enhancement on this one thus not wanting to add those on top of this PR to avoid of increase of scope.

helm/polaris/Chart.yaml Outdated Show resolved Hide resolved
helm/polaris/README.md Outdated Show resolved Hide resolved
RussellSpitzer
RussellSpitzer previously approved these changes Sep 23, 2024
@RussellSpitzer RussellSpitzer enabled auto-merge (squash) September 23, 2024 20:48
@RussellSpitzer
Copy link
Member

Thanks @MonkeyCanCode and @adutra , @dimas-b , @flyrain, @cgpoh and @mrcnc for your reviews!

@RussellSpitzer RussellSpitzer merged commit 3d0cf92 into apache:main Sep 23, 2024
3 checks passed
@MonkeyCanCode MonkeyCanCode deleted the helm branch December 12, 2024 04:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants