-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SNOW-89] Update CI with service users #157
Conversation
I will squash before merging. |
run: | | ||
pip install schemachange==3.6.1 | ||
pip install numpy==1.26.4 | ||
pip install pandas==1.5.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll need to add these back - these version locks were introduced because of failures after merging to dev
: #65
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI at that point in time used schemachange==3.6.1
, whereas I upgraded us to the most recent schemachange==4.0.1
, so I'm not sure that these exact dependency versions still work (they are sub-dependencies so it's not obvious from looking at schemachange's requirements.txt).
Do you know why pinning the versions was necessary in the first place? The ticket itself was sparse on details, but I think if schemachange has their requirements.txt in good form then we don't need to worry about pinning the dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why pinning the versions was necessary in the first place?
Not the specifics - there was just some incompatibility between that version of schemachange and the latest versions of numPy/pandas at the time. Typical dependency hell stuff. It makes me think we should keep locking these dependencies so we don't run into this issue again.
edit: If we move forward with schemachange==4.0.1
and not lock anything else, we'll run the risk of job failures. Unless there's a particular reason for upgrading schemachange
, can we stick with the version and dependencies that work for now, and update schemachange
in a separate ticket so we have more time to experiment? We can also introduce a lock file of sorts for our dependencies in that PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaymedina You can see a successful run of schemachange within a test workflow that uses the new action here. It mirrors how we run schemachange in this PR, except it points at an "empty" analytics folder and my clone database so that it doesn't accidentally deploy anything to dev/prod.
Is there a specific change since 3.6.1 which you think we should be worried about? Downgrading at this point would be a big pain since connections.toml wasn't supported until v4.0.0. I would need to completely rewrite the way auth is done to work specifically with schemachange (v3.6.1 only supports passing connection parameters via the schemachange-config.yml file or command-line parameters). With support for connections.toml, we have a uniform way to configure authentication across all of our clients: schemachange, Snowflake CLI, and even the python connector if we ever want to use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Downgrading at this point would be a big pain since connections.toml wasn't supported until v4.0.0.
Ah, this explains the bump up. I still think we should eventually have a lock file to version lock some of these dependencies. Since that issue happened with numPy/Pandas, there's no guarantee it won't happen again unless we lock those versions. How about we lock the dependencies after this merges into main
and we're sure the current versions of everything work with no issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numpy and pandas are not dependencies of v4.0.1. Here's the pip freeze from the github runner:
asn1crypto==1.5.1
certifi==2025.1.31
cffi==1.17.1
charset-normalizer==3.4.1
cryptography==44.0.2
filelock==3.[17](https://github.com/Sage-Bionetworks/snowflake/actions/runs/13684265553/job/38263809921#step:5:18).0
idna==3.10
Jinja2==3.1.5
MarkupSafe==3.0.2
packaging==24.2
platformdirs==4.3.6
pycparser==2.22
PyJWT==2.10.1
pyOpenSSL==25.0.0
pytz==[20](https://github.com/Sage-Bionetworks/snowflake/actions/runs/13684265553/job/38263809921#step:5:21)25.1
PyYAML==6.0.2
requests==2.32.3
schemachange==4.0.1
snowflake-connector-python==3.14.0
sortedcontainers==2.4.0
structlog==[24](https://github.com/Sage-Bionetworks/snowflake/actions/runs/13684265553/job/38263809921#step:5:25).1.0
tomlkit==0.13.2
typing_extensions==4.12.2
urllib3==2.3.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm that's strange. Anyway, numPy and Pandas aren't my only concern, my concern is running into dependency issues in the future. What are your thoughts on this?
How about we lock the dependencies after this merges into main and we're sure the current versions of everything work with no issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think this is because pandas was removed as a dependency in v3.7.0.
my concern is running into dependency issues in the future. What are your thoughts on this?
From my understanding, dependencies fall into two general categories:
- Package (Python) dependencies
- System dependencies
The most common issue I've run into in the past is with system dependencies. What usually happens is that the VM's OS updates one of the system libraries, introducing a breaking change into a package dependency that wasn't there previously. Besides pinning the OS, e.g., using ubuntu-24.04 rather than ubuntu-latest, there's not much we can do to avoid these types of breaking changes.
A slightly less common issue is with broken package dependencies. This tends to happen when package maintainers incorrectly configure their requirements.txt so that it's possible to install dependency versions which aren't compatible with the package. A typical workaround for this would be to manually override the allegedly compatible package versions with versions which we know will work. I believe that this is what you did for #65.
There was an issue opened in the schemachange repo just before we began running into issues with the incompatible numpy dependency version. But had the package maintainers configured the requirements.txt correctly (e.g., by specifying a numpy version range) then this issue never would have happened. In other words, this was an upstream issue that we found a workaround for.
Proactively pinning the versions of all dependencies (and all their subdependencies and their sub-subdependencies and...?) is not, I think, the best way to go about package management in non-mission-critical situations. It's better to lean on the dependency management system that the Python developers have designed (e.g., PEP 508) and exercise good-faith-until-proven-otherwise that the maintainers of schemachange have thoroughly tested their package in the most common environments.
82bea53
to
20e815f
Compare
This is great, and I'm assuming you'd want this merged into |
.github/workflows/ci.yaml
Outdated
PRIVATE_KEY: ${{ secrets.ADMIN_SERVICE_PRIVATE_KEY }} | ||
ACCOUNT: ${{ secrets.SNOWSQL_ACCOUNT }} | ||
USER: ${{ vars.ADMIN_SERVICE_USER }} | ||
WAREHOUSE: ${{ secrets.SNOWSQL_WAREHOUSE }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why the warehouse name has to be a GitHub secret?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't, that's just the way Tom set it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably replace this with either nothing (remove it, now that we have a default warehouse configured as part of the configure-snowflake-cli action) or specify the warehouse explicitly. That would get rid of the unnecessary mystique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the argument in d8cbffd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is excellent, thanks for the refactor! These are all my comments from an initial pass through. Most importantly I had comments on how this refactor will impact #125.
Not necessarily. I opened this PR into main, actually, since normally workflow changes only take effect after being merged into main. But after fact-checking myself I'm finding out that I probably only need to merge into the default branch, which currently is configured to be the dev branch. I may need to rebase.
Whichever is merged first, there will need to be a merge conflict resolution done for whichever is merged second. But the existing CI process will still work: the work you're doing for SNOW-90 doesn't need to use the new service accounts or Snowflake configuration action right away. |
This would add tech debt, but maybe that's not so bad for |
Here is the new ticket for refactoring CI/CD with this new action. |
d8cbffd
to
1c39eb3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 🔒
Just wanted to close the loop on one last thread, but this looks ready for merge.
1c39eb3
to
3354937
Compare
|
This PR contains a significant refactor, although a small number of actual changes to how our CI works.
.github/actions/configure-snowflake-cli/action.yml
config.toml
andconnections.toml
files to the~/.snowflake/
directory..github/workflows/ci.yaml
ADMIN_SERVICE
user.admin/integrations.sql