-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Template for packaging spinalcordtoolbox datasets in pip. #1
base: trunk
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should address all of the existing TODOs/comments before making any new repos with this. 🤔
yay! |
So that last commit is definitely kind of sinful. But I think that we should be doing /something/ to validate changes to the template, lest we change it in ways that break other things. |
On Jul 30, 2021, at 14:36, Joshua Newton ***@***.***> wrote:
+ perl -pi -e 's/\${dataset_name}/sample-dataset/' setup.py
IIUC, this is a find and replace?
That’s right :).
Just out of curiosity, why perl -pi -e? (I tried googling and found this SO answer, but I still don't entirely understand the purpose.)
Largely to avoid differences between the BSD/OSX sed and the GNU sed. It could just as well be `sed -i ’s/[…]/‘`, and I’m happy to change it to that, but that doesn’t play well with the OSX sed and I wanted to (lazily) test it out before pushing.
Taowa
|
This comment was marked as resolved.
This comment was marked as resolved.
Answer to TODO: No, steps cannot be parallelized. However, *jobs* can run in parallel. But, to do so would require uploading files as artifacts, then spinning up entirely new VM instances, then downloading the artifacts. And, at that point, the setup of these VMs would probably take longer and use more resources than just running the steps sequentially.
The existing values were copied over from ivadomed. Here, in SCT, I've added a temporary new token (until we can sort out NeuroPoly's PyPI accounts).
I think this is meant to specify the target branch (i.e. `trunk`, not the branch used for the PR.)
.github/workflows/release.yml
Outdated
- name: Build | ||
run: | | ||
python -m build --wheel --sdist | ||
- name: Publish to Github | ||
uses: softprops/action-gh-release@v1 | ||
with: | ||
files: 'dist/*' | ||
fail_on_unmatched_files: true | ||
tag_name: 2.0.3 # ${{ github.event.inputs.version }} # in the workflow_dispatch case, make a new tag from the given input; in the published release case, this will be empty and will fall back to updating that release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, in our setup.py
we've set the option use_scm_version=True
, which will use the most recent git tag as the package version number.
There's a subtle bug here, though: For the workflow_dispatch
case, we create a tag in the Publish step. But, the publish step happens after the build step, so the chosen tag won't be present for python -m build
to pick up.
I think this means we have to try creating a tag prior to the build step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically, this manifests in this PR as:
ERROR HTTPError: 400 Bad Request from https://test.pypi.org/legacy/
'0.1.dev1+g169fe48.d20220603' is an invalid value for Version. Error:
Can't use PEP 440 local versions. See
[ https://packaging.python.org/specifications/core](https://packaging.python.org/specifications/core-metadata)-metadata for m
information.
The package is being built with 0.1.dev1
, when we want it to be built using the dummy example tag name I chose (2.0.3
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh! Fascinating! use_scm_version=True
does not do what I thought it did (CI run):
ERROR HTTPError: 400 Bad Request from https://test.pypi.org/legacy/
'2.0.4.dev0+g5929d76.d20220603' is an invalid value for Version. Error:
Can't use PEP 440 local versions. See
[ https://packaging.python.org/specifications/core](https://packaging.python.org/specifications/core-metadata)-metadata for m
information.
Setting the tag 2.0.3
prior to release will actually auto-generate a package with version number 2.0.4.dev0
!
My best guess is that use_scm_version=True
is intended for the workflow described in the README.md of the pypi-publish
action:
For example, you could implement a parallel workflow that pushes every commit to TestPyPI or your own index server, like devpi. For this, you'd need to (1) specify a custom repository_url value and (2) generate a unique version number for each upload so that they'd not create a conflict. The latter is possible if you use setuptools_scm package but you could also invent your own solution based on the distance to the latest tagged commit.
In other words, setuptools_scm
is meant for rapidly generating unique version numbers during development (e.g. dev0
-> dev1
-> dev2
), rather than generating the version number for stable releases.
This means that we should probably stop using setuptools_scm
for building stable releases! (Though, we could optionally continue to use it to automatically build test/dev releases, but that's not really what workflow_dispatch
is for.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! So, performing a find and replace on the version="dev'
string (7394563) gets us our desired result:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't make sense to me. I've used it lots of times and its always behaved. Take a look at https://github.com/ufoym/imbalanced-dataset-sampler/actions/runs/2371746877 to see for yourself.
If you want to do a test deploy you should use just tag it 'v1.0.3rc0' or 'v1.0.3rc1' -- that'll get tagged as a test build too -- particularly we should include this line
https://github.com/ufoym/imbalanced-dataset-sampler/actions/runs/2371746877
I'm on my phone atm. I'll need some time to look deeper about why setuptools-scm isn't working. But I really think we should use it, it really improves the workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just need to ask people need to manually fill in the name.
Maybe we can write some awesome actions script that edits setup.py to sync it with the repo's name on every commit. That would be cool, and in line with using git as the single source of truth, but maybe a bit Extra, y'know? I think doing step manually is good enough because how many times are we really going to be making a new dataset (vs making a new release of that dataset)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh of course. I missed the fact that editing setup.py
means that the git repo is no longer in a clean state.
Thanks for catching that. 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just need to ask people need to manually fill in the name.
We already do! The perl
edit step is only relevant for the data-template
repo, and I only added it because otherwise the build on data-template
(i.e. this PR) would fail due to (I think) an invalid repo URL in the metadata, hehe.
We can just straight-up remove the perl
step, and allow the builds to fail on data-template
? Since the problem goes away the moment people actually use the template then follow the README.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From @kousu:
Ah good. I was pretty sure we already did...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried out the template using PAM50, and all the steps went well: https://github.com/joshuacwnewton/data-PAM50/releases/tag/2022.6.6
I then tried installing the wheel and it behaved as expected.
So, as soon as I add back setuptools_scm
, this (maybe possibly) is almost ready for merging?
9439d5c
to
7394563
Compare
* Stop repeating version number * Use || to make fallback case more explicit
a72262e
to
804b510
Compare
It's more straightforward to just have a single "correct" way of doing things, and the release UI has a lot of other nice usage benefits, too (e.g. checking if you're duplicating a tag name).
The `startsWith(github.ref, 'refs/tags/')` pattern is recommended by: * https://github.com/softprops/action-gh-release * https://github.com/pypa/gh-action-pypi-publish
I just want to see how pip will use date-based versions! GitHub Actions, pls stop being so finicky.
No description provided.