Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release CI once again failed to build windows executables #5285

Open
2 of 6 tasks
fingolfin opened this issue Dec 18, 2022 · 5 comments
Open
2 of 6 tasks

Release CI once again failed to build windows executables #5285

fingolfin opened this issue Dec 18, 2022 · 5 comments
Labels
os: windows Issues and PRs that are (at least partially) specific to Windows topic: ci Anything related to GitHub Actions, Codecov, AppVeyor, Coveralls, Travis, ...
Milestone

Comments

@fingolfin
Copy link
Member

fingolfin commented Dec 18, 2022

... and this time I only noticed after sending the release announcement sigh.

This is essentially a variant of #5011

The failed job: https://github.com/gap-system/gap/actions/runs/3722840754/jobs/6314090227

Immediate cause reported at gap-packages/agt#15

This is really annoying, and hints at multiple problems:

  1. of course I should have noticed the missing windows binaries, I screwed up :-(
  2. but we know that humans (well, me, but I think I am not an exception) tend to miss such things, so really our automation should have noticed -- e.g. the scripts updating the website could have bailed out and refused the output based on the absence of those files
  3. the CI tests for the PackageDistro perhaps could have also discovered this problem, by checking for broken symlinks (or perhaps really any kind of symlinks, as those seem to be a repeated source of issues on Windows)
  4. we lack a good mechanism to "heal" such issues: after all, it is trivial to work around the issue (just delete the offending symlink). But in practice I don't see any good way to achieve this without release 4.12.3: I can't just insert this rm PATH invocation into the workflow. And I also can't just re-tag, as it quite likely would produce new tarballs with differing shasums.

How to resolve this now?

I see these options:

  1. release GAP 4.12.3 with identical source code, just a rm -f pkg/agt/doc/mathjax added (urgh, doesn't seem appealing)
  2. find a way to inject that rm into a re-run of the CI job (I see no way to do that, though I guess if we had tmate integration set up for that job, perhaps that would allow for it...)
  3. I could re-tag 4.12.2 after all, after first downloading all relevant tarballs; then after all CI run, replace any of the new tarballs that changed SHA256
  4. someone (@ChrisJefferson perhaps) could perhaps build the GAP .exe "manually" and upload it to the release

Steps to help avoid this kind of mistake in the future

  • teach the PackageDistro to reject package update with broken symlinks (or perhaps even with any symlinks) -- done in Forbid symlinks in package tarballs PackageDistro#669
  • teach ReleaseTools to reject broken symlinks (or perhaps even any symlinks), see Reject broken symlinks or even all ReleaseTools#95
  • add a step in the dev/releases/README.md that explicitly reminds to check that all files are in the release (listing specifically what to look for and how many files there should be -- or just suggesting to "compare to the previous release) -- see PR Improve dev/releases/README.md #5287
  • teach the website update scripts to check for the presence of all tarballs and refuse to update if they are missing
  • ...

Steps to make recovery from such issues easier in the future

Well, hopefully this just won't happen again, by taking the steps above. But realistically, it will happen, just less often. Less often also means we'll have less experience dealing with these problems, so I think it makes sense to prepare for it.

  • add a section to dev/releases/README.md that discusses options for when something went wrong, from techniques for "healing" certain kind of problems (and warnings for things to watch out for -- e.g. for 4.12.0 I thought I was clever and downloaded a tarball, "fixed" it, then re-uploaded the result -- but I messed up file access right while doing so. Ouch.
@fingolfin fingolfin changed the title Release Ci once again failed to build windows executables Release CI once again failed to build windows executables Dec 18, 2022
@fingolfin
Copy link
Member Author

Ok one more idea: I can try to hack the release CI script in my fork of this repository and create a 4.12.2 tag there to produce the desired .exe files.

@fingolfin
Copy link
Member Author

Also, another minor action to take against this kind of issue: let's add a step to dev/releases/README.md that suggest to check that all files are present.

another could be to have a "check release" script or GH workflow that checks this (and also verifies shasums while at it)

@ChrisJefferson
Copy link
Contributor

ChrisJefferson commented Dec 19, 2022

I have added windows 4.12.2 to the release, obviously, that doesn't fix the underlying problem, but it solves the short term issue!

@fingolfin
Copy link
Member Author

@ChrisJefferson thank you. That's only 64 bit .exe files, right? That's fine by me, though I guess it should have been announced that we drop 32bit Windows support.... Ah well, at this point, I say we just wait and see if someone complains

@ChrisJefferson
Copy link
Contributor

Yes, I don't think we can even build 32-bit any more, without digging out an old copy of the 32-bit cygwin setup.exe, they replaced the one on the website with an executable that just says "nope, no 32-bit any more".

@fingolfin fingolfin added this to the GAP 4.13.0 milestone Dec 29, 2022
@fingolfin fingolfin added os: windows Issues and PRs that are (at least partially) specific to Windows topic: ci Anything related to GitHub Actions, Codecov, AppVeyor, Coveralls, Travis, ... labels Mar 14, 2023
@fingolfin fingolfin modified the milestones: GAP 4.13.0, GAP 4.13.1 Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
os: windows Issues and PRs that are (at least partially) specific to Windows topic: ci Anything related to GitHub Actions, Codecov, AppVeyor, Coveralls, Travis, ...
Projects
None yet
Development

No branches or pull requests

2 participants