Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package pinning mechanism for release streams #77

Closed
bgilbert opened this issue Nov 6, 2018 · 14 comments
Closed

Package pinning mechanism for release streams #77

bgilbert opened this issue Nov 6, 2018 · 14 comments
Assignees
Labels
kind/design releng Related to Fedora Release Engineering team/input

Comments

@bgilbert
Copy link
Contributor

bgilbert commented Nov 6, 2018

The lifecycle of a testing/stable release will be:

  1. Snapshot fedora + updates into a testing release; let's say it's X.0
  2. Possibly backport fixes to testing X.0 and release testing X.1
  3. Promote testing X.1 to stable; let's say we call it X.2
  4. Possibly backport fixes to stable X.2 and release stable X.3

The lifecycle of a next release, during the part of the release cycle when it doesn't alias testing, will be:

  1. Snapshot fedora into a next release; let's say it's Y.0
  2. Possibly backport fixes to next Y.0 and release next Y.1

We should thus be able to manage multiple simultaneous release branches, in which we e.g.:

  • Create a manifest of package NEVRAs based on a Koji tag. This should probably be continuously maintained in Git master via a CI gate, and then branched for an X.0/Y.0 release.
  • Update the manifest for backports, which may not be in Koji.
  • Build release images from the current manifest for a branch. This requires that the underlying package builds have not been GC'd.
@bgilbert
Copy link
Contributor Author

bgilbert commented Nov 7, 2018

This could be done with a Koji tag rather than a manifest, assuming backports were also in Koji.

@dustymabe
Copy link
Member

dustymabe commented Nov 8, 2018

Yeah I was mostly thinking a koji tag might give us everything we need. They support inheritance too so we can passively consume what is already in fedora until we need to override any one piece.

We've explored this in the past with atomic host and indeed we have a few extra tags set up for us today, though we haven't used them that much up until this point: See https://pagure.io/releng/issue/7100 for a long discussion, which should help anyone trying to understand how it could work.

@bgilbert
Copy link
Contributor Author

bgilbert commented Nov 8, 2018

If we went the koji route, I guess we'd either need to create two koji tags every two weeks (for the next next and testing branches) or update the next/testing/stable tags in place as releases are promoted?

Either way, I'm concerned about the impact on reproducible composes, since the koji tags would be moving targets. AIUI there'd be no way to rebuild X.Y.1 after the tag had been updated for an out-of-cycle X.Y.2. Maybe that's true anyway, though, depending on how package builds are GC'd.

@cgwalters
Copy link
Member

I think we should assume we control all aspects of the infrastructure in general - i.e. koji tags or whatever behave how we choose.

That said, I lean a bit towards doing an approach like this:
coreos/rpm-ostree#1670

Basically rather than koji tags, we have pinned packages defined in a git repository updated by a bot or humans, and commits are gated by PR testing.

In contrast to Koji tags for example, a package change can have an attached rationale (i.e. commit message). Git repositories have plenty of technology for doing "test this change then merge it". Koji tags...not so much.

Implementation wise the simplest is probably to have a rpm-md repo that has all the packages we plan to ship to start.

So the flow would be for e.g. new kernel:

  • Kernel maintainers do a build in Koji
  • That kernel package appears in our "fedora-coreos-all" repo.
  • Our bot submits a PR to our git repo to update the manifest-lock.yaml
  • PR is tested via CI, then either a human approves or it merges automatically (up to us)

This flow would mean packages that fail testing are still in the repo, but eh. We could do GC at some point.

@dustymabe
Copy link
Member

Basically rather than koji tags, we have pinned packages defined in a git repository updated by a bot or humans, and commits are gated by PR testing.

yep. I think I've proposed something like this in the past to achieve reverting to a previous version of an rpm. This seems reasonable to me too; bots necessary, of course. A question here is: does the git repo contain a fully depsolved list of rpms or does it just contain a 1st order list and those are depsolved?

@dustymabe dustymabe added the releng Related to Fedora Release Engineering team/input label Dec 12, 2018
@jlebon
Copy link
Member

jlebon commented Jan 21, 2019

One thought I've had on this topic is that this is also related to coreos/rpm-ostree#415. Unless we snapshot the whole Fedora repo, there's a chance that automatic updates will fail if there are packages layered.

And with a "graph" approach to updates, we can't necessarily rely on things sorting themselves out eventually as we've usually done. I.e. if the next target update requires a version of a layered pkg which is no longer in the updates repo, the client will just forever be stuck there.

I guess this also depends on what kinds of packages we expect people to use the layering escape hatch for, and whether those are prone to the base/split problem.

Implementation wise the simplest is probably to have a rpm-md repo that has all the packages we plan to ship to start.

Another way to do this (and simultaneously resolve the above concern + coreos/rpm-ostree#415) is to see if releng is willing to keep old versions of packages in the repos as done in RHEL and CentOS, at least until that release is EOL. Then we don't need a separate rpm-md repo (well, maybe except for the FCOS-specific rebuilds for backports), just the lock file. This of course also requires that libsolv on the client-side is smart enough to pick older versions of layered pkgs that depsolve successfully (will have to look into that).

@jlebon
Copy link
Member

jlebon commented Jan 21, 2019

A question here is: does the git repo contain a fully depsolved list of rpms or does it just contain a 1st order list and those are depsolved?

At least as implemented by coreos/rpm-ostree#1670, that lockfile should definitely include the fully depsolved list, just like Cargo.lock and Gopkg.lock work. Otherwise, deps might change from one build to another.

@jlebon
Copy link
Member

jlebon commented Jun 21, 2019

A more targeted ticket related to lockfile handling in #205.

@jlebon
Copy link
Member

jlebon commented Jul 10, 2019

OK, the crux of this ticket now lies in coreos/coreos-assembler#604 and coreos/rpm-ostree#1867. Once those are in, we can enable the promote-lockfiles bits in config-bot.

One major piece left afterwards is rewiring coreos-koji-tagger to target https://github.com/coreos/fedora-coreos-config instead of https://pagure.io/dusty/coreos-koji-data/.

@jlebon
Copy link
Member

jlebon commented Jul 26, 2019

Once those are in, we can enable the promote-lockfiles bits in config-bot.

I think this is actually dependent on getting coreos-koji-tagger fully set up first because otherwise, we'd essentially be exposing ourselves to race conditions with Bodhi updates clearing out packages from the updates repo.

So to summarize, I think to bootstrap ourselves into the lockfile workflow, we need to:

  1. adapt coreos-koji-tagger for the new lockfile format (that's coreos-koji-tagger: update for new lockfile format fedora-coreos-releng-automation#16)
  2. set up coreos-koji-tagger to listen for GitHub pushes to https://github.com/coreos/fedora-coreos-config (that's coreos-koji-tagger: point at fedora-coreos-config repo fedora-coreos-releng-automation#17)
  3. stand it up in the Fedora infra
  4. enable lockfile promotion in config-bot
  5. drop all repos except coreos-pool in the testing-devel manifest

The only thing I'm not sure about is 3. There are two infra tickets about this:
https://pagure.io/fedora-infrastructure/issue/7870
https://pagure.io/fedora-infrastructure/issue/7821

It seems like there are some unresolved issues there about getting it deployed in the Fedora infra with the right Koji keytab and permissions?

@dustymabe
Copy link
Member

dustymabe commented Jul 26, 2019

Once those are in, we can enable the promote-lockfiles bits in config-bot.

I think this is actually dependent on getting coreos-koji-tagger fully set up first because otherwise, we'd essentially be exposing ourselves to race conditions with Bodhi updates clearing out packages from the updates repo.

So to summarize, I think to bootstrap ourselves into the lockfile workflow, we need to:

  1. adapt coreos-koji-tagger for the new lockfile format (that's coreos/fedora-coreos-releng-automation#16)

reviewed

  1. set up coreos-koji-tagger to listen for GitHub pushes to https://github.com/coreos/fedora-coreos-config (that's coreos/fedora-coreos-releng-automation#17)

reviewed

  1. stand it up in the Fedora infra

  2. enable lockfile promotion in config-bot

The only thing I'm not sure about is 3. There are two infra tickets about this:
https://pagure.io/fedora-infrastructure/issue/7870

added a comment here

https://pagure.io/fedora-infrastructure/issue/7821

I added a comment there. It mostly works but AFAIU currently tagging of the kernel won't work. We'll have to ask bgilbert to tag it manually until we get a policy rule in place to allow the bot to do it. Maybe we'll need to add code to coreos-koji-tagger to not attempt to try to tag the kernel or something for now.

It seems like there are some unresolved issues there about getting it deployed in the Fedora infra with the right Koji keytab and permissions?

correct. Don't know if we need it 100% before then or if it's sufficient to wait til I'm back on the 5th.

EDIT: updated since I had my comments out of order for which ticket

@bgilbert
Copy link
Contributor Author

It mostly works but AFAIU currently tagging of the kernel won't work. We'll have to ask bgilbert to tag it manually until we get a policy rule in place to allow the bot to do it.

AIUI starting a kernel build is the restricted bit, and tagging it afterward is not special. @dustymabe, am I remembering that wrong?

@dustymabe
Copy link
Member

dustymabe commented Jul 26, 2019

It mostly works but AFAIU currently tagging of the kernel won't work. We'll have to ask bgilbert to tag it manually until we get a policy rule in place to allow the bot to do it.

AIUI starting a kernel build is the restricted bit, and tagging it afterward is not special. @dustymabe, am I remembering that wrong?

I think they are currently tied together (i.e. you can only tag if you could build). In https://pagure.io/fedora-infrastructure/issue/7821#comment-580062 i'm trying to convince them to make a policy where we can tag a kernel as long as it was built by someone who had the appropriate permissions.

We could do something halfway and just pull from the kernel from the regular fedora repos for now, but I'd really prefer not to do that if we don't have to. It would muddy our whole architecture of using coreos-pool.

@jlebon
Copy link
Member

jlebon commented Jul 14, 2021

This is considered fixed now with lockfiles + coreos-koji-tagger + coreos-pool.

@jlebon jlebon closed this as completed Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design releng Related to Fedora Release Engineering team/input
Projects
None yet
Development

No branches or pull requests

4 participants