-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Short-lived package branches for backported fixes #79
Comments
Thanks for opening this issue. I think we'll need to discuss this with some of the releng crowd to see if this is easily doable today with our existing toolset/policies or if we need to investigate changes. |
OK after brainstorming a bit some options that we'd like to explore:
This unfortunately might be surprising and unwanted by the real package maintainers.
i.e.
Not sure if it makes sense but maybe we could use modularity here. |
I prefer this, and probably we can look at adding branch acls in case you dont want other maintainers making changes to your branch.
I guess this is not useful if you need to make changes to platform related packages, since they cannot be modularized. |
I talked with @mattdm and he also said he preferred this option:
|
At this point we should determine a proper naming strategy for these branches that we create and decide how to use them. For example do we do one of:
In the first case we can get a bunch of branches that pollute the repo over time (because we can't delete branches). In the later case the git history of the branch in the repo is ugly. |
It's hypothetically possible that we'll need up to five branches at once: one for |
I know you used the word |
Five simultaneous backports is a bit much. I wouldn't be surprised by three, though. Suppose The situation with the development branches is a bit different. Those wouldn't be single-cycle backports, but cases where there's a fix for the Fedora package that we haven't been able to merge back for some reason. Hopefully that doesn't happen much, if ever. |
Some assumptions I'm making:
With those assumptions I understand |
For both |
OK. Does that mean we agree we probably don't need another backport branch for
Agree on the But we generally won't want to promote the extra to testing mid-cycle part, but wouldn't we just be promoting the same backported rpm to both |
I think so, yeah.
Using the kernel as the example: we might have 4.20.10 in stable, 4.20.14 in testing, and 4.20.16 in testing-devel. When applying systemd and other userspace packages move more slowly, so we might frequently have systemd 239-12 in both |
Why couldn't 4.20.17 (including the fix) go straight to |
Because then it'd have less than two weeks of bake time before promoting to |
But in this example |
As stream promotion is currently defined, no. We could choose to change that, of course. But I'd tend to think a predictable cadence for |
so in this case we could have the following:
Is that accurate? Seems like if we have out of band updates we could elect to skip the next one depending on where it landed. |
That's accurate, yes. With CL we do, in fact, have some slop. We'll sometimes perform a scheduled release a day or two early or late to coincide with a security update. That's in part because releases still require a lot of manual work (which hopefully won't be true for FCOS) and in part to reduce the number of node reboots. We also might delay a release because of a late-breaking regression. For FCOS we likewise may not want to hold ourselves to a precise date & time for each scheduled release. But we can allow ourselves some flexibility without causing that to slip the schedule for future releases. |
Considering the discussion that followed #79 (comment) I'll propose we:
|
@dustymabe SGTM. Re naming, |
I agree that |
created fesco issue for discussion there: https://pagure.io/fesco/issue/2152 |
I'll be honest, this seems like you're doing it wrong. You guys are creating all kinds of weird workflows that not only don't make sense, but add confusion to packaging and pressure on the infrastructure. Your concept of multiple update streams for backports is more or less pointless in the context of Fedora CoreOS. You're providing nothing of value by doing weird specific backport things, and it presents a view that you are only in Fedora because you must, rather than actually wanting to be working within the community. |
I'll also point out that this really didn't come up before with Fedora Atomic Host for the years it has existed, and you guys are already envisioning things that they never needed for Fedora CoreOS (which, still hasn't had a formal release and looks an awful lot like vaporware...). You seem to think Fedora CoreOS is going to something different from FAH, and I really don't think that'll be the case. If anything, it'll likely be less than what FAH could do, since there will be no OpenShift nor Kubernetes for Fedora CoreOS (Kubernetes is tied to OpenShift versions in Fedora right now, and OpenShift upstream doesn't give a damn about Fedora, so we don't even have OpenShift 4.x in Rawhide). As it stands today, there's nothing you need this for. |
@Conan-Kudo For more context, see the design document section on release streams. The reason we need a stream promotion process is the same reason that CoreOS Container Linux needed one: in order for automatic updates to be viable, they must not break users' systems. CI can't catch everything, so we need a way to let users test a new release before it's pushed out to the majority of machines. We want them to do that in production, on say 1-2% of their fleet. But conversely, for any of this to be viable in production, we can't wait 2-4 weeks for security fixes to promote through the testing stream to stable. So, net result, we need to provide out-of-cycle security support for both testing and stable channels, and we may sometimes need to do that by backporting patches. (The kernel is the most obvious case where simply shipping the current version of a package might not be safe.) That basic approach worked pretty well for CoreOS Container Linux and its user community. We've adapted it to work better with Fedora workflows, and will continue to make adjustments as we gain more experience. But I agree that we haven't done a good job yet of explaining our thinking to the broader Fedora community. I'll be writing up a Fedora Magazine article within the next few weeks that should help explain the basic goals of the distro and why we're doing things the way we are. |
@bgilbert But you have the advantage of being able to produce OSTrees from content in And I disagree about out-of-cycle security support, especially on the testing stream. You might need to cherry-pick something from |
@Conan-Kudo The |
@bgilbert They should be on an overlapping two week cadence, so that the transition from However, I think two weeks is too much for the streams. A week should be sufficient, given that it is enough for updates to push through in regular Fedora. |
@Conan-Kudo The stream cadences aren't contractual and we might change them after we gain some experience. We'll see how things go. 😁 |
I'm not convinced that you need to do targeted backporting, and I'm also not sure this is particularly necessary for Fedora CoreOS. In fact, I suspect you're going to cause more problems with this model because you're twisting and turning things into ways that aren't really desirable or supportable by the wider community. I worry that what you're trying to do here is also going to further marginalize the community in favor of just being a "lab for RHEL". We already get plenty of people outside of Fedora foisting that moniker when it isn't true, let's not actually make it true. |
@Conan-Kudo. I understand your concerns. I have some concerns myself. The model we built for atomic host was a very passive process on top of existing Fedora processes and it worked pretty well. However, we have a new opportunity here with Fedora CoreOS to merge two communities (which means we need to take some of both) and bring users to to Fedora that would have otherwise not been here. In order to get to automated upgrades we needed to make some changes, including automated testing, but also locking on a package set (more controlled than bodhi) and letting the community report issues on those locked package sets (testing stream) before we ship to stable. We're certainly not trying to hurt Fedora. This is an attempt to bridge two communities together. We'll certainly make some mistakes and we hope to learn from them. |
@dustymabe In that scenario, don't you only need the ability to compose repos to feed in as inputs for FCOS? Because Koji keeps every build forever, it shouldn't be an issue to do more flexible compositions. No need to do weird things in Dist-Git. |
Indeed we are doing that with the
Since we will have a stable stream that lags fedora we'll need to be able to backport security issues or major regresssions we find to the version of the package that is in the stable stream (i.e. deliver just the fix). 99% of the time we will just be able to use the rpm that was already built by the maintainer in fedora because versions don't typically change in the middle of a release (with a few exceptions). In some cases the version in Fedora will have been bumped and we'll elect to do a small backport instead of taking the new rpm version (minimizing risk to the stable stream). In order to do this backport we'll need to apply a change to dist-git and do a build. Tag that build into our tag and update our lockfiles to include it. We don't want to stomp on existing branches of the package in dist-git, so we're requesting new branches for this purpose. |
I don't think this is worth it for Fedora CoreOS. At all. No matter how you present this, you're basically saying you want to do something completely different from what the maintainer(s) would do for that package. This should not fly in Fedora, and I'd say you should invest in doing controlled validation of pulling in updated packages instead. |
That's not my interpretation of what we're asking. My interpretation is that we're saying:
The other branch just makes it a designated sanctioned place to do it. |
This was discussed in the FESCO meeting today. During the discussion and in the ticket we determined that it is feasible to use tags to push commits into a repo. The approval by FESCO is:
So we use the following workflow:
There are a few remaining items left. I'll briefly mention them and enumerate them below that:
We should establish a pattern for naming of these tags and also a pattern for marking the resulting rpms. I suggest we use:
Once we have established this pattern we should create a document at https://docs.fedoraproject.org/en-US/fesco/ under Policy Documents and reference the pattern and also the fesco ticket where it was approved that we could do this. We should then socialize this decision and the document with devel@ list to raise awareness of the backports in case a maintainer sees them. In order to achieve our goals we'll also need write access to all of the repos where we'll want to do a backport. The easiest way to achieve this is by getting some of our team members to be proven packagers where you can push to any dist-git repo. Enumerating those:
|
We may also want to include Counter (e.g. coreos-backport-YYYYMMDD.$counter) in case we end up doing additional backport related fix on same day after tag has already been pushed. |
Lightweight tags rather than annotated tags, then? That's not the typical Git workflow (though I could see an argument for it here). Annotated tags would also let us record e.g. the FCOS release for which the tag is intended. From
We shouldn't overwrite a tag once it has been pushed, because Git doesn't refresh local copies of tags during a fetch.
That feels clunky. Can we just use |
I'm +1 for annotated tags assuming there isn't any technicaly reason they won't work. My steps above were mostly for proof of concept documentation of how to push a commit+tag at the same time (i.e. the commit isn't already in the git repo).
That works too. I was thinking of a situation where we pushed the tag then realized we needed a slight modificaiton before we ran the build.
works for me |
In practice we haven't needed this. I propose we close it and revisit in the future if the need arises. |
Yes, I agree. |
Fixes in out-of-cycle releases may take one of two paths:
See #72 for details. For path 1, it should be sufficient to update the package manifest (#77). For path 2, we'll also need a way to branch the package to backport the fix. Such branches will be short-lived; the maximum lifetime is the
testing
release cadence plus thestable
release cadence, currently 4 weeks in total.The text was updated successfully, but these errors were encountered: