-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Talk to Fedora kernel team about FCOS stream design #80
Comments
In particular, we'll need to use official processes to build kernel images signed for Secure Boot. |
Does tracking RC releases means that we will get kernels with debugging options enabled in next? |
I guess we could produce our own builds with debug options disabled. Which way is preferable? |
For me, it would be debug off. |
Reached out to the kernel team about scheduling some time to discuss this. Will try to meet with them next week. |
I'm not fully familiar with FCOS streams, it sounds like the proposal is to backport individual critical fixes instead of just doing a full stable update, is that correct? (Want to make sure I understand before giving a detailed response) |
TL;DR - yes. In some cases (we hope this would be infrequent) we'd like to be able to backport fixes directly rather than pick up the latest update from Fedora. The idea is that if we are a kernel version behind in our stable branch (worst case 4 weeks behind) and we need to rush out a fix then we can elect to apply the small patch rather than jump to the next kernel version. Since we are trying to get people to set updates to automatic, jumping to the next kernel version without some soak time could be problematic. The longer version is in the design doc we've created To be clear: we aren't necessarily asking the Fedora kernel team to perform the work here. We are asking that our Fedora tooling/infrastructure is set up so that this type of build/update could occur and someone can do that work. I believe the FCOS community would be comfortable applying/testing these patches and collaborating with the Fedora kernel team to get them available. |
While I get the theory behind it, I might recommend that this be followed in the case of a rebase, but perhaps less so on stable updates. So if a critical CVE happens and FCOS is on 4.19.5, but Fedora pushed the fix in 4.19.8, FCOS would just update to 4.19.8. If Fedora were to push the fix in 4.20.4 however, it would be backported to the last 4.19.x that Fedora shipped. Stable updates don't often have major regressions, and the cases where they do, it is typically because of rushed CVE fixes anyway, so you would still get those. |
@jmflinuxtx Our experience in Container Linux is that stable updates often do have significant regressions. We used to push stable updates directly to the CL Based on our experience with CL, we think that each user-impacting regression encourages users to stop trusting us and disable automatic updates. To reduce that risk, we'll encourage users to run a few percent of their FCOS nodes on the The support gap after a kernel rebase is an issue I had missed. We'd greatly appreciate the extra 4 weeks of support for the previous kernel if the kernel team is willing to provide it, though I hesitate to further increase your workload. |
You've probably heard this before but just for the record: Trying to selectively pick up commits is not recommended by upstream. It's difficult to know if any commit will become a security issue or not so best practice is to pick them all up. It also may be difficult to do backporting of individual fixes if there's a diff between versions. That said, that message is usually intended at people who never want to pick up stable updates. Choosing to only pick up known critical fixes vs. entire stable updates for a fixed time period could be an option as long as you realize the tradeoffs. If you're already holding back on stable updates (i.e. FCOS is a few stable versions behind Fedora) I don't think you've lost too much by choosing to just give just the critical fixes. I'd only suggest doing this for the most critical of issues though and would advocate for "just take the full stable update" be the default option for most issues. I like the idea Justin proposed as well and I think it would be a good supplement. Long term the goal is to increase the confidence in stable updates to make everyone's life easier. |
Okay, so it sounds like there's rough consensus around:
Does that all sound good? Other pieces we haven't explictly discussed:
Any thoughts on those? |
RC builds have debugging disabled, so as long as you only use those (and not the daily git snapshots between them) you'd get kernels without debug options on. There is a bit of a snag here, though, because once a release happens Rawhide moves on so the day after 4.20 comes out, Rawhide will be 4.21-rc0. At the moment stable updates before the rebase happens (4.20.1, maybe 4.20.2) aren't built in Koji, they're built in a Copr repository and aren't Secure Boot signed. I think Koji will happily produce real builds from any dist-git commit, though, so we could possibly just build the stabilization branch in Koji without stepping on the old stable kernel's toes. They obviously wouldn't end up in Bodhi so there'd need to be some other Koji tag they got placed in, I guess. |
Thanks @jeremycline - 👍 I think you have helped answer some questions. I'll review where I currently think we are:
The remaining question we have is:
It looks like from the documentation only certain people have the ACLs to get a kernel build signed appropriately. Since we are volunteering to do or aid in backports could we get someone like @bgilbert initiated (is there a process here?) and blessed with appropriate ACLs ? |
There is not a specific process in place, it is guarded enough that it is extremely rare to add anyone to the ACLs there. We would need to get a couple more people involved in that discussion, but it is a discussion we can have. |
Thanks @jmflinuxtx. Could you make introductions for us, or tell us who to talk to in order to start the discussion? |
sent a followup email - will update next week. |
Followed up with @jmflinuxtx @vathpela and @nirik - We got @bgilbert ACLs for building/signing kernel packages. Will test it out once @jmflinuxtx has a candidate build we can try it out on. Thanks all. |
Just to chime in as another voice (perhaps an irrelevant one), but Mageia (one of the distributions I actively work in) has actually explicitly elected to switch away from LTS kernels because they tend to have more breakage than normal stable kernels. The unfortunate reality of LTS kernels is that certain people only come out of the woodwork to submit changes when an upcoming LTS release is announced, and those particular kernels have been worse than regular releases in recent years. I am not at all surprised by @bgilbert's experience, as it mirrors what has been the case for Mageia during the Mageia 5 and Mageia 6 release cycles, which is why Mageia 7 is switching to the Fedora policy of just shipping the latest stable releases and tracking those. I suspect that the Container Linux approach to handling kernel releases will matter a whole lot less with Fedora CoreOS simply because Container Linux followed a Mageia-like policy rather than a Fedora-like one. In my experience with my Fedora systems, it's been pretty rare to see such breakages with regular stable releases. Moreover, by updating frequently to new stable kernels as they arrive, the behavior changes and such are going to be more incremental and easier to adapt to anyway, which should alleviate a large number of issues. |
The FCOS stream design (#22, #72) has two elements that affect the kernel:
next
stream will track the upcoming kernel from rc6 until the kernel reaches Bodhiupdates
. The idea is that, while every other major package bump in FCOS bakes innext
for a substantial time before being promoted, new kernel versions would normally only bake for two weeks intesting
because they flow directly into Fedora stable releases. The additional kernel coverage innext
will provide a few weeks of additional baking time for catching regressions. This will require tooling support as part of Package pinning mechanism for release streams #77.Ask the Fedora kernel team what they think of all this.
The text was updated successfully, but these errors were encountered: