-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Major release and update cycle for Fedora CoreOS #22
Comments
As part of this, we should also discuss the stream structure. Container LinuxOn Container Linux, we have these channels:
Not every branch promotes to We encourage users to run some production nodes on alpha and some on beta in order to help catch regressions. In particular, this means that serious fixes must be applied to all channels. The channel names are not especially descriptive and should not be carried forward without some thought; in particular, Fedora CoreOSWe'll probably want to sync with the Fedora release cycle, even if it's not mandatory. If we branched from Rawhide independently from the rest of Fedora, we'd end up responsible for backporting fixes, which is too much work. There's still value in maintaining a multiple-stream structure, however. If we don't have something like a Straw proposal
Out-of-cycle backports of important security fixes and bugfixes would also occur. If a fix is important enough to backport directly to I've excluded I'm also not sure of a reasonable cadence for |
thanks for a well thought out reply @bgilbert. I didn't get the notification and just now stumbled upon it (see GH notifications issue I've been having). A few comments:
|
AIUI yes.
No. Users shouldn't have to think about Fedora versions or to manually update between them. Relatedly, though, it's worth considering how we assign version numbers to trees, as a friendly name for the user.
Is there value to continuing that? In the proposal above, FCOS won't ever be tracking Rawhide or attempting to keep it working.
That's the proposal. I don't know whether it's the right idea, and I'd love to hear opinions. |
So what would be the ABI/API stability promise ? If I am using next/testing/stable, what and how often should I expect breakage for each, and effort to fix those breakages ? |
For API stability: With CL it's always been "Stable unless otherwise noted" which is something we should continue. Bugs aside, there should never be breaking changes without an announcement and deprecation window. Users should not have to worry about updates (but should run a bit of beta/whatever we call the prerelease channel in their clusters to detect upcoming bugs). For ABI stability: You shouldn't care. Run your stuff in containers. Running on the host directly is unsupported. We will break you and not care if you run on the host directly. |
Running in container do not prevent to have a ABI. I do remember a article that made the round about how containers were not as portable as we tought, due to /proc being somehow exposed and being used for detecting selinux, etc and so triggering non standard code paths with Ubuntu container on RHEL. So the conclusion was that we can't really abstract the kernel from that. And even today, that's a point that is pushed by RH Folks like http://crunchtools.com/portability-not-compatibility/ So if I am being told "there is no ABI promise", how can I be sure that stuff will not break as a user ? And if that's not a problem, can someone tell to RH why they pretend otherwise and tell me I should care ? |
Fedora CoreOS is a community software project. There is no ironclad way to "be sure that stuff will not break". As @ajeddeloh said, the best approach is running some of your nodes on preproduction streams and reporting bugs you encounter. The kernel's ABI stability rules are also not ironclad, but they generally work pretty well. If you need stronger ABI guarantees than the upstream kernel can provide, neither Container Linux nor Fedora CoreOS nor any of the other Fedora editions are likely to meet your needs. |
To @ajeddeloh's point, we'll need ways to carry downstream reversions of breaking changes in Fedora. Eventually there will be something like another cgroups v2 that we'll have to work around. |
I was more thinking ABI in terms of C/C++ ABI (don't link against our stuff and run it on the host please), but the kernel ABI is a good point. I don't think there's anything we can do there really other than say "run a little preprod if you want to catch bugs before they're a problem." We're not going to ship ancient or out of date kernels. That being said, breakage from the kernel, systemd, etc, is 99% bugs, not intenional changes. |
But Fedora has a process to get kernel tested before being pushed with the updates-testing system and a policy to not break ABI in stable release (as much as possible, that's not perfect, but people get annoyed when it happen). https://fedoraproject.org/wiki/Updates_Policy And my understanding is that such promise is not on the table, and @bgilbert explictely say in this thread that users shouldn't care about version. That's a goal I agree with, but if we rely on Fedora release as the basis and switch every 6 months, that is where things will break and where I think people would need to care about version. So while Fedora do not promise much, as a user, I know when I can expect breakage and wait before upgrading because there is a time where I can use both. I do have 6 months to do the upgrade, which is fine for me. My understanding of the proposal is that as a user, I would have 2 weeks to fix if I see anything broken on my side in testing before it get to stable, which is a much shorter timeframe. And for the kernel ABI, while it is stable when the configuration do not change, we may also do configuration change in the future. So no matter how ironclad are the kernel devs, our packager might be as strict. And API/ABI is also stuff like the Docker API, switching from Docker to podman/buildah, etc. Admins should relay on the binary API and run thing on the host for sure, but in the end, they also need to interact with the host somehow, using a API. Again, if you tell "this can be broken any time", this is not very compelling to me as a user. |
If you look at how Container Linux has been managed, some things on the host have been removed, such as fleet with a year-long deprecation window. That's two Fedora release cycles. I am sure that if FCOS made any such changes (e.g. dropping As far as the rest of your comment - nothing is perfect, but the basic premise here is that value of containerization far outweighs concerns about corner case kernel ABIs affecting apps. |
Stable Fedora releases update to new kernels with new ABIs all the time. Kernel 4.17 will go EOL soon, so Fedora 28 will bump to 4.18.
You'd have two weeks to report regressions that reach |
Does Fedora have planned breaking changes that are announced anywhere? If so we should probably work with Fedora to either announce those explicitly for FCOS (since it's supposed to be "set and forget*") or ensure that the change won't impact users. |
Fedora does have a "changes" process where large changes are announced and the implications of changes are considered. I'm sure not everything that changes gets reported, but it's a good way to at least socialize changes that we would like to make or to also monitor them so we are aware of changes that will affect us. Here are links to all the fedora changes for the last 10 releases. |
Adding meeting label. |
ok we discussed in the meeting today and agreed to re-visit in a week. I had a few comments today. benjamins original proposal:
benjamin mentioned that these are the refs that he expects people to be able to run in production, with development refswe voiced the need for some "development" refs that essentially follow packages from the updates-testing fedora yum repos or are built nighly from the need for backportsone concern I raised is that if tangent We could explore not considering bodhi at all and just using our automated tests to gate packages going into FCOS, which would speed things up a bit. The above relationship between |
This ticket links into a whole lot of other things; there's deep questions here around how much we hook into/diverge from the current Fedora package process. The proposal at the top seems to basically aim to be "single stream" - I am broadly in favor of this, although I think practical realities are going to force us into at least having refs for each underlying major or so? |
I think "single stream" (or rather "triple stream") is an absolute necessity. Updates are automatic and invisible to users (hopefully). Having separate refs for streams based on different fedora releases breaks that model (not that we can't work around that, but more that it conceptually "feels" different if we do). |
@cgwalters Why do you think we'd need separate refs for each major? |
Backports@dustymabe Do you have any sense of how many updates in My use of the word "backport" may have made the proposal sound scarier than it is. The situation I had in mind is a significant kernel security fix. In that case our options are a) accept an entirely new stable kernel, including perhaps 100-200 unrelated patches, or b) cherry-pick the relevant patches to the stable kernel we're already shipping. I'm arguing for option (b). There's no real backporting work to do; the patch will almost always apply cleanly. Option (a) would entail pushing a new kernel directly to our When needed, we can use the same process for other packages as well, e.g. curl or docker. The key point is that, once we have the tooling to support backports, we can choose on a case-by-case basis whether to backport or accept an update from upstream Fedora. I don't expect the backport load to be especially heavy: in Container Linux today, many low-grade security fixes only go into the alpha channel and relatively few are backported to stable. |
Kernel releasesThere's another complication not mentioned above. Most packages only receive major updates between Fedora releases. In the above proposal, the Proposal: have |
That makes me feel better. Let's discuss this at the community meeting tomorrow and try to bring in the kernel team after that so we can come up with a plan. |
Dusty asked me to weigh in, but I generally don't have anything to add except I think what bgilbert is suggesting makes a lot of sense to me. I do think having a stream with updates-testing enabled is going to be important, even if it's just not widely publicized. Otherwise, it's going to be hard to actually test the relevant packages in place in CoreOS, making it hard to get them out of testing with reasonable certainty. I'm kind of thinking this should be "next", in fact. Some notable recent problems with dnf aside, it's usually the case that distro-sync handles the case of pulled updates fine — I've personally been running with updates-testing enabled for years. I think (right, @cgwalters?) that with rpm-ostree, going backwards (due to pulled updates) should be safe in almost all cases. |
Sorry I meant to follow up here. Basically I was more thinking for development purposes ("Let's test the new systemd in rawhide") or whatever. Exposing to users would be a distinct thing. One option is to have a separate ostree repo for this too. |
We discussed this more at the meeting Wednesday. Here's the current state: ProposalProduction refsFedora CoreOS will have several refs for use on production machines. At any given time, each ref will be downstream of a particular Fedora branch, and will consist of a snapshot of Fedora packages plus occasionally a backported fix.
All of these refs will be unversioned, in the sense that their names will not include the current Fedora major version. The stream cadences are not contractual, but will initially have two weeks between releases. The stream maintenance policies are also not contractual and may evolve from those described above, but changes will preserve the use cases and intended stability of each stream. Users will be encouraged to run most of their production systems on Development refsThere will also be some additional unversioned refs for the convenience of Fedora CoreOS developers. These will be public, but won't be exposed to users in the same way as production refs: they might be in a different repo, or in the same repo but not listed in the summary file. None of these are contractual; they might go away if we don't find them useful.
Out-of-cycle releasesDue to the promotion structure described above, A fix can take one of two forms:
We'll need infrastructure for both approaches, and the ability to choose between them on a case-by-case basis. Option 1 is cleaner and easier, but may not always be safe. Option 2 is especially useful for the kernel, where we'll want to fix individual bugs without pushing an entire stable kernel update directly to the If a fix is important enough for an out-of-cycle In some cases it may make sense to apply a fix to DeprecationBecause production refs are unversioned, users will seamlessly upgrade between Fedora major releases, so compatibility must be maintained. Removal of functionality will require explicitly announced deprecations, potentially with long deprecation windows. |
I've updated #22 (comment) to reflect the comments from last week's meeting:
|
PR in #72. |
nice - and close to merging.. I think the only other thing was that we needed to set up a session with the kernel devs to socialize our backporting strategy documented above. Should we set up some time for that? |
For Fedora Atomic Host, we do Atomic Host release along with Fedora major release. This follows with new release every Two Weeks with updated content from Fedora updates. This helps users to receives updated (including security fixes) and tested content every Two Weeks. For a major CVE fix, we make exception and do an in-between releases.
For FCOS, as per my knowledge we will have our first official release around Fedora 30 release based on be f30 tagged built packages (correct me if I am wrong). It will be nice to discuss and define how frequently we are going to do releases in between with updated content.
The text was updated successfully, but these errors were encountered: