-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create next DataFusion release (after 7.0) - 7.1 #2095
Comments
I think datafusion can release a small version (e.g. 7.0.1) once a month and a large version (e.g. 8.0.0-alpha) when there are major changes. The reason is that the pull request of DataFusion is frequent. Therefore, a stable release plan is required. |
I would be happy to help support more incremental releases of datafusion, but I probably don't have time to manage the whole thing What I think would be needed is:
I am happy to do the mechanics of creating a branch and release artifacts, but I would need help from the community backporting / cherry-picking backwards compatible changes to it. |
I agree with you very much. I suggest that you can post a bulletin in ReadMe to recruit volunteers to help manage the publishing. It may not be obvious to track the issue. |
Maybe we could imitate the style of Apache Spark.
|
Arrow C++ (official) seems to have major release quarterly and as of 2022-02, it is 7.0.0. I think Datafusion can have similar release plan as Arrow (C++)
|
Agree to make 3-layer releases: major, minor, bug fix. Another question is whether it is necessary to maintain the same version for different modules, like Ballista, datafusion-data-access (newly splitted one). |
Arrow C++ does major quarterly releases; I have not seen a minor release (e.g. 6.1.0) in the last year. Occasionally there are patch releases but it is infrequent and typically once per major release. I agree the three release sounds ideal as well.
If we intended to conform to "semantic versioning" in the rust style, it is a challenge to release minor versions from
I do not think it is necessary to keep the same versions. I keep the versions of arrow-rs/arrow-flight/parquet in sync because it lowers the release overhead. |
The challenge I predict we will encounter is getting the time to manage the releases (aka reviewing PRs, decide what to backport, backporting, making release notes and version bumps). I don't think the work is "hard" per se but it does take sustained time and effort Maybe we could start with
|
Does anyone want to volunteer to manage such release(s)? |
I would love to |
Agree that backporting patches to a stable branch is a very time consuming work so better not commit to it until we see strong need from our users or we have a maintainer who can allocate dedicated time to maintain the stable branch. |
Ok, since @jychen7 has volunteered, let's give it a try for a release or two of I have created a 7.x maintenance branch The next steps would be to decide on some content to backport (via cherry-pick) that are semantically compatible. To do so I suggest:
Sound good @jychen7 ? |
@alamb if I understand correctly, our next major release wil be around 2022-05-14 (2nd weekend of May). And next possible minor release will be around 2022-04-09 (2nd weekend of Apr). we ask contributor who want minor/patch release to create PR to As volunteer, I would help to
ps: 1 may be automate in Github workflow in future if need |
I think that would be reasonable
Yes, thank you
❤️ thank you so much! |
Question related, do you plan to release the datafusion-cli as a crate as well ? I see that the 7.0.0 datafusion-cli crate has been yanked (for reasons that I ignore). |
Hi @happysalada --I don't expect we'll release datafusion-cli to crates.io. The reason that the datafusion-cli crate was not published (to crates.io) for 7.0.0 is that it depends on For now, you can probably install datafusion-cli from source / github if you want As backstory, datafusion-cli is mostly a debugging / development tool for datafusion and ballista -- and clients such as https://github.com/roapi/roapi or https://github.com/datafusion-contrib/datafusion-python were more appropriate for end users. If you wanted to make a datafusion-cli crate that was publishable (or break something similar into https://github.com/datafusion-contrib) I think it could be useful. |
@happysalada shameless plug - im working on a more full featured datafusion cli client https://github.com/datafusion-contrib/datafusion-tui if youre interested. its a new project that still has bugs but im getting close to a 0.1 release. My plan is to publish on crates and to homebrew. |
I think the easiest way to package this for now is to build from source. Matthew, nice project! Watching out for releases, will package it for nixos as well when I play with it! |
Release candidate was created and voting started here: https://lists.apache.org/thread/kvk7688gpfofrc46zso306rdnqxfdcdc |
Mailing list approval: https://lists.apache.org/thread/h75wtqcvc1w64x2p4tb9lc6nc2t2zhwm The release is available here: I have also released it to crates.io: |
Thanks @jychen7 for the assist getting this out 👍 |
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We released datafusion 7.0.0 about a month ago https://crates.io/crates/datafusion/7.0.0
We should figure out when to release the next one
Describe the solution you'd like
Plan out the next release(s) of DataFusion. Also figure out if we want to do a maintenance release (e.g. 7.0.1 / 7.1.0) or a release from master (8.0.0).
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Brought up by @silence-coding here: #2066 (comment)
See notes on 7.0 release #1587
The text was updated successfully, but these errors were encountered: