From 8705c3e987146dcd76ffd6244ae09350c6216fc8 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Tue, 11 Jun 2024 07:41:50 -0600 Subject: [PATCH 1/9] Source release process proposal --- dev/release/README.md | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index b20f2d48e..aeddf287f 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -17,7 +17,7 @@ specific language governing permissions and limitations under the License. --> -# Comet Release Process +# Comet Source Release Process This documentation is for creating an official source release of Apache DataFusion Comet. @@ -31,14 +31,20 @@ Here is a brief overview of the steps involved in creating a release: This part of the process can be performed by any committer. -- Create and merge a PR to update the version number & update the changelog -- Push a release candidate tag (e.g. 0.1.0-rc1) to the Apache repository +Here are the steps, using the 0.1.0 release as an example: + +- Create a release branch from the latest commit in main (e.g. `git checkout -b release-0.1.0`) and push to the Apache repo +- Create and merge a PR against the release branch to update the Maven version from `0.1.0-SNAPSHOT` to `0.1.0` +- Generate a changelog for all changes since the previous release tag and the release branch and create a PR against the main branch to add this +- Cherry-pick the changelog PR into the release branch +- Tag the release branch with `0.1.0-rc1` and push to the Apache repo +- Create a PR against the main branch to update the Rust crate version to `0.2.0` and the Maven version to `0.2.0-SNAPHOT` ## Publishing the Release Candidate This part of the process can mostly only be performed by a PMC member. -- Run the create-tarball script to create the source tarball and upload it to the dev subversion repository +- Run the create-tarball script on the release branch to create the source tarball and upload it to the dev subversion repository - Start an email voting thread - Once the vote passes, run the release-tarball script to move the tarball to the release subversion repository - Register the release with the [Apache Reporter Service](https://reporter.apache.org/addrelease.html?datafusion) using From 9862dc6bf3b842f184ab110d5a4a47d1fe9070c4 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Tue, 11 Jun 2024 07:48:23 -0600 Subject: [PATCH 2/9] small refinement --- dev/release/README.md | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index aeddf287f..090343950 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -35,33 +35,24 @@ Here are the steps, using the 0.1.0 release as an example: - Create a release branch from the latest commit in main (e.g. `git checkout -b release-0.1.0`) and push to the Apache repo - Create and merge a PR against the release branch to update the Maven version from `0.1.0-SNAPSHOT` to `0.1.0` -- Generate a changelog for all changes since the previous release tag and the release branch and create a PR against the main branch to add this +- Generate a changelog for all changes between the previous release tag and the release branch and create a PR against the main branch to add this - Cherry-pick the changelog PR into the release branch -- Tag the release branch with `0.1.0-rc1` and push to the Apache repo +- Tag the release branch with a release candidate tag (`0.1.0-rc1`) and push to the Apache repo - Create a PR against the main branch to update the Rust crate version to `0.2.0` and the Maven version to `0.2.0-SNAPHOT` ## Publishing the Release Candidate This part of the process can mostly only be performed by a PMC member. -- Run the create-tarball script on the release branch to create the source tarball and upload it to the dev subversion repository +- Run the create-tarball script on the release candidate tag (`0.1.0-rc1`) to create the source tarball and upload it to the dev subversion repository - Start an email voting thread - Once the vote passes, run the release-tarball script to move the tarball to the release subversion repository - Register the release with the [Apache Reporter Service](https://reporter.apache.org/addrelease.html?datafusion) using a version such as `COMET-0.1.0` - Delete old release candidates and releases from the subversion repositories -- Push a release tag (e.g. 0.1.0) to the Apache repository +- Push a release tag (`0.1.0`) to the Apache repository - Reply to the vote thread to close the vote and announce the release -## Publishing JAR Files to Maven - -The process for publishing JAR files to Maven is not defined yet. - -## Publishing to crates.io - -We may choose to publish the `datafusion-comet` to crates.io so that other Rust projects can leverage the -Spark-compatible operators and expressions outside of Spark. - ## Verifying Release Candidates The vote email will link to this section of this document, so this is where we will need to provide instructions for @@ -83,6 +74,17 @@ Another way of verifying the release is to follow the [Comet Benchmarking Guide](https://datafusion.apache.org/comet/contributor-guide/benchmarking.html) and compare performance with the previous release. +## Publishing Binary Releases + +### Publishing JAR Files to Maven + +The process for publishing JAR files to Maven is not defined yet. + +### Publishing to crates.io + +We may choose to publish the `datafusion-comet` to crates.io so that other Rust projects can leverage the +Spark-compatible operators and expressions outside of Spark. + ## Post Release Activities Writing a blog post about the release is a great way to generate more interest in the project. We typically create a From 7ffe5b56eda6db7c2fc0170206c319aa2eb17fc5 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Wed, 12 Jun 2024 12:46:30 -0600 Subject: [PATCH 3/9] add more detail --- dev/release/README.md | 52 ++++++++++++++++++++++++++++++------------- 1 file changed, 37 insertions(+), 15 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index a8b58e46b..b7149a988 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -31,21 +31,25 @@ Here is a brief overview of the steps involved in creating a release: This part of the process can be performed by any committer. -Here are the steps, using the 0.1.0 release as an example: +Here are the steps, using the 0.1.0 release as an example. -- Create a release branch from the latest commit in main (e.g. `git checkout -b release-0.1.0`) and push to the Apache repo -- Create and merge a PR against the release branch to update the Maven version from `0.1.0-SNAPSHOT` to `0.1.0` -- Generate a changelog for all changes since the previous release tag and the release branch and create a PR against the main branch to add this -- Cherry-pick the changelog PR into the release branch -- Tag the release branch with `0.1.0-rc1` and push to the Apache repo -- Create a PR against the main branch to update the Rust crate version to `0.2.0` and the Maven version to `0.2.0-SNAPHOT` +### Create Release Branch + +Create a release branch from the latest commit in main and push to the Apache repo: + +```shell +get fetch apache +git checkout main +git checkout -b branch-0.1 +git push apache branch-0.1 +``` + +Create and merge a PR against the release branch to update the Maven version from `0.1.0-SNAPSHOT` to `0.1.0` ### Generating the Change Log -We haven't yet defined how tagging and branching will work for the source releases. This project is more complex -than DataFusion core because it consists of a Maven project and a Cargo project. However, generating a change log -to cover changes between any two commits or tags can be performed by running the provided `generate-changelog.py` -script. +Generate a change log to cover changes between the previous release and the release branch HEAD by running +the provided `generate-changelog.py` script. It is recommended that you set up a virtual Python environment and then install the dependencies: @@ -55,15 +59,33 @@ source venv/bin/activate pip3 install -r requirements.txt ``` -To generate the changelog, set the `GITHUB_TOKEN` environment variable to a valid token and then run the script -providing two commit ids or tags followed by the version number of the release being created. The following -example generates a change log of all changes between the first commit and the current HEAD revision. +To generate the changelog, set the `GITHUB_TOKEN` environment variable to a valid token and then run the script +providing two commit ids or tags followed by the version number of the release being created. The following +example generates a change log of all changes between the previous version and the current release branch HEAD revision. ```shell export GITHUB_TOKEN= -python3 generate-changelog.py 52241f44315fd1b2fd6cd9031bb05f046fe3a5a3 HEAD 0.1.0 > ../changelog/0.1.0.md +python3 generate-changelog.py 52241f44315fd1b2fd6cd9031bb05f046fe3a5a3 branch-0.1 0.0.0 > ../changelog/0.1.0.md ``` +Create a PR against the _main_ branch to add this change log and once this is approved and merged, cherry-pick the +commit into the release branch. + +### Tag the Release Candidate + +Tag the release branch with `0.1.0-rc1` and push to the Apache repo + +```shell +git checkout branch-0.1 +git pull +git tag 0.1.0-rc1 +git push apache 0.1.0-rc1 +```` + +### Update Version in main + +Create a PR against the main branch to update the Rust crate version to `0.2.0` and the Maven version to `0.2.0-SNAPHOT`. + ## Publishing the Release Candidate This part of the process can mostly only be performed by a PMC member. From 712832674c4e2b815e9253e6d6068627aa9723ed Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Wed, 12 Jun 2024 12:49:07 -0600 Subject: [PATCH 4/9] fix --- dev/release/README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/dev/release/README.md b/dev/release/README.md index 3c93417eb..8c49c5228 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -40,6 +40,7 @@ Create a release branch from the latest commit in main and push to the Apache re ```shell get fetch apache git checkout main +git reset --hard apache/main git checkout -b branch-0.1 git push apache branch-0.1 ``` @@ -76,8 +77,9 @@ commit into the release branch. Tag the release branch with `0.1.0-rc1` and push to the Apache repo ```shell +git fetch apache git checkout branch-0.1 -git pull +git reset --hard apache/branch-0.1 git tag 0.1.0-rc1 git push apache 0.1.0-rc1 ```` From 5e89f647f704365f2993a75afe04610fe7be6b7d Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Wed, 12 Jun 2024 12:58:33 -0600 Subject: [PATCH 5/9] more detail --- dev/release/README.md | 85 ++++++++++++++++----- dev/release/create-tarball.sh | 2 +- dev/release/verifying-release-candidates.md | 17 +++++ 3 files changed, 82 insertions(+), 22 deletions(-) create mode 100644 dev/release/verifying-release-candidates.md diff --git a/dev/release/README.md b/dev/release/README.md index 8c49c5228..f0daf0e25 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -92,35 +92,78 @@ Create a PR against the main branch to update the Rust crate version to `0.2.0` This part of the process can mostly only be performed by a PMC member. -- Run the create-tarball script on the release candidate tag (`0.1.0-rc1`) to create the source tarball and upload it to the dev subversion repository -- Start an email voting thread -- Once the vote passes, run the release-tarball script to move the tarball to the release subversion repository -- Register the release with the [Apache Reporter Service](https://reporter.apache.org/addrelease.html?datafusion) using - a version such as `COMET-0.1.0` -- Delete old release candidates and releases from the subversion repositories -- Push a release tag (`0.1.0`) to the Apache repository -- Reply to the vote thread to close the vote and announce the release +### Create the Release Candidate Tarball -## Verifying Release Candidates +Run the create-tarball script on the release candidate tag (`0.1.0-rc1`) to create the source tarball and upload it to the dev subversion repository -The vote email will link to this section of this document, so this is where we will need to provide instructions for -verifying a release candidate. +```shell +GH_TOKEN= ./dev/release/create-tarball.sh 0.1.0 1 +``` + +### Start an Email Voting Thread + +Send the email that is generated in the previous step to `dev@datafusion.apache.org`. + +### Publish the Release Tarball -The `dev/release/verify-release-candidate.sh` is a script in this repository that can assist in the verification -process. It checks the hashes and runs the build. It does not run the test suite because this takes a long time -for this project and the test suites already run in CI before we create the release candidate, so running them -again is somewhat redundant. +Once the vote passes, run the release-tarball script to move the tarball to the release subversion repository. ```shell -./dev/release/verify-release-candidate.sh 0.1.0 1 +./dev/release/create-tarball.sh 0.1.0 1 ``` -We hope that users will verify the release beyond running this script by testing the release candidate with their -existing Spark jobs and report any functional issues or performance regressions. +Push a release tag (`0.1.0`) to the Apache repository. + +```shell +git fetch apache +git checkout 0.1.0-rc1 +git tag 0.1.0 +git push apache 0.1.0 +``` + +Reply to the vote thread to close the vote and announce the release. + +## Post Release Admin + +Register the release with the [Apache Reporter Service](https://reporter.apache.org/addrelease.html?datafusion) using +a version such as `COMET-0.1.0`. + +### Delete old RCs and Releases + +See the ASF documentation on [when to archive](https://www.apache.org/legal/release-policy.html#when-to-archive) +for more information. + +#### Deleting old release candidates from `dev` svn + +Release candidates should be deleted once the release is published. + +Get a list of DataFusion Comet release candidates: -Another way of verifying the release is to follow the -[Comet Benchmarking Guide](https://datafusion.apache.org/comet/contributor-guide/benchmarking.html) and compare -performance with the previous release. +```shell +svn ls https://dist.apache.org/repos/dist/dev/datafusion | grep comet +``` + +Delete a release candidate: + +```shell +svn delete -m "delete old DataFusion Comet RC" https://dist.apache.org/repos/dist/dev/datafusion/apache-datafusion-comet-0.1.0-rc1/ +``` + +#### Deleting old releases from `release` svn + +Only the latest release should be available. Delete old releases after publishing the new release. + +Get a list of DataFusion releases: + +```shell +svn ls https://dist.apache.org/repos/dist/release/datafusion | grep comet +``` + +Delete a release: + +```shell +svn delete -m "delete old DataFusion Comet release" https://dist.apache.org/repos/dist/release/datafusion-comet/datafusion-comet-0.0.0 +``` ## Publishing Binary Releases diff --git a/dev/release/create-tarball.sh b/dev/release/create-tarball.sh index 367dcae8e..1bec80051 100755 --- a/dev/release/create-tarball.sh +++ b/dev/release/create-tarball.sh @@ -95,7 +95,7 @@ on the release. The vote will be open for at least 72 hours. Only votes from PMC members are binding, but all members of the community are encouraged to test the release and vote with "(non-binding)". -The standard verification procedure is documented at https://github.com/apache/datafusion-comet/blob/main/dev/release/README.md#verifying-release-candidates. +The standard verification procedure is documented at https://github.com/apache/datafusion-comet/blob/main/dev/release/verifying-release-candidates.md [ ] +1 Release this as Apache DataFusion Comet ${version} [ ] +0 diff --git a/dev/release/verifying-release-candidates.md b/dev/release/verifying-release-candidates.md new file mode 100644 index 000000000..442efdfd8 --- /dev/null +++ b/dev/release/verifying-release-candidates.md @@ -0,0 +1,17 @@ +# Verifying DataFusion Comet Release Candidates + +The `dev/release/verify-release-candidate.sh` script in this repository can assist in the verification +process. It checks the hashes and runs the build. It does not run the test suite because this takes a long time +for this project and the test suites already run in CI before we create the release candidate, so running them +again is somewhat redundant. + +```shell +./dev/release/verify-release-candidate.sh 0.1.0 1 +``` + +We hope that users will verify the release beyond running this script by testing the release candidate with their +existing Spark jobs and report any functional issues or performance regressions. + +Another way of verifying the release is to follow the +[Comet Benchmarking Guide](https://datafusion.apache.org/comet/contributor-guide/benchmarking.html) and compare +performance with the previous release. From cf1163f2f8a0aa77f08e4d715e98934de4c12850 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Wed, 12 Jun 2024 13:02:04 -0600 Subject: [PATCH 6/9] title --- dev/release/README.md | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index f0daf0e25..a80ad1b14 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -17,16 +17,10 @@ specific language governing permissions and limitations under the License. --> -# Comet Source Release Process +# Aapche DataFusion Comet: Source Release Process This documentation is for creating an official source release of Apache DataFusion Comet. -The release process is based on the parent Apache DataFusion project, so please refer to the -[DataFusion Release Process](https://github.com/apache/datafusion/blob/main/dev/release/README.md) for detailed -instructions if you are not familiar with the release process here. - -Here is a brief overview of the steps involved in creating a release: - ## Creating the Release Candidate This part of the process can be performed by any committer. From 86985bb0d07be2077e8e34049037d98e66ad0137 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Wed, 12 Jun 2024 13:02:31 -0600 Subject: [PATCH 7/9] prettier --- dev/release/README.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index a80ad1b14..30322efec 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -43,7 +43,7 @@ Create and merge a PR against the release branch to update the Maven version fro ### Generate the Change Log -Generate a change log to cover changes between the previous release and the release branch HEAD by running +Generate a change log to cover changes between the previous release and the release branch HEAD by running the provided `generate-changelog.py` script. It is recommended that you set up a virtual Python environment and then install the dependencies: @@ -63,7 +63,7 @@ export GITHUB_TOKEN= python3 generate-changelog.py 52241f44315fd1b2fd6cd9031bb05f046fe3a5a3 branch-0.1 0.0.0 > ../changelog/0.1.0.md ``` -Create a PR against the _main_ branch to add this change log and once this is approved and merged, cherry-pick the +Create a PR against the _main_ branch to add this change log and once this is approved and merged, cherry-pick the commit into the release branch. ### Tag the Release Candidate @@ -76,10 +76,10 @@ git checkout branch-0.1 git reset --hard apache/branch-0.1 git tag 0.1.0-rc1 git push apache 0.1.0-rc1 -```` +``` ### Update Version in main - + Create a PR against the main branch to update the Rust crate version to `0.2.0` and the Maven version to `0.2.0-SNAPHOT`. ## Publishing the Release Candidate @@ -110,9 +110,9 @@ Push a release tag (`0.1.0`) to the Apache repository. ```shell git fetch apache -git checkout 0.1.0-rc1 +git checkout 0.1.0-rc1 git tag 0.1.0 -git push apache 0.1.0 +git push apache 0.1.0 ``` Reply to the vote thread to close the vote and announce the release. From 28c8da578903525ad720b20258933a9a1eef11f2 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Wed, 12 Jun 2024 13:09:11 -0600 Subject: [PATCH 8/9] ASF header --- dev/release/verifying-release-candidates.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/dev/release/verifying-release-candidates.md b/dev/release/verifying-release-candidates.md index 442efdfd8..85cdf010e 100644 --- a/dev/release/verifying-release-candidates.md +++ b/dev/release/verifying-release-candidates.md @@ -1,3 +1,22 @@ + + # Verifying DataFusion Comet Release Candidates The `dev/release/verify-release-candidate.sh` script in this repository can assist in the verification From 4d54253a6bf1727007e256ef9175f3981d539d66 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Thu, 13 Jun 2024 12:20:12 -0600 Subject: [PATCH 9/9] address feedback --- dev/release/README.md | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index 30322efec..1abb359ef 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -29,7 +29,17 @@ Here are the steps, using the 0.1.0 release as an example. ### Create Release Branch -Create a release branch from the latest commit in main and push to the Apache repo: +This document assumes that GitHub remotes are set up as follows: + +```shell +$ git remote -v +apache git@github.com:apache/datafusion-comet.git (fetch) +apache git@github.com:apache/datafusion-comet.git (push) +origin git@github.com:yourgithubid/datafusion-comet.git (fetch) +origin git@github.com:yourgithubid/datafusion-comet.git (push) +``` + +Create a release branch from the latest commit in main and push to the `apache` repo: ```shell get fetch apache @@ -68,7 +78,7 @@ commit into the release branch. ### Tag the Release Candidate -Tag the release branch with `0.1.0-rc1` and push to the Apache repo +Tag the release branch with `0.1.0-rc1` and push to the `apache` repo ```shell git fetch apache @@ -106,7 +116,7 @@ Once the vote passes, run the release-tarball script to move the tarball to the ./dev/release/create-tarball.sh 0.1.0 1 ``` -Push a release tag (`0.1.0`) to the Apache repository. +Push a release tag (`0.1.0`) to the `apache` repository. ```shell git fetch apache