diff --git a/dev/release/README.md b/dev/release/README.md index 32735588ed8f..f772f1e42c1e 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -48,8 +48,8 @@ patch release: - Created a personal access token in GitHub for changelog automation script. - Github PAT should be created with `repo` access - Make sure your signing key is added to the following files in SVN: - - https://dist.apache.org/repos/dist/dev/arrow/KEYS - - https://dist.apache.org/repos/dist/release/arrow/KEYS + - https://dist.apache.org/repos/dist/dev/datafusion/KEYS + - https://dist.apache.org/repos/dist/release/datafusion/KEYS ### How to add signing key @@ -58,8 +58,8 @@ See instructions at https://infra.apache.org/release-signing.html#generate for g Committers can add signing keys in Subversion client with their ASF account. e.g.: ```bash -$ svn co https://dist.apache.org/repos/dist/dev/arrow -$ cd arrow +$ svn co https://dist.apache.org/repos/dist/dev/datafusion +$ cd datafusion $ editor KEYS $ svn ci KEYS ``` @@ -128,7 +128,7 @@ release. See [#9697](https://github.com/apache/datafusion/pull/9697) for an example. -Here are the commands that could be used to prepare the `5.1.0` release: +Here are the commands that could be used to prepare the `38.0.0` release: ### Update Version @@ -139,10 +139,10 @@ git fetch apache git checkout apache/main ``` -Update datafusion version in `datafusion/Cargo.toml` to `5.1.0`: +Update datafusion version in `datafusion/Cargo.toml` to `38.0.0`: ``` -./dev/update_datafusion_versions.py 5.1.0 +./dev/update_datafusion_versions.py 38.0.0 ``` Lastly commit the version change: @@ -167,7 +167,7 @@ Pick numbers in sequential order, with `0` for `rc0`, `1` for `rc1`, etc. While the official release artifacts are signed tarballs and zip files, we also tag the commit it was created for convenience and code archaeology. -Using a string such as `5.1.0` as the ``, create and push the tag by running these commands: +Using a string such as `38.0.0` as the ``, create and push the tag by running these commands: ```shell git fetch apache @@ -181,29 +181,29 @@ git push apache Run `create-tarball.sh` with the `` tag and `` and you found in previous steps: ```shell -GH_TOKEN= ./dev/release/create-tarball.sh 5.1.0 0 +GH_TOKEN= ./dev/release/create-tarball.sh 38.0.0 0 ``` The `create-tarball.sh` script -1. creates and uploads all release candidate artifacts to the [arrow - dev](https://dist.apache.org/repos/dist/dev/arrow) location on the +1. creates and uploads all release candidate artifacts to the [datafusion + dev](https://dist.apache.org/repos/dist/dev/datafusion) location on the apache distribution svn server 2. provide you an email template to - send to dev@arrow.apache.org for release voting. + send to dev@datafusion.apache.org for release voting. ### Vote on Release Candidate artifacts -Send the email output from the script to dev@arrow.apache.org. The email should look like +Send the email output from the script to dev@datafusion.apache.org. The email should look like ``` -To: dev@arrow.apache.org -Subject: [VOTE][DataFusion] Release Apache DataFusion 5.1.0 RC0 +To: dev@datafusion.apache.org +Subject: [VOTE] Release Apache DataFusion 38.0.0 RC1 Hi, -I would like to propose a release of Apache DataFusion version 5.1.0. +I would like to propose a release of Apache DataFusion version 38.0.0. This release candidate is based on commit: a5dd428f57e62db20a945e8b1895de91405958c4 [1] The proposed release artifacts and signatures are hosted at [2]. @@ -214,16 +214,16 @@ and vote on the release. The vote will be open for at least 72 hours. -[ ] +1 Release this as Apache DataFusion 5.1.0 +[ ] +1 Release this as Apache DataFusion 38.0.0 [ ] +0 -[ ] -1 Do not release this as Apache DataFusion 5.1.0 because... +[ ] -1 Do not release this as Apache DataFusion 38.0.0 because... Here is my vote: +1 [1]: https://github.com/apache/datafusion/tree/a5dd428f57e62db20a945e8b1895de91405958c4 -[2]: https://dist.apache.org/repos/dist/dev/arrow/apache-datafusion-5.1.0 +[2]: https://dist.apache.org/repos/dist/dev/datafusion/apache-datafusion-38.0.0 [3]: https://github.com/apache/datafusion/blob/a5dd428f57e62db20a945e8b1895de91405958c4/CHANGELOG.md ``` @@ -234,7 +234,7 @@ For the release to become "official" it needs at least three PMC members to vote The `dev/release/verify-release-candidate.sh` is a script in this repository that can assist in the verification process. Run it like: ``` -./dev/release/verify-release-candidate.sh 5.1.0 0 +./dev/release/verify-release-candidate.sh 38.0.0 0 ``` #### If the release is not approved @@ -249,11 +249,11 @@ NOTE: steps in this section can only be done by PMC members. ### After the release is approved Move artifacts to the release location in SVN, e.g. -https://dist.apache.org/repos/dist/release/datafusion/datafusion-5.1.0/, using +https://dist.apache.org/repos/dist/release/datafusion/datafusion-38.0.0/, using the `release-tarball.sh` script: ```shell -./dev/release/release-tarball.sh 5.1.0 0 +./dev/release/release-tarball.sh 38.0.0 0 ``` Congratulations! The release is now official! @@ -263,9 +263,9 @@ Congratulations! The release is now official! Tag the same release candidate commit with the final release tag ``` -git co apache/5.1.0-rc0 -git tag 5.1.0 -git push apache 5.1.0 +git co apache/38.0.0-rc0 +git tag 38.0.0 +git push apache 38.0.0 ``` ### Publish on Crates.io @@ -300,7 +300,7 @@ of the following crates: Download and unpack the official release tarball Verify that the Cargo.toml in the tarball contains the correct version -(e.g. `version = "5.1.0"`) and then publish the crates by running the script `release-crates.sh` +(e.g. `version = "38.0.0"`) and then publish the crates by running the script `release-crates.sh` in a directory extracted from the source tarball that was voted on. Note that this script doesn't work if run in a Git repo. @@ -413,10 +413,9 @@ https://crates.io/crates/datafusion-substrait/28.0.0 ### Add the release to Apache Reporter -Add the release to https://reporter.apache.org/addrelease.html?arrow with a version name prefixed with `RS-DATAFUSION-`, -for example `RS-DATAFUSION-14.0.0`. +Add the release to https://reporter.apache.org/addrelease.html?datafusion using the version number e.g. 38.0.0. -The release information is used to generate a template for a board report (see example +The release information is used to generate a template for a board report (see example from Apache Arrow project [here](https://github.com/apache/arrow/pull/14357)). ### Delete old RCs and Releases @@ -431,13 +430,13 @@ Release candidates should be deleted once the release is published. Get a list of DataFusion release candidates: ```bash -svn ls https://dist.apache.org/repos/dist/dev/arrow | grep datafusion +svn ls https://dist.apache.org/repos/dist/dev/datafusion ``` Delete a release candidate: ```bash -svn delete -m "delete old DataFusion RC" https://dist.apache.org/repos/dist/dev/datafusion/apache-datafusion-7.1.0-rc1/ +svn delete -m "delete old DataFusion RC" https://dist.apache.org/repos/dist/dev/datafusion/apache-datafusion-38.0.0-rc1/ ``` #### Deleting old releases from `release` svn @@ -447,35 +446,25 @@ Only the latest release should be available. Delete old releases after publishin Get a list of DataFusion releases: ```bash -svn ls https://dist.apache.org/repos/dist/release/arrow | grep datafusion +svn ls https://dist.apache.org/repos/dist/release/datafusion ``` Delete a release: ```bash -svn delete -m "delete old DataFusion release" https://dist.apache.org/repos/dist/release/datafusion/datafusion-7.0.0 +svn delete -m "delete old DataFusion release" https://dist.apache.org/repos/dist/release/datafusion/datafusion-37.0.0 ``` -### Publish the User Guide to the Arrow Site - -- Run the `build.sh` in the `docs` directory from the release tarball. -- Clone the [arrow-site](https://github.com/apache/arrow-site) repository -- Checkout the `asf-site` branch -- Copy content from `docs/build/html/*` to the `datafusion` directory in arrow-site -- Create a PR against the `asf-site` branch ([example](https://github.com/apache/arrow-site/pull/237)) -- Once the PR is merged, the content will be published to https://datafusion.apache.org/ by GitHub Pages (this - can take some time). - ### Optional: Write a blog post announcing the release -We typically crowdsource release announcements by collaborating on a Google document, usually starting +We typically crowd source release announcements by collaborating on a Google document, usually starting with a copy of the previous release announcement. Run the following commands to get the number of commits and number of unique contributors for inclusion in the blog post. ```bash -git log --pretty=oneline 10.0.0..11.0.0 datafusion datafusion-cli datafusion-examples | wc -l -git shortlog -sn 10.0.0..11.0.0 datafusion datafusion-cli datafusion-examples | wc -l +git log --pretty=oneline 37.0.0..38.0.0 datafusion datafusion-cli datafusion-examples | wc -l +git shortlog -sn 37.0.0..38.0.0 datafusion datafusion-cli datafusion-examples | wc -l ``` Once there is consensus on the contents of the post, create a PR to add a blog post to the diff --git a/dev/release/create-tarball.sh b/dev/release/create-tarball.sh index e345773287cf..693d069a9323 100755 --- a/dev/release/create-tarball.sh +++ b/dev/release/create-tarball.sh @@ -21,9 +21,9 @@ # Adapted from https://github.com/apache/arrow-rs/tree/master/dev/release/create-tarball.sh # This script creates a signed tarball in -# dev/dist/apache-arrow-datafusion--.tar.gz and uploads it to -# the "dev" area of the dist.apache.arrow repository and prepares an -# email for sending to the dev@arrow.apache.org list for a formal +# dev/dist/apache-datafusion--.tar.gz and uploads it to +# the "dev" area of the dist.apache.datafusion repository and prepares an +# email for sending to the dev@datafusion.apache.org list for a formal # vote. # # See release/README.md for full release instructions @@ -65,21 +65,21 @@ tag="${version}-rc${rc}" echo "Attempting to create ${tarball} from tag ${tag}" release_hash=$(cd "${SOURCE_TOP_DIR}" && git rev-list --max-count=1 ${tag}) -release=apache-arrow-datafusion-${version} +release=apache-datafusion-${version} distdir=${SOURCE_TOP_DIR}/dev/dist/${release}-rc${rc} tarname=${release}.tar.gz tarball=${distdir}/${tarname} -url="https://dist.apache.org/repos/dist/dev/arrow/${release}-rc${rc}" +url="https://dist.apache.org/repos/dist/dev/datafusion/${release}-rc${rc}" if [ -z "$release_hash" ]; then echo "Cannot continue: unknown git tag: ${tag}" fi -echo "Draft email for dev@arrow.apache.org mailing list" +echo "Draft email for dev@datafusion.apache.org mailing list" echo "" echo "---------------------------------------------------------" cat < ${tarball}.sha256 (cd ${distdir} && shasum -a 512 ${tarname}) > ${tarball}.sha512 -echo "Uploading to apache dist/dev to ${url}" -svn co --depth=empty https://dist.apache.org/repos/dist/dev/arrow ${SOURCE_TOP_DIR}/dev/dist +echo "Uploading to datafusion dist/dev to ${url}" +svn co --depth=empty https://dist.apache.org/repos/dist/dev/datafusion ${SOURCE_TOP_DIR}/dev/dist svn add ${distdir} svn ci -m "Apache DataFusion ${version} ${rc}" ${distdir} diff --git a/dev/release/publish_homebrew.sh b/dev/release/publish_homebrew.sh index 1cf7160d4284..20955953e85a 100644 --- a/dev/release/publish_homebrew.sh +++ b/dev/release/publish_homebrew.sh @@ -39,8 +39,8 @@ else # Fallback num_processing_units=1 fi -url="https://www.apache.org/dyn/closer.lua?path=arrow/arrow-datafusion-${version}/apache-arrow-datafusion-${version}.tar.gz" -sha256="$(curl https://dist.apache.org/repos/dist/release/arrow/arrow-datafusion-${version}/apache-arrow-datafusion-${version}.tar.gz.sha256 | cut -d' ' -f1)" +url="https://www.apache.org/dyn/closer.lua?path=datafusion/datafusion-${version}/apache-datafusion-${version}.tar.gz" +sha256="$(curl https://dist.apache.org/repos/dist/release/datafusion/datafusion-${version}/apache-datafusion-${version}.tar.gz.sha256 | cut -d' ' -f1)" pushd "$(brew --repository homebrew/core)" @@ -52,7 +52,7 @@ fi echo "Updating working copy" git fetch --all --prune --tags --force -j$num_processing_units -branch=apache-arrow-datafusion-${version} +branch=apache-datafusion-${version} echo "Creating branch: ${branch}" git branch -D ${branch} || : git checkout -b ${branch} origin/master diff --git a/dev/release/rat_exclude_files.txt b/dev/release/rat_exclude_files.txt index ce5635b6daf4..897a35172c9d 100644 --- a/dev/release/rat_exclude_files.txt +++ b/dev/release/rat_exclude_files.txt @@ -15,84 +15,8 @@ ci/etc/*.patch ci/vcpkg/*.patch CHANGELOG.md datafusion/CHANGELOG.md -python/CHANGELOG.md -conbench/benchmarks.json -conbench/requirements.txt -conbench/requirements-test.txt -conbench/.flake8 -conbench/.isort.cfg dev/requirements*.txt -dev/archery/MANIFEST.in -dev/archery/requirements*.txt -dev/archery/archery/tests/fixtures/* -dev/archery/archery/crossbow/tests/fixtures/* dev/release/rat_exclude_files.txt -dev/tasks/homebrew-formulae/apache-arrow.rb -dev/tasks/linux-packages/apache-arrow-apt-source/debian/apache-arrow-apt-source.install -dev/tasks/linux-packages/apache-arrow-apt-source/debian/compat -dev/tasks/linux-packages/apache-arrow-apt-source/debian/control -dev/tasks/linux-packages/apache-arrow-apt-source/debian/rules -dev/tasks/linux-packages/apache-arrow-apt-source/debian/source/format -dev/tasks/linux-packages/apache-arrow/debian/compat -dev/tasks/linux-packages/apache-arrow/debian/control.in -dev/tasks/linux-packages/apache-arrow/debian/gir1.2-arrow-1.0.install -dev/tasks/linux-packages/apache-arrow/debian/gir1.2-arrow-cuda-1.0.install -dev/tasks/linux-packages/apache-arrow/debian/gir1.2-arrow-dataset-1.0.install -dev/tasks/linux-packages/apache-arrow/debian/gir1.2-gandiva-1.0.install -dev/tasks/linux-packages/apache-arrow/debian/gir1.2-parquet-1.0.install -dev/tasks/linux-packages/apache-arrow/debian/gir1.2-plasma-1.0.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-glib-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-glib-doc.doc-base -dev/tasks/linux-packages/apache-arrow/debian/libarrow-glib-doc.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-glib-doc.links -dev/tasks/linux-packages/apache-arrow/debian/libarrow-glib400.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-cuda-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-cuda-glib-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-cuda-glib400.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-cuda400.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-dataset-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-dataset-glib-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-dataset-glib-doc.doc-base -dev/tasks/linux-packages/apache-arrow/debian/libarrow-dataset-glib-doc.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-dataset-glib-doc.links -dev/tasks/linux-packages/apache-arrow/debian/libarrow-dataset-glib400.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-dataset400.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-flight-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-flight400.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-python-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-python-flight-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-python-flight400.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow-python400.install -dev/tasks/linux-packages/apache-arrow/debian/libarrow400.install -dev/tasks/linux-packages/apache-arrow/debian/libgandiva-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libgandiva-glib-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libgandiva-glib-doc.doc-base -dev/tasks/linux-packages/apache-arrow/debian/libgandiva-glib-doc.install -dev/tasks/linux-packages/apache-arrow/debian/libgandiva-glib-doc.links -dev/tasks/linux-packages/apache-arrow/debian/libgandiva-glib400.install -dev/tasks/linux-packages/apache-arrow/debian/libgandiva400.install -dev/tasks/linux-packages/apache-arrow/debian/libparquet-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libparquet-glib-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libparquet-glib-doc.doc-base -dev/tasks/linux-packages/apache-arrow/debian/libparquet-glib-doc.install -dev/tasks/linux-packages/apache-arrow/debian/libparquet-glib-doc.links -dev/tasks/linux-packages/apache-arrow/debian/libparquet-glib400.install -dev/tasks/linux-packages/apache-arrow/debian/libparquet400.install -dev/tasks/linux-packages/apache-arrow/debian/libplasma-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libplasma-glib-dev.install -dev/tasks/linux-packages/apache-arrow/debian/libplasma-glib-doc.doc-base -dev/tasks/linux-packages/apache-arrow/debian/libplasma-glib-doc.install -dev/tasks/linux-packages/apache-arrow/debian/libplasma-glib-doc.links -dev/tasks/linux-packages/apache-arrow/debian/libplasma-glib400.install -dev/tasks/linux-packages/apache-arrow/debian/libplasma400.install -dev/tasks/linux-packages/apache-arrow/debian/patches/series -dev/tasks/linux-packages/apache-arrow/debian/plasma-store-server.install -dev/tasks/linux-packages/apache-arrow/debian/rules -dev/tasks/linux-packages/apache-arrow/debian/source/format -dev/tasks/linux-packages/apache-arrow/debian/watch -dev/tasks/requirements*.txt -dev/tasks/conda-recipes/* pax_global_header MANIFEST.in __init__.pxd @@ -109,8 +33,6 @@ requirements.txt .gitattributes rust-toolchain benchmarks/queries/q*.sql -python/rust-toolchain -python/requirements*.txt **/testdata/* benchmarks/queries/* benchmarks/expected-plans/* diff --git a/dev/release/release-crates.sh b/dev/release/release-crates.sh index 00ce77a86749..b9bda68b780b 100644 --- a/dev/release/release-crates.sh +++ b/dev/release/release-crates.sh @@ -21,7 +21,7 @@ # This script publishes datafusion crates to crates.io. # # This script should only be run after the release has been approved -# by the arrow PMC committee. +# by the Apache DataFusion PMC committee. # # See release/README.md for full release instructions diff --git a/dev/release/release-tarball.sh b/dev/release/release-tarball.sh index 74a4bab3aecd..bd858d23a767 100755 --- a/dev/release/release-tarball.sh +++ b/dev/release/release-tarball.sh @@ -21,10 +21,10 @@ # Adapted from https://github.com/apache/arrow-rs/tree/master/dev/release/release-tarball.sh # This script copies a tarball from the "dev" area of the -# dist.apache.arrow repository to the "release" area +# dist.apache.datafusion repository to the "release" area # # This script should only be run after the release has been approved -# by the arrow PMC committee. +# by the Apache DataFusion PMC committee. # # See release/README.md for full release instructions # @@ -43,7 +43,7 @@ fi version=$1 rc=$2 -tmp_dir=tmp-apache-arrow-datafusion-dist +tmp_dir=tmp-apache-datafusion-dist echo "Recreate temporary directory: ${tmp_dir}" rm -rf ${tmp_dir} @@ -52,14 +52,14 @@ mkdir -p ${tmp_dir} echo "Clone dev dist repository" svn \ co \ - https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-${version}-rc${rc} \ + https://dist.apache.org/repos/dist/dev/datafusion/apache-datafusion-${version}-rc${rc} \ ${tmp_dir}/dev echo "Clone release dist repository" -svn co https://dist.apache.org/repos/dist/release/arrow ${tmp_dir}/release +svn co https://dist.apache.org/repos/dist/release/datafusion ${tmp_dir}/release echo "Copy ${version}-rc${rc} to release working copy" -release_version=arrow-datafusion-${version} +release_version=datafusion-${version} mkdir -p ${tmp_dir}/release/${release_version} cp -r ${tmp_dir}/dev/* ${tmp_dir}/release/${release_version}/ svn add ${tmp_dir}/release/${release_version} @@ -71,4 +71,4 @@ echo "Clean up" rm -rf ${tmp_dir} echo "Success! The release is available here:" -echo " https://dist.apache.org/repos/dist/release/arrow/${release_version}" +echo " https://dist.apache.org/repos/dist/release/datafusion/${release_version}" diff --git a/dev/release/verify-release-candidate.sh b/dev/release/verify-release-candidate.sh index 45e984dec3a0..2c0bd216b3ac 100755 --- a/dev/release/verify-release-candidate.sh +++ b/dev/release/verify-release-candidate.sh @@ -33,7 +33,7 @@ set -o pipefail SOURCE_DIR="$(cd "$(dirname "${BASH_SOURCE[0]:-$0}")" && pwd)" ARROW_DIR="$(dirname $(dirname ${SOURCE_DIR}))" -ARROW_DIST_URL='https://dist.apache.org/repos/dist/dev/arrow' +ARROW_DIST_URL='https://dist.apache.org/repos/dist/dev/datafusion' download_dist_file() { curl \ @@ -45,7 +45,7 @@ download_dist_file() { } download_rc_file() { - download_dist_file apache-arrow-datafusion-${VERSION}-rc${RC_NUMBER}/$1 + download_dist_file apache-datafusion-${VERSION}-rc${RC_NUMBER}/$1 } import_gpg_keys() { @@ -143,11 +143,11 @@ test_source_distribution() { TEST_SUCCESS=no -setup_tempdir "arrow-${VERSION}" +setup_tempdir "datafusion-${VERSION}" echo "Working in sandbox ${ARROW_TMPDIR}" cd ${ARROW_TMPDIR} -dist_name="apache-arrow-datafusion-${VERSION}" +dist_name="apache-datafusion-${VERSION}" import_gpg_keys fetch_archive ${dist_name} tar xf ${dist_name}.tar.gz diff --git a/dev/update_arrow_deps.py b/dev/update_arrow_deps.py index b685ad2738b1..268ded38f6e8 100755 --- a/dev/update_arrow_deps.py +++ b/dev/update_arrow_deps.py @@ -17,7 +17,7 @@ # limitations under the License. # -# Script that updates the arrow dependencies in datafusion and ballista, locally +# Script that updates the arrow dependencies in datafusion locally # # installation: # pip install tomlkit requests