Skip to content

Conversation

slfan1989
Copy link
Contributor

@slfan1989 slfan1989 commented Oct 1, 2025

Description of PR

HADOOP-19711. Upgrade hadoop3 docker scripts to 3.4.2.

How was this patch tested?

push to branch docker-hadoop-3.4.2
https://github.com/slfan1989/hadoop/actions/runs/18157049664

has built the image and tagged it as HADOOP-19711
https://github.com/slfan1989/hadoop/pkgs/container/hadoop/

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@slfan1989
Copy link
Contributor Author

@jojochuang @adoroszlai @ayushtkn Hadoop 3.4.2 has been released, and we are preparing a corresponding Docker image for Hadoop 3.4.2. I have created this PR to complete the Docker image release. Could you please review this PR? Thank you very much!

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @slfan1989 for the patch.

$ docker run -it --rm ghcr.io/slfan1989/hadoop:3.4.2 hadoop version
...
Hadoop 3.4.2
Source code repository https://github.com/apache/hadoop.git -r 84e8b89ee2ebe6923691205b9e171badde7a495c
Compiled by ahmarsu on 2025-08-20T10:30Z
Compiled on platform linux-x86_64
Compiled with protoc 3.23.4
From source with checksum fa94c67d4b4be021b9e9515c9b0f7b6
This command was run using /opt/hadoop/share/hadoop/common/hadoop-common-3.4.2.jar

After this is merged, I suggest someone from Hadoop PMC upload the same image to Docker Hub, something like:

docker pull ghcr.io/apache/hadoop:3.4.2
docker tag ghcr.io/apache/hadoop:3.4.2 apache/hadoop:3.4.2
docker push apache/hadoop:3.4.2

Dockerfile Outdated

FROM apache/hadoop-runner
ARG HADOOP_URL=https://dlcdn.apache.org/hadoop/common/hadoop-3.4.1/hadoop-3.4.1.tar.gz
ARG HADOOP_URL=https://dlcdn.apache.org/hadoop/common/hadoop-3.4.2/hadoop-3.4.2.tar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW do you know why the tarball is published without .gz? It still seems to be gzipped:

$ file hadoop-3.4.2.tar
hadoop-3.4.2.tar: gzip compressed data, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for helping to review the code! I'm not sure why this package doesn't have the .gz

@ahmarsuhail Could you please help take a look at this question? Thank you very much!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, sorry I think I made a mistake while uploading the tar to the staging repo, and the it got copied incorrectly to the release directory. can someone from the PMC please update the file name in the release directory?

it is gzipped, just missing the .gz . My apologies for the miss.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahmarsuhail Thank you for the information—no need to apologize. I’ll try adding the .gz extension. Thanks again for your contribution to the hadoop-3.4.2 release.

Copy link
Contributor Author

@slfan1989 slfan1989 Oct 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve already updated the dist repo, but the dlcdn hasn’t synchronized yet. It may take a few more hours.

....
[hadoop-3.4.2.tar.gz](https://dist.apache.org/repos/dist/release/hadoop/common/hadoop-3.4.2/hadoop-3.4.2.tar.gz)
[hadoop-3.4.2.tar.gz.asc](https://dist.apache.org/repos/dist/release/hadoop/common/hadoop-3.4.2/hadoop-3.4.2.tar.gz.asc)
[hadoop-3.4.2.tar.gz.sha512](https://dist.apache.org/repos/dist/release/hadoop/common/hadoop-3.4.2/hadoop-3.4.2.tar.gz.sha512)
....

@slfan1989
Copy link
Contributor Author

Thanks @slfan1989 for the patch.

$ docker run -it --rm ghcr.io/slfan1989/hadoop:3.4.2 hadoop version
...
Hadoop 3.4.2
Source code repository https://github.com/apache/hadoop.git -r 84e8b89ee2ebe6923691205b9e171badde7a495c
Compiled by ahmarsu on 2025-08-20T10:30Z
Compiled on platform linux-x86_64
Compiled with protoc 3.23.4
From source with checksum fa94c67d4b4be021b9e9515c9b0f7b6
This command was run using /opt/hadoop/share/hadoop/common/hadoop-common-3.4.2.jar

After this is merged, I suggest someone from Hadoop PMC upload the same image to Docker Hub, something like:

docker pull ghcr.io/apache/hadoop:3.4.2
docker tag ghcr.io/apache/hadoop:3.4.2 apache/hadoop:3.4.2
docker push apache/hadoop:3.4.2

@adoroszlai Thank you very much for the detailed explanation. However, I have never published a Docker image before, and pushing to Docker Hub should require some additional authentication information. @jojochuang @ayushtkn , could you please take a look? Thank you very much!

@slfan1989 slfan1989 merged commit 0f4c272 into apache:docker-hadoop-3.4.2 Oct 6, 2025
1 check passed
@slfan1989
Copy link
Contributor Author

@adoroszlai @ahmarsuhail Thank you very much for helping review the code!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants