Skip to content

Conversation

slfan1989
Copy link
Contributor

@slfan1989 slfan1989 commented Aug 29, 2025

What changes were proposed in this pull request?

JIRA: HIVE-29168. Upgrade Hadoop Version to 3.4.2.

Why are the changes needed?

Hadoop 3.4.2 has been released. This JIRA updates the Hadoop dependency to version 3.4.2.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manually compile it locally & CI.

Copy link

@slfan1989
Copy link
Contributor Author

@ayushtkn Could you please help review this PR? Thank you very much! Due to the Hadoop version upgrade, I have updated the relevant dependencies in the Hive project.

@ayushtkn
Copy link
Member

ayushtkn commented Aug 30, 2025

there is some issue with docker image building now

+ HADOOP_FILE_NAME=hadoop-3.4.2.tar.gz
+ HADOOP_URL=https://archive.apache.org/dist/hadoop/core/hadoop-3.4.2/hadoop-3.4.2.tar.gz
+ '[' '!' -f /Users/ayushsaxena/code/hive/packaging/src/docker/../../cache/hadoop-3.4.2.tar.gz ']'
+ echo 'Downloading Hadoop from https://archive.apache.org/dist/hadoop/core/hadoop-3.4.2/hadoop-3.4.2.tar.gz...'
Downloading Hadoop from https://archive.apache.org/dist/hadoop/core/hadoop-3.4.2/hadoop-3.4.2.tar.gz...
+ curl --fail -L https://archive.apache.org/dist/hadoop/core/hadoop-3.4.2/hadoop-3.4.2.tar.gz -o /Users/ayushsaxena/code/hive/packaging/src/docker/../../cache/hadoop-3.4.2.tar.gz.tmp
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0   196    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (22) The requested URL returned error: 404
+ echo 'Fail to download Hadoop, exiting....'
Fail to download Hadoop, exiting....
+ exit 1

This file doesn't exist https://archive.apache.org/dist/hadoop/core/hadoop-3.4.2/hadoop-3.4.2.tar.gz.

They just uploaded the lean tar, not the full tar

On the release page as well, as of now if you click on download tar.gz, it is broken
https://hadoop.apache.org/release/3.4.2.html

@slfan1989
Copy link
Contributor Author

there is some issue with docker image building now

+ HADOOP_FILE_NAME=hadoop-3.4.2.tar.gz
+ HADOOP_URL=https://archive.apache.org/dist/hadoop/core/hadoop-3.4.2/hadoop-3.4.2.tar.gz
+ '[' '!' -f /Users/ayushsaxena/code/hive/packaging/src/docker/../../cache/hadoop-3.4.2.tar.gz ']'
+ echo 'Downloading Hadoop from https://archive.apache.org/dist/hadoop/core/hadoop-3.4.2/hadoop-3.4.2.tar.gz...'
Downloading Hadoop from https://archive.apache.org/dist/hadoop/core/hadoop-3.4.2/hadoop-3.4.2.tar.gz...
+ curl --fail -L https://archive.apache.org/dist/hadoop/core/hadoop-3.4.2/hadoop-3.4.2.tar.gz -o /Users/ayushsaxena/code/hive/packaging/src/docker/../../cache/hadoop-3.4.2.tar.gz.tmp
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0   196    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (22) The requested URL returned error: 404
+ echo 'Fail to download Hadoop, exiting....'
Fail to download Hadoop, exiting....
+ exit 1

This file doesn't exist https://archive.apache.org/dist/hadoop/core/hadoop-3.4.2/hadoop-3.4.2.tar.gz.

They just uploaded the lean tar, not the full tar

On the release page as well, as of now if you click on download tar.gz, it is broken https://hadoop.apache.org/release/3.4.2.html

@ayushtkn Thank you for the information. I attempted to download the full tar, but was unable to. I will @ Ahmar in this PR and send an email to explain the situation.

@slfan1989
Copy link
Contributor Author

@ahmarsuhail Currently, there seem to be some issues with Hadoop 3.4.2, and the full tar package hasn't been uploaded. Could you please help check this? Thank you very much!

@ahmarsuhail
Copy link

@slfan1989 we're unable to upload the full tar package as the AWS SDK has a size of 550MB, and the total tar exceeds a size of 1GB and either server doesn't accept a file that large. These were also not included in RC3.

The lean tars for both ARM and x86 are available. I noticed late on friday that this causes the links to be broken as well, because they're looking for the full tar but we're only able to upload the lean tar.

Do you have any suggestions about what we can do here? We could rename the lean.tar to hadoop-3.4.2.tar.gz but not sure?

@ayushtkn / @slfan1989 maybe one of you can send an email to mailing list with this issue and we can discuss there?

@slfan1989
Copy link
Contributor Author

@slfan1989 we're unable to upload the full tar package as the AWS SDK has a size of 550MB, and the total tar exceeds a size of 1GB and either server doesn't accept a file that large. These were also not included in RC3.

The lean tars for both ARM and x86 are available. I noticed late on friday that this causes the links to be broken as well, because they're looking for the full tar but we're only able to upload the lean tar.

Do you have any suggestions about what we can do here? We could rename the lean.tar to hadoop-3.4.2.tar.gz but not sure?

@ayushtkn / @slfan1989 maybe one of you can send an email to mailing list with this issue and we can discuss there?

@ahmarsuhail Thank you for the explanation! I will report this issue via email. INFRA can provide a temporary solution to increase the directory size for storing the tar package. I have raised a similar issue with INFRA before, and I'm not sure if it can help you. Here is the JIRA link I submitted: https://issues.apache.org/jira/projects/INFRA/issues/INFRA-25423.

@ahmarsuhail
Copy link

sounds good @slfan1989, thank you! i'll cut the ticket to INFRA first thing Monday

@slfan1989
Copy link
Contributor Author

sounds good @slfan1989, thank you! i'll cut the ticket to INFRA first thing Monday

@ahmarsuhail Thank you for your reply. I plan to send an email to the community to explain this issue, so that other members can refer to our solution if they encounter similar problems in the future.

cc: @ayushtkn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants