-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-19636. [JDK17] Remove EOL OS Support and Clean Up Dockerfile. #7822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
(!) A patch to the testing environment has been detected. |
🎊 +1 overall
This message was automatically generated. |
@ayushtkn @GauthamBanasandra @Hexiaoqiao Could you please help review this PR? Thank you very much! cc: @pan3793 |
// This stage serves as a means of cross platform validation, which is | ||
// really needed to ensure that any C++ related/platform change doesn't | ||
// break the Hadoop build on Centos 7. | ||
stage ('precommit-run Centos 7') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's leftover centos 7 stuff at line 86
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
# Dockerfile for installing the necessary dependencies for building Hadoop. | ||
# See BUILDING.txt. | ||
|
||
FROM centos:8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should remove centos8, instead, we should migrate it Rocky Linux 8 (or other RHEL-like OS) in place, then 9 or 10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a personal perspective, I don't agree with your suggestion. I believe we should completely remove operating systems that have reached their End of Life (EOL). If we need to support CentOS 9 or Debian 12 in the future, it should be done by submitting a new PR for a thorough evaluation. Rather than maintaining multiple Dockerfiles, I prefer a more lightweight approach, such as providing support through documentation. As the number of supported operating systems increases, if we have to maintain Dockerfiles for each one, we could end up managing dozens, which is neither cost-effective nor sustainable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should completely remove operating systems that have reached their End of Life (EOL). If we need to support CentOS 9 or Debian 12 in the future, it should be done by submitting a new PR for a thorough evaluation.
I don't see much benefit in your proposal, I suppose upgrading in place is straightforward, and can leave clear diff in the commit history to guide users to understand what they should change for planning Hadoop cluster OS upgrading.
Rather than maintaining multiple Dockerfiles, I prefer a more lightweight approach, such as providing support through documentation.
The documentation can easily become outdated (you can try Building on macOS (without Docker)
in BUILDING.txt
). As I replied here, I think the Dockerfile
itself is the best documentation for setup the building env.
https://lists.apache.org/thread/2ypqcrnsth3jk21rpjvjv53tntz21ht8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The choice of operating system should be made by the user, and therefore, the resolution of compilation issues should also be handled by the user.
Take CentOS 7 as an example, which has multiple versions (such as 7.2, 7.3, 7.9, etc.). Different versions may have configuration or dependency differences (e.g., glibc, gcc versions), which can lead to compilation issues, such as with protobuf or native package compilation. For these issues, we should not add extra workarounds, as that would make the project redundant.
If we were to upgrade to CentOS 9
, we would change the Dockerfile name from Dockerfile_centos_8
to Dockerfile_centos_9
. Users comparing the diff would see that Dockerfile_centos_8
has been deleted and replaced with Dockerfile_centos_9
, which contains entirely new content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that we can not enumerate all Linux distributions and versions. I believe most enterprises use Debian/RHEL family of Linux distributions to run Hadoop. Given the limitation of developer resources in the Hadoop community, how about keeping only 2 OS Dockerfiles and CI pipelines - the latest(or sub-latest) version of Ubuntu(the default env for building, testing, releasing) and Rocky Linux(only verify the compilation)? They will serve as reference for users who want to set up a building environment based on their preferred Linux distribution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is that we should remove some dependencies which are EOL, just like some other module. Back to here , CentOS 8 has reached its EOL and the packages re no longer available on mirror.centos.org site.(https://www.centos.org/centos-linux-eol/), So +1 to Shilun's comments from my side. cc @pan3793 What do you think about. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hexiaoqiao If you agree to retain at least one RHEL-family OS Dockerfile for Hadoop building, I suggest keeping CentOS 8, because CentOS 8 works well(the mirror.centos.org
site was replaced by vault.centos.org
, see dev-support/docker/pkg-resolver/set-vault-as-baseurl-centos.sh
) for the Hadoop project build as of today, I plan to migrate it to Rocky Linux 8 soon.
https://endoflife.date/rocky-linux
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm supportive of a RHEL variant. There's also the option of an amazon linux container image, which uses yum.
# Dockerfile for installing the necessary dependencies for building Hadoop. | ||
# See BUILDING.txt. | ||
|
||
FROM debian:10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, we should upgrade it to debian 12 or 13 in place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @slfan1989 for your works. +1 from my side.
// This stage serves as a means of cross platform validation, which is | ||
// really needed to ensure that any C++ related/platform change doesn't | ||
// break the Hadoop build on Centos 7. | ||
stage ('precommit-run Centos 7') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
# Dockerfile for installing the necessary dependencies for building Hadoop. | ||
# See BUILDING.txt. | ||
|
||
FROM centos:8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is that we should remove some dependencies which are EOL, just like some other module. Back to here , CentOS 8 has reached its EOL and the packages re no longer available on mirror.centos.org site.(https://www.centos.org/centos-linux-eol/), So +1 to Shilun's comments from my side. cc @pan3793 What do you think about. Thanks.
// This stage serves as a means of cross platform validation, which is | ||
// really needed to ensure that any C++ related/platform change doesn't | ||
// break the Hadoop build on Centos 8. | ||
stage ('precommit-run Centos 8') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of removing CentOS 8, I would suggest replacing it with another supported RHEL8 clone, like Rocky Linux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should replace CentOs 8 instead of dropping it outright.
Thank you for your feedback. Feel free to continue sharing your thoughts in this email thread. So far, I’ve received comments from @ayushtkn , @Hexiaoqiao , @cnauroth, @pan3793. We are still in the discussion phase, and a final decision will be made based on the collective input. https://lists.apache.org/thread/2ypqcrnsth3jk21rpjvjv53tntz21ht8 |
@GauthamBanasandra Thank you, and I look forward to hearing your thoughts on this issue. |
Can you please forward the last email to [email protected] @slfan1989 so that I can reply? |
I’ve cc’d you on the email—please have a look when it’s convenient for you. |
Description of PR
JIRA: HADOOP-19636. [JDK17] Remove EOL OS Support and Clean Up Dockerfile.
In the Apache Hadoop project, we have historically supported multiple Linux distributions, including
CentOS 7
,CentOS 8
, andDebian 10
, as part of our build and test environments. However, all three distributions have now reached End-of-Life (EOL) status and are no longer officially maintained or supported.To ensure long-term maintainability and security, we propose to deprecate and clean up build support related to these EOL platforms. This includes:
Dockerfiles
Jenkins
pipeline configurationsThis cleanup will simplify our CI infrastructure and reduce the maintenance burden going forward.
This PR is intended to remove support for
CentOS 7
,CentOS 8
, andDebian 10
—all of which have reached EOL—in a single submission, including related Dockerfiles and Jenkins build configurations. The goal is to avoid multiple follow-up cleanup PRs, thereby improving efficiency and reducing review overhead.How was this patch tested?
junit test.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?