-
Notifications
You must be signed in to change notification settings - Fork 7.3k
ZOOKEEPER-4849: Option to Provide Custom X509 Implementation of QuorumAuthServer and QuorumAuthLearner #2269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…mAuthServer and QuorumAuthLearner
@eolivelli @kezhuw @anmolnar could you please review? This change implements our paranoid security recommendation: each ZooKeeper quorum member must present a distinct mTLS certificate, validated against its CNAME. The patch adds the ability to configure authentication for both quorum servers and learners—very similar to the work tracked in https://issues.apache.org/jira/browse/ZOOKEEPER-2123 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the implementation of ZOOKEEPER-2123, we now have the option to implement a custom X509 AuthenticationProvider for client-server authentication. We also need a similar option for quorum authentication.
I think X509AuthenticationProvider
exists mainly for integration with ACL. Regarding ssl, it has no specials(it do have some customizations, say hostname verification, but that does not matter), it is just a normal cert verifier.
In our cloud environment, we could restrict quorum members to those with specific certificates, allowing only those to join the quorum.
Why are you ever issue certificates for these servers or can you revoke those you want to reject ? I think you could use a dedicated ca for zookeeper quorum, and never issue certificates to non zookeeper servers.
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumPeer.java
Outdated
Show resolved
Hide resolved
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumPeer.java
Outdated
Show resolved
Hide resolved
@@ -390,6 +393,10 @@ public void parseProperties(Properties zkProp) throws IOException, ConfigExcepti | |||
multiAddressReachabilityCheckEnabled = parseBoolean(key, value); | |||
} else if (key.equals("oraclePath")) { | |||
oraclePath = value; | |||
} else if (key.equals(QuorumAuth.QUORUM_SSL_AUTHPROVIDER)) { | |||
sslAuthServerProvider = value; | |||
} else if (key.equals(QuorumAuth.QUORUM_SSL_LEARNER_AUTHPROVIDER)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No java system properties ? I found that most other ssl config options supports both, sslQuorumReloadCertFiles
is an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve always wondered why we need to pull the server-side settings from JVM system properties when they’re almost always defined in zoo.cfg. Supporting both adds needless complexity. It makes sense to allow client-side overrides via –D flags, but reading the server properties that way isn’t strictly necessary. In any event, I’ll go ahead and add the system-property support for completeness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but reading the server properties that way isn’t strictly necessary
I guess ZooKeeper server was designed to own/occupy jvm.
You may have to delete these tow options from QuorumPeerConfig
, otherwise it will not propagate them as system properties.
...er-server/src/test/java/org/apache/zookeeper/server/quorum/auth/MockSslQuorumAuthServer.java
Show resolved
Hide resolved
@kezhuw thank you for your review. About using a dedicated CA: we actually use a global certificate-management system called Athenz to issue certs for all our services, including multiple ZooKeeper quorums. Spinning up a completely separate CA per quorum would be painful—every time you add or remove a ZooKeeper node you’d have to provision or revoke a cert in that dedicated CA. Instead, you can continue using Athenz but tighten its issuance policies for your ZK quorums. For example:
This lets you keep a single, centrally managed CA (Athenz) while still ensuring only bona fide quorum members can join. |
…mAuthServer and QuorumAuthLearner - review comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The config option does not look right to me, I left some inline comments.
I am open to export QuorumAuthServer
, but still have some concerns.
every time you add or remove a ZooKeeper node you’d have to provision or revoke a cert in that dedicated CA.
Dynamic Membership: When a new ZK server spins up, it simply presents its Athenz role and automatically gets a cert scoped to the quorum domain—no manual CA changes.
We could divide membership changes.
- Add a new member. This requires no change. CA simply signs a new cert for the new member. Existing quorum member could verify the newly signed cert with existing trust store.
- Revoke a old member. ZooKeeper currently have
ssl.quorum.crl
to support cert revocation andsslQuorumReloadCertFiles
to detect trust store changes in filesystem.
ssl.crl and ssl.quorum.crl : (Java system properties: zookeeper.ssl.crl and zookeeper.ssl.quorum.crl) New in 3.5.5: Specifies whether Certificate Revocation List is enabled in client and quorum TLS protocols. Default: false
sslQuorumReloadCertFiles : (No Java system property) New in 3.5.5, 3.6.0: Allows Quorum SSL keyStore and trustStore reloading when the certificates on the filesystem change without having to restart the ZK process. Default: false
The questions from my side:
- Is
ssl.quorum.crl
andsslQuorumReloadCertFiles
enough for revocation ? - If not, how
QuorumAuthLearner
would help in your case ?
Scoped Roles or Domains: Define an Athenz domain (e.g. zookeeper.quorum) and only allow services in that domain to get certs with a specific OU or SAN (ou=zookeeper-quorum).
- Does
ssl.quorum.hostnameVerification
already fulfill the requirement ? This require a match between zookeeper node hostname and cert san names. For example, it will reject node "node1.kafka.quorum" from joining quorum with cert san "node1.zookeeper.quorum". - How about a dedicated ca(say, "ca.zookeeper.quorum") for zookeeper domain so you don't have to filter out non zookeeper nodes ? This is what I suggested in previous comment. This way, you don't have to reject node "node1.kafka.quorum" from joining quorum with cert san "node1.kafka.quorum" as cert "node1.kafka.quorum" is not signed by "ca.zookeeper.quorum".
- Is it worth for zookeeper to have an option (say, "zookeeper.ssl.quorum.nameFilter") to filter certs on cn/san ? I would prefer dedicated ca.
@eolivelli @anmolnar @cnauroth Any thoughts ?
authLearner = getSslQuorumAuthLearner(); | ||
} catch (Exception e) { | ||
LOG.error(e.getMessage(), e); | ||
throw new SaslException(e.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be a IOException
? I guess you may have to change some signatures.
@@ -390,6 +393,10 @@ public void parseProperties(Properties zkProp) throws IOException, ConfigExcepti | |||
multiAddressReachabilityCheckEnabled = parseBoolean(key, value); | |||
} else if (key.equals("oraclePath")) { | |||
oraclePath = value; | |||
} else if (key.equals(QuorumAuth.QUORUM_SSL_AUTHPROVIDER)) { | |||
sslAuthServerProvider = value; | |||
} else if (key.equals(QuorumAuth.QUORUM_SSL_LEARNER_AUTHPROVIDER)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but reading the server properties that way isn’t strictly necessary
I guess ZooKeeper server was designed to own/occupy jvm.
You may have to delete these tow options from QuorumPeerConfig
, otherwise it will not propagate them as system properties.
// Reflection helpers to access private methods | ||
|
||
private QuorumAuthServer invokeGetSslQuorumAuthServer(QuorumPeer peer) throws Exception { | ||
Method m = QuorumPeer.class.getDeclaredMethod("getSslQuorumAuthServer"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you should do here is reading the field authServer
for assertion, its getter is getQuorumAuthServer
.
… of QuorumAuthServer and QuorumAuthLearner