Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Authentication: Fix immediate update_connection_password to use username #3330

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

liorsve
Copy link
Contributor

@liorsve liorsve commented Mar 5, 2025

Issue link

This Pull Request is linked to issue: #3289

This PR fixes the update connection password to add the username as arg in update_connection_password(immediateAuth = True) (i.e. send "AUTH [username] [password]") in case username was configured in ServerCredentials during client creation.
Currently, only "AUTH [password]" is sent.

Additionally, it fixes the implementation of update_connection_password in CMD - previously, the updated password would be saved only in the multiplexed_connection struct, and then upon disconnection the authentication would still be with the old password. Now, the password is updated in the ReconnectingConnection struct which is used in reconnecitons.

NOTES-

  • Deleting the user with delete_acl_username_and_password is triggering a disconnect from clients authenticated with that username.
  • Unlike the previous tests, calling update_connection_password with non-immediate authentication when disconnected from server (for example after deleting the user or actively killing the connection) is supposed to succeed both for CME and CMD. That's because this is an internal operation and doesn't require an active connection. The difference is, when calling with immediate auth, CME can reconnect with that supplied password while CMD tries to reconnect with the initial password and thus won't succeed.

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one issue.
  • Commit message has a detailed description of what changed and why.
  • Tests are added or updated.
  • CHANGELOG.md and documentation files are updated.
  • Destination branch is correct - main or release
  • Create merge commit if merging release branch into main, squash otherwise.

@liorsve liorsve requested a review from a team as a code owner March 5, 2025 11:57
@liorsve liorsve force-pushed the add-username-auth branch from 225ff28 to d19562d Compare March 5, 2025 12:41
@liorsve liorsve added Rust core redis-rs/glide-core matter Core changes Used to label a PR as PR with significant changes that should trigger a full matrix tests. labels Mar 5, 2025
@Yury-Fridlyand Yury-Fridlyand added the node Node.js wrapper label Mar 5, 2025
Comment on lines 2286 to 2292
let username = core
.get_cluster_param(|params| params.username.clone())
.expect(MUTEX_WRITE_ERR);
Ok(Response::Single(match username {
Some(u) => Value::SimpleString(u),
None => Value::Nil,
}))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let username = core
.get_cluster_param(|params| params.username.clone())
.expect(MUTEX_WRITE_ERR);
Ok(Response::Single(match username {
Some(u) => Value::SimpleString(u),
None => Value::Nil,
}))
let username = match core
.get_cluster_param(|params| params.username.clone())
.expect(MUTEX_READ_ERR)
{
Some(username) => Value::SimpleString(username),
None => Value::Nil,
};
Ok(Response::Single(username))

Comment on lines 529 to 534
ClientWrapper::Cluster { ref mut client } => {
if let Ok(Value::SimpleString(username)) = client.get_username().await {
cmd.arg(username);
}
}
ClientWrapper::Standalone(_) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this logic into get_username function in glide_core's Client struct

@@ -71,6 +75,29 @@ describe("Auth tests", () => {
): { host: string; port: number }[] =>
addresses.map(([host, port]) => ({ host, port }));

async function setNewAclUsernameWithPassword(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the deleteAclUsernameAndPassword functions should handle the result of the requests - if the command fails you should catch it and panic, otherwise we won't know why the test is failing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an assertion everywhere besides delete_acl_username_and_password in the python tests - there the acl user is set only for the acl specific tests that I added, so asserting that a client was deleted fails on the non acl-user tests. So instead, in python I had the delete user function return the command's responds (1 if a user was deleted, 0 if not) and i'm asserting inside the relevant tests themselves that it's returning 1 when it should.

@liorsve liorsve changed the title Authentication: Fix CME immediate update_connection_password to use username Authentication: Fix immediate update_connection_password to use username Mar 10, 2025
@liorsve liorsve force-pushed the add-username-auth branch 7 times, most recently from eeecc18 to 6d91343 Compare March 10, 2025 16:17
@liorsve
Copy link
Contributor Author

liorsve commented Mar 10, 2025

Changes -

  • addressed previous comments about the CME changes.
  • added CMD fix in core + CMD tests.
  • In tests, I had to introduce some sleep wherever a server disconnection + password change was made, otherwise they were flaky. I encountered the same timing issue that was talked about here Remove outdated test for updating connection password in cluster mode #3311, where the client sometimes wouldn't have enough time to try and reconnect, probably because it takes some time for the new password to register in the server as well, and if the two are not well-synchronized in their timing it results in a connectionError. Introducing some sleep after the disconnection seems to have resolved it, so for now I re-added the test that was removed in this PR.

@barshaul @ikolomi @avifenesh

Copy link
Collaborator

@barshaul barshaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome job! added comments inline.
What about Java's tests?

@@ -119,7 +120,11 @@ async fn create_connection(
discover_az: bool,
connection_timeout: Duration,
) -> Result<ReconnectingConnection, (ReconnectingConnection, RedisError)> {
let client = &connection_backend.connection_info;
let client = {
let guard = connection_backend.connection_info.read().unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead unwrap use except with read lock err

@@ -363,4 +392,14 @@ impl ReconnectingConnection {
log_error("disconnect notifier", "BUG! Disconnect notifier is not set");
}
}

pub fn update_connection_password(&self, new_password: Option<String>) {
let mut client = self.inner.backend.connection_info.write().unwrap();
Copy link
Collaborator

@barshaul barshaul Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same regarding unwrap - —please ensure that all unwrap calls on locks are changed to expect. If I missed any instances where unwrap is used in a different context, keep in mind that we should avoid using unwrap in the code altogether

}

pub async fn get_username(&self) -> Option<String> {
self.get_primary_connection().get_username()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment explaining that all nodes in the client should have the same username and password thus we can get it from any of the nodes

Ok(Value::Okay)
}

pub async fn get_username(&self) -> Option<String> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add documentation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, why is it 'async'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason, I think a leftover from a prev version where the inner get_username function was async. Anyways made it sync

async fn get_username(&mut self) -> Option<String> {
match &mut self.internal_client {
ClientWrapper::Cluster { client } => {
if let Ok(Value::SimpleString(username)) = client.get_username().await {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone would change get_username to return a different Value (e.g. not SimpleString, but for example bytes), we will always return None and we won't know what's the issue.
Use match instead to make sure you panic if you get Ok with Value which isn't SimpleString

protocol,
clusterMode,
);
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing newline

request,
cluster_mode,
protocol=protocol,
request_timeout=5000,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it needed for the ACL tests? I think you can remove this and use the default

return await client.custom_command(
["ACL", "DELUSER", username], route=AllNodes()
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else: raise error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean else if it's not an instance of glideClient/glideClusterClient? but It's a TGlideCLient

raise RuntimeError(f"Failed to set ACL user: {e}")


async def delete_acl_username_and_password(client: TGlideClient, username):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing typing for username and password across the file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this function to python/python/tests/utils/utils.py

@@ -330,6 +364,37 @@ async def config_set_new_password(client: TGlideClient, password):
await client.config_set({"requirepass": password}, route=AllNodes())


async def set_new_acl_username_with_password(client: TGlideClient, username, password):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this function to python/python/tests/utils/utils.py

@liorsve liorsve force-pushed the add-username-auth branch 4 times, most recently from afd12bd to 44ae581 Compare March 20, 2025 11:56
Ok(Value::SimpleString(username)) => Some(username),
Ok(Value::Nil) => None,
Ok(other) => panic!("Expected SimpleString, got: {:?}", other),
Err(e) => panic!("Got error trying to get username: {:?}", e),
Copy link
Collaborator

@barshaul barshaul Mar 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to panic if get_username fails, as it would crash the whole client. Instead, change this function (get_username in mod.rs) to return Result<Option<String>> and return the error if we get one.

let guard = connection_backend
.connection_info
.read()
.expect("READ_LOCK_ERR");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should define consts and use them:
const WRITE_LOCK_ERR: &str = "Failed to acquire the write lock";
const READ_LOCK_ERR: &str = "Failed to acquire the read lock";

impl ConnectionBackend {
/// Returns a read-only reference to the client's connection information
fn get_backend_client(&self) -> RwLockReadGuard<'_, redis::Client> {
self.connection_info.read().expect("READ_LOCK_ERR")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same regarding the error msg

.backend
.connection_info
.write()
.expect("WRITE_LOCK_ERR");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same reg. excpect err msg

.backend
.connection_info
.read()
.expect("READ_LOCK_ERR");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

let client = self
.inner
.backend
.connection_info
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't you use get_backend_client here?

@@ -604,18 +604,23 @@ impl StandaloneClient {
});
}

/// Update the password used to authenticate with the servers.
/// Update the password used to authenticate with the server.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Update the password used to authenticate with the server.
/// Update the password used to authenticate with the servers.

Thread.sleep(1000);

// Validate client reconnected succsessfuly
assertNotNull(testClient.info().get());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you deleted the ACL user and haven't updated the client yet, how does it work? shouldn't we get an error here?

Copy link
Collaborator

@barshaul barshaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments inline

liorsve and others added 15 commits March 23, 2025 12:13
Signed-off-by: Lior Sventitzky <[email protected]>
Signed-off-by: Lior Sventitzky <[email protected]>
Signed-off-by: Lior Sventitzky <[email protected]>
Signed-off-by: Lior Sventitzky <[email protected]>
Signed-off-by: Lior Sventitzky <[email protected]>
Signed-off-by: Lior Sventitzky <[email protected]>
Signed-off-by: Lior Sventitzky <[email protected]>
Signed-off-by: Lior Sventitzky <[email protected]>
Signed-off-by: Lior Sventitzky <[email protected]>
Signed-off-by: Lior Sventitzky <[email protected]>
Signed-off-by: Lior Sventitzky <[email protected]>
Signed-off-by: Lior Sventitzky <[email protected]>
Co-authored-by: Bar Shaul <[email protected]>
Signed-off-by: Lior Sventitzky <[email protected]>
@liorsve liorsve force-pushed the add-username-auth branch from e456154 to 9a07e5d Compare March 23, 2025 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Core changes Used to label a PR as PR with significant changes that should trigger a full matrix tests. node Node.js wrapper Rust core redis-rs/glide-core matter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants