Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added snake case feature and tests #97

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

cathulhu
Copy link
Collaborator

This is pull request for issue #94.

@@ -7,3 +7,4 @@ __pycache__
swagger.json
config/swagger.json
remove
.vscode
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

Copy link
Collaborator

@shreyb shreyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I just had the one request for a change, and one general comment. The latter can be worked on in another issue.

endpoint_lowercase = arguments[1].lower()

# check for underscore seperated endpoint and convert it
if "_" in arguments[1]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you don't need this if check here. str.replace() should just leave the string untouched if it doesn't have an _ in it:

>>> "blah_blah".replace("_","")
'blahblah'
>>> "blahBlahBlah".replace("_","")
'blahBlahBlah'

if "_" in arguments[1]:
endpoint_lowercase = endpoint_lowercase.replace("_", "")

for entry in endpoints:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that endpoints is a dict, it'd be nice if we can look up endpoint_lowercase in endpoints rather than iterating through until we find the matching entry (or not). But for now, this certainly works.

Copy link
Collaborator

@shreyb shreyb Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about it, especially since this isn't a time-pressing issue, the more I feel we should really try to get this right.

So to summarize, this code would be a lot more efficient if instead of iterating through a set of dict keys, normalizing each one, and comparing it to a normalized candidate key; we mutate the candidate key to the format we expect, and then just do a dict lookup.

That is to say, if the user gives something like -e get_User_info, we KNOW they mean getUserInfo, and if they don't, we should treat their input as invalid. We could mutate the get_User_info to getUserInfo, and just do a membership check (i.e. if mutated_key in endpoints...). That's a lot quicker from a performance standpoint, and I suspect it'll tighten up the code too.

Kevin suggested that we could also make it interactive at that point, printing to stderr ("Changing given argument 'get_User_info' to 'getUserInfo'. Proceed (Y/n)?") or something like that, and letting the user choose whether or not to proceed. Printing it to stderr vs stdout, or trying to detect if the command is coming through a TTY would address the concerns of those who want to pipe the stdout to jq or something like that, as long as they don't throw in a 2>&1 or something like that.

def test_snakecase_and_underscore_conversion():
test_endpoints = ["getUserInfo"]
correct_args = ["-e", "getUserInfo", "--username=johndoe"]
test_args_case_underscore = ["-e", "_Get_USeriNFo", "--username=johndoe"]
Copy link
Collaborator

@shreyb shreyb Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to support the leading underscore? I know the code as you have it will pass this test, but we should probably say in the documentation that we only support snake case where it makes sense (i.e. getUserInfo, get_user_info, and get_User_Info are all supported, but _get_User_Info_ is not supported, even if it actually will work).

if "_" in arguments[1]:
endpoint_lowercase = endpoint_lowercase.replace("_", "")

for entry in endpoints:
Copy link
Collaborator

@shreyb shreyb Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about it, especially since this isn't a time-pressing issue, the more I feel we should really try to get this right.

So to summarize, this code would be a lot more efficient if instead of iterating through a set of dict keys, normalizing each one, and comparing it to a normalized candidate key; we mutate the candidate key to the format we expect, and then just do a dict lookup.

That is to say, if the user gives something like -e get_User_info, we KNOW they mean getUserInfo, and if they don't, we should treat their input as invalid. We could mutate the get_User_info to getUserInfo, and just do a membership check (i.e. if mutated_key in endpoints...). That's a lot quicker from a performance standpoint, and I suspect it'll tighten up the code too.

Kevin suggested that we could also make it interactive at that point, printing to stderr ("Changing given argument 'get_User_info' to 'getUserInfo'. Proceed (Y/n)?") or something like that, and letting the user choose whether or not to proceed. Printing it to stderr vs stdout, or trying to detect if the command is coming through a TTY would address the concerns of those who want to pipe the stdout to jq or something like that, as long as they don't throw in a 2>&1 or something like that.

) -> List[str]:

# check to see if the arguments supplied are for an endpoint. IE a "-e" was supplied
if arguments[0] == "-e" or arguments[0] == "-E":
Copy link
Collaborator

@shreyb shreyb Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was chatting with @retzkek , and this PR came up. He pointed out that this logic works if the user gives us exactly ferry-cli -e ENDPOINT, but if they don't, it becomes fragile. To demonstrate this, go ahead and try running a malformed command like:

ferry-cli --foo -e get_user_info --username=nosuchuser

In that case, arguments[0] is actually equal to --foo, and so the intended transformation of get_user_info to getUserInfo won't happen, and the error you get is actually:

An error occurred while using the FERRY CLI: Error: 'get_user_info' is not a valid endpoint. Run 'ferry -l' for a full list of available endpoints.

which is the wrong error to print here.

There are a couple of possible ways to fix this that I can think of off the top of my head:

  1. Use a function-scoped argparse.ArgumentParser JUST in this function, that only has the argument -e defined. Then use ArgumentParser.parse_known_args() on a copy of arguments to make sure you save all the arguments and can return an argument list with just the -e value replaced.
  2. Iterate through the arguments list, and once you find your -e or -E (or --endpoint), do the value transformation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to throw another wrench into the works:

and once you find your -e or -E (or --endpoint)

What if that changes? In five years someone comes along and says "-e/-E/--endpoint" is a terrible name, it should be -p/--point-final 🇫🇷." How does someone changing the main parser (on line 109) know to change this too (on line 548)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, at some point we can't solve for every possible future situation - we have to make some assumptions.

But to safeguard against this, Lydia, you could add a test to make sure that calls both FerryCLI.get_arg_parser and this function on the same set of args, and make sure both return expected values. Maybe there's a more elegant way than that, but something that ensures that what you're checking for here is a valid FerryCLI argument.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point I hoped to get across (perhaps too subtly) was that you should first look for a way to solve this that doesn't even need to know or care what the actual flags are. Even better if you don't need to do some second-level parsing of arguments at all. A couple ideas:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too subtle!

But in all seriousness, I do like the idea of having a custom Action for the --endpoint argument that just does this transformation on the argument automatically (or leaves it alone). @cathulhu - have you worked with custom Actions?

If not, we have a couple of examples in the FerryCLI class (get_endpoint_params_action), but to see a more direct, less-wrapped up example, here's one from jobsub where we store the values of the -G flag there in the environment: https://github.com/fermitools/jobsub_lite/blob/master/lib/get_parser.py#L42.

@shreyb shreyb linked an issue Jan 24, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow for derived arguments to be given as snake case
3 participants