added snake case feature and tests #97

cathulhu · 2025-01-22T20:59:09Z

This is pull request for issue #94.

shreyb · 2025-01-23T16:04:33Z

.gitignore

@@ -7,3 +7,4 @@ __pycache__
 swagger.json
 config/swagger.json
 remove
+.vscode


shreyb

This looks good. I just had the one request for a change, and one general comment. The latter can be worked on in another issue.

shreyb · 2025-01-23T16:25:29Z

ferry_cli/__main__.py

+        endpoint_lowercase = arguments[1].lower()
+
+        # check for underscore seperated endpoint and convert it
+        if "_" in arguments[1]:


I believe you don't need this if check here. str.replace() should just leave the string untouched if it doesn't have an _ in it:

>>> "blah_blah".replace("_","") 'blahblah' >>> "blahBlahBlah".replace("_","") 'blahBlahBlah'

shreyb · 2025-01-23T16:34:55Z

ferry_cli/__main__.py

+        if "_" in arguments[1]:
+            endpoint_lowercase = endpoint_lowercase.replace("_", "")
+
+        for entry in endpoints:


Given that endpoints is a dict, it'd be nice if we can look up endpoint_lowercase in endpoints rather than iterating through until we find the matching entry (or not). But for now, this certainly works.

The more I think about it, especially since this isn't a time-pressing issue, the more I feel we should really try to get this right.

So to summarize, this code would be a lot more efficient if instead of iterating through a set of dict keys, normalizing each one, and comparing it to a normalized candidate key; we mutate the candidate key to the format we expect, and then just do a dict lookup.

That is to say, if the user gives something like -e get_User_info, we KNOW they mean getUserInfo, and if they don't, we should treat their input as invalid. We could mutate the get_User_info to getUserInfo, and just do a membership check (i.e. if mutated_key in endpoints...). That's a lot quicker from a performance standpoint, and I suspect it'll tighten up the code too.

Kevin suggested that we could also make it interactive at that point, printing to stderr ("Changing given argument 'get_User_info' to 'getUserInfo'. Proceed (Y/n)?") or something like that, and letting the user choose whether or not to proceed. Printing it to stderr vs stdout, or trying to detect if the command is coming through a TTY would address the concerns of those who want to pipe the stdout to jq or something like that, as long as they don't throw in a 2>&1 or something like that.

shreyb · 2025-01-23T16:39:27Z

tests/test_main.py

+def test_snakecase_and_underscore_conversion():
+    test_endpoints = ["getUserInfo"]
+    correct_args = ["-e", "getUserInfo", "--username=johndoe"]
+    test_args_case_underscore = ["-e", "_Get_USeriNFo", "--username=johndoe"]


Do we want to support the leading underscore? I know the code as you have it will pass this test, but we should probably say in the documentation that we only support snake case where it makes sense (i.e. getUserInfo, get_user_info, and get_User_Info are all supported, but _get_User_Info_ is not supported, even if it actually will work).

shreyb · 2025-01-23T21:07:36Z

ferry_cli/__main__.py

+        if "_" in arguments[1]:
+            endpoint_lowercase = endpoint_lowercase.replace("_", "")
+
+        for entry in endpoints:


The more I think about it, especially since this isn't a time-pressing issue, the more I feel we should really try to get this right.

So to summarize, this code would be a lot more efficient if instead of iterating through a set of dict keys, normalizing each one, and comparing it to a normalized candidate key; we mutate the candidate key to the format we expect, and then just do a dict lookup.

That is to say, if the user gives something like -e get_User_info, we KNOW they mean getUserInfo, and if they don't, we should treat their input as invalid. We could mutate the get_User_info to getUserInfo, and just do a membership check (i.e. if mutated_key in endpoints...). That's a lot quicker from a performance standpoint, and I suspect it'll tighten up the code too.

Kevin suggested that we could also make it interactive at that point, printing to stderr ("Changing given argument 'get_User_info' to 'getUserInfo'. Proceed (Y/n)?") or something like that, and letting the user choose whether or not to proceed. Printing it to stderr vs stdout, or trying to detect if the command is coming through a TTY would address the concerns of those who want to pipe the stdout to jq or something like that, as long as they don't throw in a 2>&1 or something like that.

shreyb · 2025-01-23T21:13:54Z

ferry_cli/__main__.py

+) -> List[str]:
+
+    # check to see if the arguments supplied are for an endpoint. IE a "-e" was supplied
+    if arguments[0] == "-e" or arguments[0] == "-E":


I was chatting with @retzkek , and this PR came up. He pointed out that this logic works if the user gives us exactly ferry-cli -e ENDPOINT, but if they don't, it becomes fragile. To demonstrate this, go ahead and try running a malformed command like:

ferry-cli --foo -e get_user_info --username=nosuchuser

In that case, arguments[0] is actually equal to --foo, and so the intended transformation of get_user_info to getUserInfo won't happen, and the error you get is actually:

An error occurred while using the FERRY CLI: Error: 'get_user_info' is not a valid endpoint. Run 'ferry -l' for a full list of available endpoints.

which is the wrong error to print here.

There are a couple of possible ways to fix this that I can think of off the top of my head:

Use a function-scoped argparse.ArgumentParser JUST in this function, that only has the argument -e defined. Then use ArgumentParser.parse_known_args() on a copy of arguments to make sure you save all the arguments and can return an argument list with just the -e value replaced.

Iterate through the arguments list, and once you find your -e or -E (or --endpoint), do the value transformation.

I'm going to throw another wrench into the works:

and once you find your -e or -E (or --endpoint)

What if that changes? In five years someone comes along and says "-e/-E/--endpoint" is a terrible name, it should be -p/--point-final 🇫🇷." How does someone changing the main parser (on line 109) know to change this too (on line 548)?

I mean, at some point we can't solve for every possible future situation - we have to make some assumptions.

But to safeguard against this, Lydia, you could add a test to make sure that calls both FerryCLI.get_arg_parser and this function on the same set of args, and make sure both return expected values. Maybe there's a more elegant way than that, but something that ensures that what you're checking for here is a valid FerryCLI argument.

The point I hoped to get across (perhaps too subtly) was that you should first look for a way to solve this that doesn't even need to know or care what the actual flags are. Even better if you don't need to do some second-level parsing of arguments at all. A couple ideas:

Use a custom action class for the argument. https://docs.python.org/3/library/argparse.html#action-classes

Maybe you can add the functionality to the FerryParser.

Too subtle!

But in all seriousness, I do like the idea of having a custom Action for the --endpoint argument that just does this transformation on the argument automatically (or leaves it alone). @cathulhu - have you worked with custom Actions?

If not, we have a couple of examples in the FerryCLI class (get_endpoint_params_action), but to see a more direct, less-wrapped up example, here's one from jobsub where we store the values of the -G flag there in the environment: https://github.com/fermitools/jobsub_lite/blob/master/lib/get_parser.py#L42.

cathulhu mentioned this pull request Jan 22, 2025

Allow for derived arguments to be given as snake case #94

Open

added snake case feature, tests, and updated pre-commit version

49f4a89

cathulhu force-pushed the lydia_snake_case branch from 8c8865a to 49f4a89 Compare January 23, 2025 00:37

shreyb reviewed Jan 23, 2025

View reviewed changes

.gitignore

@@ -7,3 +7,4 @@ __pycache__

swagger.json

config/swagger.json

remove

.vscode

Copy link

Collaborator

shreyb Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

shreyb requested changes Jan 23, 2025

View reviewed changes

shreyb linked an issue Jan 24, 2025 that may be closed by this pull request

Allow for derived arguments to be given as snake case #94

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added snake case feature and tests #97

added snake case feature and tests #97

cathulhu commented Jan 22, 2025

shreyb Jan 23, 2025

shreyb left a comment

shreyb Jan 23, 2025

shreyb Jan 23, 2025

shreyb Jan 23, 2025 •

edited

Loading

shreyb Jan 23, 2025 •

edited

Loading

shreyb Jan 23, 2025 •

edited

Loading

shreyb Jan 23, 2025 •

edited

Loading

retzkek Jan 23, 2025

shreyb Jan 24, 2025

retzkek Jan 24, 2025

shreyb Jan 24, 2025

added snake case feature and tests #97

Are you sure you want to change the base?

added snake case feature and tests #97

Conversation

cathulhu commented Jan 22, 2025

shreyb Jan 23, 2025

Choose a reason for hiding this comment

shreyb left a comment

Choose a reason for hiding this comment

shreyb Jan 23, 2025

Choose a reason for hiding this comment

shreyb Jan 23, 2025

Choose a reason for hiding this comment

shreyb Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

shreyb Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

shreyb Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

shreyb Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

retzkek Jan 23, 2025

Choose a reason for hiding this comment

shreyb Jan 24, 2025

Choose a reason for hiding this comment

retzkek Jan 24, 2025

Choose a reason for hiding this comment

shreyb Jan 24, 2025

Choose a reason for hiding this comment

shreyb Jan 23, 2025 •

edited

Loading

shreyb Jan 23, 2025 •

edited

Loading

shreyb Jan 23, 2025 •

edited

Loading

shreyb Jan 23, 2025 •

edited

Loading