-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added snake case feature and tests #97
base: master
Are you sure you want to change the base?
Conversation
8c8865a
to
49f4a89
Compare
@@ -7,3 +7,4 @@ __pycache__ | |||
swagger.json | |||
config/swagger.json | |||
remove | |||
.vscode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. I just had the one request for a change, and one general comment. The latter can be worked on in another issue.
endpoint_lowercase = arguments[1].lower() | ||
|
||
# check for underscore seperated endpoint and convert it | ||
if "_" in arguments[1]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you don't need this if
check here. str.replace()
should just leave the string untouched if it doesn't have an _
in it:
>>> "blah_blah".replace("_","")
'blahblah'
>>> "blahBlahBlah".replace("_","")
'blahBlahBlah'
if "_" in arguments[1]: | ||
endpoint_lowercase = endpoint_lowercase.replace("_", "") | ||
|
||
for entry in endpoints: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that endpoints
is a dict, it'd be nice if we can look up endpoint_lowercase
in endpoints
rather than iterating through until we find the matching entry (or not). But for now, this certainly works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think about it, especially since this isn't a time-pressing issue, the more I feel we should really try to get this right.
So to summarize, this code would be a lot more efficient if instead of iterating through a set of dict keys, normalizing each one, and comparing it to a normalized candidate key; we mutate the candidate key to the format we expect, and then just do a dict lookup.
That is to say, if the user gives something like -e get_User_info
, we KNOW they mean getUserInfo
, and if they don't, we should treat their input as invalid. We could mutate the get_User_info
to getUserInfo
, and just do a membership check (i.e. if mutated_key in endpoints...
). That's a lot quicker from a performance standpoint, and I suspect it'll tighten up the code too.
Kevin suggested that we could also make it interactive at that point, printing to stderr ("Changing given argument 'get_User_info' to 'getUserInfo'. Proceed (Y/n)?") or something like that, and letting the user choose whether or not to proceed. Printing it to stderr vs stdout, or trying to detect if the command is coming through a TTY would address the concerns of those who want to pipe the stdout to jq
or something like that, as long as they don't throw in a 2>&1
or something like that.
def test_snakecase_and_underscore_conversion(): | ||
test_endpoints = ["getUserInfo"] | ||
correct_args = ["-e", "getUserInfo", "--username=johndoe"] | ||
test_args_case_underscore = ["-e", "_Get_USeriNFo", "--username=johndoe"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to support the leading underscore? I know the code as you have it will pass this test, but we should probably say in the documentation that we only support snake case where it makes sense (i.e. getUserInfo, get_user_info, and get_User_Info are all supported, but _get_User_Info_
is not supported, even if it actually will work).
if "_" in arguments[1]: | ||
endpoint_lowercase = endpoint_lowercase.replace("_", "") | ||
|
||
for entry in endpoints: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think about it, especially since this isn't a time-pressing issue, the more I feel we should really try to get this right.
So to summarize, this code would be a lot more efficient if instead of iterating through a set of dict keys, normalizing each one, and comparing it to a normalized candidate key; we mutate the candidate key to the format we expect, and then just do a dict lookup.
That is to say, if the user gives something like -e get_User_info
, we KNOW they mean getUserInfo
, and if they don't, we should treat their input as invalid. We could mutate the get_User_info
to getUserInfo
, and just do a membership check (i.e. if mutated_key in endpoints...
). That's a lot quicker from a performance standpoint, and I suspect it'll tighten up the code too.
Kevin suggested that we could also make it interactive at that point, printing to stderr ("Changing given argument 'get_User_info' to 'getUserInfo'. Proceed (Y/n)?") or something like that, and letting the user choose whether or not to proceed. Printing it to stderr vs stdout, or trying to detect if the command is coming through a TTY would address the concerns of those who want to pipe the stdout to jq
or something like that, as long as they don't throw in a 2>&1
or something like that.
) -> List[str]: | ||
|
||
# check to see if the arguments supplied are for an endpoint. IE a "-e" was supplied | ||
if arguments[0] == "-e" or arguments[0] == "-E": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was chatting with @retzkek , and this PR came up. He pointed out that this logic works if the user gives us exactly ferry-cli -e ENDPOINT
, but if they don't, it becomes fragile. To demonstrate this, go ahead and try running a malformed command like:
ferry-cli --foo -e get_user_info --username=nosuchuser
In that case, arguments[0]
is actually equal to --foo
, and so the intended transformation of get_user_info
to getUserInfo
won't happen, and the error you get is actually:
An error occurred while using the FERRY CLI: Error: 'get_user_info' is not a valid endpoint. Run 'ferry -l' for a full list of available endpoints.
which is the wrong error to print here.
There are a couple of possible ways to fix this that I can think of off the top of my head:
- Use a function-scoped
argparse.ArgumentParser
JUST in this function, that only has the argument-e
defined. Then useArgumentParser.parse_known_args()
on a copy ofarguments
to make sure you save all the arguments and can return an argument list with just the-e
value replaced. - Iterate through the
arguments
list, and once you find your-e
or-E
(or--endpoint
), do the value transformation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to throw another wrench into the works:
and once you find your -e or -E (or --endpoint)
What if that changes? In five years someone comes along and says "-e/-E/--endpoint" is a terrible name, it should be -p/--point-final 🇫🇷." How does someone changing the main parser (on line 109) know to change this too (on line 548)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, at some point we can't solve for every possible future situation - we have to make some assumptions.
But to safeguard against this, Lydia, you could add a test to make sure that calls both FerryCLI.get_arg_parser
and this function on the same set of args, and make sure both return expected values. Maybe there's a more elegant way than that, but something that ensures that what you're checking for here is a valid FerryCLI
argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point I hoped to get across (perhaps too subtly) was that you should first look for a way to solve this that doesn't even need to know or care what the actual flags are. Even better if you don't need to do some second-level parsing of arguments at all. A couple ideas:
- Use a custom
action
class for the argument. https://docs.python.org/3/library/argparse.html#action-classes - Maybe you can add the functionality to the
FerryParser
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too subtle!
But in all seriousness, I do like the idea of having a custom Action
for the --endpoint
argument that just does this transformation on the argument automatically (or leaves it alone). @cathulhu - have you worked with custom Actions
?
If not, we have a couple of examples in the FerryCLI
class (get_endpoint_params_action
), but to see a more direct, less-wrapped up example, here's one from jobsub where we store the values of the -G
flag there in the environment: https://github.com/fermitools/jobsub_lite/blob/master/lib/get_parser.py#L42.
This is pull request for issue #94.