Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add most functions to the Expr class so that they're chainable. #1064

Open
deanm0000 opened this issue Mar 14, 2025 · 5 comments
Open

Add most functions to the Expr class so that they're chainable. #1064

deanm0000 opened this issue Mar 14, 2025 · 5 comments
Labels
enhancement New feature or request

Comments

@deanm0000
Copy link

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Instead of doing

df.select(
    F.abs(
        col("id1")
        )
    .alias("id1")
    )

It would be nicer to do

df.select(
    col("id1")
    .abs()
    .alias("id1")
    )

Describe the solution you'd like
This is already partly there, for example, alias is already in the Expr class. It would be a bit tedious but easy to add under class Expr, for example:

def abs(self) -> datafusion.Expr:
    """Return the absolute value of a given number.

    Returns:
    --------
    Expr
        A new expression representing the absolute value of the input expression.
    """
    return F.abs(self)

Describe alternatives you've considered
if it weren't for the type hinter, monkey patching.

Additional context
There will still be functions that don't make sense to chain off of a call to col such as when since it doesn't return an Expr. But, even functions that take multiple inputs can have this for instance col("a").atan2("b"). Additionally, this is completely backwards compatible since I'm not proposing eliminating the functions module.

@deanm0000 deanm0000 added the enhancement New feature or request label Mar 14, 2025
@deanm0000 deanm0000 changed the title Add all functions to the Expr class so that they're chainable. Add most functions to the Expr class so that they're chainable. Mar 14, 2025
@deanm0000
Copy link
Author

Is this something a PR would be accepted for or no?

@timsaucer
Copy link
Contributor

This sounds like a great idea!

I don’t think we want it for every function, but definitely many of them - probably a majority.

Things like regr_count probably don’t benefit from this but abs absolutely does (pun intended).

Thank you for filing the issue!

@deanm0000
Copy link
Author

@timsaucer I put in a draft PR that does all the one input arg functions.

Is your reluctance to putting every function in due to clutter or that more funcs is (seemingly) more work? For the draft, I wrote a script to generate the funcs so filtering some out is actually more work. That said, I didn't type each of them manually so don't feel bad in chopping some of them.

As to the general point of clutter, I made a comment in the PR advocating the idea of categorizing functions and putting them in namespaces.

@timsaucer
Copy link
Contributor

timsaucer commented Mar 20, 2025

The reluctance is only because some of them I think doesn't improve ergonomics for our users. The 1 and 2 argument functions are absolutely helpful. But I can be persuaded otherwise.

@ion-elgreco
Copy link
Contributor

Actually a duplicate of #876

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants