-
Notifications
You must be signed in to change notification settings - Fork 2
Add groupby #222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add groupby #222
Conversation
src/sciline/data_graph.py
Outdated
| meant to be executed directly, but to be further processed via | ||
| :meth:`reduce`. | ||
| """ | ||
| return self._from_cyclebane(self._cbgraph.groupby(node)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this is working. Doesn't it create some sort of broken object that looks like a regular DataGraph (will the usual methods), but actually only certain types of reduce calls will work, because that is all the object returned from groupby in Cyclebane can do)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is a weird/broken object. To avoid this, we could either create a GroupbyGraph in sciline as well, or just combine the groupby+reduce in a single call so we never return the broken object.
I would favor the second option. Something like mapped_wf.groupby(key=RawData, reduce=merge_function)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't we need extra args, making it complicated? Adding GroubyGraph should not be difficult, right?
|
We should wait for #221 |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
| "import sciline\n", | ||
| "\n", | ||
| "_fake_filesytem = {\n", | ||
| " 'file102.txt': [1, 2, float('nan'), 3],\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you undo the reformat? This is hard to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, it was done by mistake, not easy to undo...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git checkout -p main -- the/file.ipynb maybe?
src/sciline/data_graph.py
Outdated
| # We forward the constructor so this can be used by both DataGraph and Pipeline | ||
| self._graph_maker = graph_maker | ||
|
|
||
| def reduce(self, *, func: Callable[..., Any], **kwargs: Any) -> DataGraph: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is user-facing we should repeat all the args (for auto-complete) and docstrings, instead of just using kwargs.
Needs scipp/cyclebane#29