Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Added more prominent examples of using aeon with sklearn #2705

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

an04shu
Copy link

@an04shu an04shu commented Mar 30, 2025

PR Description for Issue #1921

Pull Request Title:

[DOC] Added more prominent examples of using aeon with sklearn

Reference Issues/PRs

Fixes #1921

What does this implement/fix?

This PR enhances the sklearn_distances.ipynb notebook by:

  • Improving clarity and visibility of aeon’s integration with sklearn.
  • Adding structured markdown explanations for:
    • Data formatting
    • Regression using kNN with DTW distance
    • Classification using kNN with DTW distance
    • Clustering using k-Means with DTW distance
    • Cross-validation with sklearn models
  • Ensuring examples align with a "Getting Started for sklearn Users" approach.
  • Cleaning redundant steps to make the examples more concise.

Does your contribution introduce a new dependency?

No new dependencies were introduced.

Any other comments?

This aims to make aeon’s usage in sklearn workflows more accessible and well-documented. Let me know if any refinements are needed! 🚀


Here's the corrected checklist for your PR:

PR Checklist

For all contributions
For documentation updates
  • Examples are structured clearly for new users.
  • Markdown descriptions are concise and helpful.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@aeon-actions-bot aeon-actions-bot bot added documentation Improvements or additions to documentation examples Example notebook related labels Mar 30, 2025
@aeon-actions-bot
Copy link
Contributor

Thank you for contributing to aeon

I have added the following labels to this PR based on the title: [ $\color{#F3B9F8}{\textsf{documentation}}$ ].
I have added the following labels to this PR based on the changes made: [ $\color{#45FD64}{\textsf{examples}}$ ]. Feel free to change these if they do not properly represent the PR.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

  • Run pre-commit checks for all files
  • Run mypy typecheck tests
  • Run all pytest tests and configurations
  • Run all notebook example tests
  • Run numba-disabled codecov tests
  • Stop automatic pre-commit fixes (always disabled for drafts)
  • Disable numba cache loading
  • Push an empty commit to re-run CI checks

@@ -29,7 +29,7 @@
},
Copy link
Member

@baraline baraline Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #12.        "Graph of neighbors for the first pattern in testing set with EDR distance on 2D"

Just caught that we are missing a space after "2D" for the print to be correctly formatted here, would be nice to fix this if you could.


Reply via ReviewNB

@@ -29,7 +29,7 @@
},
Copy link
Member

@baraline baraline Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this section convey the right message. This kinda implies that it would make sense to use time series data with sklearn estimators by just flattening it, which can hardly be recommended for time series data in general.

I would either remove it or add an example on how to convert 3D to 2D using an aeon transformer estimator (e.g. shapelet, rocket, catch22 etc...)


Reply via ReviewNB

@@ -29,7 +29,7 @@
},
Copy link
Member

@baraline baraline Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #3.    knn_regressor.fit(X_train_2D_flat, y_train_3D)

If the classification case uses a distance-based approach, I would rather have regression to use an aeon transformation to make a X_train_2D to use with the regression. Then you could show how to combine the two estimators inside a sklearn pipeline.


Reply via ReviewNB

@@ -29,7 +29,7 @@
},
Copy link
Member

@baraline baraline Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #10.    X_test_2D = X_test_3D.reshape(X_test_3D.shape[0], -1)

These 2D arrays are not used in this cell, should be removed


Reply via ReviewNB

@@ -29,7 +29,7 @@
},
Copy link
Member

@baraline baraline Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add some example diversity, I would rather have an example of how to define a pipeline using an aeon transformer estimator followed by a sklearn estimator, which can then be used into the cross validation or to fit/predict


Reply via ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation examples Example notebook related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DOC] More prominent examples of using aeon with sklearn
2 participants