-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Implementation of Extended Isolation Forest (EIF) anomaly detector #2679
base: main
Are you sure you want to change the base?
Conversation
Thank you for contributing to
|
Please fill out the template and use the correct title format. |
I've fixed the title format. Shall I proceed with adding EIF to the online API documentation? |
Hi @Akhil-Jasson, I just saw your code, and it looks great! That said, I don’t think using H2O is the best approach since Aeon doesn’t rely on it. It might be better to have our own implementation instead. A few updates to consider: 1.Could you update the section "Does your contribution introduce a new dependency?" and mention H2O there? 2.The test cases seem to be missing—could you add them? 3.Instead of importing the entire H2O module, it’s better to import only what’s needed to keep things lightweight. |
New dependencies should be put in pyproject.toml otherwise this won't be tested, Still bits missing from the template |
…ile for the EIF implementation.
I've added the h2o dependency to pyproject.toml, but I'm encountering errors when running the test files. The test attempts to import aeon.anomaly_detection._eif but fails, indicating that the module doesn’t exist yet. Is there a step I'm missing for adding new modules to aeon? What could be the possible issue? |
your import is incorrect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AI2O looks like a massive package. Do we actually want to include it as a dependency? The issue clearly states that we are looking for an implementation in aeon directly.
I know this is not mentioned in the corresponding issue, but I think it makes sense to work with sliding windows in EIF as well. We can always get the original behavior back by setting the window-size to 1.
self.contamination = contamination | ||
self.extension_level = extension_level | ||
self.random_state = random_state | ||
self.scaler = StandardScaler() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scaler
is not a parameter, so cannot be instantiated here. Please read the sklearn estimator development docs on how to name and where to place parameters, fitted and non-fitted attributes, etc.
# Fit the scaler | ||
self.scaler.fit(X) | ||
X_scaled = self.scaler.transform(X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The paper does not explicitly mention that scaling the data provides better results. In anomaly detection scaling might hide some types of anomalies. Why do you include it?
|
||
return self | ||
|
||
def _predict(self, X) -> np.ndarray: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make the usage of EIF similar to our other models, we want it to be usable as a semi-supervised (as implemented already) and an unsupervised algorithm. The current implementation of _predict
does not allow that.
This PR implements the Extended Isolation Forest (EIF) algorithm.
Reference Issues/PRs
Fixes #2113
What does this implement/fix? Explain your changes.
Does your contribution introduce a new dependency? If yes, which one?
Yes, it introduces H20.ai as a new dependency
Any other comments?
PR checklist
For all contributions
For new estimators and functions
__maintainer__
at the top of relevant files and want to be contacted regarding its maintenance. Unmaintained files may be removed. This is for the full file, and you should not add yourself if you are just making minor changes or do not want to help maintain its contents.For developers with write access