-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Add float datatype support to rocAucScore #3074
Conversation
/intelci: run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not backward-compatible:
miniconda3/envs/build/lib/python3.11/site-packages/daal4py/__init__.py:54: in <module>
from daal4py._daal4py import *
E ImportError: /.../miniconda3/envs/build/lib/python3.11/site-packages/daal4py/_daal4py.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN4daal15data_management8internal11rocAucScoreERKNS_8services10interface19SharedPtrINS0_10interface112NumericTableEEES9_
This PR will be stopped until CI can track ABI changes, as this requirement is undocumented and not written in the oneDAL development rules, nor the checklist for PRs. |
@icfaust I haven't checked the code in detail, but doesn't the fastest way of calculating this metric involve computations of numbers up to "nrows^2" in a loop where they are added/subtracted to similar numbers? The fp32 data type only has integer-level precision up to 2^23 which is easy to exceed, after which the precision of the calculation might degrade. |
commit 35e0b62 Author: Faust, Ian <[email protected]> Date: Mon Mar 10 14:56:52 2025 +0100 renew commit 1c9299c Author: Ian Faust <[email protected]> Date: Mon Mar 10 07:24:21 2025 +0100 Update ci.yml commit c6a7b4a Author: Ian Faust <[email protected]> Date: Mon Mar 10 06:44:43 2025 +0100 Update ci.yml commit 18dac73 Author: Ian Faust <[email protected]> Date: Thu Mar 6 01:28:24 2025 +0100 Update ci.yml commit d84c663 Author: Ian Faust <[email protected]> Date: Thu Mar 6 00:53:40 2025 +0100 Update ci.yml commit 91a523d Author: Ian Faust <[email protected]> Date: Thu Mar 6 00:05:02 2025 +0100 Update ci.yml commit a1048df Author: Ian Faust <[email protected]> Date: Wed Mar 5 23:54:30 2025 +0100 Update ci.yml commit b6a84b8 Author: Ian Faust <[email protected]> Date: Wed Mar 5 23:18:26 2025 +0100 Update ci.yml commit d496c2f Author: Ian Faust <[email protected]> Date: Wed Mar 5 22:18:02 2025 +0100 Update ci.yml commit 272fdfe Author: Ian Faust <[email protected]> Date: Wed Mar 5 19:15:04 2025 +0100 Update ci.yml commit 05ee213 Author: Ian Faust <[email protected]> Date: Wed Mar 5 18:35:28 2025 +0100 Update ci.yml commit fd8bf73 Author: Faust, Ian <[email protected]> Date: Wed Mar 5 18:17:24 2025 +0100 remove false comment commit 71c19df Author: Faust, Ian <[email protected]> Date: Wed Mar 5 18:10:59 2025 +0100 first try
Seems logically sound to me, kept the core computation in double, but now will still interface with float data. |
/intelci: run |
template DAAL_EXPORT double rocAucScore<float>(const NumericTablePtr & truePrediction, const NumericTablePtr & testPrediction); | ||
template DAAL_EXPORT double rocAucScore<double>(const NumericTablePtr & truePrediction, const NumericTablePtr & testPrediction); | ||
|
||
// necessary for maintaining ABI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we mark this so it would be easy to identify and remove in 2026.x release?
Description
The algorithm rocAucScore is currently forced to do all computation in doubles. This does not follow the coding standards of the codebase where it should template both the cpu ISA as well as the datatype (float or double). In order to properly interface to this code outside of DAAL (i.e. oneDAL), it must first maintain a float version as well. While introducing a datatype template, a default value of double is set in order to guarantee compatibility to daal4py. These changes do not have any impact on the performance or operation of daal4py or any code which currently interfaces rocAucScore, therefore performance benchmarking is unnecessary.
This PR is a precursor to enabling a oneDAL algorithm of rocAucScore. This completes issue #2740
PR should start as a draft, then move to ready for review state after CI is passed and all applicable checkboxes are closed.
This approach ensures that reviewers don't spend extra time asking for regular requirements.
You can remove a checkbox as not applicable only if it doesn't relate to this PR in any way.
For example, PR with docs update doesn't require checkboxes for performance while PR with any change in actual code should have checkboxes and justify how this code change is expected to affect performance (or justification should be self-evident).
Checklist to comply with before moving PR from draft:
PR completeness and readability
Testing
Performance