-
Notifications
You must be signed in to change notification settings - Fork 469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature add new one hot function meeting multi-dimensions (ranks) #2613
base: main
Are you sure you want to change the base?
Conversation
add one hot test
modify format add tests
modify comments
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2613 +/- ##
==========================================
+ Coverage 81.86% 83.15% +1.28%
==========================================
Files 833 841 +8
Lines 106450 108049 +1599
==========================================
+ Hits 87146 89848 +2702
+ Misses 19304 18201 -1103 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take some time to look at this later, but just a couple comments before reviewing the actual code.
- We don't need to have all of the ops comply with the ONNX spec.
- Introducing this as a new operation means that we now have multiple definitions for one-hot. One definition should take over, otherwise it makes everything cluttered.
- Regarding the motivation, do you actually need this one-hot definition? Or is it simply for ONNX conversion? If only the latter, than it can probably just live in the ONNX import code.
@laggui Thank you for your comments. The existing
For current float version one hot, I do not come up with any use case. In PyTorch, for example, the function is minimally designed to support multiple dimensions, and this aspect is something that needs improvement in our framework as well.
Furthermore, another major framework, TensorFlow, not only supports multiple dimensions but also provides flexibility with parameters such as
Further usecases
The ability to configure multiple dimensions, Regarding the concern about having multiple definitions, I also have the same sentiment and agree that unification is necessary. My proposed new function is designed to support both I look forward to your feedback and hope for your support in making this improvement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, makes sense! Thanks for the detailed response.
I think it is especially useful for the rank > 1 use cases, the rest of the configurable stuff seems less relevant to me. But I understand that there could be value in supporting the broad spec.
See my comments below 🙂
Regarding the multiple definitions, I think I would deprecate the other definitions since this can do it all. Just make sure to adapt the existing tests.
@laggui I modified codes, please review them again (maybe after your Christmas vacation, enjoy!). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hope you had a nice holiday break! Thanks for addressing my previous comments 🙂
I have some follow-up changes. Mostly form over content.
@laggui |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright should be good to go after this round! 🙂
@@ -461,8 +461,8 @@ impl TensorCheck { | |||
check | |||
} | |||
|
|||
pub(crate) fn one_hot_tensor<B: Backend>( | |||
index_tensor: Tensor<B, 1, Int>, | |||
pub(crate) fn one_hot_tensor<B: Backend, const D: usize, K: Numeric<B>>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the tensor checks was already added in a previous PR, I don't think we should be performing data validation since this could cause read synchronizations when using a GPU backend.
if index_tensor
.clone()
.greater_equal_elem(num_classes as i32)
.any()
.into_scalar()
For that reason, we might want to remove this. But this can be done later in another PR.
since = "0.16.0", | ||
note = "`Tensor::one_hot(...)` will be removed in the future, please use the new `tensor.one_hot(...)` method instead" | ||
)] | ||
pub fn _one_hot(index: usize, num_classes: usize, device: &B::Device) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh sorry I think you were right to remove this initially. In the previous review I thought the implementations would be isolated between int & float so it would not clash.
But in this case, I think we don't have a choice to make this a breaking change. Anyway the same result can be easily obtained with the new API.
/// // [[1.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0]] | ||
/// } | ||
/// ``` | ||
pub fn one_hot<const D2: usize>(self, num_classes: usize) -> Tensor<B, D2> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Output type should be the same as input
pub fn one_hot<const D2: usize>(self, num_classes: usize) -> Tensor<B, D2> { | |
pub fn one_hot<const D2: usize>(self, num_classes: usize) -> Tensor<B, D2, K> { |
pub fn one_hot_fill<K2: Numeric<B>, const D2: usize>( | ||
self, | ||
num_classes: usize, | ||
on_value: f32, | ||
off_value: f32, | ||
axis: i64, | ||
) -> Tensor<B, D2, K2> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following my previous comment:
pub fn one_hot_fill<K2: Numeric<B>, const D2: usize>( | |
self, | |
num_classes: usize, | |
on_value: f32, | |
off_value: f32, | |
axis: i64, | |
) -> Tensor<B, D2, K2> { | |
pub fn one_hot_fill<const D2: usize>( | |
self, | |
num_classes: usize, | |
on_value: f32, | |
off_value: f32, | |
axis: i64, | |
) -> Tensor<B, D2, K> { |
/// fn example<B: Backend>(){ | ||
/// let device = Default::default(); | ||
/// let indices: Tensor<B, 1> = Tensor::from_floats([0.0, 1.0, 2.0, 3.0], &device); | ||
/// let one_hot: Tensor<B, 4> = indices.one_hot(4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the output be a 2D tensor?
/// let one_hot: Tensor<B, 4> = indices.one_hot(4); | |
/// let one_hot: Tensor<B, 2> = indices.one_hot(4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this reason, I think we need to add this condition to the TensorCheck
:
Tensor of rank one greater than input tensor 'indices', i.e. rank(output) = rank(indices) + 1
In other words/code:
if D + 1 != D2
let one_hot_tensor: Tensor<TestBackend, 1, Float> = | ||
tensor.one_hot::<2>(10).flatten::<1>(0, 1); | ||
let expected = TensorData::from([0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using flatten on the output, I suggest we give the expected output as is:
let one_hot_tensor: Tensor<TestBackend, 1, Float> = | |
tensor.one_hot::<2>(10).flatten::<1>(0, 1); | |
let expected = TensorData::from([0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]); | |
let one_hot_tensor = tensor.one_hot::<2>(10); | |
let expected = TensorData::from([[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]); |
let one_hot_tensor = index_tensor.one_hot(5); | ||
let expected = TestTensorInt::eye(5, &device).into_data(); | ||
let tensor = TestTensorInt::<1>::from([0, 1, 4]); | ||
let one_hot_tensor: Tensor<TestBackend, 2, Int> = tensor.one_hot(5).int(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an int input, the output should already be int
let one_hot_tensor: Tensor<TestBackend, 2, Int> = tensor.one_hot(5).int(); | |
let one_hot_tensor: Tensor<TestBackend, 2, Int> = tensor.one_hot(5); |
let index_tensor = TestTensorInt::<1>::arange(0..6, &device); | ||
let one_hot_tensor = index_tensor.one_hot(5); | ||
let tensor = TestTensorInt::<1>::from([5]); | ||
let result: Tensor<TestBackend, 2, Int> = tensor.one_hot(5).int(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
let result: Tensor<TestBackend, 2, Int> = tensor.one_hot(5).int(); | |
let result: Tensor<TestBackend, 2, Int> = tensor.one_hot(5); |
|
||
#[test] | ||
fn one_hot_fill_with_negative_axis_and_indices() { | ||
let tensor = TestTensorInt::<2>::from([[0, 2], [1, -1]]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the output is expected to be float, the input should be float
let tensor = TestTensorInt::<2>::from([[0, 2], [1, -1]]); | |
let tensor = TestTensor::<2>::from([[0, 2], [1, -1]]); |
Target of This Pull Request
First, I attempted to implement a one-hot operation for ONNX. However, I realized that the existing one-hot function did not meet the requirements and, in fact, did not support multidimensional inputs at all. As I explored solutions, including the ONNX specifications, Pytorch, Tensorflow, I concluded that it was necessary to implement a new one-hot function. This led to the creation of this implementation, which I am now submitting as a pull request.
(Pytorch also does not implement complet one hot function, though.)
Hope this will work for burn and community.
Checklist
run-checks all
script has been executed.Related Issues/PRs
Indirectly related to onnx issues #1714
Changes
Newly implemented one hot method for numeric tensor. The reason it should belong to numeric is the return value should be defined by on_value and off_value, not tensor itself. So, the output can take either types of int and float.
This function comprehensively covers all aspects defined by ONNX, including depth, on_value, off_value, and axis, and complies with the one-hot operator specifications introduced in ONNX version 11 and later. By developing this, I believe it becomes possible to handle multidimensional one-hot encoding while also providing a concise and efficient implementation of the ONNX operator. For these reasons, I deemed it essential to create this function.
I considered removing and updating the existing one-hot method, but I decided to take a more conservative approach by leaving the existing method as it is and adding a new one instead.
Testing
Adding tests on
crates/burn-tensor/src/tests/ops/one_hot.rs
and passingrun-checks all
.