Categorical embeddings dimensionality #129
Unanswered
paraschakis
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
As far as I understand, n-hot encoding is used for categorical embeddings.
What if my dataset has many columns of high cardinality (e.g. many categories in them)?
If all of them are embedded using categorical_similarity_space, wouldn't the dimensionality of data points explode after space concatenation? That would make searches inefficient. How to deal with that?
Beta Was this translation helpful? Give feedback.
All reactions