-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
enhancementNew feature or requestNew feature or request
Description
What similarity measure is this hash function for?
Hamming distance
Describe the hash function
This is a single-bit hash function for bit vectors (i.e. BitArrays in Julia). It is extremely straightforward to compute: we choose a random index i
into the vectors, and x[i]
becomes the hash. The collision probability for a pair of vectors is their Hamming distance divided by the length of the vectors.
References
- Wikipedia
- Reference paper:
Indyk, Piotr.; Motwani, Rajeev. (1998). "Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality.". Proceedings of 30th Symposium on Theory of Computing.
- Link to PDF through princeton.edu: https://www.cs.princeton.edu/courses/archive/spr04/cos598B/bib/IndykM-curse.pdf
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request