-
-
Notifications
You must be signed in to change notification settings - Fork 247
Open
Labels
Description
Describe the bug
Hash collision seems to happen whenever two dataframes have the same column names, regardless of the rows.
To Reproduce
from deepdiff import DeepHash
x = pd.DataFrame({'a': [1, 2, 3]})
y = pd.DataFrame({'a': [1, 2, 3, 4]})
a = DeepHash(x)[x]
b = DeepHash(y)[y]
assert a == b
Expected behavior
Collisions should be harder to find than this (unless this was designed into the library?)
OS, DeepDiff version and Python version (please complete the following information):
- OS: Ubuntu 22.04.2 LTS
- Python Version: 3.10.8
- DeepDiff Version: 6.3.0
timotheecour
Metadata
Metadata
Assignees
Labels
Projects
Milestone
Relationships
Development
Select code repository
Activity
seperman commentedon May 1, 2023
Hi @amakelov
While we have been supporting Numpy for many years, Pandas data-frames have never been covered. If you have time to add the Pandas support to DeepDiff, that would be great. Otherwise I will look into it when I have a chance.