Skip to content

Improve accuracy by adding a step when comparing titles #13

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mattbruv opened this issue Jun 26, 2021 · 1 comment
Open

Improve accuracy by adding a step when comparing titles #13

mattbruv opened this issue Jun 26, 2021 · 1 comment

Comments

@mattbruv
Copy link

mattbruv commented Jun 26, 2021

I noticed that this thread which should be an easy 100% match, was a 75% match:

https://i.imgur.com/SYNE3ou.png

The problem seems to be that even though the words are the same, the uppercase/lowercase of the letters throws it off. Perhaps making the titles that are being compared all lowercase/uppercase before comparing them would fix this problem?

@Rekkonnect
Copy link

Adding to this, normalizing the apostrophe characters could also be good; I've encountered a case where the original title had apostrophes, and the parody had '.

Perhaps also introduce a mechanism to rank similarly based on individual character differences like

  • case variance (uppercase vs lowercase variants of the same word)
  • apostrophe existence (dont vs don't)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants