Skip to content

Conversation

@novusnota
Copy link
Collaborator

@novusnota novusnota commented Nov 11, 2025

Closes #1132. From the issue:

Since the goal is to catch typos in PRs better, we can either manually specify common typos of 2 letter words (and letter omissions in 3 letter words), OR just ban 26*26 two-letter words and then add exceptions (it, be, am, an, etc.).

Well, I started specifying common typos, but then I discovered that the valid range is quite small in comparison, so I just banned 26*26*4 two-letter words instead, excluding valid ones of course.

The number comes from the following pseudo-formula: [a-z] twice * 4, where 4 is the number of different spellings a word can have. For example, word aa can be written as: aa, aA, Aa, and AA. Since the case sensitivity is turned on, each of those options is considered a separate word.

However, I did not set the minimum checked word length to 2 characters in the config due to two reasons:

  1. Checks look up the dictionaries, while bans occur regardless of whether a word is allowed in a dictionary, i.e., independently of other checks.
  2. Further, those 2-character word dictionaries are lacking, so any unbanned words would then require specifying them in our resources/dictionaries/custom.txt file. Not nice.

Overall, I'm not sure if banning 2k+ words is a good idea. We might want to only restrict the common letter-omission typos of 3-letter words and be done with it. That said, this might be better in the long run, as people would be more conscious of the spelling errors, and because such typos are tricky to spot with AI. The latter would assume "ar" to be "Augmented Reality" or Argentina's country code rather than a missed "e" in "are".

@github-actions

This comment was marked as resolved.

@novusnota

This comment was marked as resolved.

@novusnota

This comment was marked as resolved.

@github-actions

This comment was marked as resolved.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No documentation issues detected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Spell] Consider further decreasing the minimum word length for CSpell checks from 3 to 2

4 participants