Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

☂️ Type-aware linter #3187

Open
Conaclos opened this issue Jun 12, 2024 · 32 comments
Open

☂️ Type-aware linter #3187

Conaclos opened this issue Jun 12, 2024 · 32 comments
Assignees
Labels
A-Linter Area: linter L-JavaScript Language: JavaScript and super languages

Comments

@Conaclos
Copy link
Member

Conaclos commented Jun 12, 2024

Description

This umbrella issue tracks the development of type-aware lint rules.
We first motivate our decision to implement our own type synthesizer, and then present the type-aware rules we intend to implement. In a second section, we present preliminary design material that indicates the high-level direction we want to take.

Motivation

Multiple rules from TypeScript ESLint requires type information. Moreover, several linter rules could be enhanced by using type information.

TypeScript ESLint uses the TypeScript Compiler API to get types. This architecture has the advantage of using the TypeScript Compiler. However, it has several drawbacks:

  • TSC is slow, and using the Compiler API makes the slowness even more noticeable.
    In fact, it is so slow that TypeScript ESLint provides presets with and without the rules that require type information.
  • Biome and TSC use their own AST, which makes interoperability difficult.

This is why we think it is important to implement our own type synthesiser. If we have a fast type synthesiser, then we could enhance many lint rules with a marginal performance overhead. Note that we are not trying to implement a full-fledged type system or a type checker like TypeScript. Many attempt, even the most promising failed.

We believe that the Biome type synthesiser doesn't need to be perfect or handle complex TypeScript types. Even a rudimentary type synthesiser could be valuable to many lint rules.

Type-aware lint rules

A first design and implementation of the Biome type synthesiser aims to implement a preliminary version of the following rules:

  • useAwaitThenable (await-thenable)

    Ensure that only thenable values are awaited. We could first target a rule that ensures that an awaited expression is a Promise. We could ignore values with an unknown type.

  • noFloatingPromises (no-floating-promises)

    Ensure that a promise is handled (returned, awaited, ...).

  • noForInArray (no-for-in-array)

    Ensure that for-in is not used on arrays.

  • noDuplicateLiteralEnumMembers (no-duplicate-enum-values)

    Ensure that every enum member initialized with a literal expression is unique. This doesn't necessarlly requires a type system. We need to compute literal expressions.

Funding

To support this effort, please consider sponsoring Biome within our Open Collective or GitHub sponsorship.

For companies wishing to speed up the development of this task, please consider hiring one of our contributors through our Enterprise Support.

@Conaclos Conaclos added A-Linter Area: linter L-JavaScript Language: JavaScript and super languages labels Jun 12, 2024
@Conaclos Conaclos pinned this issue Jun 14, 2024
@MrTact
Copy link

MrTact commented Jul 5, 2024

Maybe the approach is to identify the most critical rules, and implement type inference just for the required types? E.g. the first two rules appear to be promise-related, so figure out that a given identifier is a promise. This should still be built on a forward-looking type analyzer architecture, but by constraining the scope, you can make the effort much more manageable, since you don't have to achieve parity with tsc (a fool's errand anyway, which you point out). This also means that, post an admittedly non-trivial chunk of groundwork, additional rules can be developed (and shipped, ideally!) incrementally.

If the duplicate enum rule doesn't require a type checker, it should probably be peeled off into its own issue. Seems like it's bundled here just because it comes from typescript-eslint.

@henrik242

This comment has been minimized.

@Conaclos
Copy link
Member Author

I use tsc --noEmit && biome check . in my package.json. Can I drop tsc? What will I miss if I drop tsc?

Biome doesn't perform type checking and is not intended to do in the future.
Once this issue is resolved, Biome will be able to implement some rules from Typescript ESLint plugin that requires type information.

@scamden

This comment has been minimized.

@krzkaczor

This comment has been minimized.

@scamden

This comment has been minimized.

@krzkaczor

This comment has been minimized.

@trungutt

This comment was marked as off-topic.

@junaga
Copy link

junaga commented Aug 31, 2024

what about just shipping a default tsconfig.json, and then tell|ask|suggest Biome users in the docs to use it?

@testerez
Copy link

Note that we are not trying to implement a full-fledged type system or a type checker like TypeScript

Does it mean that types will be properly identified only when they are explicit? Or will Biome be able to follow complex type inferences?

@ericchase
Copy link

wanted to chime in about

i think this is an extremely important rule. just this past week, i've forgotten to await async functions dozens of times. i had to pull in eslint with just 2 rules related to this to check my code. tsc is so horrendously slow.

i'm not really sure why these few specific rules need type awareness while other rules don't, but i would love to see these rules added!

@awhitford
Copy link

I'd like to share that Astro Starlight cited the lack of a no-floating-promises equivalent was a show-stopper for adopting Biome over ESLint. 😞

@jpike88

This comment has been minimized.

@arendjr

This comment has been minimized.

@0x80

This comment has been minimized.

@dyc3

This comment has been minimized.

@misl-smlz

This comment has been minimized.

@0x80

This comment has been minimized.

@karlkeefer
Copy link

I just became a supporter of the project with this issue in mind. This would be a great addition to biome!

@gemue-parndt

This comment has been minimized.

@kaykdm
Copy link
Contributor

kaykdm commented Jan 17, 2025

For the no-floating-promises rule, in most cases, I think we can determine whether a given expression's identifier is a promise by checking the following:

  • It contains an async token.
  • It has a return type of Promise. (Consider recommending the useExplicitType rule alongside this to improve clarity.)

While there will inevitably be edge cases, I believe the following approach will handle the majority of scenarios effectively.


Detectable Cases

Functions

The following examples showcase functions that return promises:

async function returnsPromiseFunction() {
    return 'value';
}

const returnsPromiseArrowFunction = async () => {
    return 'value';
};

const returnsPromise = (): Promise<string> => {
    return Promise.resolve('value');
};

returnsPromiseFunction();
returnsPromiseArrowFunction();
returnsPromise();

Promises

The following examples involve promises directly:

Promise.all([]);
Promise.reject('value').catch();

const promise = new Promise((resolve, reject) => resolve('value'));  
promise;

Class or Object Methods

The following examples involve class or object methods returning promises:

class Api {
    async returnsPromise(): Promise<string> {
        return 'value';
    }

    someMethod() {
        this.returnsPromise();
    }
}

const api = new Api();
api.returnsPromise();

const obj = {
    async returnsPromise() {
        return 'value';
    }
};
obj.returnsPromise();

Cross-File Checking

Another aspect to consider is cross-file checking, such as identifying imported functions or classes. This can be addressed as follows:

  1. Identify the Import: If an identifier is imported, retrieve the source file path of the import.
  2. Parse the File: Provide the file path to the biome parser to generate the AST and traverse it to locate the exported module.
  3. Verify the Export: If the exported module matches the name of the original identifier, check whether it is a promise.

Edge Cases

While this approach addresses most scenarios, there are certain edge cases that need further consideration, such as:

  • Reassigned Promises: Instances where promises are reassigned to other variables.
  • Reexported Promises: Promises that are exported from one module and reexported in another.
  • Return Type Defined with union type: Scenarios where the type includes both promises and other types.
  • Return Type Defined with type or interface: Cases where return types are abstracted using type or interface.
type A = () => Promise<string>;
const a: A = () => {
    return Promise.resolve('value');
};
a();

@Conaclos @ematipico @arendjr
What do you think?
Do you think it is worth exploring? If so, I’d be happy to work on it.

@dyc3
Copy link
Contributor

dyc3 commented Jan 17, 2025

@kaykdm PRs are certainly welcome. For the multi file analysis, we are already discussing that in #3307, so I would hold off on that part for now. It would be a great way to explore what type checking in biome could look like. You should be able to use our semantic model to track variable reassignments and such.

@arendjr
Copy link
Contributor

arendjr commented Jan 17, 2025

I’m currently working on implementing multi-file analysis, which I think is an important prerequisite for getting type-enabled rules to work. As part of that, we’re also introducing data structures that track information from files that have been scanned in a project. In the next branch, we already have a first version of the ProjectLayout, which is important to be able to support monorepos. Right now I’m working on adding a DependencyGraph so that we can see which files import which others. The next step will be utilizing that to resolve imports to specific symbols inside files, and when we have that we can extend it with type information.

I don’t mind if you want to explore other parts of type inference already, but we’ll need to sync up at some point regarding the data model used, since it needs to be compatible with our multi-file approach. An important part of that is that we don’t want to lazily resolve imports and parse the files they point to, because such an approach makes it hard to cache and invalidate results. If you want you can look at how we construct and use the current ProjectLayout, because it should be somewhat similar to that, with entries keyed by path, and each values containing the symbols with their types for that file.

Let me know if there’s more you wish to discuss!

@kaykdm
Copy link
Contributor

kaykdm commented Jan 18, 2025

@dyc3 @arendjr
Thank you for the reply and for letting me know about the ongoing discussion on multi-file analysis. I’ll focus solely on the type-checking aspect and leave the multi-file analysis for now. My first PR will target handling simple async functions.

@arendjr
Copy link
Contributor

arendjr commented Jan 18, 2025

That would be a great start, thank you!

@ematipico
Copy link
Member

I know that also @togami2864 is working on a prototype for the type checker. Maybe we could open a task or something for coordination

@mehulkar

This comment has been minimized.

@arendjr
Copy link
Contributor

arendjr commented Jan 21, 2025

I see this issue is attracting quite some messages from people wishing to express how much they "need" this feature. Please consider that such messages may create noise and pressure for our devs, so we would like to reserve this issue for technical discussion only.

For companies that may wish to speed up the development of this work, you may be interested to hear we offer Enterprise Support now.

@kaykdm
Copy link
Contributor

kaykdm commented Jan 24, 2025

FYI: I created a task issue to track the remaining work for implementing noFloatingPromises. Please let me know if I'm missing anything.
#4956

@GermanJablo
Copy link

This is why we think it is important to implement our own type synthesiser. If we have a fast type synthesiser, then we could enhance many lint rules with a marginal performance overhead. Note that we are not trying to implement a full-fledged type system or a type checker like TypeScript. Many attempt, even the most promising failed.

We believe that the Biome type synthesiser doesn't need to be perfect or handle complex TypeScript types. Even a rudimentary type synthesiser could be valuable to many lint rules.

  1. Are you sure about this? I hope I'm wrong, but I have a hard time imagining how a type synthesizer that isn't fully tsc compliant could reliably warn about type-based rules.
  2. Is the long-term goal to cover 100% of typescript-eslint rules, or just a subset?

@arendjr
Copy link
Contributor

arendjr commented Jan 24, 2025

Are you sure about this? I hope I'm wrong, but I have a hard time imagining how a type synthesizer that isn't fully tsc compliant could reliably warn about type-based rules.

The proof will be in the pudding, of course, but yes, I too believe we will be able to achieve (significant) value even where our type inference is still incomplete. The reason for this is that most use cases only require a subset of TypeScript features in order to determine whether something is of a given type. As long as we reliably fall back to an internal "unknown" type (for which we probably can just use TypeScript's actual unknown type) in the presence of unsupported type features, we can avoid our lint rules from triggering false positives (diagnostics for code that shouldn't trigger them) while allowing some false negatives (code that should trigger a diagnostic but doesn't) to remain.

This puts a lot of weight on the word "reliably" in your question. Our solution wouldn't be 100% reliable as long as we're not fully tsc compliant, because false negatives would still exist. But as long as we avoid the false positives (which are really obnoxious in lint rules) even 80% reliability can already bring a lot of value to users. Of course which percentage of reliability you will actually get depends a lot on what your code looks like and which libraries you use, so it may very well be a "your mileage may vary" kinda situation. But the bet we're making is that we can offer significant value to many users, and we can probably improve our reliability over time.

Is the long-term goal to cover 100% of typescript-eslint rules, or just a subset?

I don't think we even support 100% of regular ESLint rules, so I don't see that as an explicit goal. But I do think that all the typescript-eslint rules should be theoretically implementable by us. It'll just be a matter of priority and complexity which ones we'll tackle and when.

@GermanJablo
Copy link

GermanJablo commented Jan 24, 2025

Our solution wouldn't be 100% reliable as long as we're not fully tsc compliant, because false negatives would still exist. But as long as we avoid the false positives (which are really obnoxious in lint rules) even 80% reliability can already bring a lot of value to users

Thank you for the detailed response!

Personally, I'll stick with 100% typescript reliability, even if it comes at the cost of a slightly slower linter. That said, I don’t mean to undervalue the thought process behind your decision. I truly appreciate the tremendous work you’re doing and fully intend to use Biome as a formatter. But I guess as a linter it's not for me.

EDIT: Another point I'd like to point out: As you noted, the stc project failed. But tsc keeps improving in performance and it's probably just a matter of time before someone rewrites it with improved performance. Biome already has a very ambitious scope, so I don't think it's good to embark on a type-checker that will take a long time, and won't get 100% compliant tsc. I just wanted to leave my feedback for the record in case you take it into consideration. Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Linter Area: linter L-JavaScript Language: JavaScript and super languages
Projects
None yet
Development

No branches or pull requests