Create a copy of TypeQuery specialized for bool #9604

vbarbaresi · 2020-10-17T09:26:38Z

Description

I tried implementing the optimization described in #7128.

I created class TypeQueryBool(SyntheticTypeVisitor[bool]).It's a specialized TypeQuery,
imlementating. all or any strategies.

Some code and comments are duplicated, so I'm not entirely satisfied with the implementation.
I'm opening the PR for some review about the implementation and the testing method.

Test Plan

I forked mypy_mypyc-wheels project and I built mypyc wheels with travis, to benchmark.

I used v0.800+dev.952ca239e6126aff52067a37869bd394eb08cf21 as the reference, and compared it with my exact changes added on top: https://github.com/vbarbaresi/mypy_mypyc-wheels/releases/tag/v0.800%2Bdev.f7129b21

I'm using macOS 10.15, python 3.8.6 and the following benchmark script (I know it's not ideal because it counts the startup time of the python process):

import sys
import pyperf

runner = pyperf.Runner()


runner.bench_command(
    'mypy_master',
    [
        '/tmp/venv_mypy_master/bin/mypy',
        '--config-file=mypy/mypy_self_check.ini',
        '/tmp/bench/mypy/mypy',
        '--no-incremental'
    ]
)


runner.bench_command(
    'mypy_typequerybool',
    [
        '/tmp/venv_mypy_typequerybool/bin/mypy',
        '--config-file=mypy/mypy_self_check.ini',
        '/tmp/bench/mypy/mypy',
        '--no-incremental'
    ]
)

I tried switching the order, running my branch first, then mypy master.
I wasn't able to get a 0.2 % accuracy, so I'm not entirely confident in the results.
I ran this bench more than 10 times and observed my branch performing very slightly better in a lot of runs

Typical run results:

Minimum:         10.9 sec
Median +- MAD:   11.0 sec +- 0.0 sec
Mean +- std dev: 11.0 sec +- 0.1 sec
Maximum:         11.1 sec

  0th percentile: 10.9 sec (-1% of the mean) -- minimum
  5th percentile: 11.0 sec (-1% of the mean)
 25th percentile: 11.0 sec (-0% of the mean) -- Q1
 50th percentile: 11.0 sec (+0% of the mean) -- median
 75th percentile: 11.1 sec (+0% of the mean) -- Q3
 95th percentile: 11.1 sec (+1% of the mean)
100th percentile: 11.1 sec (+1% of the mean) -- maximum

Number of outlier (out of 10.9 sec..11.2 sec): 0

mypy_master: Mean +- std dev: 11.0 sec +- 0.1 sec


Minimum:         10.9 sec
Median +- MAD:   11.0 sec +- 0.0 sec
Mean +- std dev: 11.0 sec +- 0.1 sec
Maximum:         11.1 sec

  0th percentile: 10.9 sec (-1% of the mean) -- minimum
  5th percentile: 10.9 sec (-1% of the mean)
 25th percentile: 10.9 sec (-0% of the mean) -- Q1
 50th percentile: 11.0 sec (+0% of the mean) -- median
 75th percentile: 11.0 sec (+0% of the mean) -- Q3
 95th percentile: 11.1 sec (+1% of the mean)
100th percentile: 11.1 sec (+1% of the mean) -- maximum

Number of outlier (out of 10.8 sec..11.1 sec): 0

mypy_typequerybool: Mean +- std dev: 11.0 sec +- 0.1 sec

On some other runs the mean was 10.9 sec for my branch, and still 11.0 on master (but always with +- 0.1 sec std dev)

msullivan

This looks solid. The one question is whether we think it helps enough to be worth the extra code. I had one suggestion for a potential perf improvement, so try that out.

Also, it looks like you accidentally included a typeshed submodule update.

msullivan · 2020-10-18T02:05:28Z

mypy/type_visitor.py

+        self.seen_aliases = set()  # type: Set[TypeAliasType]
+
+    def bool_strategy_empty(self) -> bool:
+        if self.strategy == self.STRATEGY_ALL:


Try accessing the constants as TypeQueryBool.STRATEGY_ALL instead. mypyc should directly emit a constant in that case, and it'll emit an object lookup for self. (That's probably a bug.) The object lookup should be cheap, but is still a memory access and an error check, so the constant might be better.

mypy/type_visitor.py

sobolevn

👍

sobolevn · 2020-10-18T09:05:50Z

mypy/type_visitor.py

+
+    # TODO: check that we don't have existing violations of this rule.
+    """
+    STRATEGY_ANY = 0  # type: Final[int]


Shouldn't it be just : Final?

from typing_extensions import Final class Test(object): a = 1 # type: Final b = 2 # type: Final[int] reveal_type(Test.a) # Revealed type is 'Literal[1]?' reveal_type(Test.b) # Revealed type is 'builtins.int'

The doc says that if we omit the type, mypy infers it
https://mypy.readthedocs.io/en/stable/final_attrs.html#syntax-variants
I don't know what conclusion to draw from your example, is it better to have a Literal or builtins.int?

I am not sure about the mypy team, but I personally prefer Literal, because it is more specific. And constants should be specific. 👍

I changed them to Final 👌

sobolevn · 2020-10-18T09:07:05Z

mypy/type_visitor.py

@@ -347,3 +349,121 @@ def query_types(self, types: Iterable[Type]) -> T:
                self.seen_aliases.add(t)
            res.append(t.accept(self))
        return self.strategy(res)
+
+
+class TypeQueryBool(SyntheticTypeVisitor[bool]):


Related: #9602

thanks for the headsup, I added the decorator for allow_interpreted_subclasses=True

msullivan · 2020-10-20T21:38:53Z

Does this change the perf impact at all?

vbarbaresi · 2020-10-21T18:36:55Z

I tried a few runs and I didn't see any perf impact, I have the exact same run times. The accuracy of my bench is probably not good enough to catch this kind of improvement

vbarbaresi · 2020-10-21T18:51:07Z

The one question is whether we think it helps enough to be worth the extra code.

I agree, it's not clear if it's worth it.

I want to give another shot at benchmarking this in the next few days.
I'll try another approach: running multiple iterations of mypy main entrypoint in a loop in the same Python process.
That could give more accurate results (and allow to run more iterations faster) than spawning a new process for each run.
Does that make sense? If you have other ideas, I'd be happy to try them as well

hauntsaninja · 2021-08-04T09:14:50Z

Thanks for investigating this! Seems like we weren't able to get this to a point where it was worth it, so closing. But perf work / investigation is always appreciated :-)

vbarbaresi added 6 commits October 11, 2020 14:57

Try specializing TypeQuery[bool] to remove any() and all() calls

cd17881

refactor

ab4f0c2

simplify the loop even more

6423a02

refactor TypeQueryBool strategy constants in the class

0a838bd

fix style

f7129b2

fix bad docstring on query_types

d264791

msullivan reviewed Oct 18, 2020

View reviewed changes

sobolevn reviewed Oct 18, 2020

View reviewed changes

vbarbaresi added 4 commits October 20, 2020 09:53

Merge remote-tracking branch 'origin/master' into typequerybool

da9dca0

remove useless comment about bool strategies

65d37f6

access strategy constants from class attributes instead of instances

900699d

add allow_interpreted_subclasses to TypeQueryBool

f6c5c8e

bool strategy constants: use Final instead of Final[int]

de55732

JukkaL mentioned this pull request Mar 10, 2021

Speed up TypeQuery #7128

Open

hauntsaninja closed this Aug 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Create a copy of TypeQuery specialized for bool #9604

Create a copy of TypeQuery specialized for bool #9604

Uh oh!

vbarbaresi commented Oct 17, 2020

Uh oh!

msullivan left a comment

Uh oh!

msullivan Oct 18, 2020

Uh oh!

Uh oh!

sobolevn left a comment

Uh oh!

sobolevn Oct 18, 2020

Uh oh!

vbarbaresi Oct 20, 2020

Uh oh!

sobolevn Oct 20, 2020

Uh oh!

vbarbaresi Oct 21, 2020

Uh oh!

sobolevn Oct 18, 2020

Uh oh!

vbarbaresi Oct 20, 2020

Uh oh!

msullivan commented Oct 20, 2020

Uh oh!

vbarbaresi commented Oct 21, 2020

Uh oh!

vbarbaresi commented Oct 21, 2020

Uh oh!

hauntsaninja commented Aug 4, 2021

Uh oh!

Uh oh!

Uh oh!

Create a copy of TypeQuery specialized for bool #9604

Create a copy of TypeQuery specialized for bool #9604

Uh oh!

Conversation

vbarbaresi commented Oct 17, 2020

Description

Test Plan

Uh oh!

msullivan left a comment

Choose a reason for hiding this comment

Uh oh!

msullivan Oct 18, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sobolevn left a comment

Choose a reason for hiding this comment

Uh oh!

sobolevn Oct 18, 2020

Choose a reason for hiding this comment

Uh oh!

vbarbaresi Oct 20, 2020

Choose a reason for hiding this comment

Uh oh!

sobolevn Oct 20, 2020

Choose a reason for hiding this comment

Uh oh!

vbarbaresi Oct 21, 2020

Choose a reason for hiding this comment

Uh oh!

sobolevn Oct 18, 2020

Choose a reason for hiding this comment

Uh oh!

vbarbaresi Oct 20, 2020

Choose a reason for hiding this comment

Uh oh!

msullivan commented Oct 20, 2020

Uh oh!

vbarbaresi commented Oct 21, 2020

Uh oh!

vbarbaresi commented Oct 21, 2020

Uh oh!

hauntsaninja commented Aug 4, 2021

Uh oh!

Uh oh!