You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The idea is simple: new parser doesn't need the GIL, so we can parse
files in parallel. Because it is tricky to apply parallelization _only_
to parallelizeable code, the most I see is ~4-5x speed-up with 8
threads, if I add more threads, it doesn't get visibly faster (I have 16
physical cores).
Some notes on implementation:
* I use stdlib `ThreadPoolExecutor`, it seems to work OK.
* I refactored `parse_file()` a bit, so that we can parallelize (mostly)
just the actual parsing. I see measurable degradation if I try to
parallelize all of `parse_file()`.
* I do not always use `psutil` because it is an optional dependency. We may
want to actually make it a required dependency at some point.
* It looks like there is a weird mypyc bug, that causes `ast_serialize`
to be `None` sometimes in some threads. I simply add an ugly workaround
for now.
* It looks like I need to apply wrap_context() more consistently now. A
bunch of tests used to pass accidentally before.
* I only implement parallelization in the coordinator process. The
workers counterpart can be done after
#21119 is merged (it will be
trivial).
0 commit comments