Skip to content

parallelize getRootSummary computations in dep analysis downsweep

Fixes #20891.

I haven't benchmarked more than a trivial -M run, where it reduced the total time from 2.5s to 1.1s.

The implementation just reuses the machinery used for the upsweep part, creating an action per target. This results in many threads that block until the semaphore releases a slot. I'd assume the overhead to be negligible, but if someone has a different opinion we could also create bundles of NCPU targets.

Note: Without !12607 (closed), benchmarking -M produces severely distorted results.

Edited by Torsten Schmits

Merge request reports