Improve processing of indent in GHC's pretty implementation.
Motivation
Core/STG dumps usually contain copious amounts of indentation.
While this is expected I recently run into a case where GHC would take about 15s to compile. But in practice never finish when I enabled dumping because of an unreasonably large nested expression (nested 10,000 levels deep) and the indent that comes with that.
Now I don't think it's unreasonable for that to be the case but GHC was writing out dumps at a speed of slightly over 4MB/sec for content which is >>99% indent.
In GHCi I reach ~20MB/s using replicateM_ 10000000 (hPutStr h " ") >> hClose h
and that's clearly not an efficient implementation.
Problem
As far as I can tell indent is represented by RStr n ' '
.
This is then eventually put out via put (RStr n c) next = hPutStr hdl (replicate n c) >> next
.
This is oh so terrible:
- We go through the encoding path of hPutStr for each indent.
- We allocate the indent again each time.
- We allocate the indent as
String
Proposal
Special case printing of large indent in dumps. It's not clear how to best achieve this to me.
So for now I'm just opening this ticket to document the issue.