Fix fusion for GHC's utility functions
In a patchset I'm working on, I encountered a smallish "bytes allocated" regression in perf/compiler/T5631. After some sleuthing, I discovered that the regression is caused by using a zipWith3M where previously there was a zipWithM. Interestingly, it was not the zipped function that caused the problem, but the zipWith3M itself. Further sleuthing found that zipWithM is inlined into a combination of sequenceA and zipWith, both of which have been taught to work nicely with fusion.
I thought of doing the same to GHC's zipWith3M, but I discovered that base's zipWith3 has no rules or inlining pragmas, suggesting that it does not fuse. I was quite surprised to find this, thinking that all functions as basic as zipWith3 in base would play nicely with fusion. (Perhaps it does fuse -- I didn't actually check.) I then checked out GHC's MonadUtils, which defines a bunch of functions, some of which must be hammered on, and only one of which has any kind of performance-enhancing pragma. (The one is zipWithAndUnzipM, whose INLINABLE pragma was added only after much pain suffered by your dear author.)
So: this ticket is a request to sweep through MonadUtils and Util and perhaps ListSetOps, looking for performance enhancements via fusion, tighter definitions, and such. Also, get zipWith3 (and the other zipWiths) to fuse in base.
I figure this is all some fairly low-hanging fruit for performance improvements in GHC.
Trac metadata
| Trac field | Value |
|---|---|
| Version | 8.3 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture |