String literals are wasting space
For D199 I looked into how string literals are compiled down by GHC.
On 64-bit OS X, a simple string "AAA" turns into assembly:
.const
.align 3
.align 0
c38E_str:
.byte 65
.byte 65
.byte 65
.byte 0
(And also something that invokes unpackCString#, but that isn't relevant here.)
(MkCore.mkStringExprFS -> CmmUtils.mkByteStringCLit -> compiler/nativeGen/X86/Ppr.pprSectionHeader.)
Note how this:
- Is 8 byte aligned.
- Is a
.constsection.
I can't find any reason why string literals would need to be 8-byte aligned on OS X. There might be a small benefit in performance to read data starting 8-byte aligned, but I doubt doing that for string literals would be a meaningful difference. Assembly from both clang and gcc does not align string literals.
The trivial program:
main :: IO ()
main = return ()
has almost 5kB of wasted space of padding between all strings the Prelude brings in, built with GHC HEAD.
The fact that it is a .const section, instead of .cstring (https://developer.apple.com/library/mac/documentation/DeveloperTools/Reference/Assembler/040-Assembler_Directives/asm_directives.html#//apple_ref/doc/uid/TP30000823-TPXREF127) means duplicate strings aren't shared by the assembler. GHC floats out string literals to the top-level and uses CSE to eliminate duplicates, but that only works in a single modules. Strings shared between different modules end up as duplicate strings in an executable.
The same program as above also has ~4kB of wasted space due to duplicate Prelude strings ("base" occurs 16 times!). Compared to the total binary size (4MB after stripping), removing this redundant data wouldn't be a big improvement (0.2%), but I still think it can be a worthwile optimization.
I think this can be solved quite easily by creating a new section header for literal strings, which is unaligned and of type .cstring.
Trac metadata
| Trac field | Value |
|---|---|
| Version | 7.8.2 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | low |
| Resolution | Unresolved |
| Component | Compiler (NCG) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | simonmar |
| Operating system | |
| Architecture |