String literals are wasting space
For D199 I looked into how string literals are compiled down by GHC.
On 64-bit OS X, a simple string
"AAA" turns into assembly:
.const .align 3 .align 0 c38E_str: .byte 65 .byte 65 .byte 65 .byte 0
(And also something that invokes
unpackCString#, but that isn't relevant here.)
Note how this:
- Is 8 byte aligned.
- Is a
I can't find any reason why string literals would need to be 8-byte aligned on OS X. There might be a small benefit in performance to read data starting 8-byte aligned, but I doubt doing that for string literals would be a meaningful difference. Assembly from both clang and gcc does not align string literals.
The trivial program:
main :: IO () main = return ()
has almost 5kB of wasted space of padding between all strings the Prelude brings in, built with GHC HEAD.
The fact that it is a
.const section, instead of
.cstring (https://developer.apple.com/library/mac/documentation/DeveloperTools/Reference/Assembler/040-Assembler_Directives/asm_directives.html\#//apple_ref/doc/uid/TP30000823-TPXREF127) means duplicate strings aren't shared by the assembler. GHC floats out string literals to the top-level and uses CSE to eliminate duplicates, but that only works in a single modules. Strings shared between different modules end up as duplicate strings in an executable.
The same program as above also has ~4kB of wasted space due to duplicate Prelude strings (
"base" occurs 16 times!). Compared to the total binary size (4MB after stripping), removing this redundant data wouldn't be a big improvement (0.2%), but I still think it can be a worthwile optimization.
I think this can be solved quite easily by creating a new section header for literal strings, which is unaligned and of type