Skip to content

Deriving Read instance from datatype with N fields leads to N^2 code size growth

The problem observed on GHC's code base when explored why exactly some object files are so huge.

Let's look at the stats of text code section sizes on today's HEAD (skipping GHCi libraries and stage1 binaries):

~/dev/git/ghc $ size `find -name *.o -type f | grep -v stage1 | grep -v HS` | sort -k1 -n -r | head -n20
   text    data     bss     dec     hex filename
1791068  145448       0 1936516  1d8c84 ./compiler/stage2/build/DynFlags.o
1558637   77464       0 1636101  18f705 ./compiler/stage2/build/PprC.o
1397966   23272       0 1421238  15afb6 ./compiler/stage2/build/PlatformConstants.o
1332048  373344       0 1705392  1a05b0 ./libraries/Cabal/Cabal/dist-boot/build/Distribution/PackageDescription.o
1331238  373352       0 1704590  1a028e ./bootstrapping/Distribution/PackageDescription.o
1271578  246240       0 1517818  1728fa ./libraries/template-haskell/dist-boot/build/Language/Haskell/TH/Syntax.o
1229696  311616       0 1541312  1784c0 ./libraries/Cabal/Cabal/dist-install/build/Distribution/PackageDescription.o
1215082  224288       0 1439370  15f68a ./libraries/template-haskell/dist-install/build/Language/Haskell/TH/Syntax.o
1071746  242664       0 1314410  140e6a ./compiler/stage2/build/HsExpr.o
1015090   40904       0 1055994  101cfa ./compiler/stage2/build/Llvm/PpLlvm.o
 970203  124944       0 1095147  10b5eb ./compiler/stage2/build/Parser.o
 849693   79760       0  929453   e2ead ./compiler/stage2/build/HsDecls.o
 833327   35440       0  868767   d419f ./compiler/stage2/build/X86/CodeGen.o
 819959  127192       0  947151   e73cf ./libraries/Cabal/Cabal/dist-boot/build/Distribution/Simple/Setup.o
 819685  125120       0  944805   e6aa5 ./bootstrapping/Distribution/Simple/Setup.o
 816927  124520       0  941447   e5d87 ./libraries/Cabal/Cabal/dist-install/build/Distribution/Simple/Setup.o
 785398   41536       0  826934   c9e36 ./compiler/stage2/build/CoreLint.o
 772550   42040       0  814590   c6dfe ./compiler/stage2/build/DriverPipeline.o
 766461   36280       0  802741   c3fb5 ./compiler/stage2/build/HscMain.o
 735605   23408       0  759013   b94e5 ./libraries/containers/dist-install/build/Data/Sequence.o

PlatformConstants.o is a very clear example of this problem. Trimmed down example of this file is:

module M (D) where
data D = D { a0
, a01 , a02 , a03 , a04 , a05 , a06 , a07 , a08 , a09 , a10
, a11 , a12 , a13 , a14 , a15 , a16 , a17 , a18 , a19 , a20
, a21 , a22 , a23 , a24 , a25 , a26 , a27 , a28 , a29 , a30
, a31 , a32 , a33 , a34 , a35 , a36 , a37 , a38 , a39 , a40
, a41 , a42 , a43 , a44 , a45 , a46 , a47 , a48 , a49 , a50
, a51 , a52 , a53 , a54 , a55 , a56 , a57 , a58 , a59 , a60
, a61 , a62 , a63 , a64 , a65 , a66 , a67 , a68 , a69 , a70
, a71 , a72 , a73 , a74 , a75 , a76 , a77 , a78 , a79 , a80
, a81 , a82 , a83 , a84 , a85 , a86 , a87 , a88 , a89 , a90
, a91 , a92 , a93 , a94 , a95 , a96 , a97 , a98 , a99

 :: Int } deriving Read

Results in 800KB file:

$ ghc-stage2 -c -O1 -fforce-recomp M.hs -o M.o
$ size M.o
   text    data     bss     dec     hex filename
 847039   16080       0  863119   d2b8f M.o

The size growth is quadratic:

# fields code-size
1    6263
21   61767
41  173623
61  347503
81  583367
101  881231
121 1241087
141 1662959
161 2146815
181 2692679
201 3300543
221 3970407
241 4702263
261 5496135
281 6351991
Trac metadata
Trac field Value
Version 7.10.2
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information