Skip to content

ghc 8.8.3 uses 25-30% more time and memory to derive instances with -XDerivingVia

Summary

We have a large-ish (27K SLOC) "entities" package that consists primarily of domain types and ToJSON/FromJSON instances. These instances use Aeson's Generic functionality to enforce our API conventions. For example:

data Assignment = Assignment
  { assignmentTeacherId :: Maybe TeacherId
  , ...
  }
  deriving stock (Eq, Show, Generic)

instance ToJSON Assignment where
  toEncoding = genericToEncoding $ apiOptions $ Just "assignment"
  toJSON = genericToJSON $ apiOptions $ Just "assignment"

instance FromJSON Assignment where
  parseJSON = genericParseJSON $ apiOptions $ Just "assignment"

apiOptions $ Just "assignment" means:

  1. Drop the "assignment" prefix from each field selector
  2. For sum types, ensure the value of the tag field is snake_cased (for boring historical reasons)

These hand-written instances have a couple of problems:

  1. It's easy to forget to implement both toJSON and toEncoding (the latter is typically 2-3X faster than toJSON)
  2. It's easy to use other options that stray from our API's conventions

Recently we attempted to use -XDerivingVia to avoid these problems:

data Assignment = Assignment
  { assignmentTeacherId :: Maybe TeacherId
  , ...
  }
  deriving stock (Eq, Show, Generic)

deriving via (ApiOptions ('Just "assignment")) instance ToJSON Assignment
deriving via (ApiOptions ('Just "assignment")) instance FromJSON Assignment

where ApiOptions is a newtype with the following ToJSON and FromJSON instances:

newtype ApiOptions (prefix :: Maybe Symbol) (value :: Type) = ApiOptions
  { unApiOptions :: value
  }

instance (GToEncoding Zero (Rep a), GToJSON Zero (Rep a), Generic a, ModifyOptions prefix) => ToJSON (ApiOptions prefix a) where
  toJSON = genericToJSON (modifyOptions @prefix defaultOptions) . unApiOptions
  toEncoding = genericToEncoding (modifyOptions @prefix defaultOptions) . unApiOptions

instance (GFromJSON Zero (Rep a), Generic a, ModifyOptions prefix) => FromJSON (ApiOptions prefix a) where
  parseJSON = fmap ApiOptions . genericParseJSON (modifyOptions @prefix defaultOptions)

After converting some 240 instances, ghc started running out of memory during continuous integration. On average, we're seeing about a 30% increase in memory usage according to -ddump-timings:

Allocation, sorted by difference

Stage -XNoDerivingVia Allocation -XDerivingVia Allocation Diff
Total 1208967.387 MB 1707926.976 MB +498959.589 MB
Simplifier 3 173417.41 MB 253605.102 MB +80187.691 MB
Simplifier 6 163104.206 MB 227728.294 MB +64624.088 MB
Simplifier 7 126605.672 MB 188849.772 MB +62244.1 MB
Simplifier 4 154081.307 MB 216046.1 MB +61964.792 MB
Simplifier 5 134668.038 MB 191559.688 MB +56891.651 MB
Simplifier 2 127109.857 MB 174883.096 MB +47773.239 MB
Simplifier 1 37972.934 MB 67402.067 MB +29429.133 MB
Specialise 1 4783.927 MB 21672.163 MB +16888.236 MB
CodeGen 2 80904.293 MB 95021.156 MB +14116.863 MB
CodeGen 1 77103.084 MB 90489.74 MB +13386.655 MB
Float out 2 26748.774 MB 39731.532 MB +12982.758 MB
Demand analysis 1 29991.168 MB 42497.224 MB +12506.055 MB
Demand analysis 2 26703.106 MB 38282.119 MB +11579.013 MB
Renamer/typechecker 1 23239.374 MB 33166.786 MB +9927.412 MB
Float out 1 9385.235 MB 12499.722 MB +3114.487 MB
CoreTidy 1 4987.757 MB 6012.39 MB +1024.633 MB
Worker Wrapper binds 1 845.609 MB 1158.52 MB +312.911 MB
Desugar 1 5044.441 MB 5061.204 MB +16.763 MB
Simplify 2 366.118 MB 369.361 MB +3.243 MB
Float inwards 1 5.076 MB 5.077 MB +0.0 MB
Called arity analysis 1 5.419 MB 5.42 MB +0.0 MB
Exitification transformation 1 5.308 MB 5.308 MB +0.0 MB
Common sub-expression 1 5.101 MB 5.101 MB +0.0 MB
Float inwards 2 5.075 MB 5.075 MB +0.0 MB
CorePrep 2 116.124 MB 116.109 MB -0.015
CorePrep 7 0.029 MB 0.012 MB -0.016
CorePrep 6 0.044 MB 0.026 MB -0.019
Simplify 4 1.937 MB 1.899 MB -0.039
Simplify 5 0.073 MB 0.034 MB -0.039
CorePrep 4 3.545 MB 3.498 MB -0.047
ByteCodeGen 4 4.768 MB 4.652 MB -0.116
ByteCodeGen 1 7.824 MB 7.707 MB -0.117
ByteCodeGen 5 0.167 MB 0.05 MB -0.117
CorePrep 1 5.339 MB 5.063 MB -0.276
ByteCodeGen 2 903.206 MB 902.928 MB -0.278
CorePrep 5 0.6 MB 0.132 MB -0.468
CorePrep 3 3.628 MB 3.069 MB -0.558
Simplify 3 1.319 MB 0.262 MB -1.057
Parser 1 811.944 MB 809.403 MB -2.54
ByteCodeGen 3 4.204 MB 0.779 MB -3.425
Simplify 1 14.213 MB 9.337 MB -4.875
Simplify 6 0.033 MB N/A N/A
ByteCodeGen 6 0.086 MB N/A N/A
CorePrep 8 0.016 MB N/A N/A

Note: this table was built by parsing, aggregating, and comparing *.dump-timings files created by compiling our master branch as well as the branch that uses -XDerivingVia. I compiled both branches on my local machine using the following command (-M28G prevents the operating system killing ghc at around 30 GB resident memory):

stack build \
  --ghc-options '-ddump-to-file -dshow-passes -ddump-timings -j +RTS -s -M28G -A128m -n4m -I0' \
  entities

Steps to reproduce

Our entities package is too big (and too proprietary! 😅) to share, so I've built a minimal project here. This project exhibits roughly the same behavior on a much smaller scale - 10 related entities, each with ToJSON and FromJSON instances. You can build it with:

make no.deriving.via
# expands to:
#   export STACK_WORK=.stack-work-no-deriving-via
#   stack clean oom
#   stack build oom --ghc-options "-ddump-to-file -ddump-timings -dshow-passes -j +RTS -s"

and

make deriving.via
# expands to:
#   export STACK_WORK=.stack-work-deriving-via
#   stack clean oom
#   build oom --ghc-options "-DUSE_DERIVING_VIA -ddump-to-file -ddump-timings -dshow-passes -j +RTS -s"

You can observe with htop that the latter command uses more memory, or compare the *.dump-timings files in .stack-work-no-deriving-via and .stack-work-deriving-via. Note that the README.md includes some aggregate comparisons (see the make table rule in the Makefile), which I'll reproduce here:

Time, sorted by difference

Stage -XNoDerivingVia Time -XDerivingVia Time Diff
Total 532.25s 715.652s +183.402s
Simplifier 3 89.825s 121.334s +31.509s
Simplifier 6 63.525s 86.922s +23.397s
Simplifier 7 81.041s 102.338s +21.297s
Simplifier 4 83.293s 103.967s +20.674s
Simplifier 5 75.801s 92.826s +17.025s
Simplifier 2 50.879s 67.449s +16.569s
Simplifier 1 10.92s 26.697s +15.776s
Float out 2 20.235s 27.78s +7.545s
Demand analysis 2 8.95s 15.716s +6.766s
CodeGen 1 7.509s 13.68s +6.172s
Demand analysis 1 10.729s 16.558s +5.828s
Renamer/typechecker 1 10.086s 13.426s +3.34s
CodeGen 2 10.22s 13.238s +3.018s
Specialise 1 1.933s 4.633s +2.7s
Float out 1 4.535s 6.124s +1.589s
CoreTidy 1 1.021s 1.753s +0.733s
Exitification transformation 1 0.0s 0.0s +0.0s
Float inwards 1 0.0s 0.0s +0.0s
Common sub-expression 1 0.0s 0.0s -0.0
Float inwards 2 0.0s 0.0s -0.0
Called arity analysis 1 0.0s 0.0s -0.0
CorePrep 1 0.0s 0.0s -0.0
CorePrep 2 0.0s 0.0s -0.0
Desugar 1 0.29s 0.269s -0.021
Parser 1 0.08s 0.049s -0.031
Worker Wrapper binds 1 1.377s 0.895s -0.482

Allocation, sorted by difference

Stage -XNoDerivingVia Allocation -XDerivingVia Allocation Diff
Total 59148.37 MB 81868.75 MB +22720.381 MB
Simplifier 3 10241.209 MB 13633.87 MB +3392.661 MB
Simplifier 7 8982.909 MB 11950.616 MB +2967.707 MB
Simplifier 4 8698.237 MB 11112.988 MB +2414.752 MB
Simplifier 5 7963.713 MB 10345.508 MB +2381.795 MB
Simplifier 6 7199.82 MB 9320.079 MB +2120.259 MB
Simplifier 2 5323.286 MB 7422.371 MB +2099.085 MB
Simplifier 1 1254.692 MB 3028.437 MB +1773.745 MB
CodeGen 2 1650.152 MB 2769.932 MB +1119.779 MB
CodeGen 1 1583.906 MB 2624.815 MB +1040.909 MB
Specialise 1 196.157 MB 1009.708 MB +813.551 MB
Renamer/typechecker 1 698.728 MB 1330.049 MB +631.321 MB
Float out 2 1612.035 MB 2223.416 MB +611.382 MB
Demand analysis 2 1486.519 MB 2047.257 MB +560.738 MB
Demand analysis 1 1643.384 MB 2198.509 MB +555.125 MB
Float out 1 342.7 MB 505.041 MB +162.341 MB
CoreTidy 1 168.548 MB 229.185 MB +60.637 MB
Worker Wrapper binds 1 43.757 MB 57.143 MB +13.386 MB
Desugar 1 22.449 MB 25.76 MB +3.312 MB
Float inwards 1 0.162 MB 0.162 MB 0.0
Called arity analysis 1 0.176 MB 0.176 MB 0.0
Exitification transformation 1 0.171 MB 0.171 MB 0.0
Common sub-expression 1 0.163 MB 0.163 MB 0.0
Float inwards 2 0.162 MB 0.162 MB 0.0
CorePrep 1 0.142 MB 0.142 MB 0.0
CorePrep 2 0.142 MB 0.142 MB 0.0
Parser 1 35.051 MB 32.948 MB -2.103

The linked repo also has some heap profile images (-XNoDerivingVia and -XDerivingVia), though I'm not exactly sure what to make of them.

Expected behavior

I would expect compiling with -XDerivingVia to use roughly the same amount of memory as without. The derived instances look like this:

data Assignment = Assignment
  { assignmentTeacherId :: Maybe TeacherId
  , ...
  }
  deriving stock (Eq, Show, Generic)

-- deriving via (ApiOptions ('Just "assignment")) instance ToJSON Assignment
instance ToJSON Assignment where
  toJSON = coerce
    @(ApiOptions ('Just "assignment") Assignment -> Value)
    @(Assignment -> Value)
    toJSON

  toEncoding = coerce
    @(ApiOptions ('Just "assignment") Assignment -> Encoding)
    @(Assignment -> Encoding)
    toEncoding

-- deriving via (ApiOptions ('Just "assignment")) instance FromJSON Assignment
instance FromJSON Assignment where
  parseJSON = coerce
    @(Value -> Parser (ApiOptions ('Just "assignment") Assignment))
    @(Value -> Parser Assignment)
    parseJSON

Is it possible that the extra Coercible instances ghc is having to create and solve are the culprit?

Environment

  • GHC version used: 8.8.3
  • Operating System: Ubuntu 18.04.4 LTS
  • Processor: Intel Core i7-8750H @2.2Ghz (6 cores, 12 threads)
  • Memory: 32GB
Edited by Matthew Pickering
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information