$ git clone -b ghc-oom https://github.com/sjakobi/propellor.branchable.com$ cd propellor.branchable.com$ cabal build -w ghc-9.4.1 --allow-newer=hashable:ghc-bignum,base<...>[160 of 180] Compiling Propellor.Property.SiteSpecific.JoeySites ( src/Propellor/Property/SiteSpecific/JoeySites.hs, /home/simon/tmp/propellor-5.13/dist-newstyle/build/x86_64-linux/ghc-9.4.1/propellor-5.13/build/Propellor/Property/SiteSpecific/JoeySites.o, /home/simon/tmp/propellor-5.13/dist-newstyle/build/x86_64-linux/ghc-9.4.1/propellor-5.13/build/Propellor/Property/SiteSpecific/JoeySites.dyn_o )Error: cabal: Failed to build propellor-5.13 (which is required byexe:propellor-config from propellor-5.13 and exe:propellor frompropellor-5.13). The build process was killed (i.e. SIGKILL). The typicalreason for this is that there is not enough memory available (e.g. the OSkilled a process using lots of memory).
On my machine GHC takes more than 15GB of memory before being killed. With GHC 8.6.5 and 9.0.2 the memory usage is unremarkable and the build succeeds.
I have somewhat reduced the size of the package on my ghc-oom-reduced branch (link to the reduced problematic module). At this stage the renamer/typechecker phase takes only around 5s, so it's still noticeably slow. But the majority of the compile time is now used by the simplifier, which takes around 22s:
I don't have much insight to offer here, but one problem propellor has often had is that when the type checker does have an error to report, the error message is often so enormous that it OOMS ghc when ghc tries to display it. So, if the new version of ghc is somehow finding a type error that previous versions do not, that seems like a possible reason.
Could you try to explicitly enable Haskell20210 instead of GHC2021? The code seems to use poly-kinded type families, so perhaps PolyKinds is the culprit?
The propellor library (and the reduced reproducer) already uses Haskell2010 as the default-language, with only TypeOperators in the default-extensions. Enabling LANGUAGE Haskell2010 explicitly in the problematic Propellor.Property.SiteSpecific.JoeySites module doesn't change anything about the size of the generated Core.
Several other modules use PolyKinds though, so if GHC's handling of PolyKinds changed between 9.0 and 9.2, that might be related to the perf regression.
--{-# OPTIONS_GHC -ddump-timings #-}{-# LANGUAGE Haskell2010 #-}{-# LANGUAGE DataKinds #-}{-# LANGUAGE FlexibleContexts #-}{-# LANGUAGE GADTs #-}{-# LANGUAGE PolyKinds #-}{-# LANGUAGE StandaloneKindSignatures #-}{-# LANGUAGE TypeFamilies #-}{-# LANGUAGE TypeOperators #-}{-# LANGUAGE UndecidableInstances #-}moduleJoeySites(house)whereimportData.Kind(Type)importData.Type.Bool(type(&&),type(||),If,Not)datafamilySing(x::k)typeMetaTypes=SingdataTargetOS=OSDebian|OSBuntish|OSArchLinux|OSFreeBSDdataMetaType=TargetingTargetOS|WithInfotypeUnixLike=MetaTypes'['Targeting'OSDebian,'Targeting'OSBuntish,'Targeting'OSArchLinux,'Targeting'OSFreeBSD]typeLinux=MetaTypes'['Targeting'OSDebian,'Targeting'OSBuntish,'Targeting'OSArchLinux]typeDebianLike=MetaTypes'['Targeting'OSDebian,'Targeting'OSBuntish]typeArchLinux=MetaTypes'['Targeting'OSArchLinux]typeHasInfo=MetaTypes'['WithInfo]type(+)::Type->Type->Typetypefamilya+bwhereMetaTypesa+MetaTypesb=MetaTypes(Concatab)typeConcat::[a]->[a]->[a]typefamilyConcatl1l2whereConcat'[]bs=bsConcat(a':as)bs=a':(Concatasbs)typeIsTarget::k->BooltypefamilyIsTargetawhereIsTarget('Targetinga)='TrueIsTarget'WithInfo='FalsetypeTargets::[k]->[k]typefamilyTargetslwhereTargets'[]='[]Targets(x':xs)=If(IsTargetx)(x':Targetsxs)(Targetsxs)typeNonTargets::[k]->[k]typefamilyNonTargetslwhereNonTargets'[]='[]NonTargets(x':xs)=If(IsTargetx)(NonTargetsxs)(x':NonTargetsxs)typeCombine::[a]->[a]->[a]typefamilyCombinel1l2whereCombine(list1::[a])(list2::[a])=(Concat(NonTargetslist1`Union`NonTargetslist2)(Targetslist1`Intersect`Targetslist2))typeElem::t->[t]->BooltypefamilyElemalistwhereElema'[]='FalseElema(a':bs)='TrueElema(_':bs)=ElemabstypeUnion::[a]->[a]->[a]typefamilyUnionl1l2whereUnion'[]list2=list2Union(a':rest)list2=If(Elemalist2||Elemarest)(Unionrestlist2)(a':Unionrestlist2)typeIntersect::[a]->[a]->[a]typefamilyIntersectl1l2whereIntersect'[]list2='[]Intersect(a':rest)list2=If(Elemalist2&&Not(Elemarest))(a':Intersectrestlist2)(Intersectrestlist2)dataPropertymetatypes=PropertydataPropsmetatypes=Props[()]typefamilyGetMetaTypesxwhereGetMetaTypes(Property(MetaTypest))=MetaTypestpropertyList::Props(MetaTypesmetatypes)->Property(MetaTypesmetatypes)propertyList=undefinedprops::PropsUnixLikeprops=Props[](&)::(MetaTypesy~GetMetaTypesp)=>Props(MetaTypesx)->p->Props(MetaTypes(Combinexy))(&)=undefinedhouse::Property(HasInfo+DebianLike)house=propertyList$props&bad1&bad1&bad1&bad2&bad2&bad2&bad2&bad2&bad2&bad2&bad2&bad2&bad2&bad2&bad2&bad3wherebad1::Property(DebianLike+ArchLinux)bad1=undefinedbad2::PropertyDebianLikebad2=undefinedbad3::Property(HasInfo+UnixLike)bad3=undefined
I think the compile time regression is to do with increased residency.
Here is HEAD
23,034,675,928 bytes allocated in the heap 14,777,558,520 bytes copied during GC 2,094,319,728 bytes maximum residency (18 sample(s)) 8,337,296 bytes maximum slop 4075 MiB total memory in use (0 MB lost due to fragmentation) INIT time 0.002s ( 0.006s elapsed) MUT time 28.491s ( 28.440s elapsed) GC time 52.440s ( 52.626s elapsed) EXIT time 0.001s ( 0.009s elapsed) Total time 80.933s ( 81.080s elapsed) Alloc rate 808,477,384 bytes per MUT second
Here is GHC 9.2:
*** Deleting temp dirs: 27,408,935,912 bytes allocated in the heap 1,304,456 bytes copied during GC 2,305,006,472 bytes maximum residency (22 sample(s)) 9,139,320 bytes maximum slop 4485 MiB total memory in use (0 MB lost due to fragmentation) INIT time 0.001s ( 0.000s elapsed) MUT time 17.672s ( 17.620s elapsed) GC time 18.024s ( 18.029s elapsed) EXIT time 0.001s ( 0.001s elapsed) Total time 35.698s ( 35.650s elapsed)
109,873,640 bytes allocated in the heap 44,683,152 bytes copied during GC 10,008,712 bytes maximum residency (6 sample(s)) 128,888 bytes maximum slop 24 MiB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 44 colls, 0 par 0.023s 0.023s 0.0005s 0.0023s Gen 1 6 colls, 0 par 0.081s 0.199s 0.0331s 0.1465s TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.001s ( 0.004s elapsed) MUT time 0.058s ( 0.080s elapsed) GC time 0.104s ( 0.222s elapsed) EXIT time 0.001s ( 0.004s elapsed) Total time 0.165s ( 0.310s elapsed) Alloc rate 1,880,914,833 bytes per MUT second Productivity 35.5% of total user, 25.8% of total elapsed
@sjakobi, I do indeed hope that we can address this for 9.6. It would be great if someone could take @sheaf's minimized reproducer and try running a bisection between 9.0 and 9.2.
One thing that should be noted is that the source program was arguably always somewhat fragile in that it uses suboptimal type-level data structures for its use-case. Specifically, the existence of Elem and Union is surely a sign that a type-level set would be more appropriate. Given the under-defined nature of GHC's typechecker behavior, it's not terribly surprising (but still quite unfortunate) that this program manifested in a high compile-time heap residency. Refactoring the package to use a more appropriate type-level data structure (e.g. the structure provided by the type-level-sets package) may be a sufficient workaround.