When loading certain projects into ghci, the first call to main after a :l or a :r will take a few seconds until the main is actually executed.
This makes live reloading in IHP very slow. E.g. when a user changes some haskell code in an IHP project, the IHP dev server will stop the current app in ghci and trigger a :r. The IHP dev server watches for a message such as Ok, 199 modules loaded. and then triggers a :main. This call to :main now takes a couple of seconds to execute, even though ghci already told is that it has reloaded all necessary files.
The Main entered message appears as expected, but there's a long delay between when I enter main and when the message appears. The putStrLn "Main entered" is the first operation in that module:
Once devenv.sh is installed, run cachix use digitallyinduced to enable our binary cache for a faster build time.
Now you can reproduce it by running these commands:
# Clone the repogit clone https://github.com/digitallyinduced/ihp.git cd ihp# This will enter a shell with all dependenciesnix develop --impure# Run ghcighci:l Test/Main.hs# => Ok, 152 modules loaded.main# ... now it takes a few seconds until the app starts up
Expected behavior
Calling main should directly start the app, without a delay of a few seconds.
Environment
GHC version used: 9.4.4 + 9.6.2
Operating System: macOS (This problem might be mac specific, I've not seen similiar behaviour on linux computers)
System Architecture: x86_64
Project Size: This problem seems to only appear in projects with more than a few modules. On a fresh IHP project with 20 modules, the :main is typically instant.
Edited
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
Output of ghci -RTS +Dl and then grep "internal_dlsym" optlog | sort | uniq -c | sort -r | less for non optimized build:
8480 internal_dlsym: found symbol 'ghczmprim_GHCziCString_unpackCStringzh_closure' in program 3952 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_ZC_con_info' in program 3293 internal_dlsym: found symbol 'textzm1zi2zi5zi0_DataziText_pack_closure' in shared object 2716 internal_dlsym: found symbol 'textzm1zi2zi5zi0_DataziText_zdfIsStringText_closure' in shared object 2565 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_ZMZN_closure' in program 1343 internal_dlsym: found symbol 'ghczmprim_GHCziTuple_Z2T_con_info' in program 974 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_Izh_con_info' in program 916 internal_dlsym: found symbol 'base_GHCziBase_zi_closure' in program 778 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_TrNameS_con_info' in program 765 internal_dlsym: found symbol 'base_GHCziBase_zlzg_closure' in program 689 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_KindRepFun_con_info' in program 678 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_Eqzh_con_info' in program 677 internal_dlsym: found symbol 'base_GHCziShow_showString_closure' in program 646 internal_dlsym: found symbol 'blazzezmmarkupzm0zi8zi2zi8zm5hcoVacOpJu7WU5dNWK1Al_TextziBlazzeziInternal_textTag_closure' in shared object 584 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_KindRepTyConApp_con_info' in program 534 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_TyCon_con_info' in program 523 internal_dlsym: found symbol 'base_GHCziBase_zgzgze_closure' in program 504 internal_dlsym: found symbol 'bytestringzm0zi11zi3zi1_DataziByteStringziInternal_zdfIsStringByteString_closure' in program 483 internal_dlsym: found symbol 'ghczmprim_GHCziClasses_zeze_closure' in program 473 internal_dlsym: found symbol 'base_GHCziBase_zdfMonadIO_closure' in program 374 internal_dlsym: found symbol 'blazzezmmarkupzm0zi8zi2zi8zm5hcoVacOpJu7WU5dNWK1Al_TextziBlazzeziInternal_preEscapedText_closure' in shared object 340 internal_dlsym: found symbol 'base_GHCziBase_pure_closure' in program 323 internal_dlsym: found symbol 'blazzezmmarkupzm0zi8zi2zi8zm5hcoVacOpJu7WU5dNWK1Al_TextziBlazzeziInternal_zdfAttributableMarkupM_closure' in shared object 323 internal_dlsym: found symbol 'blazzezmmarkupzm0zi8zi2zi8zm5hcoVacOpJu7WU5dNWK1Al_TextziBlazzeziInternal_attribute_closure' in shared object 322 internal_dlsym: found symbol 'blazzezmmarkupzm0zi8zi2zi8zm5hcoVacOpJu7WU5dNWK1Al_TextziBlazzeziInternal_preEscapedTextValue_closure' in shared object 299 internal_dlsym: found symbol 'base_GHCziShow_showsPrec_closure' in program 294 internal_dlsym: found symbol 'base_GHCziBase_zgzg_closure' in program 258 internal_dlsym: found symbol 'base_GHCziBase_zd_closure' in program 249 internal_dlsym: found symbol 'base_GHCziBase_zdfApplicativeIO_closure' in program 241 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_False_closure' in program 221 internal_dlsym: found symbol 'bytestringzm0zi11zi3zi1_DataziByteStringziInternal_zdfSemigroupByteString_closure' in program 216 internal_dlsym: found symbol 'ghczmprim_GHCziPrimopWrappers_dataToTagzh_closure' in program 211 internal_dlsym: found symbol 'ghczmprim_GHCziClasses_zaza_closure' in program 205 internal_dlsym: found symbol 'aesonzm2zi0zi3zi0zmJFcvkGjawJ023iFoa0xm17_DataziAesonziKey_fromString_closure' in shared object 203 internal_dlsym: found symbol 'textzm1zi2zi5zi0_DataziText_zdfSemigroupText_closure' in shared object 199 internal_dlsym: found symbol 'stringzmconversionszm0zi4zi0zi1zm4czzE9VwOQnG2ZZDLlaZZQmtf_DataziStringziConversions_cs_closure' in shared object 182 internal_dlsym: found symbol 'base_GHCziShow_showCommaSpace_closure' in program 181 internal_dlsym: found symbol 'base_GHCziBase_mconcat_closure' in program 173 internal_dlsym: found symbol 'base_DataziTypeableziInternal_mkTrCon_closure' in program 168 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_krepzdzt_closure' in program 161 internal_dlsym: found symbol 'base_GHCziMaybe_Just_con_info' in program
Same for -O2 build:
9964 internal_dlsym: found symbol 'ghczmprim_GHCziCString_unpackCStringzh_closure' in program 4835 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_ZC_con_info' in program 3607 internal_dlsym: found symbol 'textzm1zi2zi5zi0_DataziText_zdfIsStringText_closure' in shared object 3296 internal_dlsym: found symbol 'textzm1zi2zi5zi0_DataziText_pack_closure' in shared object 2758 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_ZMZN_closure' in program 1449 internal_dlsym: found symbol 'ghczmprim_GHCziTuple_Z2T_con_info' in program 1240 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_Izh_con_info' in program 1027 internal_dlsym: found symbol 'base_GHCziBase_zi_closure' in program 913 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_TrNameS_con_info' in program 834 internal_dlsym: found symbol 'base_GHCziBase_zlzg_closure' in program 804 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_KindRepFun_con_info' in program 738 internal_dlsym: found symbol 'base_GHCziShow_showString_closure' in program 715 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_Eqzh_con_info' in program 689 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_KindRepTyConApp_con_info' in program 654 internal_dlsym: found symbol 'blazzezmmarkupzm0zi8zi2zi8zm5hcoVacOpJu7WU5dNWK1Al_TextziBlazzeziInternal_textTag_closure' in shared object 631 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_TyCon_con_info' in program 591 internal_dlsym: found symbol 'base_GHCziBase_zgzgze_closure' in program 543 internal_dlsym: found symbol 'ghczmprim_GHCziClasses_zeze_closure' in program 514 internal_dlsym: found symbol 'bytestringzm0zi11zi3zi1_DataziByteStringziInternal_zdfIsStringByteString_closure' in program 488 internal_dlsym: found symbol 'base_GHCziBase_zdfMonadIO_closure' in program 388 internal_dlsym: found symbol 'base_GHCziBase_pure_closure' in program 374 internal_dlsym: found symbol 'blazzezmmarkupzm0zi8zi2zi8zm5hcoVacOpJu7WU5dNWK1Al_TextziBlazzeziInternal_preEscapedText_closure' in shared object 347 internal_dlsym: found symbol 'base_GHCziBase_zd_closure' in program 345 internal_dlsym: found symbol 'base_GHCziShow_showsPrec_closure' in program 333 internal_dlsym: found symbol 'base_GHCziBase_zgzg_closure' in program 327 internal_dlsym: found symbol 'blazzezmmarkupzm0zi8zi2zi8zm5hcoVacOpJu7WU5dNWK1Al_TextziBlazzeziInternal_zdfAttributableMarkupM_closure' in shared object 327 internal_dlsym: found symbol 'blazzezmmarkupzm0zi8zi2zi8zm5hcoVacOpJu7WU5dNWK1Al_TextziBlazzeziInternal_attribute_closure' in shared object 322 internal_dlsym: found symbol 'blazzezmmarkupzm0zi8zi2zi8zm5hcoVacOpJu7WU5dNWK1Al_TextziBlazzeziInternal_preEscapedTextValue_closure' in shared object 264 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_False_closure' in program 259 internal_dlsym: found symbol 'base_GHCziBase_zdfApplicativeIO_closure' in program 251 internal_dlsym: found symbol 'stringzmconversionszm0zi4zi0zi1zm4czzE9VwOQnG2ZZDLlaZZQmtf_DataziStringziConversions_cs_closure' in shared object 247 internal_dlsym: found symbol 'ghczmprim_GHCziPrimopWrappers_dataToTagzh_closure' in program 242 internal_dlsym: found symbol 'textzm1zi2zi5zi0_DataziText_zdfSemigroupText_closure' in shared object 226 internal_dlsym: found symbol 'ghczmprim_GHCziClasses_zaza_closure' in program 221 internal_dlsym: found symbol 'bytestringzm0zi11zi3zi1_DataziByteStringziInternal_zdfSemigroupByteString_closure' in program 205 internal_dlsym: found symbol 'aesonzm2zi0zi3zi0zmJFcvkGjawJ023iFoa0xm17_DataziAesonziKey_fromString_closure' in shared object 198 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_krepzdzt_closure' in program 192 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_False_con_info' in program 190 internal_dlsym: found symbol 'base_GHCziShow_showCommaSpace_closure' in program 190 internal_dlsym: found symbol 'base_GHCziMaybe_Just_con_info' in program 189 internal_dlsym: found symbol 'ghczmprim_GHCziClasses_zdfOrdInt_closure' in program
Ben Gamarichanged the descriptionCompare with previous version
@bgamari was fun hacking with you on this at ZuriHac! :) Did you get the reproduce steps working on the server? Anything else I can provide to help with this?
That is, we spent nearly 15 seconds linking (main itself takes less than 0.4 seconds). During this time, essentially all symbols are resolved using the system's dynamic linker, which appears to be horrendously slow.
However, there also appears to be quite a bit of repeated resolution happening:
$ cut --complement -d' ' -f 1 log | grep found | sort | uniq -c | sort -nr | head 18054 internal_dlsym: found symbol 'ghczmprim_GHCziCString_unpackCStringzh_closure' in program 13872 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_Izh_con_info' in program 6940 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_ZC_con_info' in program 6413 internal_dlsym: found symbol 'textzm2zi0zi2_DataziText_zdfIsStringText_closure' in shared object 5046 internal_dlsym: found symbol 'textzm2zi0zi2_DataziText_unpackCStringLenzh_closure' in shared object 4721 internal_dlsym: found symbol 'ghczmprim_GHCziTypes_ZMZN_con_info' in program 3393 internal_dlsym: found symbol 'ghczmprim_GHCziTuple_Z2T_con_info' in program 2001 internal_dlsym: found symbol 'ihpzmhsxzm1zi0zi1zminplace_IHPziHSXziQQ_textToStaticString_closure' in shared object 1832 internal_dlsym: found symbol 'base_GHCziStackziTypes_SrcLoc_con_info' in program 1813 internal_dlsym: found symbol 'base_GHCziStackziTypes_pushCallStack_closure' in program
where the first column gives the total amount of time spent resolving the symbol and the second gives the number of resolutions.
That is, it seems to have taken only 100ms to resolve ghczmprim_GHCziCString_unpackCStringzh_closure nearly 18000 times (around 5us/lookup). Meanwhile, it takes 1.8 seconds to resolve hspczmcrzm2zi11zi4zm948c241c_TestziHspecziCoreziSpecziMonad_zdfMonadSpecM_closure and hspczmcrzm2zi11zi4zm948c241c_TestziHspecziCoreziSpec_it_closure only ~750 times (2 ms/lookup). This seems crazy yet is quite reproducible across runs.
I can only guess that their hash-map implementation has some truly horrible collision behavior.
I can only guess that their hash-map implementation has some truly horrible collision behavior.
Looking a bit deeper into the statistics lends further support to this hypothesis:
Nearly all resolutions take less than 20 us, but there is a wide dispersion with a very clear multi-modal structure. Moreover, there is a small tail where a few symbols take an order of magnitude longer than the larger mode.
Moreover, looking at the individual lookup times confirms that these "bad" cases aren't simply due to a one-time cost (which might be due to, e.g., lots of relocations being lazily performed). Every lookup of a "bad" symbol is of similar (high) cost.
frommatplotlibimportpyplotasplimportnumpyasnpx=np.genfromtxt('hi')ts=x[:,2]bins=np.logspace(-6,-2,100)pl.hist(ts,log=True,bins=bins,density=False)pl.ylabel('number of occurrences')pl.xlabel('time per resolution (seconds)')pl.xscale('log')pl.savefig('hist.svg')
where column 1 is total resolution time, column 2 is number of resolutions, column 3 is time per resolution.
I was hoping we would find here that all "slow" symbols, e.g., share a prefix but this doesn't appear to be the case. There are symbols with high resolution cost from at least three different libraries.
Looking at the non-symbol-aggregated resolution-time histogram reveals the trimodal structure more clearer:
Moreover, looking at the resolution-time histograms of a few individual symbols confirms that the behavior truly is consistent for all resolutions of a given symbol:
At this point I am quite certain that the bug here is not in GHC. There is simply no reason why the resolution of one symbol should consistently take three orders of magnitude longer than another. Moreover, the stack traces that I have observed suggest that nearly all of this time is spent in libSystem.
This being Darwin, which a good number of our users rely upon, it is clear that the status quo is not tenable; 15 second link times are simply too long for interactive usage. On the other hand, this being an Apple platform, I have little confidence that this will be fixed by upstream.
I suspect the only way forward here is to hide the problem by caching symbol resolutions. Of course, this then means that we need to deal with invalidation, sizing, and all of the usual problems associated with caches. I suspect this will be a day or two of work.
I prototyped a very simple cache in !11010 (closed). This brought link time from 11 seconds down to less than 3. This is still very poor, but ultimately I suspect that given Darwin's linker implementation this is the best we can do without changing symbol names.
As it seems likely that Darwin's linker is using a trie for symbol lookup, I suspect prepending each symbol with a few bytes of hash content would also likely help.