Skip to content

TH eats 50 GB memory when creating ADT with multiple constructors

When TH creates a data type with multiple constructors, GHC consumes huge amounts of memory in what appears to be a highly superlinear manner.

A common use case: in the Yesod web framework, localized strings are represented by constructors of a Messages data type, created by a TH splice. There is one constructor for each localized string on the site, possibly hundreds. The splice also creates a class instance for the data type whose method matches against all the constructors for each language for which localizations are provided; this may or may not play a role in the memory leak. This Trac ticket corresponds to this Yesod issue:

https://github.com/yesodweb/yesod/issues/1487

Here are two reproductions, and one NON-reproduction:

  1. A blank "hello world" Yesod web site, with 500 messages defined for about 30 languages. The single page displays the messages in the user's language. Compiling this program in GHC 8.2.2 (stackage lts-10.5) on Ubuntu 16.04 eats over 50 GB of memory.

https://github.com/ygale/yesod-bug1487

  1. @snoyberg has cut down this reproduction to avoid using any libraries not included with GHC. It is in the same repo, on the snoyberg-master branch.

NON-reproduction: The code in this gist, which is similar to what is generated by the TH in the above reproductions, is compiled by GHC without the bat of an eyelash. This demonstrates that the bug requires TH to reproduce. https://gist.github.com/92347aa93d226e31f977a0b62b443aa7

Edited by Ryan Scott
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information