Skip to content

Pattern matching against sets of strings sharing a prefix blows up pattern checker

hvr stumbled upon this issue while attempting to bootstrap GHC with GHC HEAD. In so doing he found that GHC HEAD required more than 10 GB of memory while compiling genprimopcode (and never completed).

It appears that this blow-up is due to the new pattern checker. In particular it appears that the pattern checker is affected quite adversely by sets of patterns sharing a prefix. For instance, this example,

import System.Environment

main :: IO ()
main = do
  args <- getArgs
  print $ case head args of
                      "--primop-primop-info"  -> "turtle"
                      "--primop-tag" -> "asdf"
                      "--primop-list"  -> "casdhf"
                      "--primop-vector-uniques"  -> "this"
                      "--primop-vector-tys"  -> "is"
                      "--primop-vector-tys-exports"  -> "silly"
                      "--primop-vector-tycons"  -> "hmmm"
                      "--make-haskell-wrappers" -> "123512"
                      "--make-haskell-source"  -> "as;dg"
                      "--make-latex-doc" -> "adghiw"
                      _ -> error "Should not happen, known_args out of sync?"

As written GHC requires over ten gigabytes of heap and several minutes to compile the example. If one perform s/--primop-// to this example it takes 500ms to compile. Alternatively, if on replace the first - in each of the --primop strings with a unique character (thus breaking the shared prefixes) compilation time is a bit shy of a second.

Trac metadata
Trac field Value
Version 7.11
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information