Skip to content

Unnecessary code duplication from case analysis

Consider a case analysis like this,

predicate c =
       c == '-' || c == '_' || c == '.' || c == '~' || c == ':'
    || c == '@' || c == '&' || c == '=' || c == '+' || c == '$' || c == ','

f x | predicate x = do_something
f x = do_something_else
main = f 'a'

GHC in some cases appears to produce Core for this example by constructing a case analysis on x, replicating do_something in every branch (generated by GHC 7.10),

$wa_r3UQ =
  \ ww_s3TC w_s3Tz ->
    case ww_s3TC of _ {
      __DEFAULT -> (# w_s3Tz, () #);
      '$' -> hPutStr2 stdout lvl2_r3Uv True w_s3Tz;
      '&' -> hPutStr2 stdout lvl4_r3Ux True w_s3Tz;
      ...
      '_' -> hPutStr2 stdout lvl20_r3UN True w_s3Tz;
      '~' -> hPutStr2 stdout lvl22_r3UP True w_s3Tz
    }

In some cases where do_something is large this can lead to a substantial increase in code size.

There are really two issues here,

  1. That a branch-y case analysis is being used for a boolean condition
  2. That do_something is being inlined into each branch

This seems to be sub-optimal for instruction cache efficiency, the number of branches in generated machine code, and understandability of the resulting Core.

I would have expected something like,

f x =
    let p = pred x
    in case p of
          True  -> do_something
          False -> do_something_else
Edited by Ben Gamari
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information