Linearity

The solution is to distinguish call demands from product demands. Consider again:

let f = \ x. \ y. ... in
  ...(f 1 2)...(f 3 4)...

The demands placed on f by the first and second call get bothed together to yield SM(SM(T)). But this is incorrect. Consider:

let f = \ x. \ y. ... 
    frob = f 1 in
  ...(f 1 2)...(frob 2)...(frob 3)...

Here, the demands placed on f by the body of frob and by the call to f in the let-body get bothed together: S1(T) & S1(S1(T)) = SM(SM(T)). Note that this is the same as the demand placed on f above, yet we want to distinguish between the two situations, because in the first example, the inner lambda in f's rhs is only called once.

The solution is to treat call demands and product demands differently, and to define the both function for call demands to have the same behavior as lub. Then in the first example, f has demand SM(S1(T)) placed on it, and in the second, SM(T). This is what we want; now, if f has demand D(D(T) placed on it, that implies f is always called with two arguments.

Why does this make sense? Consider what it means if we see an example like:

let baz = lazyF p in
  case p of
    (a,b) -> strictF a b

(where lazyF is lazy in p, and strictF is strict in a and b). p is used both with demand L (in the call to lazyF and with demand S(SS) (in the call to strictF). This means it's perfectly same to strictly evaluate p, so when we both together the two demands, we should get S(SS). On the other hand, if a function is called once with one argument and once with two, we don't want to treat it as a function that's always called with two arguments; we're only interested in functions that are always called with n arguments for a given n. Hence, both should behave the same way as lub for call demands.

Ticky

(NB out-of-date, but maybe historically useful; cf Debugging/TickyTicky)

The following code inserts extra fields into closures when ticky is enabled (and so had to be commented out):

staticTickyHdr :: [CmmLit]
-- The ticky header words in a static closure
-- Was SET_STATIC_TICKY_HDR
staticTickyHdr = 
  | not opt_DoTickyProfiling = []
  | otherwise		     = [zeroCLit]

in compiler/codeGen/CgTicky.hs.

Other relevant functions: emitTickyCounter in compiler/codeGen/CgTicky.hs (called by closureCodeBody in compiler/codeGen/CgClosure.lhs).

Argh! I spent days tracking down this bug: idInfoLabelType in compiler/GHC/Cmm/CLabel.hs needs to return DataLabel for labels of type RednCount (i.e., labels for ticky counters.) By default, it was returning CodeLabel, which caused the ticky counter labels to get declared with the wrong type in the generated C, which caused C compiler errors.

Declarations for ticky counters

emitTickyCounter spits out C declarations that look like this:

static char c16O_str[] = "main:Main.$wrepeated{v r4}";

static char c16Q_str[] = "i";

StgWord Main_zdwrepeated_ct[] = {
0x0, 0x1U, 0x1U, 0x0, (W_)&c16O_str, (W_)&c16Q_str, 0x0, 0x0, 0x0
};

Here, Main_zdwrepeated_ct is actually an StgEntCounter (this type is declared in includes/StgTicky.h). The counters get used by printRegisteredCounterInfo in rts/Ticky.c, which prints out the ticky reports. The counter fields are accessed using offsets defined in includes/GHCConstants.h (oFFSET_StgEntCounter_*), which in turn get generated from includes/mkDerivedConstants.c (change it and then run make in includes/.

<s>Note that the first 3 fields of the counters are 16-bit ints and so the generated ticky-counter registration code has to reflect that (I fixed a bug where the first field was getting treated as a 32-bit int.)</s> I modified the StgEntCounter type so that all fields are StgWords, because it seems that the code generator can't cope with anything else anyway (i.e., in the declaration above, Main_zdwrepeated_ct[] is an array of StgWords, even though the C type declaration implies that some fields are halfwords.)

In emitBlackHoleCode in compiler/codeGen/CgClosure.lhs, "eager blackholing" was getting employed in the case where ticky was turned on; this was causing programs to <<loop>> when they wouldn't with ticky disabled, so I turned that off.

Strictness and let-floating

We run into the following problem in the transform nofib benchmark: suppose we have:

f x = 
  let foo = stuff in
     foo + x

where stuff doesn't depend on x. Demand analysis says that foo has a strict demand placed on it. Later, foo gets floated to the top level because it doesn't depend on x (in reality it's more complicated because in this case foo probably would have gotten floated out before demand analysis, but bear with me). foo still has a strict demand signature, which a top-level binding isn't allowed to have. Currently this manifests itself as an assertion failure in compiler/GHC/Core/Opt/Simplify/Env.hs.

There are two possible easy solutions: don't float out bindings for strict things, or "both" the demand for a binder with Lazy when its binding gets floated out. The question is, is it better to do the let-floating and lose the strictness into or to evaluate something strictly but lose sharing?

Coercions

When we run into an expression like (Cast e co) that we're placing demand d on, we analyze e to get dmd_ty, then check whether the depth of e is equal to the depth of dmd_ty or not. This is necessary because we might be casting a function to a non-function type. So, if d and dmd_ty have equal depth, we return dmd_ty as is; if d's arity is less, we drop the appropriate number of args from dmd_ty; if dmd_ty's arity is less, we add the appropriate number of dummy argument demands to it.

WARN: arity /= dmdTypeDepth rhs_dmd_ty && not (exprIsTrivial rhs)

This warning was happening for (at least) two reasons:

lambdas with a strict non-call demand placed on them were being handled wrong (see the first two examples in Commentary/Compiler/StrictnessAnalysis/Examples)
coercions were being handled wrong, resulting in a demand type with depth 0 being assigned to an rhs consisting of a cast from/to a function type

Explaining demand transformers

For those who, like me, are a little slow, this example might go in section 5.1 of the paper:

(a):

f = \ x -> 
      case x of
        ... -> \ y -> ...
        ... -> \ y -> ...

(b):

f = \ x ->
     \ y ->
        case x of
           ...

In both (a) and (b), f's rhs places a strict demand on x. So if we see:

(f x)

with a strict demand placed on it, it wouldn't be sound to look at f's demand signature and say that (f x) places a strict demand on x under f -- because we don't know whether f is like (a) or like (b). This is why when we see a partial application of f, we discard all of the argument information in f's demand type.

Nofib stuff

I've had weird problems with the time and sed commands under MSYS but I think it's just when running nofib. At some point I wrote down:

TIME needs to be time not /bin/time

and

MSYS sed does not work, use cygwin sed

but of *course* I no longer remember what I meant.

kirsten notes