Reorganise the work list, so that flattening goals are treated in the right order
Trac #9872 showed the importance of processing goals in depth-first, so that we do not build up a huge pool of suspended function calls, waiting for their children to fire. There is a detailed explanation in Note [The flattening work list] in TcFlatten The effect for Trac #9872 (slow1.hs) is dramatic. We go from too long to wait down to 28Gbyte allocation. GHC 7.8.3 did 116Gbyte allocation!
Showing with 396 additions and 308 deletions