Skip to content
  • Ömer Sinan Ağacan's avatar
    StgCmmCon: Do not generate moves from unused fields to local variables · cd50d236
    Ömer Sinan Ağacan authored
    Say we have a record like this:
    
        data Rec = Rec
          { f1 :: Int
          , f2 :: Int
          , f3 :: Int
          , f4 :: Int
          , f5 :: Int
          }
    
    Before this patch, the code generated for `f1` looked like this:
    
        f1_entry()
            {offset
               ...
               cJT:
                   _sI6::P64 = R1;
                   _sI7::P64 = P64[_sI6::P64 + 7];
                   _sI8::P64 = P64[_sI6::P64 + 15];
                   _sI9::P64 = P64[_sI6::P64 + 23];
                   _sIa::P64 = P64[_sI6::P64 + 31];
                   _sIb::P64 = P64[_sI6::P64 + 39];
                   R1 = _sI7::P64 & (-8);
                   Sp = Sp + 8;
                   call (I64[R1])(R1) args: 8, res: 0, upd: 8;
            }
    
    Note how all fields of the record are moved to local variables, even though
    they're never used. These moves make it to the final assembly:
    
        f1_info:
            ...
        _cJT:
            movq 7(%rbx),%rax
            movq 15(%rbx),%rcx
            movq 23(%rbx),%rcx
            movq 31(%rbx),%rcx
            movq 39(%rbx),%rbx
            movq %rax,%rbx
            andq $-8,%rbx
            addq $8,%rbp
            jmp *(%rbx)
    
    With this patch we stop generating these move instructions. Cmm becomes this:
    
        f1_entry()
            {offset
               ...
               cJT:
                   _sI6::P64 = R1;
                   _sI7::P64 = P64[_sI6::P64 + 7];
                   R1 = _sI7::P64 & (-8);
                   Sp = Sp + 8;
                   call (I64[R1])(R1) args: 8, res: 0, upd: 8;
            }
    
    Assembly becomes this:
    
        f1_info:
            ...
        _cJT:
            movq 7(%rbx),%rax
            movq %rax,%rbx
            andq $-8,%rbx
            addq $8,%rbp
            jmp *(%rbx)
    
    It turns out CmmSink already optimizes this, but it's better to generate
    better code in the first place.
    
    Reviewers: simonmar, simonpj, austin, bgamari
    
    Reviewed By: simonmar, simonpj
    
    Subscribers: rwbarton, thomie
    
    Differential Revision: https://phabricator.haskell.org/D2269
    cd50d236