1. 17 Dec, 2019 13 commits
  2. 16 Dec, 2019 2 commits
  3. 12 Dec, 2019 5 commits
  4. 11 Dec, 2019 13 commits
  5. 09 Dec, 2019 1 commit
    • Gabor Greif's avatar
      Fix comment typos · d46a72e1
      Gabor Greif authored
      The below is only necessary to fix the CI perf fluke that
      happened in 9897e8c8:
      -------------------------
      Metric Decrease:
          T5837
          T6048
          T9020
          T12425
          T12234
          T13035
          T12150
          Naperian
      -------------------------
      d46a72e1
  6. 07 Dec, 2019 3 commits
    • Simon Peyton Jones's avatar
      Split up coercionKind · 0a4ca9eb
      Simon Peyton Jones authored
      This patch implements the idea in #17515, splitting `coercionKind` into:
      
       * `coercion{Left,Right}Kind`, which computes the left/right side of the
          pair
       * `coercionKind`, which computes the pair of coercible types
      
      This is reduces allocation since we frequently only need only one side
      of the pair. Specifically, we see the following improvements on x86-64
      Debian 9:
      
      | test     | new        | old           | relative chg. |
      | :------- | ---------: | ------------: | ------------: |
      | T5030	   | 695537752  | 747641152.0   | -6.97%        |
      | T5321Fun | 449315744  | 474009040.0   | -5.21%        |
      | T9872a   | 2611071400 | 2645040952.0  | -1.28%        |
      | T9872c   | 2957097904 | 2994260264.0  | -1.24%        |
      | T12227   | 773435072  | 812367768.0   | -4.79%        |
      | T12545   | 3142687224 | 3215714752.0  | -2.27%        |
      | T14683   | 9392407664 | 9824775000.0  | -4.40%        |
      
      Metric Decrease:
          T12545
          T9872a
          T14683
          T5030
          T12227
          T9872c
          T5321Fun
          T9872b
      0a4ca9eb
    • Simon Peyton Jones's avatar
      Work in progress on coercionLKind, coercionRKind · ee07421f
      Simon Peyton Jones authored
      This is a preliminary patch for #17515
      ee07421f
    • Gabor Greif's avatar
      Implement pointer tagging for big families (#14373) · 9897e8c8
      Gabor Greif authored
      Formerly we punted on these and evaluated constructors always got a tag
      of 1.
      
      We now cascade switches because we have to check the tag first and when
      it is MAX_PTR_TAG then get the precise tag from the info table and
      switch on that. The only technically tricky part is that the default
      case needs (logical) duplication. To do this we emit an extra label for
      it and branch to that from the second switch. This avoids duplicated
      codegen.
      
      Here's a simple example of the new code gen:
      
          data D = D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8
      
      On a 64-bit system previously all constructors would be tagged 1. With
      the new code gen D7 and D8 are tagged 7:
      
          [Lib.D7_con_entry() {
               ...
               {offset
                 c1eu: // global
                     R1 = R1 + 7;
                     call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
               }
           }]
      
          [Lib.D8_con_entry() {
               ...
               {offset
                 c1ez: // global
                     R1 = R1 + 7;
                     call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
               }
           }]
      
      When switching we now look at the info table only when the tag is 7. For
      example, if we derive Enum for the type above, the Cmm looks like this:
      
          c2Le:
              _s2Js::P64 = R1;
              _c2Lq::P64 = _s2Js::P64 & 7;
              switch [1 .. 7] _c2Lq::P64 {
                  case 1 : goto c2Lk;
                  case 2 : goto c2Ll;
                  case 3 : goto c2Lm;
                  case 4 : goto c2Ln;
                  case 5 : goto c2Lo;
                  case 6 : goto c2Lp;
                  case 7 : goto c2Lj;
              }
      
          // Read info table for tag
          c2Lj:
              _c2Lv::I64 = %MO_UU_Conv_W32_W64(I32[I64[_s2Js::P64 & (-8)] - 4]);
              if (_c2Lv::I64 != 6) goto c2Lu; else goto c2Lt;
      
      Generated Cmm sizes do not change too much, but binaries are very
      slightly larger, due to the fact that the new instructions are longer in
      encoded form. E.g. previously entry code for D8 above would be
      
          00000000000001c0 <Lib_D8_con_info>:
           1c0:	48 ff c3             	inc    %rbx
           1c3:	ff 65 00             	jmpq   *0x0(%rbp)
      
      With this patch
      
          00000000000001d0 <Lib_D8_con_info>:
           1d0:	48 83 c3 07          	add    $0x7,%rbx
           1d4:	ff 65 00             	jmpq   *0x0(%rbp)
      
      This is one byte longer.
      
      Secondly, reading info table directly and then switching is shorter
      
          _c1co:
                  movq -1(%rbx),%rax
                  movl -4(%rax),%eax
                  // Switch on info table tag
                  jmp *_n1d5(,%rax,8)
      
      than doing the same switch, and then for the tag 7 doing another switch:
      
          // When tag is 7
          _c1ct:
                  andq $-8,%rbx
                  movq (%rbx),%rax
                  movl -4(%rax),%eax
                  // Switch on info table tag
                  ...
      
      Some changes of binary sizes in actual programs:
      
      - In NoFib the worst case is 0.1% increase in benchmark "parser" (see
        NoFib results below). All programs get slightly larger.
      
      - Stage 2 compiler size does not change.
      
      - In "containers" (the library) size of all object files increases
        0.0005%. Size of the test program "bitqueue-properties" increases
        0.03%.
      
      nofib benchmarks kindly provided by Ömer (@osa1):
      
      NoFib Results
      =============
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs    Instrs     Reads    Writes
      --------------------------------------------------------------------------------
                   CS          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  CSD          +0.0%      0.0%      0.0%     +0.0%     +0.0%
                   FS          +0.0%      0.0%      0.0%     +0.0%      0.0%
                    S          +0.0%      0.0%     -0.0%      0.0%      0.0%
                   VS          +0.0%      0.0%     -0.0%     +0.0%     +0.0%
                  VSD          +0.0%      0.0%     -0.0%     +0.0%     -0.0%
                  VSM          +0.0%      0.0%      0.0%      0.0%      0.0%
                 anna          +0.0%      0.0%     +0.1%     -0.9%     -0.0%
                 ansi          +0.0%      0.0%     -0.0%     +0.0%     +0.0%
                 atom          +0.0%      0.0%      0.0%      0.0%      0.0%
               awards          +0.0%      0.0%     -0.0%     +0.0%      0.0%
               banner          +0.0%      0.0%     -0.0%     +0.0%      0.0%
           bernouilli          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         binary-trees          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                boyer          +0.0%      0.0%     +0.0%      0.0%     -0.0%
               boyer2          +0.0%      0.0%     +0.0%      0.0%     -0.0%
                 bspt          +0.0%      0.0%     +0.0%     +0.0%      0.0%
            cacheprof          +0.0%      0.0%     +0.1%     -0.8%      0.0%
             calendar          +0.0%      0.0%     -0.0%     +0.0%     -0.0%
             cichelli          +0.0%      0.0%     +0.0%      0.0%      0.0%
              circsim          +0.0%      0.0%     -0.0%     -0.1%     -0.0%
             clausify          +0.0%      0.0%     +0.0%     +0.0%      0.0%
        comp_lab_zift          +0.0%      0.0%     +0.0%      0.0%     -0.0%
             compress          +0.0%      0.0%     +0.0%     +0.0%      0.0%
            compress2          +0.0%      0.0%      0.0%      0.0%      0.0%
          constraints          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
         cryptarithm1          +0.0%      0.0%     +0.0%      0.0%      0.0%
         cryptarithm2          +0.0%      0.0%     +0.0%     -0.0%      0.0%
                  cse          +0.0%      0.0%     +0.0%     +0.0%      0.0%
         digits-of-e1          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
         digits-of-e2          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
               dom-lt          +0.0%      0.0%     +0.0%     +0.0%      0.0%
                eliza          +0.0%      0.0%     -0.0%     +0.0%      0.0%
                event          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
          exact-reals          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               exp3_8          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
               expert          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
       fannkuch-redux          +0.0%      0.0%     +0.0%      0.0%      0.0%
                fasta          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  fem          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                  fft          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                 fft2          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
             fibheaps          +0.0%      0.0%     +0.0%     +0.0%      0.0%
                 fish          +0.0%      0.0%     +0.0%     +0.0%      0.0%
                fluid          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               fulsom          +0.0%      0.0%     +0.0%     -0.0%     +0.0%
               gamteb          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                  gcd          +0.0%      0.0%     +0.0%     +0.0%      0.0%
          gen_regexps          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
               genfft          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                   gg          +0.0%      0.0%      0.0%     -0.0%      0.0%
                 grep          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               hidden          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                  hpg          +0.0%      0.0%     +0.0%     -0.1%     -0.0%
                  ida          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                infer          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
              integer          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
            integrate          +0.0%      0.0%      0.0%     +0.0%      0.0%
         k-nucleotide          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                kahan          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
              knights          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
               lambda          +0.0%      0.0%     +1.2%     -6.1%     -0.0%
           last-piece          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                 lcss          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                 life          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                 lift          +0.0%      0.0%     +0.0%     +0.0%      0.0%
               linear          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            listcompr          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
             listcopy          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
             maillist          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
               mandel          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
              mandel2          +0.0%      0.0%     +0.0%     +0.0%     -0.0%
                 mate          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
              minimax          +0.0%      0.0%     -0.0%     +0.0%     -0.0%
              mkhprog          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
           multiplier          +0.0%      0.0%      0.0%     +0.0%     -0.0%
               n-body          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
             nucleic2          +0.0%      0.0%     +0.0%     +0.0%     -0.0%
                 para          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            paraffins          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               parser          +0.1%      0.0%     +0.4%     -1.7%     -0.0%
              parstof          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  pic          +0.0%      0.0%     +0.0%      0.0%     -0.0%
             pidigits          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                power          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
               pretty          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               primes          +0.0%      0.0%     +0.0%      0.0%      0.0%
            primetest          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               prolog          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               puzzle          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               queens          +0.0%      0.0%      0.0%     +0.0%     +0.0%
              reptile          +0.0%      0.0%     +0.0%     +0.0%      0.0%
      reverse-complem          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
              rewrite          +0.0%      0.0%     +0.0%      0.0%     -0.0%
                 rfib          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                  rsa          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                  scc          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                sched          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                  scs          +0.0%      0.0%     +0.0%     +0.0%      0.0%
               simple          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                solid          +0.0%      0.0%     +0.0%     +0.0%      0.0%
              sorting          +0.0%      0.0%     +0.0%     -0.0%      0.0%
        spectral-norm          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
               sphere          +0.0%      0.0%     +0.0%     -1.0%      0.0%
               symalg          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                  tak          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            transform          +0.0%      0.0%     +0.4%     -1.3%     +0.0%
             treejoin          +0.0%      0.0%     +0.0%     -0.0%      0.0%
            typecheck          +0.0%      0.0%     -0.0%     +0.0%      0.0%
              veritas          +0.0%      0.0%     +0.0%     -0.1%     +0.0%
                 wang          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            wave4main          +0.0%      0.0%     +0.0%      0.0%     -0.0%
         wheel-sieve1          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         wheel-sieve2          +0.0%      0.0%     +0.0%     +0.0%      0.0%
                 x2n1          +0.0%      0.0%     +0.0%     +0.0%      0.0%
      --------------------------------------------------------------------------------
                  Min          +0.0%      0.0%     -0.0%     -6.1%     -0.0%
                  Max          +0.1%      0.0%     +1.2%     +0.0%     +0.0%
       Geometric Mean          +0.0%     -0.0%     +0.0%     -0.1%     -0.0%
      
      NoFib GC Results
      ================
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs    Instrs     Reads    Writes
      --------------------------------------------------------------------------------
              circsim          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
          constraints          +0.0%      0.0%     -0.0%      0.0%     -0.0%
             fibheaps          +0.0%      0.0%      0.0%     -0.0%     -0.0%
               fulsom          +0.0%      0.0%      0.0%     -0.6%     -0.0%
             gc_bench          +0.0%      0.0%      0.0%      0.0%     -0.0%
                 hash          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 lcss          +0.0%      0.0%      0.0%     -0.0%      0.0%
            mutstore1          +0.0%      0.0%      0.0%     -0.0%     -0.0%
            mutstore2          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                power          +0.0%      0.0%     -0.0%      0.0%     -0.0%
           spellcheck          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
      --------------------------------------------------------------------------------
                  Min          +0.0%      0.0%     -0.0%     -0.6%     -0.0%
                  Max          +0.0%      0.0%     +0.0%      0.0%      0.0%
       Geometric Mean          +0.0%     +0.0%     +0.0%     -0.1%     +0.0%
      
      Fixes #14373
      
      These performance regressions appear to be a fluke in CI. See the
      discussion in !1742 for details.
      
      Metric Increase:
          T6048
          T12234
          T12425
          Naperian
          T12150
          T5837
          T13035
      9897e8c8
  7. 05 Dec, 2019 3 commits