1. 24 Dec, 2019 2 commits
  2. 21 Dec, 2019 1 commit
  3. 16 Dec, 2019 2 commits
  4. 11 Dec, 2019 3 commits
  5. 10 Dec, 2019 5 commits
  6. 09 Dec, 2019 1 commit
  7. 06 Dec, 2019 9 commits
    • Ömer Sinan Ağacan's avatar
      Fix compacting GC bug when chaining tagged and non-tagged fields together · da04db1a
      Ömer Sinan Ağacan authored
      Currently compacting GC has the invariant that in a chain all fields are tagged
      the same. However this does not really hold: root pointers are not tagged, so
      when we thread a root we initialize a chain without a tag. When the pointed
      objects is evaluated and we have more pointers to it from the heap, we then add
      *tagged* fields to the chain (because pointers to it from the heap are tagged),
      ending up chaining fields with different tags (pointers from roots are NOT
      tagged, pointers from heap are). This breaks the invariant and as a result
      compacting GC turns tagged pointers into non-tagged.
      
      This later causes problem in the generated code where we do reads assuming that
      the pointer is aligned, e.g.
      
          0x7(%rax) -- assumes that pointer is tagged 1
      
      which causes misaligned reads. This caused #17088.
      
      We fix this using the "pointer tagging for large families" patch (#14373,
      !1742):
      
      - With the pointer tagging patch the GC can know what the tagged pointer to a
        CONSTR should be (previously we'd need to know the family size -- large
        families are always tagged 1, small families are tagged depending on the
        constructor).
      
      - Since we now know what the tags should be we no longer need to store the
        pointer tag in the info table pointers when forming chains in the compacting
        GC.
      
      As a result we no longer need to tag pointers in chains with 1/2 depending on
      whether the field points to an info table pointer, or to another field: an info
      table pointer is always tagged 0, everything else in the chain is tagged 1. The
      lost tags in pointers can be retrieved by looking at the info table.
      
      Finally, instead of using tag 1 for fields and tag 0 for info table pointers, we
      use two different tags for fields:
      
      - 1 for fields that have untagged pointers
      - 2 for fields that have tagged pointers
      
      When unchaining we then look at the pointer to a field, and depending on its tag
      we either leave a tagged pointer or an untagged pointer in the field.
      
      This allows chaining untagged and tagged fields together in compacting GC.
      
      Fixes #17088
      da04db1a
    • Gabor Greif's avatar
      Implement pointer tagging for big families (#14373) · b911c532
      Gabor Greif authored
      Formerly we punted on these and evaluated constructors always got a tag
      of 1.
      
      We now cascade switches because we have to check the tag first and when
      it is MAX_PTR_TAG then get the precise tag from the info table and
      switch on that. The only technically tricky part is that the default
      case needs (logical) duplication. To do this we emit an extra label for
      it and branch to that from the second switch. This avoids duplicated
      codegen.
      
      Here's a simple example of the new code gen:
      
          data D = D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8
      
      On a 64-bit system previously all constructors would be tagged 1. With
      the new code gen D7 and D8 are tagged 7:
      
          [Lib.D7_con_entry() {
               ...
               {offset
                 c1eu: // global
                     R1 = R1 + 7;
                     call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
               }
           }]
      
          [Lib.D8_con_entry() {
               ...
               {offset
                 c1ez: // global
                     R1 = R1 + 7;
                     call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
               }
           }]
      
      When switching we now look at the info table only when the tag is 7. For
      example, if we derive Enum for the type above, the Cmm looks like this:
      
          c2Le:
              _s2Js::P64 = R1;
              _c2Lq::P64 = _s2Js::P64 & 7;
              switch [1 .. 7] _c2Lq::P64 {
                  case 1 : goto c2Lk;
                  case 2 : goto c2Ll;
                  case 3 : goto c2Lm;
                  case 4 : goto c2Ln;
                  case 5 : goto c2Lo;
                  case 6 : goto c2Lp;
                  case 7 : goto c2Lj;
              }
      
          // Read info table for tag
          c2Lj:
              _c2Lv::I64 = %MO_UU_Conv_W32_W64(I32[I64[_s2Js::P64 & (-8)] - 4]);
              if (_c2Lv::I64 != 6) goto c2Lu; else goto c2Lt;
      
      Generated Cmm sizes do not change too much, but binaries are very
      slightly larger, due to the fact that the new instructions are longer in
      encoded form. E.g. previously entry code for D8 above would be
      
          00000000000001c0 <Lib_D8_con_info>:
           1c0:	48 ff c3             	inc    %rbx
           1c3:	ff 65 00             	jmpq   *0x0(%rbp)
      
      With this patch
      
          00000000000001d0 <Lib_D8_con_info>:
           1d0:	48 83 c3 07          	add    $0x7,%rbx
           1d4:	ff 65 00             	jmpq   *0x0(%rbp)
      
      This is one byte longer.
      
      Secondly, reading info table directly and then switching is shorter
      
          _c1co:
                  movq -1(%rbx),%rax
                  movl -4(%rax),%eax
                  // Switch on info table tag
                  jmp *_n1d5(,%rax,8)
      
      than doing the same switch, and then for the tag 7 doing another switch:
      
          // When tag is 7
          _c1ct:
                  andq $-8,%rbx
                  movq (%rbx),%rax
                  movl -4(%rax),%eax
                  // Switch on info table tag
                  ...
      
      Some changes of binary sizes in actual programs:
      
      - In NoFib the worst case is 0.1% increase in benchmark "parser" (see
        NoFib results below). All programs get slightly larger.
      
      - Stage 2 compiler size does not change.
      
      - In "containers" (the library) size of all object files increases
        0.0005%. Size of the test program "bitqueue-properties" increases
        0.03%.
      
      nofib benchmarks kindly provided by Ömer (@osa1):
      
      NoFib Results
      =============
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs    Instrs     Reads    Writes
      --------------------------------------------------------------------------------
                   CS          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  CSD          +0.0%      0.0%      0.0%     +0.0%     +0.0%
                   FS          +0.0%      0.0%      0.0%     +0.0%      0.0%
                    S          +0.0%      0.0%     -0.0%      0.0%      0.0%
                   VS          +0.0%      0.0%     -0.0%     +0.0%     +0.0%
                  VSD          +0.0%      0.0%     -0.0%     +0.0%     -0.0%
                  VSM          +0.0%      0.0%      0.0%      0.0%      0.0%
                 anna          +0.0%      0.0%     +0.1%     -0.9%     -0.0%
                 ansi          +0.0%      0.0%     -0.0%     +0.0%     +0.0%
                 atom          +0.0%      0.0%      0.0%      0.0%      0.0%
               awards          +0.0%      0.0%     -0.0%     +0.0%      0.0%
               banner          +0.0%      0.0%     -0.0%     +0.0%      0.0%
           bernouilli          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         binary-trees          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                boyer          +0.0%      0.0%     +0.0%      0.0%     -0.0%
               boyer2          +0.0%      0.0%     +0.0%      0.0%     -0.0%
                 bspt          +0.0%      0.0%     +0.0%     +0.0%      0.0%
            cacheprof          +0.0%      0.0%     +0.1%     -0.8%      0.0%
             calendar          +0.0%      0.0%     -0.0%     +0.0%     -0.0%
             cichelli          +0.0%      0.0%     +0.0%      0.0%      0.0%
              circsim          +0.0%      0.0%     -0.0%     -0.1%     -0.0%
             clausify          +0.0%      0.0%     +0.0%     +0.0%      0.0%
        comp_lab_zift          +0.0%      0.0%     +0.0%      0.0%     -0.0%
             compress          +0.0%      0.0%     +0.0%     +0.0%      0.0%
            compress2          +0.0%      0.0%      0.0%      0.0%      0.0%
          constraints          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
         cryptarithm1          +0.0%      0.0%     +0.0%      0.0%      0.0%
         cryptarithm2          +0.0%      0.0%     +0.0%     -0.0%      0.0%
                  cse          +0.0%      0.0%     +0.0%     +0.0%      0.0%
         digits-of-e1          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
         digits-of-e2          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
               dom-lt          +0.0%      0.0%     +0.0%     +0.0%      0.0%
                eliza          +0.0%      0.0%     -0.0%     +0.0%      0.0%
                event          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
          exact-reals          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               exp3_8          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
               expert          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
       fannkuch-redux          +0.0%      0.0%     +0.0%      0.0%      0.0%
                fasta          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  fem          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                  fft          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                 fft2          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
             fibheaps          +0.0%      0.0%     +0.0%     +0.0%      0.0%
                 fish          +0.0%      0.0%     +0.0%     +0.0%      0.0%
                fluid          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               fulsom          +0.0%      0.0%     +0.0%     -0.0%     +0.0%
               gamteb          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                  gcd          +0.0%      0.0%     +0.0%     +0.0%      0.0%
          gen_regexps          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
               genfft          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                   gg          +0.0%      0.0%      0.0%     -0.0%      0.0%
                 grep          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               hidden          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                  hpg          +0.0%      0.0%     +0.0%     -0.1%     -0.0%
                  ida          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                infer          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
              integer          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
            integrate          +0.0%      0.0%      0.0%     +0.0%      0.0%
         k-nucleotide          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                kahan          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
              knights          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
               lambda          +0.0%      0.0%     +1.2%     -6.1%     -0.0%
           last-piece          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                 lcss          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                 life          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                 lift          +0.0%      0.0%     +0.0%     +0.0%      0.0%
               linear          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            listcompr          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
             listcopy          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
             maillist          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
               mandel          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
              mandel2          +0.0%      0.0%     +0.0%     +0.0%     -0.0%
                 mate          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
              minimax          +0.0%      0.0%     -0.0%     +0.0%     -0.0%
              mkhprog          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
           multiplier          +0.0%      0.0%      0.0%     +0.0%     -0.0%
               n-body          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
             nucleic2          +0.0%      0.0%     +0.0%     +0.0%     -0.0%
                 para          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            paraffins          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               parser          +0.1%      0.0%     +0.4%     -1.7%     -0.0%
              parstof          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                  pic          +0.0%      0.0%     +0.0%      0.0%     -0.0%
             pidigits          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                power          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
               pretty          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               primes          +0.0%      0.0%     +0.0%      0.0%      0.0%
            primetest          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               prolog          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               puzzle          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
               queens          +0.0%      0.0%      0.0%     +0.0%     +0.0%
              reptile          +0.0%      0.0%     +0.0%     +0.0%      0.0%
      reverse-complem          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
              rewrite          +0.0%      0.0%     +0.0%      0.0%     -0.0%
                 rfib          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                  rsa          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                  scc          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                sched          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                  scs          +0.0%      0.0%     +0.0%     +0.0%      0.0%
               simple          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                solid          +0.0%      0.0%     +0.0%     +0.0%      0.0%
              sorting          +0.0%      0.0%     +0.0%     -0.0%      0.0%
        spectral-norm          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
               sphere          +0.0%      0.0%     +0.0%     -1.0%      0.0%
               symalg          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
                  tak          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            transform          +0.0%      0.0%     +0.4%     -1.3%     +0.0%
             treejoin          +0.0%      0.0%     +0.0%     -0.0%      0.0%
            typecheck          +0.0%      0.0%     -0.0%     +0.0%      0.0%
              veritas          +0.0%      0.0%     +0.0%     -0.1%     +0.0%
                 wang          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            wave4main          +0.0%      0.0%     +0.0%      0.0%     -0.0%
         wheel-sieve1          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         wheel-sieve2          +0.0%      0.0%     +0.0%     +0.0%      0.0%
                 x2n1          +0.0%      0.0%     +0.0%     +0.0%      0.0%
      --------------------------------------------------------------------------------
                  Min          +0.0%      0.0%     -0.0%     -6.1%     -0.0%
                  Max          +0.1%      0.0%     +1.2%     +0.0%     +0.0%
       Geometric Mean          +0.0%     -0.0%     +0.0%     -0.1%     -0.0%
      
      NoFib GC Results
      ================
      
      --------------------------------------------------------------------------------
              Program           Size    Allocs    Instrs     Reads    Writes
      --------------------------------------------------------------------------------
              circsim          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
          constraints          +0.0%      0.0%     -0.0%      0.0%     -0.0%
             fibheaps          +0.0%      0.0%      0.0%     -0.0%     -0.0%
               fulsom          +0.0%      0.0%      0.0%     -0.6%     -0.0%
             gc_bench          +0.0%      0.0%      0.0%      0.0%     -0.0%
                 hash          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
                 lcss          +0.0%      0.0%      0.0%     -0.0%      0.0%
            mutstore1          +0.0%      0.0%      0.0%     -0.0%     -0.0%
            mutstore2          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
                power          +0.0%      0.0%     -0.0%      0.0%     -0.0%
           spellcheck          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
      --------------------------------------------------------------------------------
                  Min          +0.0%      0.0%     -0.0%     -0.6%     -0.0%
                  Max          +0.0%      0.0%     +0.0%      0.0%      0.0%
       Geometric Mean          +0.0%     +0.0%     +0.0%     -0.1%     +0.0%
      
      Fixes #14373
      
      These performance regressions appear to be a fluke in CI. See the
      discussion in !1742 for details.
      
      Metric Increase:
          T6048
          T12234
          T12425
          Naperian
          T12150
          T5837
          T13035
      b911c532
    • Ben Gamari's avatar
      gitlab-ci: Add Debian 10 builds · 99e1816a
      Ben Gamari authored
      (cherry picked from commit c0c77bda)
      99e1816a
    • Ben Gamari's avatar
      13dab275
    • Ben Gamari's avatar
      gitlab-ci: Set LANG on CentOS 7 · 113bf129
      Ben Gamari authored
      It otherwise seems to default to ascii
      
      (cherry picked from commit b6f7e407)
      113bf129
    • Ben Gamari's avatar
      gitlab-ci: pxz is unavailable on CentOS 7 · 61820b24
      Ben Gamari authored
      Fall back to xz
      
      (cherry picked from commit 8565f808)
      61820b24
    • Ben Gamari's avatar
      Release notes fixes · 599a5916
      Ben Gamari authored
      599a5916
    • Ben Gamari's avatar
      gitlab-ci: Always build source tarball · a41bd7ac
      Ben Gamari authored
      a41bd7ac
    • Ben Gamari's avatar
      gitlab-ci: Add release-x86_64-linux-deb9 job · 1a70568b
      Ben Gamari authored
      1a70568b
  8. 23 Nov, 2019 1 commit
    • vdukhovni's avatar
      On FreeBSD 12 sys/sysctl.h requires sys/types.h · 273d802b
      vdukhovni authored
      Else build fails with:
      
          In file included from ExecutablePath.hsc:42:
          /usr/include/sys/sysctl.h:1062:25: error: unknown type name 'u_int'; did you mean 'int'?
           int sysctl(const int *, u_int, void *, size_t *, const void *, size_t);
      			     ^~~~~
      			     int
          compiling libraries/base/dist-install/build/System/Environment/ExecutablePath_hsc_make.c failed (exit code 1)
      
      Perhaps also also other FreeBSD releases, but additional include
      will no harm even if not needed.
      273d802b
  9. 22 Nov, 2019 2 commits
  10. 21 Nov, 2019 11 commits
  11. 19 Nov, 2019 3 commits
    • Ben Gamari's avatar
      nonmoving: Drop redundant write barrier on stack underflow · 098d5017
      Ben Gamari authored
      Previously we would push stack-carried return values to the new stack on
      a stack overflow. While the precise reasoning for this barrier is
      unfortunately lost to history, in hindsight I suspect it was prompted by
      a missing barrier elsewhere (that has been since fixed).
      
      Moreover, there the redundant barrier is actively harmful: the stack may
      contain non-pointer values; blindly pushing these to the mark queue will
      result in a crash. This is precisely what happened in the `stack003`
      test. However, because of a (now fixed) deficiency in the test this
      crash did not trigger on amd64.
      098d5017
    • Ben Gamari's avatar
      testsuite: Increase width of stack003 test · a7571a74
      Ben Gamari authored
      Previously the returned tuple seemed to fit in registers on amd64. This
      meant that non-moving collector bug would cause the test to fail on i386
      yet not amd64.
      a7571a74
    • Ben Gamari's avatar
      nonmoving: Fix handling on large object marking on 32-bit · eb7b233a
      Ben Gamari authored
      Previously we would reset the pointer pointing to the object to be
      marked to the beginning of the block when marking a large object. This
      did no harm on 64-bit but on 32-bit it broke, e.g. `arr020`, since we
      align pinned ByteArray allocations such that the payload is 8
      byte-aligned. This means that the object might not begin at the
      beginning of the block.,
      eb7b233a