- 11 Dec, 2019 1 commit
-
-
Micha Wiedenmann authored
-
- 09 Dec, 2019 1 commit
-
-
Gabor Greif authored
The below is only necessary to fix the CI perf fluke that happened in 9897e8c8: ------------------------- Metric Decrease: T5837 T6048 T9020 T12425 T12234 T13035 T12150 Naperian -------------------------
-
- 07 Dec, 2019 3 commits
-
-
Simon Peyton Jones authored
This patch implements the idea in #17515, splitting `coercionKind` into: * `coercion{Left,Right}Kind`, which computes the left/right side of the pair * `coercionKind`, which computes the pair of coercible types This is reduces allocation since we frequently only need only one side of the pair. Specifically, we see the following improvements on x86-64 Debian 9: | test | new | old | relative chg. | | :------- | ---------: | ------------: | ------------: | | T5030 | 695537752 | 747641152.0 | -6.97% | | T5321Fun | 449315744 | 474009040.0 | -5.21% | | T9872a | 2611071400 | 2645040952.0 | -1.28% | | T9872c | 2957097904 | 2994260264.0 | -1.24% | | T12227 | 773435072 | 812367768.0 | -4.79% | | T12545 | 3142687224 | 3215714752.0 | -2.27% | | T14683 | 9392407664 | 9824775000.0 | -4.40% | Metric Decrease: T12545 T9872a T14683 T5030 T12227 T9872c T5321Fun T9872b
-
Simon Peyton Jones authored
This is a preliminary patch for #17515
-
Gabor Greif authored
Formerly we punted on these and evaluated constructors always got a tag of 1. We now cascade switches because we have to check the tag first and when it is MAX_PTR_TAG then get the precise tag from the info table and switch on that. The only technically tricky part is that the default case needs (logical) duplication. To do this we emit an extra label for it and branch to that from the second switch. This avoids duplicated codegen. Here's a simple example of the new code gen: data D = D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 On a 64-bit system previously all constructors would be tagged 1. With the new code gen D7 and D8 are tagged 7: [Lib.D7_con_entry() { ... {offset c1eu: // global R1 = R1 + 7; call (P64[Sp])(R1) args: 8, res: 0, upd: 8; } }] [Lib.D8_con_entry() { ... {offset c1ez: // global R1 = R1 + 7; call (P64[Sp])(R1) args: 8, res: 0, upd: 8; } }] When switching we now look at the info table only when the tag is 7. For example, if we derive Enum for the type above, the Cmm looks like this: c2Le: _s2Js::P64 = R1; _c2Lq::P64 = _s2Js::P64 & 7; switch [1 .. 7] _c2Lq::P64 { case 1 : goto c2Lk; case 2 : goto c2Ll; case 3 : goto c2Lm; case 4 : goto c2Ln; case 5 : goto c2Lo; case 6 : goto c2Lp; case 7 : goto c2Lj; } // Read info table for tag c2Lj: _c2Lv::I64 = %MO_UU_Conv_W32_W64(I32[I64[_s2Js::P64 & (-8)] - 4]); if (_c2Lv::I64 != 6) goto c2Lu; else goto c2Lt; Generated Cmm sizes do not change too much, but binaries are very slightly larger, due to the fact that the new instructions are longer in encoded form. E.g. previously entry code for D8 above would be 00000000000001c0 <Lib_D8_con_info>: 1c0: 48 ff c3 inc %rbx 1c3: ff 65 00 jmpq *0x0(%rbp) With this patch 00000000000001d0 <Lib_D8_con_info>: 1d0: 48 83 c3 07 add $0x7,%rbx 1d4: ff 65 00 jmpq *0x0(%rbp) This is one byte longer. Secondly, reading info table directly and then switching is shorter _c1co: movq -1(%rbx),%rax movl -4(%rax),%eax // Switch on info table tag jmp *_n1d5(,%rax,8) than doing the same switch, and then for the tag 7 doing another switch: // When tag is 7 _c1ct: andq $-8,%rbx movq (%rbx),%rax movl -4(%rax),%eax // Switch on info table tag ... Some changes of binary sizes in actual programs: - In NoFib the worst case is 0.1% increase in benchmark "parser" (see NoFib results below). All programs get slightly larger. - Stage 2 compiler size does not change. - In "containers" (the library) size of all object files increases 0.0005%. Size of the test program "bitqueue-properties" increases 0.03%. nofib benchmarks kindly provided by Ömer (@osa1): NoFib Results ============= -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- CS +0.0% 0.0% -0.0% -0.0% -0.0% CSD +0.0% 0.0% 0.0% +0.0% +0.0% FS +0.0% 0.0% 0.0% +0.0% 0.0% S +0.0% 0.0% -0.0% 0.0% 0.0% VS +0.0% 0.0% -0.0% +0.0% +0.0% VSD +0.0% 0.0% -0.0% +0.0% -0.0% VSM +0.0% 0.0% 0.0% 0.0% 0.0% anna +0.0% 0.0% +0.1% -0.9% -0.0% ansi +0.0% 0.0% -0.0% +0.0% +0.0% atom +0.0% 0.0% 0.0% 0.0% 0.0% awards +0.0% 0.0% -0.0% +0.0% 0.0% banner +0.0% 0.0% -0.0% +0.0% 0.0% bernouilli +0.0% 0.0% +0.0% +0.0% +0.0% binary-trees +0.0% 0.0% -0.0% -0.0% -0.0% boyer +0.0% 0.0% +0.0% 0.0% -0.0% boyer2 +0.0% 0.0% +0.0% 0.0% -0.0% bspt +0.0% 0.0% +0.0% +0.0% 0.0% cacheprof +0.0% 0.0% +0.1% -0.8% 0.0% calendar +0.0% 0.0% -0.0% +0.0% -0.0% cichelli +0.0% 0.0% +0.0% 0.0% 0.0% circsim +0.0% 0.0% -0.0% -0.1% -0.0% clausify +0.0% 0.0% +0.0% +0.0% 0.0% comp_lab_zift +0.0% 0.0% +0.0% 0.0% -0.0% compress +0.0% 0.0% +0.0% +0.0% 0.0% compress2 +0.0% 0.0% 0.0% 0.0% 0.0% constraints +0.0% 0.0% -0.0% -0.0% -0.0% cryptarithm1 +0.0% 0.0% +0.0% 0.0% 0.0% cryptarithm2 +0.0% 0.0% +0.0% -0.0% 0.0% cse +0.0% 0.0% +0.0% +0.0% 0.0% digits-of-e1 +0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e2 +0.0% 0.0% +0.0% -0.0% -0.0% dom-lt +0.0% 0.0% +0.0% +0.0% 0.0% eliza +0.0% 0.0% -0.0% +0.0% 0.0% event +0.0% 0.0% -0.0% -0.0% -0.0% exact-reals +0.0% 0.0% +0.0% +0.0% +0.0% exp3_8 +0.0% 0.0% -0.0% -0.0% -0.0% expert +0.0% 0.0% +0.0% +0.0% +0.0% fannkuch-redux +0.0% 0.0% +0.0% 0.0% 0.0% fasta +0.0% 0.0% -0.0% -0.0% -0.0% fem +0.0% 0.0% +0.0% +0.0% +0.0% fft +0.0% 0.0% +0.0% -0.0% -0.0% fft2 +0.0% 0.0% +0.0% +0.0% +0.0% fibheaps +0.0% 0.0% +0.0% +0.0% 0.0% fish +0.0% 0.0% +0.0% +0.0% 0.0% fluid +0.0% 0.0% +0.0% +0.0% +0.0% fulsom +0.0% 0.0% +0.0% -0.0% +0.0% gamteb +0.0% 0.0% +0.0% -0.0% -0.0% gcd +0.0% 0.0% +0.0% +0.0% 0.0% gen_regexps +0.0% 0.0% +0.0% -0.0% -0.0% genfft +0.0% 0.0% -0.0% -0.0% -0.0% gg +0.0% 0.0% 0.0% -0.0% 0.0% grep +0.0% 0.0% +0.0% +0.0% +0.0% hidden +0.0% 0.0% +0.0% -0.0% -0.0% hpg +0.0% 0.0% +0.0% -0.1% -0.0% ida +0.0% 0.0% +0.0% -0.0% -0.0% infer +0.0% 0.0% -0.0% -0.0% -0.0% integer +0.0% 0.0% -0.0% -0.0% -0.0% integrate +0.0% 0.0% 0.0% +0.0% 0.0% k-nucleotide +0.0% 0.0% -0.0% -0.0% -0.0% kahan +0.0% 0.0% -0.0% -0.0% -0.0% knights +0.0% 0.0% +0.0% -0.0% -0.0% lambda +0.0% 0.0% +1.2% -6.1% -0.0% last-piece +0.0% 0.0% +0.0% -0.0% -0.0% lcss +0.0% 0.0% +0.0% -0.0% -0.0% life +0.0% 0.0% +0.0% -0.0% -0.0% lift +0.0% 0.0% +0.0% +0.0% 0.0% linear +0.0% 0.0% +0.0% +0.0% +0.0% listcompr +0.0% 0.0% -0.0% -0.0% -0.0% listcopy +0.0% 0.0% -0.0% -0.0% -0.0% maillist +0.0% 0.0% +0.0% -0.0% -0.0% mandel +0.0% 0.0% +0.0% +0.0% +0.0% mandel2 +0.0% 0.0% +0.0% +0.0% -0.0% mate +0.0% 0.0% +0.0% +0.0% +0.0% minimax +0.0% 0.0% -0.0% +0.0% -0.0% mkhprog +0.0% 0.0% +0.0% +0.0% +0.0% multiplier +0.0% 0.0% 0.0% +0.0% -0.0% n-body +0.0% 0.0% +0.0% -0.0% -0.0% nucleic2 +0.0% 0.0% +0.0% +0.0% -0.0% para +0.0% 0.0% +0.0% +0.0% +0.0% paraffins +0.0% 0.0% +0.0% +0.0% +0.0% parser +0.1% 0.0% +0.4% -1.7% -0.0% parstof +0.0% 0.0% -0.0% -0.0% -0.0% pic +0.0% 0.0% +0.0% 0.0% -0.0% pidigits +0.0% 0.0% -0.0% -0.0% -0.0% power +0.0% 0.0% +0.0% -0.0% -0.0% pretty +0.0% 0.0% +0.0% +0.0% +0.0% primes +0.0% 0.0% +0.0% 0.0% 0.0% primetest +0.0% 0.0% +0.0% +0.0% +0.0% prolog +0.0% 0.0% +0.0% +0.0% +0.0% puzzle +0.0% 0.0% +0.0% +0.0% +0.0% queens +0.0% 0.0% 0.0% +0.0% +0.0% reptile +0.0% 0.0% +0.0% +0.0% 0.0% reverse-complem +0.0% 0.0% -0.0% -0.0% -0.0% rewrite +0.0% 0.0% +0.0% 0.0% -0.0% rfib +0.0% 0.0% +0.0% +0.0% +0.0% rsa +0.0% 0.0% +0.0% +0.0% +0.0% scc +0.0% 0.0% +0.0% +0.0% +0.0% sched +0.0% 0.0% +0.0% +0.0% +0.0% scs +0.0% 0.0% +0.0% +0.0% 0.0% simple +0.0% 0.0% +0.0% +0.0% +0.0% solid +0.0% 0.0% +0.0% +0.0% 0.0% sorting +0.0% 0.0% +0.0% -0.0% 0.0% spectral-norm +0.0% 0.0% -0.0% -0.0% -0.0% sphere +0.0% 0.0% +0.0% -1.0% 0.0% symalg +0.0% 0.0% +0.0% +0.0% +0.0% tak +0.0% 0.0% +0.0% +0.0% +0.0% transform +0.0% 0.0% +0.4% -1.3% +0.0% treejoin +0.0% 0.0% +0.0% -0.0% 0.0% typecheck +0.0% 0.0% -0.0% +0.0% 0.0% veritas +0.0% 0.0% +0.0% -0.1% +0.0% wang +0.0% 0.0% +0.0% +0.0% +0.0% wave4main +0.0% 0.0% +0.0% 0.0% -0.0% wheel-sieve1 +0.0% 0.0% +0.0% +0.0% +0.0% wheel-sieve2 +0.0% 0.0% +0.0% +0.0% 0.0% x2n1 +0.0% 0.0% +0.0% +0.0% 0.0% -------------------------------------------------------------------------------- Min +0.0% 0.0% -0.0% -6.1% -0.0% Max +0.1% 0.0% +1.2% +0.0% +0.0% Geometric Mean +0.0% -0.0% +0.0% -0.1% -0.0% NoFib GC Results ================ -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- circsim +0.0% 0.0% -0.0% -0.0% -0.0% constraints +0.0% 0.0% -0.0% 0.0% -0.0% fibheaps +0.0% 0.0% 0.0% -0.0% -0.0% fulsom +0.0% 0.0% 0.0% -0.6% -0.0% gc_bench +0.0% 0.0% 0.0% 0.0% -0.0% hash +0.0% 0.0% -0.0% -0.0% -0.0% lcss +0.0% 0.0% 0.0% -0.0% 0.0% mutstore1 +0.0% 0.0% 0.0% -0.0% -0.0% mutstore2 +0.0% 0.0% +0.0% -0.0% -0.0% power +0.0% 0.0% -0.0% 0.0% -0.0% spellcheck +0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min +0.0% 0.0% -0.0% -0.6% -0.0% Max +0.0% 0.0% +0.0% 0.0% 0.0% Geometric Mean +0.0% +0.0% +0.0% -0.1% +0.0% Fixes #14373 These performance regressions appear to be a fluke in CI. See the discussion in !1742 for details. Metric Increase: T6048 T12234 T12425 Naperian T12150 T5837 T13035
-
- 05 Dec, 2019 14 commits
-
-
Matthias Braun authored
-
Ben Gamari authored
Previously we used allocBlockOnNode_sync in nonmovingSweepMutLists despite the fact that we aren't in the GC and therefore the allocation spinlock isn't in use. This meant that sweep would end up spinning until the next minor GC, when the SM lock was moved away from the SM_MUTEX to the spinlock. This isn't a correctness issue but it sure isn't good for performance. Found thanks for Ward. Fixes #17539.
-
Vladislav Zavialov authored
-
Vladislav Zavialov authored
Before this patch, GHC always printed the * kind unparenthesized. This led to two issues: 1. Sometimes GHC printed invalid or incorrect code. For example, GHC would print: type F @* x = x when it meant to print: type F @(*) x = x In the former case, instead of a kind application we were getting a type operator (@*). 2. Sometimes GHC printed kinds that were correct but hard to read. Should Either * Int be read as Either (*) Int or as (*) Either Int ? This depends on whether -XStarIsType is enabled, but it would be easier if we didn't have to check for the flag when reading the code. We can solve both problems by assigning (*) a different precedence. Note that Haskell98 kinds are not affected: ((* -> *) -> *) -> * does NOT become (((*) -> (*)) -> (*)) -> (*) The parentheses are added when (*) is used in a function argument position: F * * * becomes F (*) (*) (*) F A * B becomes F A (*) B Proxy * becomes Proxy (*) a * -> * becomes a (*) -> *
-
Ben Gamari authored
-
Ben Gamari authored
Along with some refactoring.
-
Ben Gamari authored
-
Ben Gamari authored
The LLVM backend does not guarantee any particular semantics for division by zero, making this test unreliable across platforms.
-
Ben Gamari authored
Previously we would clear the bitmaps of segments which we are going to sweep during the preparatory pause. However, this is unnecessary: the existence of the mark epoch ensures that the sweep will correctly identify non-reachable objects, even if we do not clear the bitmap. We now defer clearing the bitmap to sweep, which happens concurrently with mutation.
-
Bodigrim authored
-
Bodigrim authored
-
Ben Gamari authored
We use two tricks, as suggested here [1]: * Use microtype to try to reduce the incidence of underfull boxes * Bump up \hbadness to eliminate the warnings
-
Vladislav Zavialov authored
-
Vladislav Zavialov authored
This patch implements GHC Proposal #176: https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0176-scc-parsing.rst Before the change: 1 / 2 / 2 = 0.25 1 / {-# SCC "name" #-} 2 / 2 = 1.0 After the change: 1 / 2 / 2 = 0.25 1 / {-# SCC "name" #-} 2 / 2 = parse error
-
- 04 Dec, 2019 4 commits
-
-
Ben Gamari authored
Previously we would use the `.int` assembler directive to generate 32-bit words in the note section. However, `.int` is note guaranteed to produce 4-bytes; in fact, on some platforms (e.g. AArch64) it produces 8-bytes. Use the `.4bytes` directive to avoid this. Moreover, we used the `.align` directive, which is quite platform dependent. On AArch64 it appears to not even be idempotent (despite what the documentation claims). `.balign` is consequentially preferred as it offers consistent behavior across platforms.
-
Ben Gamari authored
This does two things: * Eliminate all uses of Unique.deriveUnique, which was quite easy to mis-use and extremely subtle. * Rename the previous "derived unique" notion to "local unique". This is possible because the only places where `uniqAway` can be safely used are those where local uniqueness (with respect to some InScopeSet) is sufficient. * Rework the implementation of VarEnv.uniqAway, as discussed in #17462. This should make the operation significantly more efficient than its previous iterative implementation.. Metric Decrease: T9872c T12227 T9233 T14683 T5030 T12545 hie002 Metric Increase: T9961
-
Ben Gamari authored
This relied on deriveUnique, which was far too subtle to be safely applied. Thankfully the instance doesn't appear to be used so let's just drop it.
-
Ben Gamari authored
-
- 03 Dec, 2019 6 commits
-
-
Sylvain Henry authored
narrowN (x .&. m) m .&. (2^N-1) = 2^N-1 ==> narrowN x e.g. narrow16 (x .&. 0x12FFFF) ==> narrow16 x
-
Sylvain Henry authored
-
Ben Gamari authored
In #17424 Simon PJ noted that there is a potentially unsafe occurrence of unsafeCoerce#, coercing from an unlifted to lifted type. However, nowhere in the compiler do we assume that a BCO# is not a thunk. Moreover, in the case of a CAF the result returned by `createBCO` *will* be a thunk (as noted in [Updatable CAF BCOs]). Consequently it seems better to rather make BCO# a lifted type and rename it to BCO.
-
Athas authored
-
Sylvain Henry authored
-
Sylvain Henry authored
-
- 02 Dec, 2019 11 commits
-
-
Sergei Trofimovich authored
Be more precise at detecting cross-compilation case. Before the change configuration $ ./configure --host=x86_64-pc-linux-gnu --target=x86_64-gentoo-linux-musl was not considered a cross-target. Even though libcs are different (`glibc` vs. `musl`). Without this patch build fails as: ``` "inplace/bin/ghc-cabal" check libraries/integer-gmp "inplace/bin/ghc-cabal" configure libraries/integer-gmp dist-install \ --with-ghc="/home/slyfox/dev/git/ghc/inplace/bin/ghc-stage1" \ --with-ghc-pkg="/home/slyfox/dev/git/ghc/inplace/bin/ghc-pkg" \ --disable-library-for-ghci --enable-library-vanilla --enable-library-for-ghci \ --enable-library-profiling --enable-shared --with-hscolour="/usr/bin/HsColour" \ --configure-option=CFLAGS="-Wall \ -Werror=unused-but-set-variable -Wno-error=inline \ -iquote /home/slyfox/dev/git/ghc/libraries/integer-gmp" \ --configure-option=LDFLAGS=" " --configure-option=CPPFLAGS=" \ " --gcc-options="-Wall -Werror=unused-but-set-variable -Wno-error=inline -iquote /home/slyfox/dev/git/ghc/libraries/integer-gmp \ " --with-gcc="x86_64-gentoo-linux-musl-gcc" --with-ld="x86_64-gentoo-linux-musl-ld.gold" --with-ar="x86_64-gentoo-linux-musl-ar" \ --with-alex="/usr/bin/alex" --with-happy="/usr/bin/happy" Configuring integer-gmp-1.0.2.0... configure: WARNING: unrecognized options: --with-compiler checking build system type... x86_64-pc-linux-gnu checking host system type... x86_64-pc-linux-gnu checking target system type... x86_64-pc-linux-gnu checking for gcc... /usr/lib/ccache/bin/x86_64-gentoo-linux-musl-gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... configure: error: in `/home/slyfox/dev/git/ghc/libraries/integer-gmp/dist-install/build': configure: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details make[1]: *** [libraries/integer-gmp/ghc.mk:5: libraries/integer-gmp/dist-install/package-data.mk] Error 1 make: *** [Makefile:126: all] Error 2 ``` Note: here `ghc-stage1` is assumed to target `musl` target but is passed `glibc` toolchain. It happens because initial ./configure phase did not detect host/target as different. Signed-off-by:
Sergei Trofimovich <slyfox@gentoo.org>
-
Ben Gamari authored
(cherry picked from commit 67b5de88ef923971f1980335137e3c7193213abd)
-
Ben Gamari authored
(cherry picked from commit cbedb3c4a90649f474cb716842ba53afc5a642ca)
-
Ben Gamari authored
(cherry picked from commit d092d859)
-
Ben Gamari authored
Apparently variable interpolation in the `artifacts.paths` key of `gitlab-ci.yml` doesn't work on Windows as it does on WIndows. (cherry picked from commit 100cc756)
-
Ben Gamari authored
-
Ben Gamari authored
-
Ben Gamari authored
It otherwise seems to default to ascii
-
Ben Gamari authored
Fall back to xz
-
Ben Gamari authored
Installs pxz on Centos7 (cherry picked from commit 86960e69)
-
Ben Gamari authored
Previously we were including <sys/unistd.h> which is available on glibc but not musl. (cherry picked from commit e44b695c)
-