Commits · 1d2bb9eb828f8daac552a9494746b6d6af60b515 · Naïm Favier / GHC

"README.md" did not exist on "eb1c4917869c41f57f440708c661606bb1724c74"

Dec 12, 2019
- Revert "rts: Drop redundant flags for libffi" · 7179b968
  Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago
  
  This seems to have regressed builds using `--with-system-libffi` (#17520). This reverts commit 3ce18700.
  7179b968
Dec 11, 2019

rts: Specialize hashing at call site rather than in struct. · f80c4a66

Crazycolorz5 authored 6 years ago and

Marge Bot committed 5 years ago

Separate word and string hash tables on the type level, and do not store
the hashing function.  Thus when a different hash function is desire it
is provided upon accessing the table. This is worst case the same as
before the change, and in the majority of cases is better. Also mark the
functions for aggressive inlining to improve performance.  {F1686506}

Reviewers: bgamari, erikd, simonmar

Subscribers: rwbarton, thomie, carter

GHC Trac Issues: #13165

Differential Revision: https://phabricator.haskell.org/D4889

f80c4a66

rts: Add a long form flag to enable the non-moving GC · 843ceb38
Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago
```
The old flag, `-xn`, was quite cryptic. Here we add `--nonmoving-gc` in
addition.
```
843ceb38

Dec 09, 2019

Fix comment typos · d46a72e1

Gabor Greif authored 5 years ago

The below is only necessary to fix the CI perf fluke that
happened in 9897e8c8:
-------------------------
Metric Decrease:
    T5837
    T6048
    T9020
    T12425
    T12234
    T13035
    T12150
    Naperian
-------------------------

d46a72e1

Dec 05, 2019

rts/NonMovingSweep: Fix locking of new mutable list allocation · a7a4efbf

Ben Gamari authored 5 years ago and

Marge Bot committed 5 years ago

Previously we used allocBlockOnNode_sync in nonmovingSweepMutLists
despite the fact that we aren't in the GC and therefore the allocation
spinlock isn't in use. This meant that sweep would end up spinning until
the next minor GC, when the SM lock was moved away from the SM_MUTEX to
the spinlock. This isn't a correctness issue but it sure isn't good for
performance.

Found thanks for Ward.

Fixes #17539.

a7a4efbf

nonmoving: Clear segment bitmaps during sweep · 69001f54

Ben Gamari authored 5 years ago and

Marge Bot committed 5 years ago

Previously we would clear the bitmaps of segments which we are going to
sweep during the preparatory pause. However, this is unnecessary: the
existence of the mark epoch ensures that the sweep will correctly
identify non-reachable objects, even if we do not clear the bitmap.

We now defer clearing the bitmap to sweep, which happens concurrently
with mutation.

69001f54

Dec 02, 2019
- Fix more typos · 717f3236
  Brian Wignall authored 5 years ago and Marge Bot committed 5 years ago
  
  717f3236
Nov 28, 2019
- Fix typos, using Wikipedia list of common typos · 3748ba3a
  Brian Wignall authored 5 years ago and Marge Bot committed 5 years ago
  
  3748ba3a
Nov 24, 2019
- Fix typos · 7b4c7b75
  Brian Wignall authored 5 years ago
  
  7b4c7b75
Nov 23, 2019

rts: Expose interface for configuring EventLogWriters · e43e6ece

Ben Gamari authored 5 years ago and

Marge Bot committed 5 years ago

This exposes a set of interfaces from the GHC API for configuring
EventLogWriters. These can be used by consumers like
[ghc-eventlog-socket](https://github.com/bgamari/ghc-eventlog-socket).

e43e6ece

Nov 20, 2019

Use pointer equality in Eq/Ord for ThreadId · d1f3c637

roland authored 5 years ago and

Marge Bot committed 5 years ago

Changes (==) to use only pointer equality. This is safe because two
threads are the same iff they have the same id.

Changes `compare` to check pointer equality first and fall back on ids
only in case of inequality.

See discussion in #16761.

d1f3c637

Changing Thread IDs from 32 bits to 64 bits. · e57b7cc6
roland authored 5 years ago and Marge Bot committed 5 years ago

e57b7cc6

Nov 19, 2019

nonmoving: Drop redundant write barrier on stack underflow · 098d5017

Ben Gamari authored 5 years ago and

Marge Bot committed 5 years ago

Previously we would push stack-carried return values to the new stack on
a stack overflow. While the precise reasoning for this barrier is
unfortunately lost to history, in hindsight I suspect it was prompted by
a missing barrier elsewhere (that has been since fixed).

Moreover, there the redundant barrier is actively harmful: the stack may
contain non-pointer values; blindly pushing these to the mark queue will
result in a crash. This is precisely what happened in the `stack003`
test. However, because of a (now fixed) deficiency in the test this
crash did not trigger on amd64.

098d5017

nonmoving: Fix handling on large object marking on 32-bit · eb7b233a

Ben Gamari authored 5 years ago and

Marge Bot committed 5 years ago

Previously we would reset the pointer pointing to the object to be
marked to the beginning of the block when marking a large object. This
did no harm on 64-bit but on 32-bit it broke, e.g. `arr020`, since we
align pinned ByteArray allocations such that the payload is 8
byte-aligned. This means that the object might not begin at the
beginning of the block.,

eb7b233a

nonmoving: Rework mark queue representation · 097f8072
Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago
```
The previous representation needlessly limited the array length to
16-bits on 32-bit platforms.
```
097f8072

nonmoving: Fix incorrect masking in mark queue type test · deed8e31

Ben Gamari authored 5 years ago and

Marge Bot committed 5 years ago

We were using TAG_BITS instead of TAG_MASK. This happened to work on
64-bit platforms where TAG_BITS==3 since we only use tag values 0 and
3. However, this broken on 32-bit platforms where TAG_BITS==2.

deed8e31

nonmoving: Use correct info table pointer accessor · c819c0e4

Ben Gamari authored 5 years ago and

Marge Bot committed 5 years ago

Previously we used INFO_PTR_TO_STRUCT instead of
THUNK_INFO_PTR_TO_STRUCT when looking at a thunk. These two happen to be
equivalent on 64-bit architectures due to alignment considerations
however they are different on 32-bit platforms. This lead to #17487.

To fix this we also employ a small optimization: there is only one thunk
of type WHITEHOLE (namely stg_WHITEHOLE_info). Consequently, we can just
use a plain pointer comparison instead of testing against info->type.

c819c0e4

rts: Add missing include of SymbolExtras.h · 0418c38d
Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago
```
This broke the Windows build.
```
0418c38d
Properly account for libdw paths in make build system · 2b27cc16
Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago
```
Should finally fix #17255.
```
2b27cc16

Enable USE_PTHREAD_FOR_ITIMER also on FreeBSD · ec8a463d

vdukhovni authored 5 years ago and

Marge Bot committed 5 years ago

If using a pthread instead of a timer signal is more reliable, and
has no known drawbacks, then FreeBSD is also capable of supporting
this mode of operation (tested on FreeBSD 12 with GHC 8.8.1, but
no reason why it would not also work on FreeBSD 11 or GHC 8.6).

Proposed by Kevin Zhang in:

    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241849

ec8a463d

Nov 08, 2019

rts/nonmoving: Catch failure of createOSThread · 6e4656cc
Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago

6e4656cc
rts/NonMoving: Fix various Windows build issues · 23994738
Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago
```
The Windows build seems to be stricter about not providing threading
primitives in the non-threaded RTS.
```
23994738
rts: Remove undesireable inline specifier · 0d141d28
Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago
```
I have no idea why I marked this as inline originally but clearly it
shouldn't be inlined.
```
0d141d28

rts: Ensure that Rts.h is always included first · ae431cf4

Ben Gamari authored 5 years ago and

Marge Bot committed 5 years ago

In general this is the convention that we use in the RTS. On Windows
things actually fail if we break it. For instance, you see things like:

   includes\stg\Types.h:26:9: error:
     warning: #warning "Mismatch between __USE_MINGW_ANSI_STDIO
     definitions. If using Rts.h make sure it is the first header
     included." [-Wcpp]

ae431cf4

rts: Fix m32 allocator build on Windows · b1c158c9
Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago
```
An inconsistency in the name of m32_allocator_flush caused the build to
fail with a missing prototype error.
```
b1c158c9

Nov 06, 2019
- configure: Add --with-libdw-{includes,libraries} flags · ce9e2a1a
  Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago
  
  Fixing #17255.
  ce9e2a1a
- rts: Drop redundant flags for libffi · 3ce18700
  Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago
  
  These are now handled in the cabal file's include-dirs field.
  3ce18700
Nov 05, 2019
- rts: Add missing const in HEAP_ALLOCED_GC · d57059f7
  Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago
  
  This was previously unnoticed as this code-path is hit on very few platforms (e.g. OpenBSD).
  d57059f7
Nov 04, 2019

rts/linker: Ensure that code isn't writable · 120f2e53

Ben Gamari authored 5 years ago and

Marge Bot committed 5 years ago

For many years the linker would simply map all of its memory with
PROT_READ|PROT_WRITE|PROT_EXEC. However operating systems have been
becoming increasingly reluctant to accept this practice (e.g. #17353
and #12657) and for good reason: writable code is ripe for exploitation.

Consequently mmapForLinker now maps its memory with
PROT_READ|PROT_WRITE.  After the linker has finished filling/relocating
the mapping it must then call mmapForLinkerMarkExecutable on the
sections of the mapping which contain executable code.

Moreover, to make all of this possible it was necessary to redesign the
m32 allocator. First, we gave (in an earlier commit) each ObjectCode its
own m32_allocator. This was necessary since code loading and symbol
resolution/relocation are currently interleaved, meaning that it is not
possible to enforce W^X when symbols from different objects reside in
the same page.

We then redesigned the m32 allocator to take advantage of the fact that
all of the pages allocated with the allocator die at the same time
(namely, when the owning ObjectCode is unloaded). This makes a number of
things simpler (e.g. no more page reference counting; the interface
provided by the allocator for freeing is simpler). See
Note [M32 Allocator] for details.

120f2e53

Nov 02, 2019
- Add +RTS --disable-delayed-os-memory-return. Fixes #17411 . · 9980fb58
  Niklas Hambüchen authored 5 years ago and Marge Bot committed 5 years ago
  
  Sets `MiscFlags.disableDelayedOsMemoryReturn`. See the added `Note [MADV_FREE and MADV_DONTNEED]` for details.
  9980fb58
Nov 01, 2019

rts: Make m32 allocator per-ObjectCode · c6759080

Ben Gamari authored 5 years ago and

Marge Bot committed 5 years ago

MacOS Catalina is finally going to force our hand in forbidden writable
exeutable mappings. Unfortunately, this is quite incompatible with the
current global m32 allocator, which mixes symbols from various objects
in a single page. The problem here is that some of these symbols may not
yet be resolved (e.g. had relocations performed) as this happens lazily
(and therefore we can't yet make the section read-only and therefore
executable).

The easiest way around this is to simply create one m32 allocator per
ObjectCode. This may slightly increase fragmentation for short-running
programs but I suspect will actually improve fragmentation for programs
doing lots of loading/unloading since we can always free all of the
pages allocated to an object when it is unloaded (although this ability
will only be implemented in a later patch).

c6759080

mmap: Factor out protection flags · 70b62c97
Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago

70b62c97

Oct 30, 2019
- rts: More aarch64 header fixes · 93ff9197
  Ben Gamari authored 5 years ago and Marge Bot committed 5 years ago
  
  93ff9197
- Interpreter: initialize arity fields of AP_NOUPDs · 01ef3e1f
  Ömer Sinan Ağacan authored 5 years ago and Marge Bot committed 5 years ago
  
  AP_NOUPD entry code doesn't use the arity field, but not initializing this field confuses printers/debuggers, and also makes testing harder as the field's value changes randomly.
  01ef3e1f
Oct 26, 2019

rts: Fix ARM linker includes · 417f59d4

Ben Gamari authored 5 years ago and

Marge Bot committed 5 years ago

 * Prefer #pragma once over guard macros
 * Drop redundant #includes
 * Fix order to ensure that necessary macros are defined when we
   condition on them

417f59d4

Implement shrinkSmallMutableArray# and resizeSmallMutableArray#. · 8916e64e

Andrew Martin authored 5 years ago and

Marge Bot committed 5 years ago

This is a part of GHC Proposal #25: "Offer more array resizing primitives".
Resources related to the proposal:

  - Discussion: https://github.com/ghc-proposals/ghc-proposals/pull/121
  - Proposal: https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0025-resize-boxed.rst

Only shrinkSmallMutableArray# is implemented as a primop since a
library-space implementation of resizeSmallMutableArray# (in GHC.Exts)
is no less efficient than a primop would be. This may be replaced by
a primop in the future if someone devises a strategy for growing
arrays in-place. The library-space implementation always copies the
array when growing it.

This commit also tweaks the documentation of the deprecated
sizeofMutableByteArray#, removing the mention of concurrency. That
primop is unsound even in single-threaded applications. Additionally,
the non-negativity assertion on the existing shrinkMutableByteArray#
primop has been removed since this predicate is trivially always true.

8916e64e

Oct 25, 2019

configure: Drop GccLT46 · 519f5162

Ben Gamari authored 5 years ago and

Marge Bot committed 5 years ago

GCC 4.6 was released 7 years ago. I think we can finally assume that
it's available. This is a simplification prompted by #15742.

519f5162

Oct 23, 2019

Full abort on validate failure merging `orElse`. · 1f40e68a

ryates@cs.rochester.edu authored 5 years ago and

Marge Bot committed 5 years ago

Previously partial roll back of a branch of an `orElse` was attempted
if validation failure was observed.  Validation here, however, does
not account for what part of the transaction observed inconsistent
state.  This commit fixes this by fully aborting and restarting the
transaction.

1f40e68a

eventlog: Dump cost centre stack on each sample · 17987a4b

Matthew Pickering authored 5 years ago and

Marge Bot committed 5 years ago

With this change it is possible to reconstruct the timing portion of a
`.prof` file after the fact. By logging the stacks at each time point
a more precise executation trace of the program can be observed rather
than all identical cost centres being identified in the report.

There are two new events:

1. `EVENT_PROF_BEGIN` - emitted at the start of profiling to communicate
the tick interval
2. `EVENT_PROF_SAMPLE_COST_CENTRE` - emitted on each tick to communicate the
current call stack.

Fixes #17322

17987a4b

Refactor Compact.c: · b521e8b6

Ömer Sinan Ağacan authored 5 years ago and

Marge Bot committed 5 years ago

- Remove forward declarations
- Introduce UNTAG_PTR and GET_PTR_TAG for dealing with pointer tags
  without having to cast arguments to StgClosure*
- Remove dead code
- Use W_ instead of StgWord
- Use P_ instead of StgPtr

b521e8b6