Commits · 9b9e031b67dbc812c156a4773c0c9d293451fefa · Reinier Maas / GHC

Apr 05, 2024

compiler: Allow more types in GHCForeignImportPrim · 9b9e031b

Ben Gamari authored 1 year ago and

Marge Bot committed 11 months ago

For many, many years `GHCForeignImportPrim` has suffered from the rather
restrictive limitation of not allowing any non-trivial types in arguments
or results. This limitation was justified by the code generator allegely
barfing in the presence of such types.

However, this restriction appears to originate well before the NCG
rewrite and the new NCG does not appear to have any trouble with such
types (see the added `T24598` test). Lift this restriction.

Fixes #24598.

9b9e031b

Apr 04, 2024

Fix off by one error in seekBinNoExpand and seekBin · 28009fbc
Matthew Pickering authored 1 year ago and Marge Bot committed 11 months ago

28009fbc

Change how invisible patterns represented in haskell syntax and TH AST (#24557) · 36a75b80

Andrei Borzenkov authored 11 months ago and

Marge Bot committed 11 months ago

Before this patch:
  data ArgPat p
    = InvisPat (LHsType p)
    | VisPat (LPat p)

With this patch:
  data Pat p
    = ...
    | InvisPat (LHsType p)
    ...

And the same transformation in the TH land. The rest of the
changes is just updating code to handle new AST and writing tests
to check if it is possible to create invalid states using TH.

Metric Increase:
    MultiLayerModulesTH_OneShot

36a75b80

Compact FlatBag array representation · 82cfe10c

Hannes Siebenhandl authored 1 year ago and

Marge Bot committed 11 months ago

`Array` contains three additional `Word`'s we do not need in `FlatBag`. Move
`FlatBag` to `SmallArray`.

Expand the API of SmallArray by `sizeofSmallArray` and add common
traversal functions, such as `mapSmallArray` and `foldMapSmallArray`.
Additionally, allow users to force the elements of a `SmallArray`
via `rnfSmallArray`.

82cfe10c

Replace `SizedSeq` with `FlatBag` for flattened structure · 5f085d3a

Hannes Siebenhandl authored 1 year ago and

Marge Bot committed 11 months ago

LinkedLists are notoriously memory ineffiecient when all we do is
traversing a structure.
As 'UnlinkedBCO' has been identified as a data structure that impacts
the overall memory usage of GHCi sessions, we avoid linked lists and
prefer flattened structure for storing.

We introduce a new memory efficient representation of sequential
elements that has special support for the cases:

* Empty
* Singleton
* Tuple Elements

This improves sharing in the 'Empty' case and avoids the overhead of
'Array' until its constant overhead is justified.

5f085d3a

Update correct counter in bumpTickyAllocd · 0c4a9686
Luite Stegeman authored 1 year ago and Marge Bot committed 11 months ago

0c4a9686
testsuite: Introduce template-haskell-exports test · 0fde229f
Ben Gamari authored 1 year ago and Marge Bot committed 11 months ago

0fde229f

Apr 03, 2024

Account for bottoming functions in OccurAnal · 271a7812
Simon Peyton Jones authored 1 year ago and Marge Bot committed 11 months ago
```
This fixes #24582, a small but long-standing bug
```
271a7812
Testsuite message changes from simplifier improvements · 27db3c5e
Simon Peyton Jones authored 1 year ago and Marge Bot committed 11 months ago

27db3c5e

Simplifier improvements · e026bdf2

Simon Peyton Jones authored 1 year ago and

Marge Bot committed 11 months ago

This MR started as: allow the simplifer to do more in one pass,
arising from places I could see the simplifier taking two iterations
where one would do.  But it turned into a larger project, because
these changes unexpectedly made inlining blow up, especially join
points in deeply-nested cases.

The main changes are below.  There are also many new or rewritten Notes.

Avoiding simplifying repeatedly
~~~~~~~~~~~~~~~
See Note [Avoiding simplifying repeatedly]

* The SimplEnv now has a seInlineDepth field, which says how deep
  in unfoldings we are.  See Note [Inline depth] in Simplify.Env.
  Currently used only for the next point: avoiding repeatedly
  simplifying coercions.

* Avoid repeatedly simplifying coercions.
  see Note [Avoid re-simplifying coercions] in Simplify.Iteration
  As you'll see from the Note, this makes use of the seInlineDepth.

* Allow Simplify.Iteration.simplAuxBind to inline used-once things.
  This is another part of Note [Post-inline for single-use things], and
  is really good for reducing simplifier iterations in situations like
      case K e of { K x -> blah }
  wher x is used once in blah.

* Make GHC.Core.SimpleOpt.exprIsConApp_maybe do some simple case
  elimination.  Note [Case elim in exprIsConApp_maybe]

* Improve the case-merge transformation:
  - Move the main code to `GHC.Core.Utils.mergeCaseAlts`, to join `filterAlts`
    and friends.  See Note [Merge Nested Cases] in GHC.Core.Utils.
  - Add a new case for `tagToEnum#`; see wrinkle (MC3).
  - Add a new case to look through join points: see wrinkle (MC4)

postInlineUnconditionally
~~~~~~~~~~~~~~~~~~~~~~~~~
* Allow Simplify.Utils.postInlineUnconditionally to inline variables
  that are used exactly once. See Note [Post-inline for single-use things].

* Do not postInlineUnconditionally join point, ever.
  Doing so does not reduce allocation, which is the main point,
  and with join points that are used a lot it can bloat code.
  See point (1) of Note [Duplicating join points] in
  GHC.Core.Opt.Simplify.Iteration.

* Do not postInlineUnconditionally a strict (demanded) binding.
  It will not allocate a thunk (it'll turn into a case instead)
  so again the main point of inlining it doesn't hold.  Better
  to check per-call-site.

* Improve occurrence analyis for bottoming function calls, to help
  postInlineUnconditionally.  See Note [Bottoming function calls]
  in GHC.Core.Opt.OccurAnal

Inlining generally
~~~~~~~~~~~~~~~~~~
* In GHC.Core.Opt.Simplify.Utils.interestingCallContext,
  use RhsCtxt NonRecursive (not BoringCtxt) for a plain-seq case.
  See Note [Seq is boring]  Also, wrinkle (SB1), inline in that
  `seq` context only for INLINE functions (UnfWhen guidance).

* In GHC.Core.Opt.Simplify.Utils.interestingArg,
  - return ValueArg for OtherCon [c1,c2, ...], but
  - return NonTrivArg for OtherCon []
  This makes a function a little less likely to inline if all we
  know is that the argument is evaluated, but nothing else.

* isConLikeUnfolding is no longer true for OtherCon {}.
  This propagates to exprIsConLike.  Con-like-ness has /positive/
  information.

Join points
~~~~~~~~~~~
* Be very careful about inlining join points.
  See these two long Notes
    Note [Duplicating join points] in GHC.Core.Opt.Simplify.Iteration
    Note [Inlining join points] in GHC.Core.Opt.Simplify.Inline

* When making join points, don't do so if the join point is so small
  it will immediately be inlined; check uncondInlineJoin.

* In GHC.Core.Opt.Simplify.Inline.tryUnfolding, improve the inlining
  heuristics for join points. In general we /do not/ want to inline
  join points /even if they are small/.  See Note [Duplicating join points]
  GHC.Core.Opt.Simplify.Iteration.

  But sometimes we do: see Note [Inlining join points] in
  GHC.Core.Opt.Simplify.Inline; and the new `isBetterUnfoldingThan` function.

* Do not add an unfolding to a join point at birth.  This is a tricky one
  and has a long Note [Do not add unfoldings to join points at birth]
  It shows up in two places
  - In `mkDupableAlt` do not add an inlining
  - (trickier) In `simplLetUnfolding` don't add an unfolding for a
    fresh join point
  I am not fully satisifed with this, but it works and is well documented.

* In GHC.Core.Unfold.sizeExpr, make jumps small, so that we don't penalise
  having a non-inlined join point.

Performance changes
~~~~~~~~~~~~~~~~~~~
* Binary sizes fall by around 2.6%, according to nofib.

* Compile times improve slightly. Here are the figures over 1%.

  I investiate the biggest differnce in T18304. It's a very small module, just
  a few hundred nodes. The large percentage difffence is due to a single
  function that didn't quite inline before, and does now, making code size a
  bit bigger.  I decided gains outweighed the losses.

    Metrics: compile_time/bytes allocated (changes over +/- 1%)
    ------------------------------------------------
           CoOpt_Singletons(normal)   -9.2% GOOD
                LargeRecord(normal)  -23.5% GOOD
MultiComponentModulesRecomp(normal)   +1.2%
MultiLayerModulesTH_OneShot(normal)   +4.1%  BAD
                  PmSeriesS(normal)   -3.8%
                  PmSeriesV(normal)   -1.5%
                     T11195(normal)   -1.3%
                     T12227(normal)  -20.4% GOOD
                     T12545(normal)   -3.2%
                     T12707(normal)   -2.1% GOOD
                     T13253(normal)   -1.2%
                 T13253-spj(normal)   +8.1%  BAD
                     T13386(normal)   -3.1% GOOD
                     T14766(normal)   -2.6% GOOD
                     T15164(normal)   -1.4%
                     T15304(normal)   +1.2%
                     T15630(normal)   -8.2%
                    T15630a(normal)          NEW
                     T15703(normal)  -14.7% GOOD
                     T16577(normal)   -2.3% GOOD
                     T17516(normal)  -39.7% GOOD
                     T18140(normal)   +1.2%
                     T18223(normal)  -17.1% GOOD
                     T18282(normal)   -5.0% GOOD
                     T18304(normal)  +10.8%  BAD
                     T18923(normal)   -2.9% GOOD
                      T1969(normal)   +1.0%
                     T19695(normal)   -1.5%
                     T20049(normal)  -12.7% GOOD
                    T21839c(normal)   -4.1% GOOD
                      T3064(normal)   -1.5%
                      T3294(normal)   +1.2%  BAD
                      T4801(normal)   +1.2%
                      T5030(normal)  -15.2% GOOD
                   T5321Fun(normal)   -2.2% GOOD
                      T6048(optasm)  -16.8% GOOD
                       T783(normal)   -1.2%
                      T8095(normal)   -6.0% GOOD
                      T9630(normal)   -4.7% GOOD
                      T9961(normal)   +1.9%  BAD
                      WWRec(normal)   -1.4%
        info_table_map_perf(normal)   -1.3%
                 parsing001(normal)   +1.5%

                          geo. mean   -2.0%
                          minimum    -39.7%
                          maximum    +10.8%

* Runtimes generally improve. In the testsuite perf/should_run gives:
   Metrics: runtime/bytes allocated
   ------------------------------------------
             Conversions(normal)   -0.3%
                 T13536a(optasm)  -41.7% GOOD
                   T4830(normal)   -0.1%
           haddock.Cabal(normal)   -0.1%
            haddock.base(normal)   -0.1%
        haddock.compiler(normal)   -0.1%

                       geo. mean   -0.8%
                       minimum    -41.7%
                       maximum     +0.0%

* For runtime, nofib is a better test.  The news is mostly good.
  Here are the number more than +/- 0.1%:

    # bytes allocated
    ==========================++==========
       imaginary/digits-of-e1 ||  -14.40%
       imaginary/digits-of-e2 ||   -4.41%
          imaginary/paraffins ||   -0.17%
               imaginary/rfib ||   -0.15%
       imaginary/wheel-sieve2 ||   -0.10%
                real/compress ||   -0.47%
                   real/fluid ||   -0.10%
                  real/fulsom ||   +0.14%
                  real/gamteb ||   -1.47%
                      real/gg ||   -0.20%
                   real/infer ||   +0.24%
                     real/pic ||   -0.23%
                  real/prolog ||   -0.36%
                     real/scs ||   -0.46%
                 real/smallpt ||   +4.03%
        shootout/k-nucleotide ||  -20.23%
              shootout/n-body ||   -0.42%
       shootout/spectral-norm ||   -0.13%
              spectral/boyer2 ||   -3.80%
         spectral/constraints ||   -0.27%
          spectral/hartel/ida ||   -0.82%
                spectral/mate ||  -20.34%
                spectral/para ||   +0.46%
             spectral/rewrite ||   +1.30%
              spectral/sphere ||   -0.14%
    ==========================++==========
                    geom mean ||   -0.59%

    real/smallpt has a huge nest of local definitions, and I
    could not pin down a reason for a regression.  But there are
    three big wins!

Metric Decrease:
    CoOpt_Singletons
    LargeRecord
    T12227
    T12707
    T13386
    T13536a
    T14766
    T15703
    T16577
    T17516
    T18223
    T18282
    T18923
    T21839c
    T20049
    T5321Fun
    T5030
    T6048
    T8095
    T9630
    T783
Metric Increase:
    MultiLayerModulesTH_OneShot
    T13253-spj
    T18304
    T18698a
    T9961
    T3294

e026bdf2

Remove a long-commented-out line · b4581e23
Simon Peyton Jones authored 1 year ago and Marge Bot committed 11 months ago
```
Pure refactoring
```
b4581e23
Use named record fields for the CastIt { ... } data constructor · e9297181
Simon Peyton Jones authored 1 year ago and Marge Bot committed 11 months ago
```
This is a pure refactor
```
e9297181

Slight improvement in WorkWrap · ae24c9bc

Simon Peyton Jones authored 1 year ago and

Marge Bot committed 11 months ago

Ensure that WorkWrap preserves lambda binders, in case of join points.  Sadly I
have forgotten why I made this change (it was while I was doing a lot of
meddling in the Simplifier, but
  * it does no harm,
  * it is slightly more efficient, and
  * presumably it made something better!

Anyway I have kept it in a separate commit.

ae24c9bc

Inline GHC.HsToCore.Pmc.Solver.Types.trvVarInfo · 609cd32c

Simon Peyton Jones authored 1 year ago and

Marge Bot committed 11 months ago

When exploring compile-time regressions after meddling with the Simplifier, I
discovered that GHC.HsToCore.Pmc.Solver.Types.trvVarInfo was very delicately
balanced. It's a small, heavily used, overloaded function and it's important
that it inlines. By a fluke it was before, but at various times in my journey it
stopped doing so. So I just added an INLINE pragma to it; no sense in depending
on a delicately-balanced fluke.

609cd32c

Improve exprIsConApp_maybe a little · bdf1660f

Simon Peyton Jones authored 1 year ago and

Marge Bot committed 11 months ago

Eliminate a redundant case at birth.  This sometimes reduces
Simplifier iterations.

See Note [Case elim in exprIsConApp_maybe].

bdf1660f

Spelling, layout, pretty-printing only · 95a9a172
Simon Peyton Jones authored 1 year ago and Marge Bot committed 11 months ago

95a9a172

Improve eta-expansion through call stacks · 9c00154d

Simon Peyton Jones authored 1 year ago and

Marge Bot committed 11 months ago

See Note [Eta expanding through CallStacks] in GHC.Core.Opt.Arity

This is a one-line change, that fixes an inconsistency
-               || isCallStackPredTy ty
+               || isCallStackPredTy ty || isCallStackTy ty

9c00154d

FloatOut: improve floating for join point · 1efd0714

Simon Peyton Jones authored 1 year ago and

Marge Bot committed 11 months ago

See the new Note [Floating join point bindings].

* Completely get rid of the complicated join_ceiling nonsense, which
  I have never understood.

* Do not float join points at all, except perhaps to top level.

* Some refactoring around wantToFloat, to treat Rec and NonRec more
  uniformly

1efd0714

Several improvements to the handling of coercions · e869605e

Simon Peyton Jones authored 1 year ago and

Marge Bot committed 11 months ago

* Make `mkSymCo` and `mkInstCo` smarter
  Fixes #23642

* Fix return role of `SelCo` in the coercion optimiser.
  Fixes #23617

* Make the coercion optimiser `opt_trans_rule` work better for newtypes
  Fixes #23619

e869605e

Accept metric decrease in T12227 · 8d950968

Duncan Coutts authored 1 year ago and

Marge Bot committed 11 months ago

I can't think of any good reason that anything in this MR should have
changed the number of allocations, up or down.

(Yes this is an empty commit.)

Metric Decrease:
    T12227

8d950968

Accept changes to base-exports · 1adc6fa4

Duncan Coutts authored 1 year ago and

Marge Bot committed 11 months ago

All the changes are in fact not changes at all.

Previously, the IoSubSystem data type was defined in GHC.RTS.Flags and
exported from both GHC.RTS.Flags and GHC.IO.SubSystem. Now, the data
type is defined in GHC.IO.SubSystem and still exported from both
modules.

Therefore, the same exports and same instances are still available from
both modules. But the base-exports records only the defining module, and
so it looks like a change when it is fully compatible.

Related: we do add a deprecation to the export of the type via
GHC.RTS.Flags, telling people to use the export from GHC.IO.SubSystem.

Also the sort order for some unrelated Show instances changed. No idea
why.

The same changes apply in the other versions, with a few more changes
due to sort order weirdness.

1adc6fa4

Conditionally ignore some GCC warnings · 83a74d20

Duncan Coutts authored 1 year ago and

Marge Bot committed 11 months ago

Some GCC versions don't know about some warnings, and they complain
that we're ignoring unknown warnings. So we try to ignore the warning
based on the GCC version.

83a74d20

waitRead# / waitWrite# do not work for win32-legacy I/O manager · 8023bad4

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

Previously it was unclear that they did not work because the code path
was shared with other I/O managers (in particular select()).

Following the code carefully shows that what actually happens is that
the calling thread would block forever: the thread will be put into the
blocked queue, but no other action is scheduled that will ever result in
it getting unblocked.

It's better to just fail loudly in case anyone accidentally calls it,
also it's less confusing code.

8023bad4

Include the default I/O manager in the +RTS --info output · c7d3e3a3
Duncan Coutts authored 1 year ago and Marge Bot committed 11 months ago
```
Document the extra +RTS --info output in the user guide
```
c7d3e3a3

Add tracing for the main I/O manager actions · 9c51473b

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago


Using the new tracer class.

Note: The unconditional definition of showIOManager should be
compatible with the debugTrace change in 7c7d1f66.

Co-authored-by: Pi Delport <pi@well-typed.com>

9c51473b

The select() I/O manager does have some global initialisation · 877a2a80
Duncan Coutts authored 2 years ago and Marge Bot committed 11 months ago
```
It's just to make sure an exception CAF is a GC root.
```
877a2a80

Make struct CapIOManager be fully opaque · aaa294d0

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

Provide an opaque (forward) definition in Capability.h (since the cap
contains a *CapIOManager) and then only provide a full definition in
a new file IOManagerInternals.h. This new file is only supposed to be
included by the IOManager implementation, not by its users. So that
means IOManager.c and individual I/O manager implementations.

The posix/Signals.c still needs direct access, but that should be
eliminated. Anything that needs direct access either needs to be clearly
part of an I/O manager (e.g. the sleect() one) or go via a proper API.

aaa294d0

Select an I/O manager early in RTS startup · 3be6d591
Duncan Coutts authored 2 years ago and Marge Bot committed 11 months ago
```
We need to select the I/O manager to use during startup before the
per-cap I/O manager initialisation.
```
3be6d591

Add I/O manager API notifyIOManagerCapabilitiesChanged · 94a87d21

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

Used in setNumCapabilities.

It only does anything for MIO on Posix.

Previously it always invoked Haskell code, but that code only did
anything on non-Windows (and non-JS), and only threaded. That currently
effectively means the MIO I/O manager on Posix.

So now it only invokes it for the MIO Posix case.

94a87d21

Add an IOManager API for scavenging TSO blocked_info · 4161f516

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

When the GC scavenges a TSO it needs to scavenge the tso->blocked_info
but the blocked_info is a big union and what lives there depends on the
two->why_blocked, which for I/O-related reasons is something that in
principle is the responsibility of the I/O manager and not the GC. So
the right thing to do is for the GC to ask the I/O manager to sscavenge
the blocked_info if it encounters any I/O-related why_blocked reasons.

So we add scavengeTSOIOManager in IOManager.{h,c} with the usual style.

Now as it happens, right now, there is no special scavenging to do, so
the implementation of scavengeTSOIOManager is a fancy no-op. That's
because the select I/O manager uses only the fd and target members,
which are not GC pointers, and the win32-legacy I/O manager _ought_ to
be using GC-managed heap objects for the StgAsyncIOResult but it is
actually usingthe C heap, so again no GC pointers. If the win32-legacy
were doing this more sensibly, then scavengeTSOIOManager would be the
right place to do the GC magic.

Future I/O managers will need GC heap objects in the tso->blocked_info
and will make use of this functionality.

4161f516

Tidy up a couple things in Select.{h,c} · d30c6bc6

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

Use the standard #include {Begin,End}Private.h style rather than
RTS_PRIVATE on individual decls.

And conditionally build the code for the select I/O manager based on
the new CPP IOMGR_ENABLED_SELECT rather than on THREADED_RTS.

d30c6bc6

Rename awaitEvent in select and win32 I/O managers · 5ad4b30f

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

These are now just called from IOManager.c and are the per-I/O manager
backend impls (whereas previously awaitEvent was the entry point).

Follow the new naming convention in the IOManager.{h,c} of
awaitCompletedTimeoutsOrIO, with the I/O manager's name as a suffix:
so awaitCompletedTimeoutsOrIO{Select,Win32}.

5ad4b30f

Move awaitEvent into a proper IOManager API · 4f9e9c4e

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

and have the scheduler use it.

Previously the scheduler calls awaitEvent directly, and awaitEvent is
implemented directly in the RTS I/O managers (select, win32). This
relies on the old scheme where there's a single active I/O manager for
each platform and RTS way.

We want to move that to go via an API in IOManager.{h,c} which can then
call out to the active I/O manager.

Also take the opportunity to split awaitEvent into two. The existing
awaitEvent has a bool wait parameter, to say if the call should be
blocking or non-blocking. We split this into two separate functions:
pollCompletedTimeoutsOrIO and awaitCompletedTimeoutsOrIO. We split them
for a few reasons: they have different post-conditions (specifically the
await version is supposed to guarantee that there are threads runnable
when it completes). Secondly, it is also anticipated that in future I/O
managers the implementations of the two cases will be simpler if they
are separated.

4f9e9c4e

Have the throwTo impl go via (new) IOManager APIs · f0c1f862

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

rather than directly operating on the IO manager's data structures.

Specifically, when thowing an async exception to a thread that is
blocked waiting for I/O or waiting for a timer, then we want to cancel
that I/O waiting or cancel the timer. Currently this is done directly in
removeFromQueues() in RaiseAsync.c. We want it to go via proper APIs
both for modularity but also to let us support multiple I/O managers.

So add sync{IO,Delay}Cancel, which is the cancellation for the
corresponding sync{IO,Delay}. The implementations of these use the usual
"switch (iomgr_type)" style.

f0c1f862

Add a new trace class for the iomanager · b48805b9

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

It makes sense now for it to be separate from the scheduler class of
tracers.

Enabled with +RTS -Do. Document the -Do debug flag in the user guide.

b48805b9

Take a simpler approach to gcc warnings in IOManager.c · f70b8108

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

We have lots of functions with conditional implementations for
different I/O managers. Some functions, for some I/O managers,
naturally have implementations that do nothing or barf. When only one
such I/O manager is enabled then the whole function implementation will
have an implementation that does nothing or barfs. This then results in
warnings from gcc that parameters are unused, or that the function
should be marked with attribute noreturn (since barf does not return).
The USED_IF_THREADS trick for fine-grained warning supression is fine
for just two cases, but an equivalent here would need
USED_IF_THE_ONLY_ENABLED_IOMGR_IS_X_OR_Y which would have combinitorial
blowup. So we take a coarse grained approach and simply disable these
two warnings for the whole file.

So we use a GCC pragma, with its handy push/pop support:

 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored "-Wsuggest-attribute=noreturn"
 #pragma GCC diagnostic ignored "-Wunused-parameter"

...

 #pragma GCC diagnostic pop

f70b8108

Move anyPendingTimeoutsOrIO impl from .h to .c · 60ce9910

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

The implementation is eventually going to need to use more private
things, which will drag in unwanted includes into IOManager.h, so it's
better to move the impl out of the header file and into the .c file, at
the slight cost of it no longer being inline.

At the same time, change to the "switch (iomgr_type)" style.

60ce9910

insertIntoSleepingQueue is no longer public · e93058e0

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

No longer defined in IOManager.h, just a private function in
IOManager.c. Since it is no longer called from cmm code, just from
syncDelay. It ought to get moved further into the select() I/O manager
impl, rather than living in IOManager.c.

On the other hand appendToIOBlockedQueue is still called from cmm code
in the win32-legacy I/O manager primops async{Read,Write}#, and it is
also used by the select() I/O manager. Update the CPP and comments to
reflect this.

e93058e0

Move most of the delay# impl from cmm to C · 457705a8

Duncan Coutts authored 2 years ago and

Marge Bot committed 11 months ago

Moves it into the IOManager.c where we can follow the new pattern of
switching on the selected I/O manager.

Uses a new IOManager API: syncDelay, following the naming convention of
sync* for thread-synchronous I/O & timer/delay operations.

As part of porting from cmm to C, we maintain the rule that the
why_blocked gets accessed using load acquire and store release atomic
memory operations. There was one exception to this rule: in the delay#
primop cmm code on posix (not win32), the why_blocked was being updated
using a store relaxed, not a store release. I've no idea why. In this
convesion I'm playing it safe here and using store release consistently.

457705a8

Move most of waitRead#/Write# from cmm to C · c2f26f36
Duncan Coutts authored 2 years ago and Marge Bot committed 11 months ago
```
Moves it into the IOManager.c where we can follow the new pattern of
switching on the selected I/O manager.
```
c2f26f36