Skip to content
Snippets Groups Projects
  1. May 24, 2024
    • jeffrey young's avatar
      cmm: add word <-> double/float bitcast · bdcc0f37
      jeffrey young authored and Marge Bot's avatar Marge Bot committed
      - closes: #25331
      
      This is the last step in the project plan described in #25331. This
      commit:
      
      - adds bitcast operands for x86_64, LLVM, aarch64
      - For PPC and i386 we resort to using the cmm implementations
      - renames conversion MachOps from Conv to Round|Truncate
      bdcc0f37
  2. May 23, 2024
  3. May 22, 2024
  4. May 17, 2024
    • Cheng Shao's avatar
      rts: fix I/O manager compilation errors for win32 target · 710665bd
      Cheng Shao authored and Marge Bot's avatar Marge Bot committed
      This patch fixes I/O manager compilation errors for win32 target
      discovered when cross-compiling to win32 using recent clang:
      
      ```
      rts/win32/ThrIOManager.c:117:7: error:
           error: call to undeclared function 'is_io_mng_native_p'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
            117 |   if (is_io_mng_native_p ()) {
                |       ^
          |
      117 |   if (is_io_mng_native_p ()) {
          |       ^
      
      1 error generated.
      `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1)
      
      rts/fs.c:143:28: error:
           error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
            143 | int setErrNoFromWin32Error () {
                |                            ^
                |                             void
          |
      143 | int setErrNoFromWin32Error () {
          |                            ^
      
      1 error generated.
      `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1)
      
      rts/win32/ConsoleHandler.c:227:9: error:
           error: call to undeclared function 'interruptIOManagerEvent'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
            227 |         interruptIOManagerEvent ();
                |         ^
          |
      227 |         interruptIOManagerEvent ();
          |         ^
      
      rts/win32/ConsoleHandler.c:227:9: error:
           note: did you mean 'getIOManagerEvent'?
          |
      227 |         interruptIOManagerEvent ();
          |         ^
      
      rts/include/rts/IOInterface.h:27:10: error:
           note: 'getIOManagerEvent' declared here
             27 | void *   getIOManagerEvent  (void);
                |          ^
         |
      27 | void *   getIOManagerEvent  (void);
         |          ^
      
      1 error generated.
      `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1)
      
      rts/win32/ConsoleHandler.c:196:9: error:
           error: call to undeclared function 'setThreadLabel'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
            196 |         setThreadLabel(cap, t, "signal handler thread");
                |         ^
          |
      196 |         setThreadLabel(cap, t, "signal handler thread");
          |         ^
      
      rts/win32/ConsoleHandler.c:196:9: error:
           note: did you mean 'postThreadLabel'?
          |
      196 |         setThreadLabel(cap, t, "signal handler thread");
          |         ^
      
      rts/eventlog/EventLog.h:118:6: error:
           note: 'postThreadLabel' declared here
            118 | void postThreadLabel(Capability    *cap,
                |      ^
          |
      118 | void postThreadLabel(Capability    *cap,
          |      ^
      
      1 error generated.
      `x86_64-w64-mingw32-clang' failed in phase `C Compiler'. (Exit code: 1)
      ```
      710665bd
    • Teo Camarasu's avatar
      rts: Allocate non-moving segments with megablocks · b38dcf39
      Teo Camarasu authored and Marge Bot's avatar Marge Bot committed
      Non-moving segments are 8 blocks long and need to be aligned.
      Previously we serviced allocations by grabbing 15 blocks, finding
      an aligned 8 block group in it and returning the rest.
      This proved to lead to high levels of fragmentation as a de-allocating a segment
      caused an 8 block gap to form, and this could not be reused for allocation.
      
      This patch introduces a segment allocator based around using entire
      megablocks to service segment allocations in bulk.
      
      When there are no free segments, we grab an entire megablock and fill it
      with aligned segments. As the megablock is free, we can easily guarantee
      alignment. Any unused segments are placed on a free list.
      
      It only makes sense to free segments in bulk when all of the segments in
      a megablock are freeable. After sweeping, we grab the free list, sort it,
      and find all groups of segments where they cover the megablock and free
      them.
      This introduces a period of time when free segments are not available to
      the mutator, but the risk that this would lead to excessive allocation
      is low. Right after sweep, we should have an abundance of partially full
      segments, and this pruning step is relatively quick.
      
      In implementing this we drop the logic that kept NONMOVING_MAX_FREE
      segments on the free list.
      
      We also introduce an eventlog event to log the amount of pruned/retained
      free segments.
      
      See Note [Segment allocation strategy]
      
      Resolves #24150
      
      -------------------------
      Metric Decrease:
          T13253
          T19695
      -------------------------
      b38dcf39
    • Cheng Shao's avatar
      rts: do not prefetch mark_closure bdescr in non-moving gc when ASSERTS_ENABLED · 886ab43a
      Cheng Shao authored and Marge Bot's avatar Marge Bot committed
      This commit fixes a small an oversight in !12148: the prefetch logic
      in non-moving GC may trap in debug RTS because it calls Bdescr() for
      mark_closure which may be a static one. It's fine in non-debug RTS
      because even invalid bdescr addresses are prefetched, they will not
      cause segfaults, so this commit implements the most straightforward
      fix: don't prefetch mark_closure bdescr when assertions are enabled.
      886ab43a
  5. May 10, 2024
    • Ben Gamari's avatar
      IPE: Eliminate dependency on Read · ab840ce6
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      Instead of encoding the closure type as decimal string we now simply
      represent it as an integer, eliminating the need for `Read` in
      `GHC.Internal.InfoProv.Types.peekInfoProv`.
      
      Closes #24504.
      
      -------------------------
      Metric Decrease:
          T24602_perf_size
          size_hello_artifact
      -------------------------
      ab840ce6
  6. May 02, 2024
    • Sylvain Henry's avatar
      GHCi: support inlining breakpoints (#24712) · b85b1199
      Sylvain Henry authored and Marge Bot's avatar Marge Bot committed
      When a breakpoint is inlined, its context may change (e.g. tyvars in
      scope). We must take this into account and not used the breakpoint tick
      index as its sole identifier. Each instance of a breakpoint (even with
      the same tick index) now gets a different "info" index.
      
      We also need to distinguish modules:
      - tick module: module with the break array (tick counters, status, etc.)
      - info module: module having the CgBreakInfo (info at occurrence site)
      b85b1199
    • Andreas Klebinger's avatar
      STM: Be more optimistic when validating in-flight transactions. · 917ef81b
      Andreas Klebinger authored and Marge Bot's avatar Marge Bot committed
      * Don't lock tvars when performing non-committal validation.
      * If we encounter a locked tvar don't consider it a failure.
      
      This means in-flight validation will only fail if committing at the
      moment of validation is *guaranteed* to fail.
      
      This prevents in-flight validation from failing spuriously if it happens in
      parallel on multiple threads or parallel to thread comitting.
      917ef81b
    • Andreas Klebinger's avatar
      STM: Remove (unused)coarse grained locking. · ac9c5f84
      Andreas Klebinger authored and Marge Bot's avatar Marge Bot committed
      The STM code had a coarse grained locking mode guarded by #defines that was unused.
      This commit removes the code.
      ac9c5f84
  7. Apr 21, 2024
  8. Apr 17, 2024
  9. Apr 12, 2024
  10. Apr 10, 2024
    • Rodrigo Mesquita's avatar
      rts: Make addDLL a wrapper around loadNativeObj · dcfaa190
      Rodrigo Mesquita authored and Marge Bot's avatar Marge Bot committed
      Rewrite the implementation of `addDLL` as a wrapper around the more
      principled `loadNativeObj` rts linker function. The latter should be
      preferred while the former is preserved for backwards compatibility.
      
      `loadNativeObj` was previously only available on ELF platforms, so this
      commit further refactors the rts linker to transform loadNativeObj_ELF
      into loadNativeObj_POSIX, which is available in ELF and MachO platforms.
      
      The refactor made it possible to remove the `dl_mutex` mutex in favour
      of always using `linker_mutex` (rather than a combination of both).
      
      Lastly, we implement `loadNativeObj` for Windows too.
      dcfaa190
    • Alexis King's avatar
      linker: Avoid linear search when looking up Haskell symbols via dlsym · e008a19a
      Alexis King authored and Marge Bot's avatar Marge Bot committed
      
      See the primary Note [Looking up symbols in the relevant objects] for a
      more in-depth explanation.
      
      When dynamically loading a Haskell symbol (typical when running a splice or
      GHCi expression), before this commit we would search for the symbol in
      all dynamic libraries that were loaded. However, this could be very
      inefficient when too many packages are loaded (which can happen if there are
      many package dependencies) because the time to lookup the would be
      linear in the number of packages loaded.
      
      This commit drastically improves symbol loading performance by
      introducing a mapping from units to the handles of corresponding loaded
      dlls. These handles are returned by dlopen when we load a dll, and can
      then be used to look up in a specific dynamic library.
      
      Looking up a given Name is now much more precise because we can get
      lookup its unit in the mapping and lookup the symbol solely in the
      handles of the dynamic libraries loaded for that unit.
      
      In one measurement, the wait time before the expression was executed
      went from +-38 seconds down to +-2s.
      
      This commit also includes Note [Symbols may not be found in pkgs_loaded],
      explaining the fallback to the old behaviour in case no dll can be found
      in the unit mapping for a given Name.
      
      Fixes #23415
      
      Co-authored-by: default avatarRodrigo Mesquita <(@alt-romes)>
      e008a19a
    • Rodrigo Mesquita's avatar
      rts: free error message before returning · dd530bb7
      Rodrigo Mesquita authored and Marge Bot's avatar Marge Bot committed
      Fixes a memory leak in rts/linker/PEi386.c
      dd530bb7
  11. Apr 03, 2024
    • Duncan Coutts's avatar
      Conditionally ignore some GCC warnings · 83a74d20
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      Some GCC versions don't know about some warnings, and they complain
      that we're ignoring unknown warnings. So we try to ignore the warning
      based on the GCC version.
      83a74d20
    • Duncan Coutts's avatar
      waitRead# / waitWrite# do not work for win32-legacy I/O manager · 8023bad4
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      Previously it was unclear that they did not work because the code path
      was shared with other I/O managers (in particular select()).
      
      Following the code carefully shows that what actually happens is that
      the calling thread would block forever: the thread will be put into the
      blocked queue, but no other action is scheduled that will ever result in
      it getting unblocked.
      
      It's better to just fail loudly in case anyone accidentally calls it,
      also it's less confusing code.
      8023bad4
    • Duncan Coutts's avatar
      Include the default I/O manager in the +RTS --info output · c7d3e3a3
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      Document the extra +RTS --info output in the user guide
      c7d3e3a3
    • Duncan Coutts's avatar
      Add tracing for the main I/O manager actions · 9c51473b
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      
      Using the new tracer class.
      
      Note: The unconditional definition of showIOManager should be
      compatible with the debugTrace change in 7c7d1f66.
      
      Co-authored-by: default avatarPi Delport <pi@well-typed.com>
      9c51473b
    • Duncan Coutts's avatar
      The select() I/O manager does have some global initialisation · 877a2a80
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      It's just to make sure an exception CAF is a GC root.
      877a2a80
    • Duncan Coutts's avatar
      Make struct CapIOManager be fully opaque · aaa294d0
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      Provide an opaque (forward) definition in Capability.h (since the cap
      contains a *CapIOManager) and then only provide a full definition in
      a new file IOManagerInternals.h. This new file is only supposed to be
      included by the IOManager implementation, not by its users. So that
      means IOManager.c and individual I/O manager implementations.
      
      The posix/Signals.c still needs direct access, but that should be
      eliminated. Anything that needs direct access either needs to be clearly
      part of an I/O manager (e.g. the sleect() one) or go via a proper API.
      aaa294d0
    • Duncan Coutts's avatar
      Select an I/O manager early in RTS startup · 3be6d591
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      We need to select the I/O manager to use during startup before the
      per-cap I/O manager initialisation.
      3be6d591
    • Duncan Coutts's avatar
      Add I/O manager API notifyIOManagerCapabilitiesChanged · 94a87d21
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      Used in setNumCapabilities.
      
      It only does anything for MIO on Posix.
      
      Previously it always invoked Haskell code, but that code only did
      anything on non-Windows (and non-JS), and only threaded. That currently
      effectively means the MIO I/O manager on Posix.
      
      So now it only invokes it for the MIO Posix case.
      94a87d21
    • Duncan Coutts's avatar
      Add an IOManager API for scavenging TSO blocked_info · 4161f516
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      When the GC scavenges a TSO it needs to scavenge the tso->blocked_info
      but the blocked_info is a big union and what lives there depends on the
      two->why_blocked, which for I/O-related reasons is something that in
      principle is the responsibility of the I/O manager and not the GC. So
      the right thing to do is for the GC to ask the I/O manager to sscavenge
      the blocked_info if it encounters any I/O-related why_blocked reasons.
      
      So we add scavengeTSOIOManager in IOManager.{h,c} with the usual style.
      
      Now as it happens, right now, there is no special scavenging to do, so
      the implementation of scavengeTSOIOManager is a fancy no-op. That's
      because the select I/O manager uses only the fd and target members,
      which are not GC pointers, and the win32-legacy I/O manager _ought_ to
      be using GC-managed heap objects for the StgAsyncIOResult but it is
      actually usingthe C heap, so again no GC pointers. If the win32-legacy
      were doing this more sensibly, then scavengeTSOIOManager would be the
      right place to do the GC magic.
      
      Future I/O managers will need GC heap objects in the tso->blocked_info
      and will make use of this functionality.
      4161f516
    • Duncan Coutts's avatar
      Tidy up a couple things in Select.{h,c} · d30c6bc6
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      Use the standard #include {Begin,End}Private.h style rather than
      RTS_PRIVATE on individual decls.
      
      And conditionally build the code for the select I/O manager based on
      the new CPP IOMGR_ENABLED_SELECT rather than on THREADED_RTS.
      d30c6bc6
    • Duncan Coutts's avatar
      Rename awaitEvent in select and win32 I/O managers · 5ad4b30f
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      These are now just called from IOManager.c and are the per-I/O manager
      backend impls (whereas previously awaitEvent was the entry point).
      
      Follow the new naming convention in the IOManager.{h,c} of
      awaitCompletedTimeoutsOrIO, with the I/O manager's name as a suffix:
      so awaitCompletedTimeoutsOrIO{Select,Win32}.
      5ad4b30f
    • Duncan Coutts's avatar
      Move awaitEvent into a proper IOManager API · 4f9e9c4e
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      and have the scheduler use it.
      
      Previously the scheduler calls awaitEvent directly, and awaitEvent is
      implemented directly in the RTS I/O managers (select, win32). This
      relies on the old scheme where there's a single active I/O manager for
      each platform and RTS way.
      
      We want to move that to go via an API in IOManager.{h,c} which can then
      call out to the active I/O manager.
      
      Also take the opportunity to split awaitEvent into two. The existing
      awaitEvent has a bool wait parameter, to say if the call should be
      blocking or non-blocking. We split this into two separate functions:
      pollCompletedTimeoutsOrIO and awaitCompletedTimeoutsOrIO. We split them
      for a few reasons: they have different post-conditions (specifically the
      await version is supposed to guarantee that there are threads runnable
      when it completes). Secondly, it is also anticipated that in future I/O
      managers the implementations of the two cases will be simpler if they
      are separated.
      4f9e9c4e
    • Duncan Coutts's avatar
      Have the throwTo impl go via (new) IOManager APIs · f0c1f862
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      rather than directly operating on the IO manager's data structures.
      
      Specifically, when thowing an async exception to a thread that is
      blocked waiting for I/O or waiting for a timer, then we want to cancel
      that I/O waiting or cancel the timer. Currently this is done directly in
      removeFromQueues() in RaiseAsync.c. We want it to go via proper APIs
      both for modularity but also to let us support multiple I/O managers.
      
      So add sync{IO,Delay}Cancel, which is the cancellation for the
      corresponding sync{IO,Delay}. The implementations of these use the usual
      "switch (iomgr_type)" style.
      f0c1f862
    • Duncan Coutts's avatar
      Add a new trace class for the iomanager · b48805b9
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      It makes sense now for it to be separate from the scheduler class of
      tracers.
      
      Enabled with +RTS -Do. Document the -Do debug flag in the user guide.
      b48805b9
    • Duncan Coutts's avatar
      Take a simpler approach to gcc warnings in IOManager.c · f70b8108
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      We have lots of functions with conditional implementations for
      different I/O managers. Some functions, for some I/O managers,
      naturally have implementations that do nothing or barf. When only one
      such I/O manager is enabled then the whole function implementation will
      have an implementation that does nothing or barfs. This then results in
      warnings from gcc that parameters are unused, or that the function
      should be marked with attribute noreturn (since barf does not return).
      The USED_IF_THREADS trick for fine-grained warning supression is fine
      for just two cases, but an equivalent here would need
      USED_IF_THE_ONLY_ENABLED_IOMGR_IS_X_OR_Y which would have combinitorial
      blowup. So we take a coarse grained approach and simply disable these
      two warnings for the whole file.
      
      So we use a GCC pragma, with its handy push/pop support:
      
       #pragma GCC diagnostic push
       #pragma GCC diagnostic ignored "-Wsuggest-attribute=noreturn"
       #pragma GCC diagnostic ignored "-Wunused-parameter"
      
      ...
      
       #pragma GCC diagnostic pop
      f70b8108
    • Duncan Coutts's avatar
      Move anyPendingTimeoutsOrIO impl from .h to .c · 60ce9910
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      The implementation is eventually going to need to use more private
      things, which will drag in unwanted includes into IOManager.h, so it's
      better to move the impl out of the header file and into the .c file, at
      the slight cost of it no longer being inline.
      
      At the same time, change to the "switch (iomgr_type)" style.
      60ce9910
    • Duncan Coutts's avatar
      insertIntoSleepingQueue is no longer public · e93058e0
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      No longer defined in IOManager.h, just a private function in
      IOManager.c. Since it is no longer called from cmm code, just from
      syncDelay. It ought to get moved further into the select() I/O manager
      impl, rather than living in IOManager.c.
      
      On the other hand appendToIOBlockedQueue is still called from cmm code
      in the win32-legacy I/O manager primops async{Read,Write}#, and it is
      also used by the select() I/O manager. Update the CPP and comments to
      reflect this.
      e93058e0
    • Duncan Coutts's avatar
      Move most of the delay# impl from cmm to C · 457705a8
      Duncan Coutts authored and Marge Bot's avatar Marge Bot committed
      Moves it into the IOManager.c where we can follow the new pattern of
      switching on the selected I/O manager.
      
      Uses a new IOManager API: syncDelay, following the naming convention of
      sync* for thread-synchronous I/O & timer/delay operations.
      
      As part of porting from cmm to C, we maintain the rule that the
      why_blocked gets accessed using load acquire and store release atomic
      memory operations. There was one exception to this rule: in the delay#
      primop cmm code on posix (not win32), the why_blocked was being updated
      using a store relaxed, not a store release. I've no idea why. In this
      convesion I'm playing it safe here and using store release consistently.
      457705a8
Loading