This project is mirrored from https://gitlab.haskell.org/ghc/ghc.git.
Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts, and can be resumed by a project maintainer.
Last successful update .
Repository mirroring has been paused due to too many failed attempts, and can be resumed by a project maintainer.
Last successful update .
- 13 Sep, 2018 1 commit
-
-
Simon Marlow authored
This reverts commit bf10456e.
-
- 17 Jun, 2018 1 commit
-
-
Ömer Sinan Ağacan authored
It seems like we currently support string literals in Cmm, so we can use __LINE__ CPP macro in assertion macros. This improves error messages that previously looked like ASSERTION FAILED: file (null), line 1302 (null) part now shows the actual file name. Also inline some single-use string literals in PrimOps.cmm. Reviewers: bgamari, simonmar, erikd Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4862
-
- 05 Jun, 2018 1 commit
-
-
Ömer Sinan Ağacan authored
SMALL_MUT_ARR_PTRS_FROZEN0 -> SMALL_MUT_ARR_PTRS_FROZEN_DIRTY SMALL_MUT_ARR_PTRS_FROZEN -> SMALL_MUT_ARR_PTRS_FROZEN_CLEAN MUT_ARR_PTRS_FROZEN0 -> MUT_ARR_PTRS_FROZEN_DIRTY MUT_ARR_PTRS_FROZEN -> MUT_ARR_PTRS_FROZEN_CLEAN Naming is now consistent with other CLEAR/DIRTY objects (MVAR, MUT_VAR, MUT_ARR_PTRS). (alternatively we could rename MVAR_DIRTY/MVAR_CLEAN etc. to MVAR0/MVAR) Removed a few comments in Scav.c about FROZEN0 being on the mut_list because it's now clear from the closure type. Reviewers: bgamari, simonmar, erikd Reviewed By: simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4784
-
- 02 Jun, 2018 1 commit
-
-
Ben Gamari authored
This feature has some very serious correctness issues (#14310), introduces a great deal of complexity, and hasn't seen wide usage. Consequently we are removing it, as proposed in Proposal #77 [1]. This is heavily based on a patch from fryguybob. Updates stm submodule. [1] https://github.com/ghc-proposals/ghc-proposals/pull/77 Test Plan: Validate Reviewers: erikd, simonmar, hvr Reviewed By: simonmar Subscribers: rwbarton, thomie, carter GHC Trac Issues: #14310 Differential Revision: https://phabricator.haskell.org/D4760
-
- 28 May, 2018 1 commit
-
-
Ömer Sinan Ağacan authored
Make it clear that max_live_bytes is updated after a major GC whereas live_bytes is updated after all GCs (including minor collections) and considers data in uncollected generations as live. Reviewers: bgamari, simonmar, hvr Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4734
-
- 23 May, 2018 1 commit
-
-
Ben Gamari authored
Unfortunately, this optimisation is infeasible on MachO platforms (e.g. Darwin) due to an object format limitation. Specifically, linking fails with errors of the form: error: unsupported relocation with subtraction expression, symbol '_integerzmgmp_GHCziIntegerziType_quotInteger_closure' can not be undefined in a subtraction expression Apparently MachO does not permit relocations' subtraction expressions to refer to undefined symbols. As far as I can tell this means that it is essentially impossible to express an offset between symbols living in different compilation units. This means that we lively can't use this optimisation on MachO platforms. Test Plan: Validate on Darwin Reviewers: simonmar, erikd Subscribers: rwbarton, thomie, carter, angerman GHC Trac Issues: #15169 Differential Revision: https://phabricator.haskell.org/D4715
-
- 20 May, 2018 1 commit
-
-
patrickdoc authored
This pulls parts of Joachim Breitner's ghc-heap-view library inside GHC. The bits added are the C hooks into the RTS and a basic Haskell wrapper to these C hooks. The main reason for these to be added to GHC proper is that the code needs to be kept in sync with the closure types defined by the RTS. It is expected that the version of HeapView shipped with GHC will always work with that version of GHC and that extra functionality can be layered on top with a library like ghc-heap-view distributed via Hackage. Test Plan: validate Reviewers: simonmar, hvr, nomeata, austin, Phyx, bgamari, erikd Reviewed By: bgamari Subscribers: carter, patrickdoc, tmcgilchrist, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3055
-
- 17 May, 2018 1 commit
-
-
niteria authored
Summary: See the new note. This should fix cb5c2fe8 enough to unbreak Windows and OS X builds. Test Plan: manual testing with patched gdb Reviewers: bgamari, simonmar, erikd Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4694
-
- 16 May, 2018 4 commits
-
-
Ben Gamari authored
-
Simon Marlow authored
Summary: The idea here is to save a little code size and some work in the GC, by collapsing FUN_STATIC closures and their SRTs. This is (4) in a series; see D4632 for more details. There's a tradeoff here: more complexity in the compiler in exchange for a modest code size reduction (probably around 0.5%). Results: * GHC binary itself (statically linked) is 1% smaller * -0.2% binary sizes in nofib (-0.5% module sizes) Full nofib results comparing D4634 with this: P177 (ignore runtimes, these aren't stable on my laptop) Test Plan: validate, nofib Reviewers: bgamari, niteria, simonpj, erikd Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4637
-
Simon Marlow authored
Summary: An info table with an SRT normally looks like this: StgWord64 srt_offset StgClosureInfo layout StgWord32 layout StgWord32 has_srt But we only need 32 bits for srt_offset on x86_64, because the small memory model requires that code segments are at most 2GB. So we can optimise this to StgClosureInfo layout StgWord32 layout StgWord32 srt_offset saving a word. We can tell whether the info table has an SRT or not, because zero is not a valid srt_offset, so zero still indicates that there's no SRT. Test Plan: * validate * For results, see D4632. Reviewers: bgamari, niteria, osa1, erikd Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4634
-
Simon Marlow authored
Summary: - Previously we would hvae a single big table of pointers per module, with a set of bitmaps to reference entries within it. The new representation is identical to a static constructor, which is much simpler for the GC to traverse, and we get to remove the complicated bitmap-traversal code from the GC. - Rewrite all the code to generate SRTs in CmmBuildInfoTables, and document it much better (see Note [SRTs]). This has been something I've wanted to do since we moved to the new code generator, I finally had the opportunity to finish it while on a transatlantic flight recently :) There are a series of 4 diffs: 1. D4632 (this one), which does the bulk of the changes 2. D4633 which adds support for smaller `CmmLabelDiffOff` constants 3. D4634 which takes advantage of D4632 and D4633 to save a word in info tables that have an SRT on x86_64. This is where most of the binary size improvement comes from. 4. D4637 which makes a further optimisation to merge some SRTs with static FUN closures. This adds some complexity and the benefits are fairly modest, so it's not clear yet whether we should do this. Results (after (3), on x86_64) - GHC itself (staticaly linked) is 5.2% smaller - -1.7% binary sizes in nofib, -2.9% module sizes. Full nofib results: P176 - I measured the overhead of traversing all the static objects in a major GC in GHC itself by doing `replicateM_ 1000 performGC` as the first thing in `Main.main`. The new version was 5-10% faster, but the results did vary quite a bit. - I'm not sure if there's a compile-time difference, the results are too unreliable. Test Plan: validate Reviewers: bgamari, michalt, niteria, simonpj, erikd, osa1 Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4632
-
- 13 May, 2018 1 commit
-
-
Michal Terepeta authored
GCC 8 now generates warnings for incompatible function pointer casts [-Werror=cast-function-type]. Apparently there are a few of those in rts code, which makes `./validate` unhappy (since we compile with `-Werror`) This commit tries to fix these issues by changing the functions to have the correct type (and, if necessary, moving the casts into those functions). For instance, hash/comparison function are declared (`Hash.h`) to take `StgWord` but we want to use `StgWord64[2]` in `StaticPtrTable.c`. Instead of casting the function pointers, we can cast the `StgWord` parameter to `StgWord*`. I think this should be ok since `StgWord` should be the same size as a pointer. Signed-off-by:
Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: bgamari, erikd, simonmar Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4673
-
- 12 May, 2018 1 commit
-
- 11 May, 2018 1 commit
-
-
niteria authored
See the new note. Test Plan: manual testing with patched gdb Reviewers: bgamari, simonmar, erikd Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4666
-
- 10 May, 2018 1 commit
-
-
Ömer Sinan Ağacan authored
-
- 25 Apr, 2018 1 commit
-
-
Ömer Sinan Ağacan authored
Test Plan: Passes validate Reviewers: simonmar, bgamari, erikd Subscribers: thomie, carter GHC Trac Issues: #10296 Differential Revision: https://phabricator.haskell.org/D4627
-
- 16 Apr, 2018 1 commit
-
-
Ben Gamari authored
-
- 05 Apr, 2018 1 commit
-
-
Ömer Sinan Ağacan authored
Reviewers: bgamari, simonmar, erikd Reviewed By: bgamari, simonmar Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4539
-
- 30 Mar, 2018 1 commit
-
-
Ömer Sinan Ağacan authored
[skip ci]
-
- 26 Mar, 2018 1 commit
-
-
Douglas Wilson authored
There should be no change in the output of the '+RTS -s' (summary) report, or the 'RTS -t' (one-line) report. All data shown in the summary report is now shown in the machine readable report. All data in RTSStats is now shown in the machine readable report. init times are added to RTSStats and added to GHC.Stats. Example of the new output: ``` [("bytes allocated", "375016384") ,("num_GCs", "113") ,("average_bytes_used", "148348") ,("max_bytes_used", "206552") ,("num_byte_usage_samples", "2") ,("peak_megabytes_allocated", "6") ,("init_cpu_seconds", "0.001642") ,("init_wall_seconds", "0.001027") ,("mut_cpu_seconds", "3.020166") ,("mut_wall_seconds", "0.757244") ,("GC_cpu_seconds", "0.037750") ,("GC_wall_seconds", "0.009569") ,("exit_cpu_seconds", "0.000890") ,("exit_wall_seconds", "0.002551") ,("total_cpu_seconds", "3.060452") ,("total_wall_seconds", "0.770395") ,("major_gcs", "2") ,("allocated_bytes", "375016384") ,("max_live_bytes", "206552") ,("max_large_objects_bytes", "159344") ,("max_compact_bytes", "0") ,("max_slop_bytes", "59688") ,("max_mem_in_use_bytes", "6291456") ,("cumulative_live_bytes", "296696") ,("copied_bytes", "541024") ,("par_copied_bytes", "493976") ,("cumulative_par_max_copied_bytes", "104104") ,("cumulative_par_balanced_copied_bytes", "274456") ,("fragmentation_bytes", "2112") ,("alloc_rate", "124170795") ,("productivity_cpu_percent", "0.986838") ,("productivity_wall_percent", "0.982935") ,("bound_task_count", "1") ,("sparks_count", "5836258") ,("sparks_converted", "237") ,("sparks_overflowed", "1990408") ,("sparks_dud ", "0") ,("sparks_gcd", "3455553") ,("sparks_fizzled", "390060") ,("work_balance", "0.555606") ,("n_capabilities", "4") ,("task_count", "10") ,("peak_worker_count", "9") ,("worker_count", "9") ,("gc_alloc_block_sync_spin", "162") ,("gc_alloc_block_sync_yield", "0") ,("gc_alloc_block_sync_spin", "162") ,("gc_spin_spin", "18840855") ,("gc_spin_yield", "10355") ,("mut_spin_spin", "70331392") ,("mut_spin_yield", "61700") ,("waitForGcThreads_spin", "241") ,("waitForGcThreads_yield", "2797") ,("whitehole_gc_spin", "0") ,("whitehole_lockClosure_spin", "0") ,("whitehole_lockClosure_yield", "0") ,("whitehole_executeMessage_spin", "0") ,("whitehole_threadPaused_spin", "0") ,("any_work", "1667") ,("no_work", "1662") ,("scav_find_work", "1026") ,("gen_0_collections", "111") ,("gen_0_par_collections", "111") ,("gen_0_cpu_seconds", "0.036126") ,("gen_0_wall_seconds", "0.036126") ,("gen_0_max_pause_seconds", "0.036126") ,("gen_0_avg_pause_seconds", "0.000081") ,("gen_0_sync_spin", "21") ,("gen_0_sync_yield", "0") ,("gen_1_collections", "2") ,("gen_1_par_collections", "1") ,("gen_1_cpu_seconds", "0.001624") ,("gen_1_wall_seconds", "0.001624") ,("gen_1_max_pause_seconds", "0.001624") ,("gen_1_avg_pause_seconds", "0.000272") ,("gen_1_sync_spin", "3") ,("gen_1_sync_yield", "0") ] ``` Test Plan: Ensure that one-line and summary reports are unchanged. Reviewers: erikd, simonmar, hvr Subscribers: duog, carter, thomie, rwbarton GHC Trac Issues: #14660 Differential Revision: https://phabricator.haskell.org/D4529
-
- 20 Mar, 2018 1 commit
-
-
Ben Gamari authored
This reverts commit 2d4bda2e.
-
- 19 Mar, 2018 3 commits
-
-
Douglas Wilson authored
There should be no change in the output of the '+RTS -s' (summary) report, or the 'RTS -t' (one-line) report. All data shown in the summary report is now shown in the machine readable report. All data in RTSStats is now shown in the machine readable report. init times are added to RTSStats and added to GHC.Stats. Example of the new output: ``` [("bytes allocated", "375016384") ,("num_GCs", "113") ,("average_bytes_used", "148348") ,("max_bytes_used", "206552") ,("num_byte_usage_samples", "2") ,("peak_megabytes_allocated", "6") ,("init_cpu_seconds", "0.001642") ,("init_wall_seconds", "0.001027") ,("mut_cpu_seconds", "3.020166") ,("mut_wall_seconds", "0.757244") ,("GC_cpu_seconds", "0.037750") ,("GC_wall_seconds", "0.009569") ,("exit_cpu_seconds", "0.000890") ,("exit_wall_seconds", "0.002551") ,("total_cpu_seconds", "3.060452") ,("total_wall_seconds", "0.770395") ,("major_gcs", "2") ,("allocated_bytes", "375016384") ,("max_live_bytes", "206552") ,("max_large_objects_bytes", "159344") ,("max_compact_bytes", "0") ,("max_slop_bytes", "59688") ,("max_mem_in_use_bytes", "6291456") ,("cumulative_live_bytes", "296696") ,("copied_bytes", "541024") ,("par_copied_bytes", "493976") ,("cumulative_par_max_copied_bytes", "104104") ,("cumulative_par_balanced_copied_bytes", "274456") ,("fragmentation_bytes", "2112") ,("alloc_rate", "124170795") ,("productivity_cpu_percent", "0.986838") ,("productivity_wall_percent", "0.982935") ,("bound_task_count", "1") ,("sparks_count", "5836258") ,("sparks_converted", "237") ,("sparks_overflowed", "1990408") ,("sparks_dud ", "0") ,("sparks_gcd", "3455553") ,("sparks_fizzled", "390060") ,("work_balance", "0.555606") ,("n_capabilities", "4") ,("task_count", "10") ,("peak_worker_count", "9") ,("worker_count", "9") ,("gc_alloc_block_sync_spin", "162") ,("gc_alloc_block_sync_yield", "0") ,("gc_alloc_block_sync_spin", "162") ,("gc_spin_spin", "18840855") ,("gc_spin_yield", "10355") ,("mut_spin_spin", "70331392") ,("mut_spin_yield", "61700") ,("waitForGcThreads_spin", "241") ,("waitForGcThreads_yield", "2797") ,("whitehole_gc_spin", "0") ,("whitehole_lockClosure_spin", "0") ,("whitehole_lockClosure_yield", "0") ,("whitehole_executeMessage_spin", "0") ,("whitehole_threadPaused_spin", "0") ,("any_work", "1667") ,("no_work", "1662") ,("scav_find_work", "1026") ,("gen_0_collections", "111") ,("gen_0_par_collections", "111") ,("gen_0_cpu_seconds", "0.036126") ,("gen_0_wall_seconds", "0.036126") ,("gen_0_max_pause_seconds", "0.036126") ,("gen_0_avg_pause_seconds", "0.000081") ,("gen_0_sync_spin", "21") ,("gen_0_sync_yield", "0") ,("gen_1_collections", "2") ,("gen_1_par_collections", "1") ,("gen_1_cpu_seconds", "0.001624") ,("gen_1_wall_seconds", "0.001624") ,("gen_1_max_pause_seconds", "0.001624") ,("gen_1_avg_pause_seconds", "0.000272") ,("gen_1_sync_spin", "3") ,("gen_1_sync_yield", "0") ] ``` Test Plan: Ensure that one-line and summary reports are unchanged. Reviewers: bgamari, erikd, simonmar, hvr Reviewed By: simonmar Subscribers: rwbarton, thomie, carter GHC Trac Issues: #14660 Differential Revision: https://phabricator.haskell.org/D4303
-
Ben Gamari authored
Summary: get/setAllocationCounter didn't take into account allocations in the current block. This was known at the time, but it turns out to be important to have more accuracy when using these in a fine-grained way. Test Plan: New unit test to test incrementally larger allocaitons. Before I got results like this: ``` +0 +0 +0 +0 +0 +4096 +0 +0 +0 +0 +0 +4064 +0 +0 +4088 +4056 +0 +0 +0 +4088 +4096 +4056 +4096 ``` Notice how the results aren't always monotonically increasing. After this patch: ``` +344 +416 +488 +560 +632 +704 +776 +848 +920 +992 +1064 +1136 +1208 +1280 +1352 +1424 +1496 +1568 +1640 +1712 +1784 +1856 +1928 +2000 +2072 +2144 ``` Reviewers: hvr, erikd, simonmar, jrtc27, trommler Reviewed By: simonmar Subscribers: trommler, jrtc27, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4363
-
Douglas Wilson authored
The existing internal counters: * gc_alloc_block_sync * whitehole_spin * gen[g].sync * gen[1].sync are now not shown in the -s report unless --internal-counters is also passed. If --internal-counters is passed we now show the counters above, reformatted, as well as several other counters. In particular, we now count the yieldThread() calls that SpinLocks do as well as their spins. The added counters are: * gc_spin (spin and yield) * mut_spin (spin and yield) * whitehole_threadPaused (spin only) * whitehole_executeMessage (spin only) * whitehole_lockClosure (spin only) * waitForGcThreadsd (spin and yield) As well as the following, which are not SpinLock-like things: * any_work * do_work * scav_find_work See the Note for descriptions of what these counters are. We add busy_wait_nops in these loops along with the counter increment where it was absent. Old internal counters output: ``` gc_alloc_block_sync: 0 whitehole_gc_spin: 0 gen[0].sync: 0 gen[1].sync: 0 ``` New internal counters output: ``` Internal Counters: Spins Yields gc_alloc_block_sync 323 0 gc_spin 9016713 752 mut_spin 57360944 47716 whitehole_gc 0 n/a whitehole_threadPaused 0 n/a whitehole_executeMessage 0 n/a whitehole_lockClosure 0 0 waitForGcThreads 2 415 gen[0].sync 6 0 gen[1].sync 1 0 any_work 2017 no_work 2014 scav_find_work 1004 ``` Test Plan: ./validate Check it builds with #define PROF_SPIN removed from includes/rts/Config.h Reviewers: bgamari, erikd, simonmar, hvr Reviewed By: simonmar Subscribers: rwbarton, thomie, carter GHC Trac Issues: #3553, #9221 Differential Revision: https://phabricator.haskell.org/D4302
-
- 09 Mar, 2018 1 commit
-
-
Sergei Trofimovich authored
Unreg build failed as: $ ./configure --enable-unregisterised $ make HC [stage 1] libraries/ghc-prim/dist-install/build/GHC/PrimopWrappers.o ghc_1.hc: In function 'ghczmprim_GHCziPrimopWrappers_pdep8zh_entry': ghc_1.hc:1810:9: error: error: implicit declaration of function 'hs_pdep8'; did you mean 'hs_ctz8'? [-Werror=implicit-function-declaration] _c3jz = hs_pdep8(*Sp, Sp[1]); ^~~~~~~~ hs_ctz8 | 1810 | _c3jz = hs_pdep8(*Sp, Sp[1]); | ^ Signed-off-by:
Sergei Trofimovich <slyfox@gentoo.org>
-
- 08 Feb, 2018 1 commit
-
-
Douglas Wilson authored
Summary: See definition of PRINTF above the change Reviewers: bgamari, erikd, simonmar, Phyx Reviewed By: Phyx Subscribers: Phyx, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4392
-
- 06 Feb, 2018 1 commit
-
-
Ben Gamari authored
Test Plan: Validate Reviewers: erikd, simonmar Reviewed By: simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4374
-
- 01 Feb, 2018 1 commit
-
-
Andreas Klebinger authored
This prevents the register being picked up as a scratch register. Otherwise the allocator would be free to use it before a call. This fixes #14619. Test Plan: ci, repro case on #14619 Reviewers: bgamari, Phyx, erikd, simonmar, RyanGlScott, simonpj Reviewed By: Phyx, RyanGlScott, simonpj Subscribers: simonpj, RyanGlScott, Phyx, rwbarton, thomie, carter GHC Trac Issues: #14619 Differential Revision: https://phabricator.haskell.org/D4348
-
- 26 Jan, 2018 1 commit
-
-
Tamar Christina authored
On Windows we use the function `win32AllocStack` to do stack allocations in 4k blocks and insert a stack check afterwards to ensure the allocation returned a valid block. The problem is this function does something that by C semantics is pointless. The stack allocated value can never escape the function, and the stack isn't used so the compiler just optimizes away the entire function body. After considering a bunch of other possibilities I think the simplest fix is to just disable optimizations for the function. Alternatively inline assembly is an option but the stack check function doesn't have a very portable name as it relies on e.g. `libgcc`. Thanks to Sergey Vinokurov for helping diagnose and test. Test Plan: ./validate Reviewers: bgamari, erikd, simonmar Reviewed By: bgamari Subscribers: rwbarton, thomie, carter GHC Trac Issues: #14669 Differential Revision: https://phabricator.haskell.org/D4343
-
- 18 Jan, 2018 1 commit
-
-
Ben Gamari authored
This reverts commit a1a689dd.
-
- 08 Jan, 2018 1 commit
-
-
Simon Marlow authored
Summary: get/setAllocationCounter didn't take into account allocations in the current block. This was known at the time, but it turns out to be important to have more accuracy when using these in a fine-grained way. Test Plan: New unit test to test incrementally larger allocaitons. Before I got results like this: ``` +0 +0 +0 +0 +0 +4096 +0 +0 +0 +0 +0 +4064 +0 +0 +4088 +4056 +0 +0 +0 +4088 +4096 +4056 +4096 ``` Notice how the results aren't always monotonically increasing. After this patch: ``` +344 +416 +488 +560 +632 +704 +776 +848 +920 +992 +1064 +1136 +1208 +1280 +1352 +1424 +1496 +1568 +1640 +1712 +1784 +1856 +1928 +2000 +2072 +2144 ``` Reviewers: niteria, bgamari, hvr, erikd Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4288
-
- 23 Nov, 2017 1 commit
-
-
takano-akio authored
Test Plan: ./validate Reviewers: bgamari, erikd, simonmar Reviewed By: bgamari Subscribers: rwbarton, thomie GHC Trac Issues: #14513 Differential Revision: https://phabricator.haskell.org/D4226
-
- 22 Nov, 2017 1 commit
-
-
Ben Gamari authored
Reviewers: erikd, simonmar Reviewed By: simonmar Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D4194
-
- 16 Nov, 2017 1 commit
-
-
Simon Marlow authored
Summary: GC sync is the time between a GC being intiated and all the mutator threads finally stopping so that the GC can start. Problems that cause the GC sync to be delayed are hard to find and can cause dramatic slowdowns for heavily parallel programs. The new flag --long-gc-sync=<time> helps by emitting a warning and calling a user-overridable hook when the GC sync time exceeds the specified threshold. A debugger can be used to set a breakpoint when this happens and inspect the stacks of threads to find the culprit. Test Plan: ``` $ ./inplace/bin/ghc-stage2 +RTS --long-gc-sync=0.0000001 -S Alloc Copied Live GC GC TOT TOT Page Flts bytes bytes bytes user elap user elap 1135856 51144 153736 0.000 0.000 0.002 0.002 0 0 (Gen: 0) 1034760 94704 188752 0.000 0.000 0.002 0.002 0 0 (Gen: 0) 1038888 134832 228888 0.009 0.009 0.011 0.011 0 0 (Gen: 1) 1025288 90128 235184 0.000 0.000 0.012 0.012 0 0 (Gen: 0) 1049088 130080 333984 0.000 0.000 0.013 0.013 0 0 (Gen: 0) Warning: waited 0us for GC sync 1034424 73360 331976 0.000 0.000 0.013 0.013 0 0 (Gen: 0) ``` Also tested on a real production problem. Reviewers: niteria, bgamari, erikd Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D4193
-
- 08 Nov, 2017 1 commit
-
-
Herbert Valerio Riedel authored
These removes left-overs from e3ba26f8 where I implemented `compareByteArray#` as an out-of-line primop, which got optimised into an inline primop shortly afterwards (as per 76735615).
-
- 30 Oct, 2017 1 commit
-
-
Michal Terepeta authored
This is another step for fixing #13825 and is based on D38 by Simon Marlow. The change allows storing multiple constructor fields within the same word. This currently applies only to `Float`s, e.g., ``` data Foo = Foo {-# UNPACK #-} !Float {-# UNPACK #-} !Float ``` on 64-bit arch, will now store both fields within the same constructor word. For `WordX/IntX` we'll need to introduce new primop types. Main changes: - We now use sizes in bytes when we compute the offsets for constructor fields in `StgCmmLayout` and introduce padding if necessary (word-sized fields are still word-aligned) - `ByteCodeGen` had to be updated to correctly construct the data types. This required some new bytecode instructions to allow pushing things that are not full words onto the stack (and updating `Interpreter.c`). Note that we only use the packed stuff when constructing data types (i.e., for `PACK`), in all other cases the behavior should not change. - `RtClosureInspect` was changed to handle the new layout when extracting subterms. This seems to be used by things like `:print`. I've also added a test for this. - I deviated slightly from Simon's approach and use `PrimRep` instead of `ArgRep` for computing the size of fields. This seemed more natural and in the future we'll probably want to introduce new primitive types (e.g., `Int8#`) and `PrimRep` seems like a better place to do that (where we already have `Int64Rep` for example). `ArgRep` on the other hand seems to be more focused on calling functions. Signed-off-by:
Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: bgamari, simonmar, austin, hvr, goldfire, erikd Reviewed By: bgamari Subscribers: maoe, rwbarton, thomie GHC Trac Issues: #13825 Differential Revision: https://phabricator.haskell.org/D3809
-
- 22 Oct, 2017 1 commit
-
-
Tamar Christina authored
Summary: This patch adds the ability to generate stack traces on crashes for Windows. When running in the interpreter this attempts to use symbol information from the interpreter and information we know about the loaded object files to resolve addresses to symbols. When running compiled it doesn't have this information and then defaults to using symbol information from PDB files. Which for now means only files compiled with ICC or MSVC will show traces compiled. But I have a future patch that may address this shortcoming. Also since I don't know how to walk a pure haskell stack, I can for now only show the last entry. I'm hoping to figure out how Apply.cmm works to be able to walk the stalk and give more entries for pure haskell code. In GHCi ``` $ echo main | inplace/bin/ghc-stage2.exe --interactive ./testsuite/tests/rts/derefnull.hs GHCi, version 8.3.20170830: http://www.haskell.org/ghc/ :? for help Ok, 1 module loaded. Prelude Main> Access violation in generated code when reading 0x0 Attempting to reconstruct a stack trace... Frame Code address * 0x77cde10 0xc370229 E:\..\base\dist-install\build\HSbase-4.10.0.0.o+0x190031 (base_ForeignziStorable_zdfStorableInt4_info+0x3f) ``` and compiled ``` Access violation in generated code when reading 0x0 Attempting to reconstruct a stack trace... Frame Code address * 0xf0dbd0 0x40bb01 E:\..\rts\derefnull.run\derefnull.exe+0xbb01 ``` Test Plan: ./validate Reviewers: austin, hvr, bgamari, erikd, simonmar Reviewed By: bgamari Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3913
-
- 16 Oct, 2017 1 commit
-
-
Herbert Valerio Riedel authored
The new primop compareByteArrays# :: ByteArray# -> Int# {- offset -} -> ByteArray# -> Int# {- offset -} -> Int# {- length -} -> Int# allows to compare the subrange of the first `ByteArray#` to the (same-length) subrange of the second `ByteArray#` and returns a value less than, equal to, or greater than zero if the range is found, respectively, to be byte-wise lexicographically less than, to match, or be greater than the second range. Under the hood, the new primop is implemented in terms of the standard ISO C `memcmp(3)` function. It is currently an out-of-line primop but work is underway to optimise this into an inline primop for a future follow-up Differential (see D4091). This primop has applications in packages like `text`, `text-short`, `bytestring`, `text-containers`, `primitive`, etc. which currently have to incur the overhead of an ordinary FFI call to directly or indirectly invoke `memcmp(3)` as well has having to deal with some `unsafePerformIO`-variant. While at it, this also improves the documentation for the existing `copyByteArray#` primitive which has a non-trivial type-signature that significantly benefits from a more explicit description of its arguments. Reviewed By: bgamari Differential Revision: https://phabricator.haskell.org/D4090
-
- 03 Oct, 2017 1 commit
-
-
Tamar Christina authored
It's often hard to debug things like segfaults on Windows, mostly because gdb isn't always of use and users don't know how to effectively use it. This patch provides a way to create a crash drump by passing `+RTS --generate-crash-dumps` as an option. If any unhandled exception is triggered a dump is made that contains enough information to be able to diagnose things successfully. Currently the created dumps are a bit big because I include all registers, code and threads information. This looks like ``` $ testsuite/tests/rts/derefnull.run/derefnull.exe +RTS --generate-crash-dumps Access violation in generated code when reading 0000000000000000 Crash dump created. Dump written to: E:\msys64\tmp\ghc-20170901-220250-11216-16628.dmp ``` Test Plan: ./validate Reviewers: austin, hvr, bgamari, erikd, simonmar Reviewed By: bgamari, simonmar Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3912
-