- 23 Sep, 2015 1 commit
-
-
Simon Marlow authored
Summary: This allows the code generator to give hints to later code generation steps about which branch is most likely to be taken. Right now it is only taken into account in one place: a special case in CmmContFlowOpt that swapped branches over to maximise the chance of fallthrough, which is now disabled when there is a likelihood setting. Test Plan: validate Reviewers: austin, simonpj, bgamari, ezyang, tibbe Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1273
-
- 30 Mar, 2015 1 commit
-
-
Joachim Breitner authored
This re-implements the code generation for case expressions at the Stg → Cmm level, both for data type cases as well as for integral literal cases. (Cases on float are still treated as before). The goal is to allow for fancier strategies in implementing them, for a cleaner separation of the strategy from the gritty details of Cmm, and to run this later than the Common Block Optimization, allowing for one way to attack #10124. The new module CmmSwitch contains a number of notes explaining this changes. For example, it creates larger consecutive jump tables than the previous code, if possible. nofib shows little significant overall improvement of runtime. The rather large wobbling comes from changes in the code block order (see #8082, not much we can do about it). But the decrease in code size alone makes this worthwhile. ``` Program Size Allocs Runtime Elapsed TotalMem Min -1.8% 0.0% -6.1% -6.1% -2.9% Max -0.7% +0.0% +5.6% +5.7% +7.8% Geometric Mean -1.4% -0.0% -0.3% -0.3% +0.0% ``` Compilation time increases slightly: ``` -1 s.d. ----- -2.0% +1 s.d. ----- +2.5% Average ----- +0.3% ``` The test case T783 regresses a lot, but it is the only one exhibiting any regression. The cause is the changed order of branches in an if-then-else tree, which makes the hoople data flow analysis traverse the blocks in a suboptimal order. Reverting that gets rid of this regression, but has a consistent, if only very small (+0.2%), negative effect on runtime. So I conclude that this test is an extreme outlier and no reason to change the code. Differential Revision: https://phabricator.haskell.org/D720
-
- 23 Dec, 2014 1 commit
-
-
Simon Peyton Jones authored
The purpose of silent superclass parameters was to solve the awkward problem of superclass dictinaries being bound to bottom. See THE PROBLEM in Note [Recursive superclasses] in TcInstDcls Although the silent-superclass idea worked, * It had non-local consequences, and had effects even in Haddock, where we had to discard silent parameters before displaying instance declarations * It had unexpected peformance costs, shown up by Trac #3064 and its test case. In monad-transformer code, when constructing a Monad dictionary you had to pass an Applicative dictionary; and to construct that you neede a Functor dictionary. Yet these extra dictionaries were often never used. (All this got much worse when we added Applicative as a superclass of Monad.) Test T3064 compiled *far* faster after silent superclasses were eliminated. * It introduced new bugs. For example SilentParametersOverlapping, T5051, and T7862, all failed to compile because of instance overlap directly because of the silent-superclass trick. So this patch takes a new approach, which I worked out with Dimitrios in the closing hours before Christmas. It is described in detail in THE PROBLEM in Note [Recursive superclasses] in TcInstDcls. Seems to work great! Quite a bit of knock-on effect * The main implementation work is in tcSuperClasses in TcInstDcls Everything else is fall-out * IdInfo.DFunId no longer needs its n-silent argument * Ditto IDFunId in IfaceSyn * Hence interface file format changes * Now that DFunIds do not have silent superclass parameters, printing out instance declarations is simpler. There is tiny knock-on effect in Haddock, so that submodule is updated * I realised that when computing the "size of a dictionary type" in TcValidity.sizePred, we should be rather conservative about type functions, which can arbitrarily increase the size of a type. Hence the new datatype TypeSize, which has a TSBig constructor for "arbitrarily big". * instDFunType moves from TcSMonad to Inst * Interestingly, CmmNode and CmmExpr both now need a non-silent (Ord r) in a couple of instance declarations. These were previously silent but must now be explicit. * Quite a bit of wibbling in error messages
-
- 19 Dec, 2014 1 commit
-
-
Peter Wortmann authored
- Make abbrev offset absolute on Non-Mac systems - Add another termination byte at the end of the abbrev section (readelf complains) - Scope combination was wrong for the simpler cases - Shouldn't have a "global/" in front of all scopes
-
- 16 Dec, 2014 3 commits
-
-
Peter Wortmann authored
Unwind information allows the debugger to discover more information about a program state, by allowing it to "reconstruct" other states of the program. In practice, this means that we explain to the debugger how to unravel stack frames, which comes down mostly to explaining how to find their Sp and Ip register values. * We declare yet another new constructor for CmmNode - and this time there's actually little choice, as unwind information can and will change mid-block. We don't actually make use of these capabilities, and back-end support would be tricky (generate new labels?), but it feels like the right way to do it. * Even though we only use it for Sp so far, we allow CmmUnwind to specify unwind information for any register. This is pretty cheap and could come in useful in future. * We allow full CmmExpr expressions for specifying unwind values. The advantage here is that we don't have to make up new syntax, and can e.g. use the WDS macro directly. On the other hand, the back-end will now have to simplify the expression until it can sensibly be converted into DWARF byte code - a process which might fail, yielding NCG panics. On the other hand, when you're writing Cmm by hand you really ought to know what you're doing. (From Phabricator D169)
-
Peter Wortmann authored
This patch solves the scoping problem of CmmTick nodes: If we just put CmmTicks into blocks we have no idea what exactly they are meant to cover. Here we introduce tick scopes, which allow us to create sub-scopes and merged scopes easily. Notes: * Given that the code often passes Cmm around "head-less", we have to make sure that its intended scope does not get lost. To keep the amount of passing-around to a minimum we define a CmmAGraphScoped type synonym here that just bundles the scope with a portion of Cmm to be assembled later. * We introduce new scopes at somewhat random places, aligning with getCode calls. This works surprisingly well, but we might have to add new scopes into the mix later on if we find things too be too coarse-grained. (From Phabricator D169)
-
Peter Wortmann authored
This patch adds CmmTick nodes to Cmm code. This is relatively straight-forward, but also not very useful, as many blocks will simply end up with no annotations whatosever. Notes: * We use this design over, say, putting ticks into the entry node of all blocks, as it seems to work better alongside existing optimisations. Now granted, the reason for this is that currently GHC's main Cmm optimisations seem to mainly reorganize and merge code, so this might change in the future. * We have the Cmm parser generate a few source notes as well. This is relatively easy to do - worst part is that it complicates the CmmParse implementation a bit. (From Phabricator D169)
-
- 15 May, 2014 1 commit
-
-
Herbert Valerio Riedel authored
In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been reorganized, while following the convention, to - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before any `{-# OPTIONS_GHC #-}`-lines. - Moreover, if the list of language extensions fit into a single `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each individual language extension. In both cases, try to keep the enumeration alphabetically ordered. (The latter layout is preferable as it's more diff-friendly) While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
-
- 18 Oct, 2013 1 commit
-
-
Simon Peyton Jones authored
The only substantive change here is to change "==" into ">=" in the Note [Always false stack check] code. This is semantically correct, but won't have any practical impact.
-
- 04 Sep, 2013 1 commit
-
-
Jan Stolarek authored
And update comments
-
- 03 Sep, 2013 1 commit
-
-
Jan Stolarek authored
-
- 24 Jul, 2013 1 commit
-
-
Simon Marlow authored
We weren't properly tracking the number of stack arguments in the continuation of a foreign call. It happened to work when the continuation was not a join point, but when it was a join point we were using the wrong amount of stack fixup.
-
- 14 Apr, 2013 1 commit
-
-
ian@well-typed.com authored
-
- 06 Apr, 2013 1 commit
-
-
Gabor Greif authored
-
- 05 Mar, 2013 1 commit
-
-
Simon Marlow authored
-
- 12 Nov, 2012 1 commit
-
-
Simon Marlow authored
This removes the OldCmm data type and the CmmCvt pass that converts new Cmm to OldCmm. The backends (NCGs, LLVM and C) have all been converted to consume new Cmm. The main difference between the two data types is that conditional branches in new Cmm have both true/false successors, whereas in OldCmm the false case was a fallthrough. To generate slightly better code we occasionally need to invert a conditional to ensure that the branch-not-taken becomes a fallthrough; this was previously done in CmmCvt, and it is now done in CmmContFlowOpt. We could go further and use the Hoopl Block representation for native code, which would mean that we could use Hoopl's postorderDfs and analyses for native code, but for now I've left it as is, using the old ListGraph representation for native code.
-
- 30 Oct, 2012 1 commit
-
-
gmainlan@microsoft.com authored
We would like to calculate register liveness for global registers as well as local registers, so this patch generalizes the existing infrastructure to set the stage.
-
- 08 Oct, 2012 1 commit
-
-
Simon Marlow authored
The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
-
- 31 Aug, 2012 1 commit
-
-
Simon Marlow authored
This caused the CAF analysis to occasionally miss a CAF sometimes, resulting in a very hard to diagnose crash.
-
- 06 Aug, 2012 1 commit
-
-
Simon Marlow authored
See Note [foreign calls clobber GlobalRegs]
-
- 20 Jul, 2012 1 commit
-
-
Ian Lynagh authored
-
- 09 Jul, 2012 1 commit
-
-
Simon Marlow authored
This gives the register allocator access to R1.., F1.., D1.. etc. for the new code generator, and is a cheap way to eliminate all the extra "x = R1" assignments that we get from copyIn.
-
- 05 Jul, 2012 2 commits
-
-
Simon Marlow authored
-
Simon Marlow authored
-
- 15 Mar, 2012 2 commits
-
-
Simon Marlow authored
-
Simon Marlow authored
-
- 13 Feb, 2012 1 commit
-
-
Simon Marlow authored
-
- 08 Feb, 2012 1 commit
-
-
Simon Marlow authored
Also: - improvements to code generation: push slow-call continuations on the stack instead of generating explicit continuations - remove unused CmmInfo wrapper type (replace with CmmInfoTable) - squash Area and AreaId together, remove now-unused RegSlot - comment out old unused stack-allocation code that no longer compiles after removal of RegSlot
-
- 03 Feb, 2012 1 commit
-
-
Simon Marlow authored
-
- 30 Jan, 2012 1 commit
-
-
Simon Marlow authored
-
- 25 Jan, 2012 1 commit
-
-
Simon Marlow authored
-
- 23 Jan, 2012 1 commit
-
-
Simon Marlow authored
-
- 18 Jan, 2012 1 commit
-
-
Simon Marlow authored
-
- 19 Dec, 2011 1 commit
-
-
Simon Marlow authored
-
- 04 Nov, 2011 1 commit
-
-
Ian Lynagh authored
We only use it for "compiler" sources, i.e. not for libraries. Many modules have a -fno-warn-tabs kludge for now.
-
- 25 Aug, 2011 1 commit
-
-
Simon Marlow authored
-
- 19 Aug, 2011 1 commit
-
- 28 Jun, 2011 1 commit
-
-
Simon Marlow authored
-
- 17 Jun, 2011 1 commit
-
-
Edward Z. Yang authored
* Rewrote cmmMachOpFold to cmmMachOpFoldM, which returns Nothing if no folding took place. * Wrote some generic mapping functions which take functions of form 'a -> Maybe a' and are smart about sharing. * Split up optimizations from PIC and PPC work in the native codegen, so they'll be easier to turn off later (they are not currently being turned off, however.) * Whitespace fixes! ToDo: Turn off MachOp folding when new codegenerator is being used. Signed-off-by:
Edward Z. Yang <ezyang@mit.edu>
-
- 13 Jun, 2011 1 commit
-
-
Edward Z. Yang authored
Signed-off-by:
Edward Z. Yang <ezyang@mit.edu>
-