- 05 Nov, 2012 1 commit
-
-
Simon Marlow authored
-
- 30 Oct, 2012 2 commits
-
-
gmainlan@microsoft.com authored
On x86-64 F and D registers are both drawn from SSE registers, so there is no reason not to draw them from the same pool of available SSE registers. This means that whereas previously a function could only receive two Double arguments in registers even if it did not have any Float arguments, now it can receive up to 6 arguments that are any mix of Float and Double in registers. This patch breaks the LLVM back end. The next patch will fix this breakage.
-
Simon Marlow authored
Fixes this, when building unregisterised: rts/dist/build/AutoApply.hc:87:1: error: ‘stg_ap_v_entry’ undeclared (first use in this function)
-
- 19 Oct, 2012 1 commit
-
-
Simon Marlow authored
Except for CgUtils.fixStgRegisters that is used in the NCG and LLVM backends, and should probably be moved somewhere else.
-
- 16 Oct, 2012 1 commit
-
-
ian@well-typed.com authored
Mostly d -> g (matching DynFlag -> GeneralFlag). Also renamed if* to when*, matching the Haskell if/when names
-
- 08 Oct, 2012 2 commits
-
-
Simon Marlow authored
-
Simon Marlow authored
The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
-
- 18 Sep, 2012 2 commits
-
-
ian@well-typed.com authored
StgWord is a newtyped Word64, as it needed to be something that has a UArray instance.
-
ian@well-typed.com authored
It's now a newtyped Integer. Perhaps a newtyped Word32 would make more sense, though.
-
- 16 Sep, 2012 1 commit
-
-
ian@well-typed.com authored
-
- 12 Sep, 2012 3 commits
-
-
ian@well-typed.com authored
-
ian@well-typed.com authored
-
ian@well-typed.com authored
I've switched to passing DynFlags rather than Platform, as (a) it's simpler to not have to extract targetPlatform in so many places, and (b) it may be useful to have DynFlags around in future.
-
- 28 Aug, 2012 2 commits
-
-
ian@well-typed.com authored
-
ian@well-typed.com authored
-
- 06 Aug, 2012 1 commit
-
-
ian@well-typed.com authored
This is a bit odd by itself, but it's a stepping stone on the way to putting "target unregisterised" into the settings file.
-
- 30 Jul, 2012 1 commit
-
-
Simon Marlow authored
Proc-point splitting is only required by backends that do not support having proc-points within a code block (that is, everything except the native backend, i.e. LLVM and C). Not doing proc-point splitting saves some compilation time, and might produce slightly better code in some cases.
-
- 24 Jul, 2012 1 commit
-
-
Ian Lynagh authored
All the flags that 'ways' imply are now dynamic
-
- 13 Jun, 2012 1 commit
-
-
Ian Lynagh authored
We can now get the Platform from the DynFlags inside an SDoc, so we no longer need to pass the Platform in.
-
- 12 Jun, 2012 1 commit
-
-
Ian Lynagh authored
-
- 23 Feb, 2012 1 commit
-
-
Ian Lynagh authored
No special-casing in any NCGs yet
-
- 08 Feb, 2012 1 commit
-
-
Simon Marlow authored
Also: - improvements to code generation: push slow-call continuations on the stack instead of generating explicit continuations - remove unused CmmInfo wrapper type (replace with CmmInfoTable) - squash Area and AreaId together, remove now-unused RegSlot - comment out old unused stack-allocation code that no longer compiles after removal of RegSlot
-
- 27 Jan, 2012 1 commit
-
-
Simon Marlow authored
-
- 10 Jan, 2012 1 commit
-
-
dterei authored
We now carry around with CmmJump statements a list of the STG registers that are live at that jump site. This is used by the LLVM backend so it can avoid unnesecarily passing around dead registers, improving perfromance. This gives us the framework to finally fix trac #4308.
-
- 06 Jan, 2012 2 commits
- 19 Dec, 2011 1 commit
-
-
Ian Lynagh authored
We no longer have many separate, clashing getDynFlags functions I've given each GhcMonad its own HasDynFlags instance, rather than using UndecidableInstances to make a GhcMonad m => HasDynFlags m instance.
-
- 29 Nov, 2011 2 commits
-
-
Simon Marlow authored
This means that both time and heap profiling work for parallel programs. Main internal changes: - CCCS is no longer a global variable; it is now another pseudo-register in the StgRegTable struct. Thus every Capability has its own CCCS. - There is a new built-in CCS called "IDLE", which records ticks for Capabilities in the idle state. If you profile a single-threaded program with +RTS -N2, you'll see about 50% of time in "IDLE". - There is appropriate locking in rts/Profiling.c to protect the shared cost-centre-stack data structures. This patch does enough to get it working, I have cut one big corner: the cost-centre-stack data structure is still shared amongst all Capabilities, which means that multiple Capabilities will race when updating the "allocations" and "entries" fields of a CCS. Not only does this give unpredictable results, but it runs very slowly due to cache line bouncing. It is strongly recommended that you use -fno-prof-count-entries to disable the "entries" count when profiling parallel programs. (I shall add a note to this effect to the docs).
-
Simon Marlow authored
This field was doing nothing. I think it originally appeared in a very old incarnation of the new code generator.
-
- 02 Nov, 2011 1 commit
-
-
Simon Marlow authored
User visible changes ==================== Profilng -------- Flags renamed (the old ones are still accepted for now): OLD NEW --------- ------------ -auto-all -fprof-auto -auto -fprof-exported -caf-all -fprof-cafs New flags: -fprof-auto Annotates all bindings (not just top-level ones) with SCCs -fprof-top Annotates just top-level bindings with SCCs -fprof-exported Annotates just exported bindings with SCCs -fprof-no-count-entries Do not maintain entry counts when profiling (can make profiled code go faster; useful with heap profiling where entry counts are not used) Cost-centre stacks have a new semantics, which should in most cases result in more useful and intuitive profiles. If you find this not to be the case, please let me know. This is the area where I have been experimenting most, and the current solution is probably not the final version, however it does address all the outstanding bugs and seems to be better than GHC 7.2. Stack traces ------------ +RTS -xc now gives more information. If the exception originates from a CAF (as is common, because GHC tends to lift exceptions out to the top-level), then the RTS walks up the stack and reports the stack in the enclosing update frame(s). Result: +RTS -xc is much more useful now - but you still have to compile for profiling to get it. I've played around a little with adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem quite accurately. I plan to add more facilities for stack tracing (e.g. in GHCi) in the future. Coverage (HPC) -------------- * derived instances are now coloured yellow if they weren't used * likewise record field names * entry counts are more accurate (hpc --fun-entry-count) * tab width is now correct (markup was previously off in source with tabs) Internal changes ================ In Core, the Note constructor has been replaced by Tick (Tickish b) (Expr b) which is used to represent all the kinds of source annotation we support: profiling SCCs, HPC ticks, and GHCi breakpoints. Depending on the properties of the Tickish, different transformations apply to Tick. See CoreUtils.mkTick for details. Tickets ======= This commit closes the following tickets, test cases to follow: - Close #2552: not a bug, but the behaviour is now more intuitive (test is T2552) - Close #680 (test is T680) - Close #1531 (test is result001) - Close #949 (test is T949) - Close #2466: test case has bitrotted (doesn't compile against current version of vector-space package)
-
- 25 Aug, 2011 3 commits
-
-
Simon Peyton Jones authored
CmmTop -> CmmDecl CmmPgm -> CmmGroup
-
Simon Marlow authored
-
Simon Marlow authored
-
- 28 Jul, 2011 1 commit
-
-
batterseapower authored
Put the info CLabel in CmmInfoTable rather than a localness flag, tidy up some info<->entry conversions
-
- 15 Jul, 2011 1 commit
-
-
Ian Lynagh authored
There's now a variant of the Outputable class that knows what platform we're targetting: class PlatformOutputable a where pprPlatform :: Platform -> a -> SDoc pprPlatformPrec :: Platform -> Rational -> a -> SDoc and various instances have had to be converted to use that class, and we pass Platform around accordingly.
-
- 13 Jul, 2011 1 commit
-
-
Ian Lynagh authored
They've been deprecated since GHC 6.12.
-
- 07 Jul, 2011 1 commit
-
-
batterseapower authored
This is safe because GHC never generates a fast call to a data constructor worker: if the call is seen statically it will be eta-expanded and the allocation of the data will be inlined. We still need to export the _closure in case the constructor is used in an unapplied fashion.
-
- 05 Jul, 2011 2 commits
-
-
batterseapower authored
-
batterseapower authored
I observed that the [CmmStatics] within CmmData uses the list in a very stylised way. The first item in the list is almost invariably a CmmDataLabel. Many parts of the compiler pattern match on this list and fail if this is not true. This patch makes the invariant explicit by introducing a structured type CmmStatics that holds the label and the list of remaining [CmmStatic]. There is one wrinkle: the x86 backend sometimes wants to output an alignment directive just before the label. However, this can be easily fixed up by parameterising the native codegen over the type of CmmStatics (though the GenCmmTop parameterisation) and using a pair (Alignment, CmmStatics) there instead. As a result, I think we will be able to remove CmmAlign and CmmDataLabel from the CmmStatic data type, thus nuking a lot of code and failing pattern matches. This change will come as part of my next patch.
-
- 09 Jun, 2011 1 commit
-
-
Ian Lynagh authored
The "Unhelpful" cases are now in a separate type. This allows us to improve various things, e.g.: * Most of the panic's in SrcLoc are now gone * The Lexer now works with RealSrcSpans rather than SrcSpans, i.e. it knows that it has real locations and thus can assume that the line number etc really exists * Some of the more suspicious cases are no longer necessary, e.g. we no longer need this case in advanceSrcLoc: advanceSrcLoc loc _ = loc -- Better than nothing More improvements can probably be made, e.g. tick locations can probably use RealSrcSpans too.
-