- 29 Mar, 2014 1 commit
-
-
tibbe authored
These array types are smaller than Array# and MutableArray# and are faster when the array size is small, as they don't have the overhead of a card table. Having no card table reduces the closure size with 2 words in the typical small array case and leads to less work when updating or GC:ing the array. Reduces both the runtime and memory allocation by 8.8% on my insert benchmark for the HashMap type in the unordered-containers package, which makes use of lots of small arrays. With tuned GC settings (i.e. `+RTS -A6M`) the runtime reduction is 15%. Fixes #8923.
-
- 22 Mar, 2014 1 commit
-
-
tibbe authored
The inline allocation version is 69% faster than the out-of-line version, when cloning an array of 16 unit elements on a 64-bit machine. Comparing the new and the old primop implementations isn't straightforward. The old version had a missing heap check that I discovered during the development of the new version. Comparing the old and the new version would requiring fixing the old version, which in turn means reimplementing the equivalent of MAYBE_CG in StgCmmPrim. The inline allocation threshold is configurable via -fmax-inline-alloc-size which gives the maximum array size, in bytes, to allocate inline. The size does not include the closure header size. Allowing the same primop to be either inline or out-of-line has some implication for how we lay out heap checks. We always place a heap check around out-of-line primops, as they may allocate outside of our knowledge. However, for the inline primops we only allow allocation via the standard means (i.e. virtHp). Since the clone primops might be either inline or out-of-line the heap check layout code now consults shouldInlinePrimOp to know whether a primop will be inlined.
-
- 17 Mar, 2014 1 commit
-
-
Simon Peyton Jones authored
We don't yet understand WHY commit ad15c2, which is to do with CmmSink, causes seg-faults on Windows, but it certainly seems to. So reverting it is a stop-gap, but we need to un-block the 7.8 release. Many thanks to awson for identifying the offending commit.
-
- 13 Mar, 2014 1 commit
-
- 11 Mar, 2014 2 commits
-
-
Simon Marlow authored
- Move array representation knowledge into SMRep - Separate out low-level heap-object allocation so that we can reuse it from doNewArrayOp - remove card-table initialisation, we can safely ignore the card table for newly allocated arrays.
-
Simon Marlow authored
I'd like to be able to pack together non-pointer fields that are less than a word in size, and this is a necessary prerequisite.
-
- 03 Feb, 2014 2 commits
-
-
Jan Stolarek authored
End of Cmm pipeline used to be split into two alternative flows, depending on whether we did proc-point splitting or not. There was a lot of code duplication between these two branches. But it wasn't really necessary as the differences can be easily enclosed within an if-then-else. I observed no impact of this change on compilation performance.
-
Jan Stolarek authored
-
- 02 Feb, 2014 2 commits
-
-
Jan Stolarek authored
-
Jan Stolarek authored
-
- 01 Feb, 2014 3 commits
-
-
Jan Stolarek authored
* CmmRewriteAddignments module was replaced by CmmSink a long time ago. That module is now available at https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Hoopl/Examples wiki page. * removeDeadAssignments function was not used and it was also moved to the above page. * I also nuked some commented out debugging code that was not used for 1,5 year.
-
Jan Stolarek authored
It turns out that one of the cases in the optimization pass was a special case of another. I remove that specialization since it does not have impact on compilation time, and the resulting Cmm is identical.
-
Jan Stolarek authored
-
- 26 Jan, 2014 1 commit
-
-
Gabor Greif authored
-
- 16 Jan, 2014 4 commits
-
-
Simon Marlow authored
By using the constant-folder to reduce it to an integer.
-
Simon Marlow authored
We occasionally need to reserve some temporary memory in a primop for passing to a foreign function. We've been using the stack for this, but when we moved to high-level Cmm it became quite fragile because primops are in high-level Cmm and the stack is supposed to be under the control of the Cmm pipeline. So this change puts things on a firmer footing by adding a new Cmm construct 'reserve'. e.g. in decodeFloat_Int#: reserve 2 = tmp { mp_tmp1 = tmp + WDS(1); mp_tmp_w = tmp; /* Perform the operation */ ccall __decodeFloat_Int(mp_tmp1 "ptr", mp_tmp_w "ptr", arg); r1 = W_[mp_tmp1]; r2 = W_[mp_tmp_w]; } reserve is described in CmmParse.y. Unfortunately the argument to reserve must be a compile-time constant. We might have to extend the parser to allow expressions with arithmetic operators if this is too restrictive. Note also that the return instruction for the procedure must be outside the scope of the reserved stack area, so we have to extract the values from the reserved area before we close the scope. This means some more local variables (r1, r2 in the example above). The generated code is more or less identical to what we had before though.
-
Gabor Greif authored
-
Simon Marlow authored
-
- 10 Jan, 2014 1 commit
-
-
Simon Marlow authored
-
- 28 Nov, 2013 3 commits
-
-
Simon Marlow authored
-
parcs authored
-
Simon Marlow authored
-
- 22 Nov, 2013 3 commits
-
-
Simon Peyton Jones authored
This bug only shows up when you are using proc-point splitting. What was happening was: * We generate a proc-point for the stack check * And an info table * We eliminate the stack check because it's redundant * And the dangling info table caused a panic in CmmBuildInfoTables.bundle
-
Simon Peyton Jones authored
-
Simon Peyton Jones authored
-
- 03 Nov, 2013 1 commit
-
-
Herbert Valerio Riedel authored
Signed-off-by:
Herbert Valerio Riedel <hvr@gnu.org>
-
- 26 Oct, 2013 1 commit
-
-
Austin Seipp authored
This reverts commit 2f5db98e.
-
- 25 Oct, 2013 2 commits
-
-
Simon Marlow authored
Inlining global registers and constants made code slightly larger in some cases. I finally got around to looking into why, and discovered one reason: we weren't discarding dead code in some cases. This patch fixes it.
-
Simon Marlow authored
-
- 24 Oct, 2013 1 commit
-
-
Jan Stolarek authored
Fixes #8456
-
- 18 Oct, 2013 4 commits
-
-
Simon Peyton Jones authored
-
Jan Stolarek authored
Fixes #8456. Previous version of control flow optimisations did not update the list of block predecessors, leading to unnecessary duplication of blocks in some cases. See Trac and comments in the code for more details.
-
Simon Peyton Jones authored
The only substantive change here is to change "==" into ">=" in the Note [Always false stack check] code. This is semantically correct, but won't have any practical impact.
-
Simon Peyton Jones authored
-
- 17 Oct, 2013 1 commit
-
-
Jan Stolarek authored
Fix a bug introduced in 94125c97. See Note [Always false stack check]
-
- 16 Oct, 2013 3 commits
-
-
Jan Stolarek authored
I am removing old loopification code that has been commented out for long long time. We now have loopification implemented in the code generator (see Note [Self-recursive tail calls]) so we won't need to resurect this old code.
-
Jan Stolarek authored
-
Jan Stolarek authored
When compiling a function we can determine how much stack space it will use. We therefore need to perform only a single stack check at the beginning of a function to see if we have enough stack space. Instead of referring directly to Sp - as we used to do in the past - the code generator uses (old + 0) in the stack check. Stack layout phase turns (old + 0) into Sp. The idea here is that, while we need to perform only one stack check for each function, we could in theory place more stack checks later in the function. They would be redundant, but not incorrect (in a sense that they should not change program behaviour). We need to make sure however that a stack check inserted after incrementing the stack pointer checks for a respectively smaller stack space. This would not be the case if the code generator produced direct references to Sp. By referencing (old + 0) we make sure that we always check for a correct amount of stack: when converting (old + 0) to Sp the stack layout phase takes into account changes already made to stack pointer. The idea for this change came from observations made while debugging #8275.
-
- 12 Oct, 2013 2 commits
-
-
Krzysztof Gogolewski authored
-
rwbarton authored
Signed-off-by:
Erik de Castro Lopo <erikd@mega-nerd.com>
-