Commits · c30cea53dc328a326fc676e7824396518fd3ebda · Reinier Maas / GHC

Jul 07, 2023

driver: Fix -S with .cmm files · 76983a0d

Matthew Pickering authored 1 year ago and

Marge Bot committed 1 year ago

There was an oversight in the driver which assumed that you would always
produce a `.o` file when compiling a .cmm file.

Fixes #23610

76983a0d

Jun 13, 2023

Add a test Way for running ghci with Core optimizations · ec01f0ec

Matthew Pickering authored 2 years ago and

Marge Bot committed 1 year ago


Tracking ticket: #23059

This runs compile_and_run tests with optimised code with bytecode
interpreter

Changed submodules: hpc, process

Co-authored-by: Torsten Schmits <git@tryp.io>

ec01f0ec

Apr 24, 2023

rts: always build 64-bit atomic ops · 87095f6a

Cheng Shao authored 1 year ago and

Marge Bot committed 1 year ago

This patch does a few things:

- Always build 64-bit atomic ops in rts/ghc-prim, even on 32-bit
  platforms
- Remove legacy "64bit" cabal flag of rts package
- Fix hs_xchg64 function prototype for 32-bit platforms
- Fix AtomicFetch test for wasm32

87095f6a

Apr 02, 2023
- cmm: implement parsing of MO_AtomicRMW from hand-written CMM files · 43ebd5dc
  Bodigrim authored 1 year ago and Marge Bot committed 1 year ago
  
  Fixes #23206
  43ebd5dc
Feb 28, 2023
- Testsuite: replace some js_skip with req_cmm · 239202a2
  Sylvain Henry authored 2 years ago and Marge Bot committed 2 years ago
  
  req_cmm is more informative than js_skip
  239202a2
Feb 01, 2023

compiler: properly handle non-word-sized CmmSwitch scrutinees in the wasm NCG · f0eefa3c

Cheng Shao authored 2 years ago and

Marge Bot committed 2 years ago

Currently, the wasm NCG has an implicit assumption: all CmmSwitch
scrutinees are 32-bit integers. This is not always true; #22864 is one
counter-example with a 64-bit scrutinee. This patch fixes the logic by
explicitly converting the scrutinee to a word that can be used as a
br_table operand. Fixes #22871. Also includes a regression test.

f0eefa3c

Jan 11, 2023

Misc cleanup · 083f7015

Krzysztof Gogolewski authored 2 years ago and

Marge Bot committed 2 years ago

- Remove unused mkWildEvBinder
- Use typeTypeOrConstraint - more symmetric and asserts that
  that the type is Type or Constraint
- Fix escape sequences in Python; they raise a deprecation warning
  with -Wdefault

083f7015

Nov 29, 2022

Add Javascript backend · cc25d52e

Sylvain Henry authored 3 years ago


Add JS backend adapted from the GHCJS project by Luite Stegeman.

Some features haven't been ported or implemented yet. Tests for these
features have been disabled with an associated gitlab ticket.

Bump array submodule

Work funded by IOG.

Co-authored-by: Jeffrey Young <jeffrey.young@iohk.io>
Co-authored-by: Luite Stegeman <stegeman@gmail.com>
Co-authored-by: Josh Meredith <joshmeredith2008@gmail.com>

cc25d52e

Apr 27, 2022
- Give Cmm files fake ModuleNames which include full filepath · 5a7f0dee
  Matthew Pickering authored 2 years ago and Marge Bot committed 2 years ago
  
  This fixes the initialisation functions when using -prof or -finfo-table-map. Fixes #21370
  5a7f0dee
Apr 01, 2022

Minor cleanup · 8334ff9e

Krzysztof Gogolewski authored 3 years ago and

Matthew Pickering committed 2 years ago

- Remove unused functions exprToCoercion_maybe, applyTypeToArg,
  typeMonoPrimRep_maybe, runtimeRepMonoPrimRep_maybe.
- Replace orValid with a simpler check
- Use splitAtList in applyTysX
- Remove calls to extra_clean in the testsuite; it does not do anything.

Metric Decrease:
    T18223

8334ff9e

Jan 19, 2022

Fix T20638 on big-endian architectures · 95e7964b

Peter Trommler authored 3 years ago and

Marge Bot committed 3 years ago

The test reads a 16 bit value from an array of 8 bit values. Naturally,
that leads to different values read on big-endian architectures than
on little-endian. In this case the value read is 0x8081 on big-endian
and 0x8180 on little endian. This patch changes the argument of the `and`
machop to mask bit 7 which is the only bit different. The test still checks
that bit 15 is zero, which was the original issue in #20638.

Fixes #20906.

95e7964b

Dec 28, 2021

Multiple Home Units · fd42ab5f

Matthew Pickering authored 3 years ago


Multiple home units allows you to load different packages which may depend on
each other into one GHC session. This will allow both GHCi and HLS to support
multi component projects more naturally.

Public Interface
~~~~~~~~~~~~~~~~

In order to specify multiple units, the -unit @⟨filename⟩ flag
is given multiple times with a response file containing the arguments for each unit.
The response file contains a newline separated list of arguments.

```
ghc -unit @unitLibCore -unit @unitLib
```

where the `unitLibCore` response file contains the normal arguments that cabal would pass to `--make` mode.

```
-this-unit-id lib-core-0.1.0.0
-i
-isrc
LibCore.Utils
LibCore.Types
```

The response file for lib, can specify a dependency on lib-core, so then modules in lib can use modules from lib-core.

```
-this-unit-id lib-0.1.0.0
-package-id lib-core-0.1.0.0
-i
-isrc
Lib.Parse
Lib.Render
```

Then when the compiler starts in --make mode it will compile both units lib and lib-core.

There is also very basic support for multiple home units in GHCi, at the
moment you can start a GHCi session with multiple units but only the
:reload is supported. Most commands in GHCi assume a single home unit,
and so it is additional work to work out how to modify the interface to
support multiple loaded home units.

Options used when working with Multiple Home Units

There are a few extra flags which have been introduced specifically for
working with multiple home units. The flags allow a home unit to pretend
it’s more like an installed package, for example, specifying the package
name, module visibility and reexported modules.

-working-dir ⟨dir⟩

    It is common to assume that a package is compiled in the directory
    where its cabal file resides. Thus, all paths used in the compiler
    are assumed to be relative to this directory. When there are
    multiple home units the compiler is often not operating in the
    standard directory and instead where the cabal.project file is
    located. In this case the -working-dir option can be passed which
    specifies the path from the current directory to the directory the
    unit assumes to be it’s root, normally the directory which contains
    the cabal file.

    When the flag is passed, any relative paths used by the compiler are
    offset by the working directory. Notably this includes -i and
    -I⟨dir⟩ flags.

-this-package-name ⟨name⟩

    This flag papers over the awkward interaction of the PackageImports
    and multiple home units. When using PackageImports you can specify
    the name of the package in an import to disambiguate between modules
    which appear in multiple packages with the same name.

    This flag allows a home unit to be given a package name so that you
    can also disambiguate between multiple home units which provide
    modules with the same name.

-hidden-module ⟨module name⟩

    This flag can be supplied multiple times in order to specify which
    modules in a home unit should not be visible outside of the unit it
    belongs to.

    The main use of this flag is to be able to recreate the difference
    between an exposed and hidden module for installed packages.

-reexported-module ⟨module name⟩

    This flag can be supplied multiple times in order to specify which
    modules are not defined in a unit but should be reexported. The
    effect is that other units will see this module as if it was defined
    in this unit.

    The use of this flag is to be able to replicate the reexported
    modules feature of packages with multiple home units.

Offsetting Paths in Template Haskell splices
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When using Template Haskell to embed files into your program,
traditionally the paths have been interpreted relative to the directory
where the .cabal file resides. This causes problems for multiple home
units as we are compiling many different libraries at once which have
.cabal files in different directories.

For this purpose we have introduced a way to query the value of the
-working-dir flag to the Template Haskell API. By using this function we
can implement a makeRelativeToProject function which offsets a path
which is relative to the original project root by the value of
-working-dir.

```
import Language.Haskell.TH.Syntax ( makeRelativeToProject )

foo = $(makeRelativeToProject "./relative/path" >>= embedFile)
```

> If you write a relative path in a Template Haskell splice you should use the makeRelativeToProject function so that your library works correctly with multiple home units.

A similar function already exists in the file-embed library. The
function in template-haskell implements this function in a more robust
manner by honouring the -working-dir flag rather than searching the file
system.

Closure Property for Home Units
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For tools or libraries using the API there is one very important closure
property which must be adhered to:

> Any dependency which is not a home unit must not (transitively) depend
  on a home unit.

For example, if you have three packages p, q and r, then if p depends on
q which depends on r then it is illegal to load both p and r as home
units but not q, because q is a dependency of the home unit p which
depends on another home unit r.

If you are using GHC by the command line then this property is checked,
but if you are using the API then you need to check this property
yourself. If you get it wrong you will probably get some very confusing
errors about overlapping instances.

Limitations of Multiple Home Units
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There are a few limitations of the initial implementation which will be smoothed out on user demand.

    * Package thinning/renaming syntax is not supported
    * More complicated reexports/renaming are not yet supported.
    * It’s more common to run into existing linker bugs when loading a
      large number of packages in a session (for example #20674, #20689)
    * Backpack is not yet supported when using multiple home units.
    * Dependency chasing can be quite slow with a large number of
      modules and packages.
    * Loading wired-in packages as home units is currently not supported
      (this only really affects GHC developers attempting to load
      template-haskell).
    * Barely any normal GHCi features are supported, it would be good to
      support enough for ghcid to work correctly.

Despite these limitations, the implementation works already for nearly
all packages. It has been testing on large dependency closures,
including the whole of head.hackage which is a total of 4784 modules
from 452 packages.

Internal Changes
~~~~~~~~~~~~~~~~

* The biggest change is that the HomePackageTable is replaced with the
  HomeUnitGraph. The HomeUnitGraph is a map from UnitId to HomeUnitEnv,
  which contains information specific to each home unit.
* The HomeUnitEnv contains:
    - A unit state, each home unit can have different package db flags
    - A set of dynflags, each home unit can have different flags
    - A HomePackageTable
* LinkNode: A new node type is added to the ModuleGraph, this is used to
  place the linking step into the build plan so linking can proceed in
  parralel with other packages being built.
* New invariant: Dependencies of a ModuleGraphNode can be completely
  determined by looking at the value of the node. In order to achieve
  this, downsweep now performs a more complete job of downsweeping and
  then the dependenices are recorded forever in the node rather than
  being computed again from the ModSummary.
* Some transitive module calculations are rewritten to use the
  ModuleGraph which is more efficient.
* There is always an active home unit, which simplifies modifying a lot
  of the existing API code which is unit agnostic (for example, in the
  driver).

The road may be bumpy for a little while after this change but the
basics are well-tested.

One small metric increase, which we accept and also submodule update to
haddock which removes ExtendedModSummary.

Closes #10827

-------------------------
Metric Increase:
    MultiLayerModules
-------------------------

Co-authored-by: Fendor <power.walross@gmail.com>

fd42ab5f

Dec 07, 2021
- generalize GHC.Cmm.Dataflow to work over any node type · cc2bf8e9
  Norman Ramsey authored 3 years ago and Marge Bot committed 3 years ago
  
  See #20725. The commit includes source-code changes and a test case.
  cc2bf8e9
Dec 02, 2021
- testsuite: Specify expected word-size of machop tests · 44c08863
  Ben Gamari authored 3 years ago and Marge Bot committed 3 years ago
  
  These generally expect a particular word size.
  44c08863
- CmmToC: Always cast arguments as unsigned · 0aeaa8f3
  Ben Gamari authored 3 years ago and Marge Bot committed 3 years ago
  
  As noted in Note [When in doubt, cast arguments as unsigned], we must ensure that arguments have the correct signedness since some operations (e.g. `%`) have different semantics depending upon signedness.
  0aeaa8f3
- testsuite: Add testcases for various machop issues · 2f6565cf
  Ben Gamari authored 3 years ago and Marge Bot committed 3 years ago
  
  There were found by the test-primops testsuite.
  2f6565cf
Nov 23, 2021

Don't include types in test output · 1ed2aa90
Andreas Klebinger authored 3 years ago and Marge Bot committed 3 years ago

1ed2aa90

CmmSink: Be more aggressive in removing no-op assignments. · 680ef2c8

Andreas Klebinger authored 3 years ago and

Marge Bot committed 3 years ago

No-op assignments like R1 = R1 are not only wasteful. They can also
inhibit other optimizations like inlining assignments that read from
R1.

We now check for assignments being a no-op before and after we
simplify the RHS in Cmm sink which should eliminate most of these
no-ops.

680ef2c8

Nov 06, 2021
- Make Word64 use Word64# on every architecture · 2800eee2
  Sylvain Henry authored 3 years ago and Marge Bot committed 3 years ago
  
  2800eee2
Aug 17, 2021

Test non-native switch C-- with twos compliment · b784a51e

John Ericson authored 3 years ago


We don't want regressions like e8f7734d
to regress.

Co-Authored-By: Sylvain Henry <hsyl20@gmail.com>

b784a51e

Jul 24, 2021
- testsuite: Add test for #20142 · a31aa271
  Ben Gamari authored 3 years ago and Marge Bot committed 3 years ago
  
  a31aa271
Mar 03, 2021

Fix array and cleanup conversion primops (#19026) · d8dc0f96

Sylvain Henry authored 4 years ago and

Marge Bot committed 4 years ago

The first change makes the array ones use the proper fixed-size types,
which also means that just like before, they can be used without
explicit conversions with the boxed sized types. (Before, it was Int# /
Word# on both sides, now it is fixed sized on both sides).

For the second change, don't use "extend" or "narrow" in some of the
user-facing primops names for conversions.

  - Names like `narrowInt32#` are misleading when `Int` is 32-bits.

  - Names like `extendInt64#` are flat-out wrong when `Int is
    32-bits.

  - `narrow{Int,Word}<N>#` however map a type to itself, and so don't
    suffer from this problem. They are left as-is.

These changes are batched together because Alex happend to use the array
ops. We can only use released versions of Alex at this time, sadly, and
I don't want to have to have a release thatwon't work for the final GHC
9.2. So by combining these we get all the changes for Alex done at once.

Bump hackage state in a few places, and also make that workflow slightly
easier for the future.

Bump minimum Alex version

Bump Cabal, array, bytestring, containers, text, and binary submodules

d8dc0f96

Nov 04, 2020

NCG: Fix 64bit int comparisons on 32bit x86 · bb100805

Andreas Klebinger authored 4 years ago and

Marge Bot committed 4 years ago

We no compare these by doing 64bit subtraction and
checking the resulting flags.

We used to do this differently but the old approach was
broken when the high bits compared equal and the comparison
was one of >= or <=.

The new approach should be both correct and faster.

bb100805

Jun 24, 2020
- Add tests for #17920 · cad62ef1
  Sylvain Henry authored 4 years ago and Marge Bot committed 4 years ago
  
  Metric Decrease: T12150 T12234
  cad62ef1
May 15, 2020

GHC.Cmm.Opt: Handle MO_XX_Conv · 568d7279

Ben Gamari authored 4 years ago and

Marge Bot committed 4 years ago

This MachOp was introduced by 2c959a18
but a wildcard match in cmmMachOpFoldM hid the fact that it wasn't
handled. Ideally we would eliminate the match but this appears to be a
larger task.

Fixes #18141.

568d7279

Feb 14, 2020
- testsuite/T16930: Don't rely on gnu grep specific --include · 8ef7a15a
  Ben Gamari authored 5 years ago
  
  In BSD grep this flag only affects directory recursion.
  8ef7a15a
Jan 25, 2020
- Module hierarchy: Cmm (cf #13009) · 6e2d9ee2
  Sylvain Henry authored 5 years ago and Marge Bot committed 5 years ago
  
  6e2d9ee2
Jan 16, 2020

Handle TagToEnum in the same big case as the other primops · 22c0bdc3

John Ericson authored 5 years ago and

Marge Bot committed 5 years ago

Before, it was a panic because it was handled above. But there must have
been an error in my reasoning (another caller?) because #17442 reported
the panic was hit.

But, rather than figuring out what happened, I can just make it
impossible by construction. By adding just a bit more bureaucracy in the
return types, I can handle TagToEnum in the same case as all the others,
so the big case is is now total, and the panic is removed.

Fixes #17442

22c0bdc3

Sep 05, 2019
- Make the C-- O and C types constructors with DataKinds · f96d57b8
  John Ericson authored 5 years ago and Marge Bot committed 5 years ago
  
  The tightens up the kinds a bit. I use type synnonyms to avoid adding promotion ticks everywhere.
  f96d57b8
Jul 26, 2019

Change behaviour of -ddump-cmm-verbose to dump each Cmm pass output to a... · aae0457f

Alex D authored 5 years ago and

Marge Bot committed 5 years ago

Change behaviour of -ddump-cmm-verbose to dump each Cmm pass output to a separate file and add -ddump-cmm-verbose-by-proc to keep old behaviour (#16930)

aae0457f

Jan 30, 2019
- testsuite: Use makefile_test · 513a449c
  Ben Gamari authored 6 years ago
  
  This eliminates most uses of run_command in the testsuite in favor of the more structured makefile_test.
  513a449c
- Revert "Batch merge" · 172a5933
  Ben Gamari authored 6 years ago
  
  This reverts commit 76c8fd67.
  172a5933
- Batch merge · 76c8fd67
  Ben Gamari authored 6 years ago
  
  76c8fd67
Nov 17, 2018

NCG: New code layout algorithm. · 912fd2b6

Andreas Klebinger authored 6 years ago

Summary:
This patch implements a new code layout algorithm.
It has been tested for x86 and is disabled on other platforms.

Performance varies slightly be CPU/Machine but in general seems to be better
by around 2%.
Nofib shows only small differences of about +/- ~0.5% overall depending on
flags/machine performance in other benchmarks improved significantly.

Other benchmarks includes at least the benchmarks of: aeson, vector, megaparsec, attoparsec,
containers, text and xeno.

While the magnitude of gains differed three different CPUs where tested with
all getting faster although to differing degrees. I tested: Sandy Bridge(Xeon), Haswell,
Skylake

* Library benchmark results summarized:
* containers: ~1.5% faster
* aeson: ~2% faster
* megaparsec: ~2-5% faster
* xml library benchmarks: 0.2%-1.1% faster
* vector-benchmarks: 1-4% faster
* text: 5.5% faster

On average GHC compile times go down, as GHC compiled with the new layout
is faster than the overhead introduced by using the new layout algorithm,

Things this patch does:

* Move code responsilbe for block layout in it's own module.
* Move the NcgImpl Class into the NCGMonad module.
* Extract a control flow graph from the input cmm.
* Update this cfg to keep it in sync with changes during
asm codegen. This has been tested on x64 but should work on x86.
Other platforms still use the old codelayout.
* Assign weights to the edges in the CFG based on type and limited static
analysis which are then used for block layout.
* Once we have the final code layout eliminate some redundant jumps.

In particular turn a sequences of:
jne .foo
jmp .bar
foo:
into
je bar
foo:
..

Test Plan: ci

Reviewers: bgamari, jmct, jrtc27, simonmar, simonpj, RyanGlScott

Reviewed By: RyanGlScott

Subscribers: RyanGlScott, trommler, jmct, carter, thomie, rwbarton

GHC Trac Issues: #15124

Differential Revision: https://phabricator.haskell.org/D4726

912fd2b6

Jun 07, 2018

Check if both branches of an Cmm if have the same target. · efea32cf

Andreas Klebinger authored 6 years ago

This for some reason or the other and makes it into the final
binary. I've added the check to ContFlowOpt as that seems
like a logical place for this.

In a regular nofib run there were 30 occurences of this pattern.

Test Plan: ci

Reviewers: bgamari, simonmar, dfeuer, jrtc27, tdammers

Reviewed By: bgamari, simonmar

Subscribers: tdammers, dfeuer, rwbarton, thomie, carter

GHC Trac Issues: #15188

Differential Revision: https://phabricator.haskell.org/D4740

efea32cf

Mar 19, 2018

Hoopl: improve postorder calculation · bbcea13a

Michal Terepeta authored 7 years ago and

Ben Gamari committed 7 years ago


- Fix the naming and comments to indicate that we are calculating
  *reverse* postorder (and not the standard postorder).

- Rewrite the calculation to avoid CPS code. I found it fairly
  difficult to understand and the new one seems faster (according to
  nofib, decreases compiler allocations by 0.2%)

- Remove `LabelsPtr`, which seems unnecessary and could be *really*
  confusing. For instance, previously:
  `postorder_dfs_from <block with label X>`
  and
  `postorder_dfs_from <label X>`
  would actually mean quite different things (and give different
  results).

- Change the `Dataflow` module to always use entry of the graph for
  reverse postorder calculation. This should be the only change in
  behavior of this commit.

  Previously, if the caller provided initial facts for some of the
  labels, we would use those labels for our postorder calculation.
  However, I don't think that's correct in general - if the initial
  facts did not contain the entry of the graph, we would never analyze
  the blocks reachable from the entry but unreachable from the labels
  provided with the initial facts. It seems that the only analysis that
  used this was proc-point analysis, which I think would always include
  the entry block (so I don't think there's any bug due to this).

Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>

Test Plan: ./validate

Reviewers: bgamari, simonmar

Reviewed By: simonmar

Subscribers: rwbarton, thomie, carter

Differential Revision: https://phabricator.haskell.org/D4464

bbcea13a