-
Finley McIlwaine authoredFinley McIlwaine authored
.. _extending-ghc:
Extending and using GHC as a Library
====================================
GHC exposes its internal APIs to users through the built-in ghc package.
It allows you to write programs that leverage GHC's entire compilation
driver, in order to analyze or compile Haskell code programmatically.
Furthermore, GHC gives users the ability to load compiler plugins during
compilation - modules which are allowed to view and change GHC's
internal intermediate representation, Core. Plugins are suitable for
things like experimental optimizations or analysis, and offer a lower
barrier of entry to compiler development for many common cases.
Furthermore, GHC offers a lightweight annotation mechanism that you can
use to annotate your source code with metadata, which you can later
inspect with either the compiler API or a compiler plugin.
.. _annotation-pragmas:
Source annotations
------------------
Annotations are small pragmas that allow you to attach data to
identifiers in source code, which are persisted when compiled. These
pieces of data can then inspected and utilized when using GHC as a
library or writing a compiler plugin.
.. _ann-pragma:
Annotating values
~~~~~~~~~~~~~~~~~
.. index::
single: ANN pragma
single: pragma; ANN
single: source annotations
Any expression that has both ``Typeable`` and ``Data`` instances may be
attached to a top-level value binding using an ``ANN`` pragma. In
particular, this means you can use ``ANN`` to annotate data constructors
(e.g. ``Just``) as well as normal values (e.g. ``take``). By way of
example, to annotate the function ``foo`` with the annotation
``Just "Hello"`` you would do this:
::
{-# ANN foo (Just "Hello") #-}
foo = ...
A number of restrictions apply to use of annotations:
- The binder being annotated must be at the top level (i.e. no nested
binders)
- The binder being annotated must be declared in the current module
- The expression you are annotating with must have a type with
``Typeable`` and ``Data`` instances
- The :ref:`Template Haskell staging restrictions ` apply to the
expression being annotated with, so for example you cannot run a
function from the module being compiled.
To be precise, the annotation ``{-# ANN x e #-}`` is well staged if
and only if ``$(e)`` would be (disregarding the usual type
restrictions of the splice syntax, and the usual restriction on
splicing inside a splice - ``$([|1|])`` is fine as an annotation,
albeit redundant).
If you feel strongly that any of these restrictions are too onerous,
:ghc-wiki:`please give the GHC team a shout `.
However, apart from these restrictions, many things are allowed,
including expressions which are not fully evaluated! Annotation
expressions will be evaluated by the compiler just like Template Haskell
splices are. So, this annotation is fine:
::
{-# ANN f SillyAnnotation { foo = (id 10) + $([| 20 |]), bar = 'f } #-}
f = ...
.. _typeann-pragma:
Annotating types
~~~~~~~~~~~~~~~~
.. index::
single: ANN pragma; on types
You can annotate types with the ``ANN`` pragma by using the ``type``
keyword. For example:
::
{-# ANN type Foo (Just "A `Maybe String' annotation") #-}
data Foo = ...
.. _modann-pragma:
Annotating modules
~~~~~~~~~~~~~~~~~~
.. index::
single: ANN pragma; on modules
You can annotate modules with the ``ANN`` pragma by using the ``module``
keyword. For example:
::
{-# ANN module (Just "A `Maybe String' annotation") #-}
.. _ghc-as-a-library:
Using GHC as a Library
----------------------
The ``ghc`` package exposes most of GHC's frontend to users, and thus
allows you to write programs that leverage it. This library is actually
the same library used by GHC's internal, frontend compilation driver,
and thus allows you to write tools that programmatically compile source
code and inspect it. Such functionality is useful in order to write
things like IDE or refactoring tools. As a simple example, here's a
program which compiles a module, much like ghc itself does by default
when invoked:
::
import GHC
import GHC.Paths ( libdir )
import GHC.Driver.Session ( defaultFatalMessager, defaultFlushOut )
main =
defaultErrorHandler defaultFatalMessager defaultFlushOut $ do
runGhc (Just libdir) $ do
dflags <- getSessionDynFlags
setSessionDynFlags dflags
target <- guessTarget "test_main.hs" Nothing
setTargets [target]
load LoadAllTargets
The argument to ``runGhc`` is a bit tricky. GHC needs this to find its
libraries, so the argument must refer to the directory that is printed
by ``ghc --print-libdir`` for the same version of GHC that the program
is being compiled with. Above we therefore use the ``ghc-paths`` package
which provides this for us.
Compiling it results in:
.. code-block:: none
$ cat test_main.hs
main = putStrLn "hi"
$ ghc -package ghc simple_ghc_api.hs
[1 of 1] Compiling Main ( simple_ghc_api.hs, simple_ghc_api.o )
Linking simple_ghc_api ...
$ ./simple_ghc_api
$ ./test_main
hi
$
For more information on using the API, as well as more samples and
references, please see `this Haskell.org wiki
page `__.
.. _compiler-plugins:
Compiler Plugins
----------------
GHC has the ability to load compiler plugins at compile time. The
feature is similar to the one provided by
`GCC `__, and allows users to write
plugins that can adjust the behaviour of the constraint solver, inspect
and modify the compilation pipeline, as well as transform and inspect
GHC's intermediate language, Core. Plugins are suitable for experimental
analysis or optimization, and require no changes to GHC's source code to
use.
Plugins cannot optimize/inspect C-\\-, nor can they implement things like
parser/front-end modifications like GCC, apart from limited changes to
the constraint solver. If you feel strongly that any of these
restrictions are too onerous,
:ghc-wiki:`please give the GHC team a shout `.
Plugins do not work with ``-fexternal-interpreter``. If you need to run plugins
with ``-fexternal-interpreter`` let GHC developers know in :ghc-ticket:`14335`.
.. _using-compiler-plugins:
Using compiler plugins
~~~~~~~~~~~~~~~~~~~~~~
Plugins can be added on the command line with the :ghc-flag:`-fplugin=⟨module⟩`
option where ⟨module⟩ is a module in a registered package that exports the
plugin. Plugins are loaded in order, with command-line and Cabal flags preceding
those in OPTIONS pragmas which are processed in file order. Arguments can be
passed to the plugins with the :ghc-flag:`-fplugin-opt=⟨module⟩:⟨args⟩`
option. The list of enabled plugins can be reset with the
:ghc-flag:`-fclear-plugins` option.
.. ghc-flag:: -fplugin=⟨module⟩
:shortdesc: Load a plugin exported by a given module
:type: dynamic
:category: plugins
Load the plugin in the given module. The module must be a member of a
package registered in GHC's package database.
.. ghc-flag:: -fplugin-opt=⟨module⟩:⟨args⟩
:shortdesc: Give arguments to a plugin module; module must be specified with
:ghc-flag:`-fplugin=⟨module⟩`
:type: dynamic
:category: plugins
Give arguments to a plugin module; module must be specified with
:ghc-flag:`-fplugin=⟨module⟩`. The order of plugin pragmas matter but the
order of arg pragmas does not. The same set of arguments go to all plugins
from the same module.
::
-- Two Echo plugins will both get args A and B.
{-# OPTIONS -fplugin Echo -fplugin-opt Echo:A #-}
{-# OPTIONS -fplugin Echo -fplugin-opt Echo:B #-}
-- While order of the plugins matters, arg order does not.
{-# OPTIONS -fplugin-opt Echo2:B #-}
{-# OPTIONS -fplugin Echo1 #-}
{-# OPTIONS -fplugin-opt Echo1:A #-}
{-# OPTIONS -fplugin Echo2 #-}
If you want to use the same plugin with different arguments then rexport the
same plugin from different lightweight modules.
::
-- Echo1 and Echo2 as lightweight modules re-exporting Echo.plugin.
module Echo1 (plugin) where import Echo (plugin)
module Echo2 (plugin) where import Echo (plugin)
-- Echo1 gets arg A while Echo2 gets arg B.
{-# OPTIONS -fplugin Echo1 -fplugin-opt Echo1:A #-}
{-# OPTIONS -fplugin Echo2 -fplugin-opt Echo2:B #-}
.. ghc-flag:: -fplugin-trustworthy
:shortdesc: Trust the used plugins and no longer mark the compiled module
as unsafe
:type: dynamic
:category: plugins
By default, when a module is compiled with plugins, it will be marked as
unsafe. With this flag passed, all plugins are treated as trustworthy
and the safety inference will no longer be affected.
.. ghc-flag:: -fclear-plugins
:shortdesc: Clear the list of active plugins
:type: dynamic
:category: plugins
Clear the list of plugins previously specified with
:ghc-flag:`-fplugin <-fplugin=⟨module⟩>`. This is useful in GHCi where
simply removing the :ghc-flag:`-fplugin <-fplugin=⟨module⟩>` options from
the command line is not possible. Instead ``:set -fclear-plugins`` can be
used.
As an example, in order to load the plugin exported by ``Foo.Plugin`` in
the package ``foo-ghc-plugin``, and give it the parameter "baz", we
would invoke GHC like this:
.. code-block:: none
$ ghc -fplugin Foo.Plugin -fplugin-opt Foo.Plugin:baz Test.hs
[1 of 1] Compiling Main ( Test.hs, Test.o )
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
Loading package foo-ghc-plugin-0.1 ... linking ... done.
...
Linking Test ...
$
Plugins can be also be loaded from libraries directly. It allows plugins to be
loaded in cross-compilers (as a workaround for :ghc-ticket:`14335`).
.. ghc-flag:: -fplugin-library=⟨file-path⟩;⟨unit-id⟩;⟨module⟩;⟨args⟩
:shortdesc: Load a pre-compiled static plugin from an external library
:type: dynamic
:category: plugins
Arguments are specified in a list form, so a plugin specified to
:ghc-flag:`-fplugin-library=⟨file-path⟩;⟨unit-id⟩;⟨module⟩;⟨args⟩` will look
like ``'path/to/plugin;package-123;Plugin.Module;["Argument","List"]'``.
Alternatively, core plugins can be specified with Template Haskell.
::
addCorePlugin "Foo.Plugin"
This inserts the plugin as a core-to-core pass. Unlike `-fplugin=(module)`,
the plugin module can't reside in the same package as the module calling
:th-ref:`Language.Haskell.TH.Syntax.addCorePlugin`. This way, the
implementation can expect the plugin to be built by the time
it is needed.
Plugin modules live in a separate namespace from
the user import namespace. By default, these two namespaces are
the same; however, there are a few command line options which
control specifically plugin packages:
.. ghc-flag:: -plugin-package ⟨pkg⟩
:shortdesc: Expose ⟨pkg⟩ for plugins
:type: dynamic
:category: plugins
This option causes the installed package ⟨pkg⟩ to be exposed for plugins,
such as :ghc-flag:`-fplugin=⟨module⟩`. The package ⟨pkg⟩ can be specified
in full with its version number (e.g. ``network-1.0``) or the version
number can be omitted if there is only one version of the package
installed. If there are multiple versions of ⟨pkg⟩ installed and
:ghc-flag:`-hide-all-plugin-packages` was not specified, then all other
versions will become hidden. :ghc-flag:`-plugin-package ⟨pkg⟩` supports
thinning and renaming described in :ref:`package-thinning-and-renaming`.
Unlike :ghc-flag:`-package ⟨pkg⟩`, this option does NOT cause package ⟨pkg⟩
to be linked into the resulting executable or shared object.
.. ghc-flag:: -plugin-package-id ⟨pkg-id⟩
:shortdesc: Expose ⟨pkg-id⟩ for plugins
:type: dynamic
:category: plugins
Exposes a package in the plugin namespace like :ghc-flag:`-plugin-package
⟨pkg⟩`, but the package is named by its installed package ID rather than by
name. This is a more robust way to name packages, and can be used to
select packages that would otherwise be shadowed. Cabal passes
:ghc-flag:`-plugin-package-id ⟨pkg-id⟩` flags to GHC.
:ghc-flag:`-plugin-package-id ⟨pkg-id⟩` supports thinning and renaming
described in :ref:`package-thinning-and-renaming`.
.. ghc-flag:: -hide-all-plugin-packages
:shortdesc: Hide all packages for plugins by default
:type: dynamic
:category: plugins
By default, all exposed packages in the normal, source import namespace are
also available for plugins. This causes those packages to be hidden by
default. If you use this flag, then any packages with plugins you require
need to be explicitly exposed using :ghc-flag:`-plugin-package ⟨pkg⟩`
options.
At the moment, the only way to specify a dependency on a plugin
in Cabal is to put it in ``build-depends`` (which uses the conventional
:ghc-flag:`-package-id ⟨unit-id⟩` flag); however, in the future there
will be a separate field for specifying plugin dependencies specifically.
.. _writing-compiler-plugins:
Writing compiler plugins
~~~~~~~~~~~~~~~~~~~~~~~~
Plugins are modules that export at least a single identifier,
``plugin``, of type ``GHC.Plugins.Plugin``. All plugins should
``import GHC.Plugins`` as it defines the interface to the compilation
pipeline.
A ``Plugin`` effectively holds a function which installs a compilation
pass into the compiler pipeline. By default there is the empty plugin
which does nothing, ``GHC.Plugins.defaultPlugin``, which you should
override with record syntax to specify your installation function. Since
the exact fields of the ``Plugin`` type are open to change, this is the
best way to ensure your plugins will continue to work in the future with
minimal interface impact.
``Plugin`` exports a field, ``installCoreToDos`` which is a function of
type ``[CommandLineOption] -> [CoreToDo] -> CoreM [CoreToDo]``. A
``CommandLineOption`` is effectively just ``String``, and a ``CoreToDo``
is basically a function of type ``Core -> Core``. A ``CoreToDo`` gives
your pass a name and runs it over every compiled module when you invoke
GHC.
As a quick example, here is a simple plugin that just does nothing and
just returns the original compilation pipeline, unmodified, and says
'Hello':
::
module DoNothing.Plugin (plugin) where
import GHC.Plugins
plugin :: Plugin
plugin = defaultPlugin {
installCoreToDos = install
}
install :: [CommandLineOption] -> [CoreToDo] -> CoreM [CoreToDo]
install _ todo = do
putMsgS "Hello!"
return todo
Provided you compiled this plugin and registered it in a package (with
cabal for instance,) you can then use it by just specifying
``-fplugin=DoNothing.Plugin`` on the command line, and during the
compilation you should see GHC say 'Hello'.
Running multiple plugins is also supported, by passing
multiple ``-fplugin=...`` options. GHC will load the plugins
in the order in which they are specified on the command line
and, when appropriate, compose their effects in the same
order. That is, if we had two Core plugins, ``Plugin1`` and
``Plugin2``, each defining an ``install`` function like
the one above, then GHC would first run ``Plugin1.install``
on the default ``[CoreToDo]``, take the result and feed it to
``Plugin2.install``. ``-fplugin=Plugin1 -fplugin=Plugin2``
will update the Core pipeline by applying
``Plugin1.install opts1 >=> Plugin2.install opts2`` (where
``opts1`` and ``opts2`` are the options passed to each plugin
using ``-fplugin-opt=...``). This is not specific to Core
plugins but holds for all the types of plugins that can be
composed or sequenced in some way: the first plugin to appear
on the GHC command line will always act first.
.. _core-plugins-in-more-detail:
Core plugins in more detail
~~~~~~~~~~~~~~~~~~~~~~~~~~~
``CoreToDo`` is effectively a data type that describes all the kinds of
optimization passes GHC does on Core. There are passes for
simplification, CSE, etc. There is a specific case for
plugins, ``CoreDoPluginPass :: String -> PluginPass -> CoreToDo`` which
should be what you always use when inserting your own pass into the
pipeline. The first parameter is the name of the plugin, and the second
is the pass you wish to insert.
``CoreM`` is a monad that all of the Core optimizations live and operate
inside of.
A plugin's installation function (``install`` in the above example)
takes a list of ``CoreToDo``\ s and returns a list of ``CoreToDo``.
Before GHC begins compiling modules, it enumerates all the needed
plugins you tell it to load, and runs all of their installation
functions, initially on a list of passes that GHC specifies itself.
After doing this for every plugin, the final list of passes is given to
the optimizer, and are run by simply going over the list in order.
You should be careful with your installation function, because the list
of passes you give back isn't questioned or double checked by GHC at the
time of this writing. An installation function like the following:
::
install :: [CommandLineOption] -> [CoreToDo] -> CoreM [CoreToDo]
install _ _ = return []
is certainly valid, but also certainly not what anyone really wants.
.. _manipulating-bindings:
Manipulating bindings
^^^^^^^^^^^^^^^^^^^^^
In the last section we saw that besides a name, a ``CoreDoPluginPass``
takes a pass of type ``PluginPass``. A ``PluginPass`` is a synonym for
``(ModGuts -> CoreM ModGuts)``. ``ModGuts`` is a type that represents
the one module being compiled by GHC at any given time.
A ``ModGuts`` holds all of the module's top level bindings which we can
examine. These bindings are of type ``CoreBind`` and effectively
represent the binding of a name to body of code. Top-level module
bindings are part of a ``ModGuts`` in the field ``mg_binds``.
Implementing a pass that manipulates the top level bindings merely needs
to iterate over this field, and return a new ``ModGuts`` with an updated
``mg_binds`` field. Because this is such a common case, there is a
function provided named ``bindsOnlyPass`` which lifts a function of type
``([CoreBind] -> CoreM [CoreBind])`` to type
``(ModGuts -> CoreM ModGuts)``.
Continuing with our example from the last section, we can write a simple
plugin that just prints out the name of all the non-recursive bindings
in a module it compiles:
::
module SayNames.Plugin (plugin) where
import GHC.Plugins
plugin :: Plugin
plugin = defaultPlugin {
installCoreToDos = install
}
install :: [CommandLineOption] -> [CoreToDo] -> CoreM [CoreToDo]
install _ todo = do
return (CoreDoPluginPass "Say name" pass : todo)
pass :: ModGuts -> CoreM ModGuts
pass guts = do dflags <- getDynFlags
bindsOnlyPass (mapM (printBind dflags)) guts
where printBind :: DynFlags -> CoreBind -> CoreM CoreBind
printBind dflags bndr@(NonRec b _) = do
putMsgS $ "Non-recursive binding named " ++ showSDoc dflags (ppr b)
return bndr
printBind _ bndr = return bndr
.. _late-plugins:
Late Plugins
^^^^^^^^^^^^
If the ``CoreProgram`` of a module is modified in a normal core plugin, the
modified bindings can end up in unfoldings the interface file for the module.
This may be undesireable, as the plugin could make changes which affect inlining
or optimization.
Late plugins can be used to avoid introducing such changes into the interface
file. Late plugins are a bit different from typical core plugins:
1. They do not run in the ``CoreM`` monad. Instead, they are explicitly passed
the ``HscEnv`` and they run in ``IO``.
2. They are given ``CgGuts`` instead of ``ModGuts``. ``CgGuts`` are a restricted
form of ``ModGuts`` intended for code generation. The ``CoreProgram`` held in
the ``CgGuts`` given to a late plugin will already be fully optimized.
3. They must maintain a ``CostCentreState`` and track any cost centres they
introduce by adding them to the ``cg_ccs`` field of ``CgGuts``. This is
because the automatic collection of cost centres happens before the late
plugin stage. If a late plugin does not introduce any cost centres, it may
simply return the given cost centre state.
Here is a very simply example of a late plugin that changes the value of a
binding in a module. If it finds a non-recursive top-level binding named
``testBinding`` with type ``Int``, it will change its value to the ``Int``
expression ``111111``.
::
plugin :: Plugin
plugin = defaultPlugin { latePlugin = lateP }
lateP :: LatePlugin
lateP _ _ (cg_guts, cc_state) = do
binds' <- editCoreBinding (cg_binds cg_guts)
return (cg_guts { cg_binds = binds' }, cc_state)
editCoreBinding :: CoreProgram -> IO CoreProgram
editCoreBinding pgm = pure . go
where
go :: [CoreBind] -> [CoreBind]
go (b@(NonRec v e) : bs)
| occNameString (getOccName v) == "testBinding" && exprType e `eqType` intTy =
NonRec v (mkUncheckedIntExpr 111111) : bs
go (b:bs) = b : go bs
go [] = []
Since this is a late plugin, the changed binding value will not end up in the
interface file.
.. _getting-annotations:
Using Annotations
^^^^^^^^^^^^^^^^^
Previously we discussed annotation pragmas (:ref:`annotation-pragmas`),
which we mentioned could be used to give compiler plugins extra guidance
or information. Annotations for a module can be retrieved by a plugin,
but you must go through the modules ``ModGuts`` in order to get it.
Because annotations can be arbitrary instances of ``Data`` and
``Typeable``, you need to give a type annotation specifying the proper
type of data to retrieve from the interface file, and you need to make
sure the annotation type used by your users is the same one your plugin
uses. For this reason, we advise distributing annotations as part of the
package which also provides compiler plugins if possible.
To get the annotations of a single binder, you can use
``getAnnotations`` and specify the proper type. Here's an example that
will print out the name of any top-level non-recursive binding with the
``SomeAnn`` annotation:
::
{-# LANGUAGE DeriveDataTypeable #-}
module SayAnnNames.Plugin (plugin, SomeAnn(..)) where
import GHC.Plugins
import Control.Monad (unless)
import Data.Data
data SomeAnn = SomeAnn deriving Data
plugin :: Plugin
plugin = defaultPlugin {
installCoreToDos = install
}
install :: [CommandLineOption] -> [CoreToDo] -> CoreM [CoreToDo]
install _ todo = do
return (CoreDoPluginPass "Say name" pass : todo)
pass :: ModGuts -> CoreM ModGuts
pass g = do
dflags <- getDynFlags
mapM_ (printAnn dflags g) (mg_binds g) >> return g
where printAnn :: DynFlags -> ModGuts -> CoreBind -> CoreM CoreBind
printAnn dflags guts bndr@(NonRec b _) = do
anns <- annotationsOn guts b :: CoreM [SomeAnn]
unless (null anns) $ putMsgS $ "Annotated binding found: " ++ showSDoc dflags (ppr b)
return bndr
printAnn _ _ bndr = return bndr
annotationsOn :: Data a => ModGuts -> CoreBndr -> CoreM [a]
annotationsOn guts bndr = do
(_, anns) <- getAnnotations deserializeWithData guts
return $ lookupWithDefaultUFM_Directly anns [] (varUnique bndr)
Please see the GHC API documentation for more about how to use internal
APIs, etc.
.. _typechecker-plugins:
Typechecker plugins
~~~~~~~~~~~~~~~~~~~
In addition to Core plugins, GHC has experimental support for
typechecker plugins, which allow the behaviour of the constraint solver
to be modified. For example, they make it possible to interface the
compiler to an SMT solver, in order to support a richer theory of
type-level arithmetic expressions than the theory built into GHC (see
:ref:`typelit-tyfuns`).
The ``Plugin`` type has a field ``tcPlugin`` of type
``[CommandLineOption] -> Maybe TcPlugin``, where the ``TcPlugin`` type
is defined thus:
::
data TcPlugin = forall s . TcPlugin
{ tcPluginInit :: TcPluginM s
, tcPluginSolve :: s -> TcPluginSolver
, tcPluginRewrite :: s -> UniqFM TyCon TcPluginRewriter
, tcPluginStop :: s -> TcPluginM ()
}
type TcPluginSolver = EvBindsVar -> [Ct] -> [Ct] -> TcPluginM TcPluginSolveResult
type TcPluginRewriter = RewriteEnv -> [Ct] -> [Type] -> TcPluginM TcPluginRewriteResult
data TcPluginSolveResult
= TcPluginSolveResult
{ tcPluginInsolubleCts :: [Ct]
, tcPluginSolvedCts :: [(EvTerm, Ct)]
, tcPluginNewCts :: [Ct]
}
data TcPluginRewriteResult
= TcPluginNoRewrite
| TcPluginRewriteTo
{ tcPluginRewriteTo :: Reduction
, tcRewriterNewWanteds :: [Ct]
}
(The details of this representation are subject to change as we gain
more experience writing typechecker plugins. It should not be assumed to
be stable between GHC releases.)
The basic idea is as follows:
- When type checking a module, GHC calls ``tcPluginInit`` once before
constraint solving starts. This allows the plugin to look things up
in the context, initialise mutable state or open a connection to an
external process (e.g. an external SMT solver). The plugin can return
a result of any type it likes, and the result will be passed to the
other fields of the ``TcPlugin`` record.
- During constraint solving, GHC repeatedly calls ``tcPluginSolve``.
This function is provided with the current set of constraints, and
should return a ``TcPluginSolveResult`` that indicates whether a
contradiction was found or progress was made. If the plugin solver
makes progress, GHC will re-start the constraint solving pipeline,
looping until a fixed point is reached.
- When rewriting type family applications, GHC calls ``tcPluginRewriter``.
The plugin supplies a collection of type families which it is interested
in rewriting. For each of those, the rewriter is provided with the
the arguments to that type family, as well as the current collection of
Given constraints. The plugin can then specify a rewriting for this
type family application, if desired.
- Finally, GHC calls ``tcPluginStop`` after constraint solving is
finished, allowing the plugin to dispose of any resources it has
allocated (e.g. terminating the SMT solver process).
Plugin code runs in the ``TcPluginM`` monad, which provides a restricted
interface to GHC API functionality that is relevant for typechecker
plugins, including ``IO`` and reading the environment. If you need
functionality that is not exposed in the ``TcPluginM`` module, you can
use ``unsafeTcPluginTcM :: TcM a -> TcPluginM a``, but are encouraged to
contact the GHC team to suggest additions to the interface. Note that
``TcPluginM`` can perform arbitrary IO via
``tcPluginIO :: IO a -> TcPluginM a``, although some care must be taken
with side effects (particularly in ``tcPluginSolve``). In general, it is
up to the plugin author to make sure that any IO they do is safe.
.. _constraint-solving-with-plugins:
Constraint solving with plugins
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The key component of a typechecker plugin is a function of type
``TcPluginSolver``, like this:
::
solve :: EvBindsVar -> [Ct] -> [Ct] -> TcPluginM TcPluginSolveResult
solve binds givens wanteds = ...
This function will be invoked in two different ways:
1. after simplification of Given constraints, where the plugin gets the
opportunity to rewrite givens,
2. after GHC has attempted to solve Wanted constraints.
The two ways can be distinguished by checking the Wanted constraints: in the
first case (and the first case only), the plugin will be passed an empty list
of Wanted constraints.
The plugin can then respond with:
* solved constraints, which will be removed from the inert set,
* new constraints, which will be added to the work list,
* insoluble constraints, which will be reported as errors.
The plugin must respond with constraints of the same flavour,
i.e. in (1) it should return only Givens, and for (2) it should return only
Wanteds; all other constraints will be ignored.
If the plugin cannot make any progress, it should return
``TcPluginSolveResult [] [] []``. Otherwise, if there were any new constraints,
the main constraint solver will be re-invoked to simplify them, then the
plugin will be invoked again. The plugin is responsible for making sure
that this process eventually terminates.
Plugins are provided with all available constraints (including
equalities and typeclass constraints), but it is easy for them to
discard those that are not relevant to their domain, because they need
return only those constraints for which they have made progress (either
by solving or contradicting them).
Constraints that have been solved by the plugin must be provided with
evidence in the form of an ``EvTerm`` of the type of the constraint.
This evidence is ignored for Given constraints, which GHC
"solves" simply by discarding them; typically this is used when they are
uninformative (e.g. reflexive equations). For Wanted constraints, the
evidence will form part of the Core term that is generated after
typechecking, and can be checked by ``-dcore-lint``. It is possible for
the plugin to create equality axioms for use in evidence terms, but GHC
does not check their consistency, and inconsistent axiom sets may lead
to segfaults or other runtime misbehaviour.
Evidence is required also when creating new Given constraints, which are
usually implied by old ones. It is not uncommon that the evidence of a new
Given constraint contains a removed constraint: the new one has replaced the
removed one.
.. _type-family-rewriting-with-plugins:
Type family rewriting with plugins
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Typechecker plugins can also directly rewrite type family applications,
by supplying the ``tcPluginRewrite`` field of the ``TcPlugin`` record.
::
tcPluginRewrite :: s -> UniqFM TyCon TcPluginRewriter
That is, the plugin registers a map, from a type family's ``TyCon`` to its
associated rewriting function: ::
type TcPluginRewriter = [Ct] -> [Type] -> TcPluginM TcPluginRewriteResult
This rewriting function is supplied with the Given constraints from the current
context, and the type family arguments.
Note that the type family application is guaranteed to be exactly saturated.
This function should then return a possible rewriting of the type family
application, by means of the following datatype: ::
data TcPluginRewriteResult
= TcPluginNoRewrite
| TcPluginRewriteTo
{ tcPluginRewriteTo :: Reduction
, tcRewriterNewWanteds :: [Ct]
}
That is, the rewriter can specify a rewriting of the type family application --
in which case it can also emit new Wanted constraints -- or it can do nothing.
To specify a rewriting, the plugin must provide a ``Reduction``, which is
defined as follows: ::
data Reduction = Reduction Coercion !Type
That is, on top of specifying what type the type-family application rewrites to,
the plugin must also supply a coercion which witnesses this rewriting: ::
co :: F orig_arg_1 ... orig_arg_n ~ rewritten_ty
Note in particular that the LHS type of the coercion should be the original
type-family application, while its RHS type is the type that the plugin wants
to rewrite the type-family application to.
.. _source-plugins:
Source plugins
~~~~~~~~~~~~~~
In addition to core and type checker plugins, you can install plugins that can
access different representations of the source code. The main purpose of these
plugins is to make it easier to implement development tools.
There are several different access points that you can use for defining plugins
that access the representations. All these fields receive the list of
``CommandLineOption`` strings that are passed to the compiler using the
:ghc-flag:`-fplugin-opt=⟨module⟩:⟨args⟩` flags.
::
plugin :: Plugin
plugin = defaultPlugin {
parsedResultAction = parsed
, typeCheckResultAction = typechecked
, spliceRunAction = spliceRun
, interfaceLoadAction = interfaceLoad
, renamedResultAction = renamed
}
Parsed representation
^^^^^^^^^^^^^^^^^^^^^
When you want to define a plugin that uses the syntax tree of the source code,
you would like to override the ``parsedResultAction`` field. This access point
enables you to get access to information about the lexical tokens and comments
in the source code as well as the original syntax tree of the compiled module.
::
parsed :: [CommandLineOption] -> ModSummary
-> ParsedResult -> Hsc ParsedResult
The ``ModSummary`` contains useful
meta-information about the compiled module. The ``ParsedResult`` contains a
``HsParsedModule``, which contains the lexical and syntactical information we
mentioned before. The result that you return will change the result of the
parsing. If you don't want to change the result, just return the
``ParsedResult`` that you received as the argument.
If the parser encounters any errors that prevent an AST from being constructed,
the plugin will not be run, but other kinds of errors, as well as warnings,
will be given to the plugin via the ``PsMessages`` value of the
``ParsedResult``. This allows you to modify, remove, and add warnings or errors
before they are displayed to the user, although in most cases, you will likely
want to return the messages unmodified. The parsing pass will fail if the
``Messages PsError`` collection inside the return ``ParsedResult`` is not empty
after all parsing plugins have been run.
Type checked representation
^^^^^^^^^^^^^^^^^^^^^^^^^^^
When you want to define a plugin that needs semantic information about the
source code, use the ``typeCheckResultAction`` field. For example, if your
plugin have to decide if two names are referencing the same definition or it has
to check the type of a function it is using semantic information. In this case
you need to access the renamed or type checked version of the syntax tree with
``typeCheckResultAction`` or ``renamedResultAction``.
::
typechecked :: [CommandLineOption] -> ModSummary -> TcGblEnv -> TcM TcGblEnv
renamed :: [CommandLineOption] -> TcGblEnv -> HsGroup GhcRn -> TcM (TcGblEnv, HsGroup GhcRn)
By overriding the ``renamedResultAction`` field we can modify each ``HsGroup``
after it has been renamed. A source file is separated into groups depending on
the location of template haskell splices so the contents of these groups may
not be intuitive. In order to save the entire renamed AST for inspection
at the end of typechecking you can set ``renamedResultAction`` to ``keepRenamedSource``
which is provided by the ``Plugins`` module.
This is important because some parts of the renamed
syntax tree (for example, imports) are not found in the typechecked one.
Evaluated code
^^^^^^^^^^^^^^
When the compiler type checks the source code, :ref:`template-haskell` Splices
and :ref:`th-quasiquotation` will be replaced by the syntax tree fragments
generated from them. However for tools that operate on the source code the
code generator is usually more interesting than the generated code. For this
reason we included ``spliceRunAction``. This field is invoked on each expression
before they are evaluated. The input is type checked, so semantic information is
available for these syntax tree fragments. If you return a different expression
you can change the code that is generated.
::
spliceRun :: [CommandLineOption] -> LHsExpr GhcTc -> TcM (LHsExpr GhcTc)
However take care that the generated definitions are still in the input of
``typeCheckResultAction``. If your don't take care to filter the typechecked
input, the behavior of your tool might be inconsistent.
Interface files
^^^^^^^^^^^^^^^
Sometimes when you are writing a tool, knowing the source code is not enough,
you also have to know details about the modules that you import. In this case we
suggest using the ``interfaceLoadAction``. This will be called each time when
the code of an already compiled module is loaded. It will be invoked for modules
from installed packages and even modules that are installed with GHC. It will
NOT be invoked with your own modules.
::
interfaceLoad :: forall lcl . [CommandLineOption] -> ModIface
-> IfM lcl ModIface
In the ``ModIface`` datatype you can find lots of useful information, including
the exported definitions and type class instances.
The ``ModIface`` datatype also contains facilities for extending it with extra
data, stored in a ``Map`` of serialised fields, indexed by field names and using
GHC's internal ``Binary`` class. The interface to work with these fields is:
::
readIfaceField :: Binary a => FieldName -> ModIface -> IO (Maybe a)
writeIfaceField :: Binary a => FieldName -> a -> ModIface -> IO ModIface
deleteIfaceField :: FieldName -> ModIface -> ModIface
The ``FieldName`` is open-ended, but typically it should contain the producing
package name, along with the actual field name. Then, the version number can either
be attached to the serialised data for that field, or in cases where multiple versions
of a field could exist in the same interface file, included in the field name.
Depending on if the field version advances with the package version, or independently,
the version can be attached to either the package name or the field name. Examples of
each case:
::
package/field
ghc-n.n.n/core
package/field-n
To read an interface file from an external tool without linking to GHC, the format
is described at `Extensible Interface Files `_.
Source plugin example
^^^^^^^^^^^^^^^^^^^^^