safe_haskell.rst



.. _safe-haskell:

Safe Haskell

============

.. index::

   single: safe haskell

Safe Haskell is an extension to the Haskell language that is implemented

in GHC as of version 7.2. It allows for unsafe code to be securely

included in a trusted code base by restricting the features of GHC

Haskell the code is allowed to use. Put simply, it makes the types of

programs trustable.

While a primary use case of Safe Haskell is running untrusted code, Safe

Haskell doesn't provide this directly. Instead, Safe Haskell provides

strict type safety. Without Safe Haskell, GHC allows many exceptions to

the type system which can subvert any abstractions. By providing strict

type safety, Safe Haskell enables developers to build their own library

level sandbox mechanisms to run untrusted code.

While Safe Haskell is an extension, it actually runs in the background

for every compilation with GHC. It does this to track the type

violations of modules to infer their safety, even when they aren't

explicitly using Safe Haskell. Please refer to section

:ref:`safe-inference` for more details of this.

The design of Safe Haskell covers the following aspects:

- A :ref:`safe language ` dialect of Haskell that provides

  stricter guarantees about the code. It allows types and module boundaries to

  be trusted.

- A *safe import* extension that specifies that the module being imported must

  be trusted.

- A definition of *trust* (or safety) and how it operates, along with ways of

  defining and changing the trust of modules and packages.

Safe Haskell, however, *does not offer* compilation safety. During

compilation time it is possible for arbitrary processes to be launched,

using for example the :ref:`custom pre-processor ` flag.

This can be manipulated to either compromise a user's system at

compilation time, or to modify the source code just before compilation

to try to alter Safe Haskell flags. This is discussed further in section

:ref:`safe-compilation`.

.. _safe-use-cases:

Uses of Safe Haskell

--------------------

.. index::

   single: safe haskell uses

Safe Haskell has been designed with two use cases in mind:

-  Enforcing strict type safety at compile time

-  Compiling and executing untrusted code

Strict type-safety (good style)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Haskell offers a powerful type system and separation of pure and effectual

functions through the ``IO`` monad. However, there are several loop holes in the

type system, the most obvious being the ``unsafePerformIO :: IO a -> a``

function. The safe language dialect of Safe Haskell disallows the use of such

functions. This can be useful restriction as it makes Haskell code easier to

analyse and reason about. It also codifies the existing culture in the Haskell

community of trying to avoid unsafe functions unless absolutely necessary. As

such, using the safe language (through the ``-XSafe`` flag) can be thought of as

a way of enforcing good style, similar to the function of ``-Wall``.

Building secure systems (restricted IO Monads)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. index::

   single: secure haskell

Systems such as information flow control security, capability based

security systems and DSLs for working with encrypted data.. etc can be

built in the Haskell language as a library. However they require

guarantees about the properties of Haskell that aren't true in general

due to the presence of functions like ``unsafePerformIO``. Safe Haskell

gives users enough guarantees about the type system to allow them to

build such secure systems.

As an example, let's define an interface for a plugin system where the

plugin authors are untrusted, possibly malicious third-parties. We do

this by restricting the plugin interface to pure functions or to a

restricted ``IO`` monad that we have defined. The restricted ``IO``

monad will only allow a safe subset of ``IO`` actions to be executed. We

define the plugin interface so that it requires the plugin module,

``Danger``, to export a single computation, ``Danger.runMe``, of type

``RIO ()``, where ``RIO`` is a monad defined as follows:

::

    -- While we use `Safe', the `Trustworthy' pragma would also be

    -- fine. We simply want to ensure that:

    -- 1) The module exports an interface that untrusted code can't

    --    abuse.

    -- 2) Untrusted code can import this module.

    --

    {-# LANGUAGE Safe #-}

    module RIO (RIO(), runRIO, rioReadFile, rioWriteFile) where

    -- Notice that symbol UnsafeRIO is not exported from this module!

    newtype RIO a = UnsafeRIO { runRIO :: IO a }

    instance Monad RIO where

        return = UnsafeRIO . return

        (UnsafeRIO m) >>= k = UnsafeRIO $ m >>= runRIO . k

    -- Returns True iff access is allowed to file name

    pathOK :: FilePath -> IO Bool

    pathOK file = {- Implement some policy based on file name -}

    rioReadFile :: FilePath -> RIO String

    rioReadFile file = UnsafeRIO $ do

        ok <- pathOK file

        if ok then readFile file else return ""

    rioWriteFile :: FilePath -> String -> RIO ()

    rioWriteFile file contents = UnsafeRIO $ do

        ok <- pathOK file

        if ok then writeFile file contents else return ()

We then compile the ``Danger`` plugin using the new Safe Haskell

``-XSafe`` flag:

::

    {-# LANGUAGE Safe #-}

    module Danger ( runMe ) where

    runMe :: RIO ()

    runMe = ...

Before going into the Safe Haskell details, let's point out some of the

reasons this security mechanism would fail without Safe Haskell:

- The design attempts to restrict the operations that ``Danger`` can perform by

  using types, specifically the ``RIO`` type wrapper around ``IO`` . The author

  of ``Danger`` can subvert this though by simply writing arbitrary ``IO``

  actions and using ``unsafePerformIO :: IO a -> a`` to execute them as pure

  functions.

- The design also relies on ``Danger`` not being able to access the

  ``UnsafeRIO`` constructor. Unfortunately Template Haskell can be used to

  subvert module boundaries and so could be used to gain access to this

  constructor.

- There is no way to place restrictions on the modules that ``Danger`` can

  import. This gives the author of ``Danger`` a very large attack surface,

  essentially any package currently installed on the system. Should any of

  these packages have a vulnerability, then the ``Danger`` module can exploit

  it.

Safe Haskell prevents all these attacks. This is done by compiling the

RIO module with the :extension:`Safe` or :extension:`Trustworthy` flag and compiling

``Danger`` with the :extension:`Safe` flag. We explain each below.

The use of :extension:`Safe` to compile ``Danger`` restricts the features of

Haskell that can be used to a `safe subset <#safe-language>`__. This

includes disallowing ``unsafePerformIO``, Template Haskell, pure FFI

functions, RULES and restricting the operation of Overlapping Instances.

The :extension:`Safe` flag also restricts the modules can be imported by

``Danger`` to only those that are considered trusted. Trusted modules

are those compiled with :extension:`Safe`, where GHC provides a mechanical

guarantee that the code is safe. Or those modules compiled with

:extension:`Trustworthy`, where the module author claims that the module is

Safe.

This is why the RIO module is compiled with :extension:`Safe` or

:extension:`Trustworthy`>, to allow the ``Danger`` module to import it. The

:extension:`Trustworthy` flag doesn't place any restrictions on the module like

:extension:`Safe` does (expect to restrict overlapping instances to `safe

overlapping instances <#safe-overlapping-instances>`__). Instead the

module author claims that while code may use unsafe features internally,

it only exposes an API that can used in a safe manner.

However, the unrestricted use of :extension:`Trustworthy` is a problem as an

arbitrary module can use it to mark themselves as trusted, yet

:extension:`Trustworthy` doesn't offer any guarantees about the module, unlike

:extension:`Safe`. To control the use of trustworthy modules it is recommended

to use the :ghc-flag:`-fpackage-trust` flag. This flag adds an extra requirement

to the trust check for trustworthy modules. It requires that for a

trustworthy modules to be considered trusted, and allowed to be used in

:extension:`Safe` compiled code, the client C compiling the code must tell GHC

that they trust the package the trustworthy module resides in. This is

essentially a way of for C to say, while this package contains

trustworthy modules that can be used by untrusted modules compiled with

:extension:`Safe`, I trust the author(s) of this package and trust the modules

only expose a safe API. The trust of a package can be changed at any

time, so if a vulnerability found in a package, C can declare that

package untrusted so that any future compilation against that package

would fail. For a more detailed overview of this mechanism see

:ref:`safe-trust`.

In the example, ``Danger`` can import module ``RIO`` because ``RIO`` is

compiled with :extension:`Safe`. Thus, ``Danger`` can make use of the

``rioReadFile`` and ``rioWriteFile`` functions to access permitted file

names. The main application then imports both ``RIO`` and ``Danger``. To

run the plugin, it calls ``RIO.runRIO Danger.runMe`` within the ``IO``

monad. The application is safe in the knowledge that the only ``IO`` to

ensue will be to files whose paths were approved by the ``pathOK`` test.

The Safe Haskell checks can be disabled for a module by passing the

:ghc-flag:`-fno-safe-haskell` flag. This is useful in particular when compiling

with source plugins as running a plugin marks the module as unsafe and can then

cause downstream modules to fail the safety checks.

.. _safe-language:

Safe Language

-------------

.. index::

   single: safe language

The Safe Haskell *safe language* (enabled by ``-XSafe``) guarantees the

following properties:

- *Referential transparency* — The types can be trusted. Any pure function, is

  guaranteed to be pure. Evaluating them is deterministic and won't cause any

  side effects. Functions in the ``IO`` monad are still allowed and behave as

  usual. So, for example, the ``unsafePerformIO :: IO a -> a`` function is

  disallowed in the safe language to enforce this property.

- *Module boundary control* — Only symbols that are publicly available through

  other module export lists can be accessed in the safe language. Values using

  data constructors not exported by the defining module, cannot be examined or

  created. As such, if a module ``M`` establishes some invariants through

  careful use of its export list, then code written in the safe language that

  imports ``M`` is guaranteed to respect those invariants.

- *Semantic consistency* — For any module that imports a module written in the

  safe language, expressions that compile both with and without the safe import

  have the same meaning in both cases. That is, importing a module written in

  the safe language cannot change the meaning of existing code that isn't

  dependent on that module. So, for example, there are some restrictions placed

  on the use of :ref:`OverlappingInstances `, as these can

  violate this property.

- *Strict subset* — The safe language is strictly a subset of Haskell as

  implemented by GHC. Any expression that compiles in the safe language has the

  same meaning as it does when compiled in normal Haskell.

These four properties guarantee that in the safe language you can trust

the types, can trust that module export lists are respected, and can

trust that code that successfully compiles has the same meaning as it

normally would.

To achieve these properties, in the safe language dialect we disable

completely the following features:

- :extension:`TemplateHaskell` — Can be used to gain access to constructors and

  abstract data types that weren't exported by a module, subverting module

  boundaries.

Furthermore, we restrict the following features:

- :extension:`ForeignFunctionInterface` — Foreign import declarations that

  import a function with a non-``IO`` type are disallowed.

- ``RULES`` — Rewrite rules defined in a module M compiled with

  :extension:`Safe` are dropped. Rules defined in Trustworthy modules that

  ``M`` imports are still valid and will fire as usual.

- :extension:`OverlappingInstances` — There is no restriction on the creation

  of overlapping instances, but we do restrict their use at a particular call

  site. This is a detailed restriction, please refer to :ref:`Safe Overlapping

  Instances ` for details.

- :extension:`GeneralisedNewtypeDeriving` — GND is not allowed in the safe

  language. This is due to the ability of it to violate module boundaries when

  module authors forget to put nominal role annotations on their types as

  appropriate. For this reason, the ``Data.Coerce`` module is also considered

  unsafe. We are hoping to find a better solution here in the future.

- ``GHC.Generics`` — Hand crafted instances of the ``Generic`` type class are

  not allowed in Safe Haskell. Such instances aren't strictly unsafe, but

  there is an important invariant that a ``Generic`` instance should adhere to

  the structure of the data type for which the instance is defined, and

  allowing manually implemented ``Generic`` instances would break that

  invariant. Derived instances (through the :extension:`DeriveGeneric`

  extension) are still allowed. Note that the only allowed

  :ref:`deriving strategy ` for deriving ``Generic`` under

  Safe Haskell is ``stock``, as another strategy (e.g., ``anyclass``) would

  produce an instance that violates the invariant.

  Refer to the

  :ref:`generic programming ` section for more details.

.. _safe-overlapping-instances:

Safe Overlapping Instances

~~~~~~~~~~~~~~~~~~~~~~~~~~

Due to the semantic consistency guarantee of Safe Haskell, we must

restrict the function of overlapping instances. We don't restrict their

ability to be defined, as this is a global property and not something we

can determine by looking at a single module. Instead, when a module

calls a function belonging to a type-class, we check that the instance

resolution done is considered 'safe'. This check is enforced for modules

compiled with both ``-XSafe`` and ``-XTrustworthy``.

More specifically, consider the following modules:

::

            {-# LANGUAGE Safe #-}

            module Class (TC(..)) where

              class TC a where { op :: a -> String }

            {-# LANGUAGE Safe #-}

            module Dangerous (TC(..)) where

              import Class

              instance

                {-# OVERLAPS #-}

                TC [Int] where { op _ = "[Int]" }

            {-# LANGUAGE Safe #-}

            module TCB_Runner where

              import Class

              import Dangerous