safe_haskell.xml 25.5 KB
Newer Older
1
2
3
<?xml version="1.0" encoding="iso-8859-1"?>
<sect1 id="safe-haskell">
  <title>Safe Haskell</title>
4
5
6
7
8
9
10
11
12
13
14
15

  <para>
  Safe Haskell is an extension to the Haskell language that is implemented in
  GHC as of version 7.2. It allows for unsafe code to be securely included in a
  trusted code base by restricting the features of GHC Haskell the code is
  allowed to use. Put simply, it makes the types of programs trustable.  Safe
  Haskell itself is aimed to be as minimal as possible while still providing
  strong enough guarantees about compiled Haskell code for more advance secure
  systems to be built on top of it. These include techniques such as
  information flow control security or encrypted computations.
  </para>

16
  The design of Safe Haskell covers the following aspects:
17

18
  <itemizedlist>
19
20
21
    <listitem>A <link linkend="safe-language">safe language</link> dialect of
      Haskell that provides guarantees about the code. It allows types and
      module boundaries to be trusted.
22
23
    </listitem>
    <listitem>A new <emphasis>safe import</emphasis> extension that specifies
24
      that the module being imported must be trusted.
25
26
27
28
29
30
31
    </listitem>
    <listitem>A definition of <emphasis>trust</emphasis> (or safety) and how it
      operates, along with ways of defining and changing the trust of modules
      and packages.
    </listitem>
  </itemizedlist>

32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
  <sect2 id="safe-use-cases">
    <title>Uses of Safe Haskell</title>

    Safe Haskell has been designed with two use cases in mind:

    <itemizedlist>
      <listitem>Enforcing strict type safety at compile time</listitem>
      <listitem>Compiling and executing untrusted code</listitem>
    </itemizedlist>

    <sect3>
      <title>Strict type-safety (good style)</title>

      Haskell offers a powerful type system and separation of pure and
      effectual functions through the <literal>IO</literal> monad. There are
      several loop holes in the type system though, the most obvious offender
      being the <literal>unsafePerformIO :: IO a -> a</literal> function. The
      safe language dialect of Safe Haskell disallows the use of such
      functions. This can be useful for a variety of purposes as it makes
      Haskell code easier to analyze and reason about. It also codifies an
      existing culture in the Haskell community of trying to avoid using such
      unsafe functions unless absolutely necessary. As such using the safe
      language (through the <option>-XSafe</option> flag) can be though of as a
      way of enforcing good style, similar to the function of
      <option>-Wall</option>.
    </sect3>

    <sect3>
      <title>Building secure systems (restricted IO Monads)</title>

      <para>
      Safe Haskell is designed to give users enough guarantees about the safety
      properties of compiled code so that secure systems can be built using
      Haskell. A lot of work has been done with Haskell, building such systems
      as information flow control security, capability based security, DSLs for
      working with encrypted data... etc. These systems all rely on properties
      of the Haskell language that aren't true in the general case where uses
      of functions like <literal>unsafePerformIO</literal> are allowed.
      </para>

      <para>
      As an example lets define an interface for a plugin system where the
      plugin authors are untrusted, possibly malicious third-parties. We do
      this by restricting the plugin interface to pure functions or to a
      restricted <literal>IO</literal> monad that we have defined that only
      allows a safe subset of <literal>IO</literal> actions to be executed. We
      define the plugin interface here so that it requires the plugin module,
      <literal>Danger</literal>, to export a single computation,
      <literal>Danger.runMe</literal>, of type <literal>RIO ()</literal>, where
      <literal>RIO</literal> is a new monad defined as follows:
      </para>

      <programlisting>
        -- Either of the following Safe Haskell pragmas would do
        {-# LANGUAGE Trustworthy #-}
        {-# LANGUAGE Safe #-}

        module RIO (RIO(), runRIO, rioReadFile, rioWriteFile) where

        -- Notice that symbol UnsafeRIO is not exported from this module!
        newtype RIO a = UnsafeRIO { runRIO :: IO a }

        instance Monad RIO where
            return = UnsafeRIO . return
            (UnsafeRIO m) >>= k = UnsafeRIO $ m >>= runRIO . k

        -- Returns True iff access is allowed to file name
        pathOK :: FilePath -> IO Bool
        pathOK file = {- Implement some policy based on file name -}

        rioReadFile :: FilePath -> RIO String
        rioReadFile file = UnsafeRIO $ do
          ok &lt;- pathOK file
          if ok then readFile file else return ""

        rioWriteFile :: FilePath -> String -> RIO ()
        rioWriteFile file contents = UnsafeRIO $ do
          ok &lt;- pathOK file
          if ok then writeFile file contents else return ()
      </programlisting>

      We compile Danger using the new Safe Haskell <option>-XSafe</option> flag:

      <programlisting>
        {-# LANGUAGE Safe #-}
        module Danger ( runMe ) where

        runMe :: RIO ()
        runMe = ...
      </programlisting>

      Before going into the Safe Haskell details, lets point out some of
      the reasons this design would fail without Safe Haskell:

      <itemizedlist>
        <listitem>The design attempts to restrict the operations that Danger
          can perform by using types, specifically the <literal>RIO</literal>
          type wrapper around <literal>IO</literal>. The author of Danger can
          subvert this though by simply writing arbitrary
          <literal>IO</literal> actions and using <literal>unsafePerformIO ::
          IO a -> a</literal> to execute them as pure functions.
        </listitem>
        <listitem>The design also relies on the Danger module not being able
          to access the <literal>UnsafeRIO</literal> constructor.
          Unfortunately Template Haskell can be used to subvert module
          boundaries and so could be used gain access to this constructor.
        </listitem>
        <listitem>There is no way to place restrictions on the modules that
          the untrusted Danger module can import. This gives the author of
          Danger a very large attack surface, essentially any package
          currently installed on the system. Should any of these packages
          have a vulnerability then the Danger module can exploit this. The
          only way to stop this would be to patch or remove packages with
          known vulnerabilities even if they should only be used by
          trusted code such as the RIO module.
        </listitem>
      </itemizedlist>

      <para>
      To stop these attacks Safe Haskell can be used. This is done by compiling
      the RIO module with the <option>-XTrustworthy</option> flag and compiling
      the Danger module with the <option>-XSafe</option> flag.
      </para>

      <para>
      The use of the <option>-XSafe</option> flag to compile the Danger module
      restricts the features of Haskell that can be used to a
      <link linkend="safe-language">safe subset</link>. This includes
      disallowing <literal>unsafePerfromIO</literal>, Template Haskell, pure
      FFI functions, Generalized Newtype Deriving, RULES and restricting the
      operation of Overlapping Instances. The <option>-XSafe</option> flag also
      restricts the modules can be imported by Danger to only those that are
      considered trusted.  Trusted modules are those compiled with
      <option>-XSafe</option>, where GHC provides a mechanical guarantee that
      the code is safe. Or those modules compiled with
      <option>-XTrustworthy</option>, where the module author claims that the
      module is Safe.
      </para>

      <para>
      This is why the RIO module is compiled with
      <option>-XTrustworthy</option>, to allow the Danger module to import it.
      The <option>-XTrustworthy</option> flag doesn't place any restrictions on
      the module like <option>-XSafe</option> does. Instead the module author
      claims that while code may use unsafe features internally, it only
      exposes an API that can used in a safe manner. There is an issue here as
      <option>-XTrustworthy</option> may be used by an arbitrary module and
      module author. Because of this for trustworthy modules to be considered
      trusted, and so allowed to be used in <option>-XSafe</option> compiled
      code, the client C compiling the code must tell GHC that they trust the
      package the trustworthy module resides in. This is essentially a way of
      for C to say, while this package contains trustworthy modules that can be
      used by untrusted modules compiled with <option>-XSafe </option>, I trust
      the author(s) of this package and trust the modules only expose a safe
      API. The trust of a package can be changed at any time, so if a
      vulnerability found in a package, C can declare that package untrusted so
      that any future compilation against that package would fail.  For a more
      detailed overview of this mechanism see <xref linkend="safe-trust"/>.
      </para>

      <para>
      So Danger can import module RIO because RIO is marked trustworthy. Thus,
      Danger can make use of the rioReadFile and rioWriteFile functions to
      access permitted file names. The main application then imports both RIO
      and Danger. To run the plugin, it calls RIO.runRIO Danger.runMe within
      the IO monad. The application is safe in the knowledge that the only IO
      to ensue will be to files whose paths were approved by the pathOK test.
      </para>
    </sect3>
  </sect2>

  <sect2 id="safe-language">
    <title>Safe Language</title>

    The Safe Haskell <emphasis>safe language</emphasis> guarantees the
207
    following properties:
208

209
    <itemizedlist>
210
211
212
213
214
215
216
217
      <listitem><emphasis>Referential transparency</emphasis> &mdash; Functions
        in the safe language are deterministic, evaluating them will not
        cause any side effects. Functions in the <literal>IO</literal> monad
        are still allowed and behave as usual. Any pure function though, as
        according to its type, is guaranteed to indeed be pure. This property
        allows a user of the safe language to trust the types. This means,
        for example, that the <literal>unsafePerformIO :: IO a -> a</literal>
        function is disallowed in the safe language.
218
      </listitem>
219
220
221
222
223
224
225
226
227
228
229
230
231
      <listitem><emphasis>Module boundary control</emphasis> &mdash; Haskell
        code compiled using the safe language is guaranteed to only access
        symbols that are publicly available to it through other modules export
        lists. An important part of this is that safe compiled code is not
        able to examine or create data values using data constructors
        that it cannot import. If a module M establishes some invariants
        through careful use of its export list then code compiled using the
        safe language that imports M is guaranteed to respect those invariants.
        Because of this, <emphasis><link linkend="template-haskell">Template
        Haskell</link></emphasis> and <emphasis>
        <link linkend="newtype-deriving">GeneralizedNewtypeDeriving</link>
        </emphasis> are both disabled in the safe language as they can be used
        to violate this property.
232
      </listitem>
233
234
235
236
237
238
239
240
241
242
243
      <listitem><emphasis>Semantic consistency</emphasis> &mdash; The safe
        language is strictly a subset of Haskell as implemented by GHC. Any
        expression that compiles in the safe language has the same meaning as
        it does when compiled in normal Haskell. In addition, in any module
        that imports a safe language module, expressions that compile both
        with and without the safe import have the same meaning in both cases.
        That is, importing a module using the safe language cannot change the
        meaning of existing code that isn't dependent on that module. So for
        example, there are some restrictions placed on the <emphasis>
        <link linkend="instance-overlap">Overlapping Instances</link>
        </emphasis> extension as it can violate this property.
244
245
      </listitem>
    </itemizedlist>
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295

    <para>
    These three properties guarantee that you can trust the types in the safe
    language, can trust that module export lists are respected in the safe
    language and can trust that code that successfully compiles using the safe
    language has the same meaning as it normally would.
    </para>

    Lets now look at the details of the safe language. In the safe language
    dialect (enabled by <option>-XSafe</option>) we disable completely the
    following features:

    <itemizedlist>
      <listitem><emphasis>GeneralizedNewtypeDeriving</emphasis> &mdash; It can
        be used to violate constructor access control, by allowing untrusted
        code to manipulate protected data types in ways the data type author
        did not intend. For example can be used to break invariants of data
        structures.</listitem>
      <listitem><emphasis>TemplateHaskell</emphasis> &mdash; Is particularly
        dangerous, as it can cause side effects even at compilation time and
        can be used to access abstract data types. It is very easy to break
        module boundaries with TH.</listitem>
   </itemizedlist>

    In the safe language dialect we restrict the following features:
    <itemizedlist>
      <listitem><emphasis>ForeignFunctionInterface</emphasis> &mdash; This is
        mostly safe, but foreign import declarations that import a function
        with a non-IO type are disallowed. All FFI imports must reside in the
        IO Monad.</listitem>
      <listitem><emphasis>RULES</emphasis> &mdash; As they can change the
        behaviour of trusted code in unanticipated ways, violating semantic
        consistency they are restricted in function. Specifically any RULES
        defined in a module M compiled with <option>-XSafe</option> are
        dropped. RULES defined in trustworthy modules that M imports are still
        valid and will fire as usual.</listitem>
      <listitem><emphasis>OverlappingInstances</emphasis> &mdash; This
        extension can be used to violate semantic consistency, because
        malicious code could redefine a type instance (by containing a more
        specific instance definition) in a way that changes the behaviour of
        code importing the untrusted module. The extension is not disabled for
        a module M compiled with <option>-XSafe</option> but restricted. While M
        can define overlapping instance declarations, they can only overlap
        other instance declaration defined in M. If in a module N that imports
        M, at a call site that uses a type-class function there is a choice of
        which instance to use (i.e. an overlap) and the most specific instances
        is from M, then all the other choices must also be from M. If not, a
        compilation error will occur. A simple way to think of this is a
        <emphasis>same origin policy</emphasis> for overlapping instances
        defined in Safe compiled modules.</listitem>
296
297
298
299
300
301
	  <listitem><emphasis>Data.Typeable</emphasis> &mdash; We restrict Typeable
		  instances to only derived ones (offered by GHC through the 
		  <link linkend="deriving-typeable"><option>-XDeriveDataTypeable</option>
		  </link> extension). Hand crafted instances of the Typeable type class
		  are not allowed in Safe Haskell as this can easily be abused to 
		  unsafely coerce between types.</listitem>
302
    </itemizedlist>
303
304
305
306
307
308
309
310
311
312
313
  </sect2>

  <sect2 id="safe-imports">
    <title>Safe Imports</title>

    Safe Haskell enables a small extension to the usual import syntax of
    Haskell, adding a <emphasis>safe</emphasis> keyword:
    <programlisting>
      impdecl -> import [safe] [qualified] modid [as modid] [impspec]
    </programlisting>

314
315
316
317
318
319
320
    When used, the module being imported with the safe keyword must be a
    trusted module, otherwise a compilation error will occur. The safe import
    extension is enabled by either of the <option>-XSafe</option>,
    <option>-XTrustworthy</option>, or <option>-XSafeImports</option>
    flags and corresponding PRAGMA's. When the <option>-XSafe</option> flag
    is used, the safe keyword is allowed but meaningless, every import
    is required to be safe regardless.
321
322
323
324
325
326
  </sect2>

  <sect2 id="safe-trust">
    <title>Trust</title>

    The Safe Haskell extension introduces the following two new language flags:
327

328
    <itemizedlist>
329
330
331
332
333
334
335
336
337
338
339
340
341
      <listitem><emphasis>-XSafe</emphasis> &mdash; Enables the safe language
        dialect, asking GHC to guarantee trust. The safe language dialect
        requires that all imports be trusted or a compilation error will
        occur.</listitem>
      <listitem><emphasis>-XTrustworthy</emphasis> &mdash; Means that while
        this module may invoke unsafe functions internally, the module's
        author claims that it exports an API that can't be used in an unsafe
        way.  This doesn't enable the safe language or place any restrictions
        on the allowed Haskell code.  The trust guarantee is provided by the
        module author, not GHC. An import statement with the safe keyword
        results in a compilation error if the imported module is not trusted.
        An import statement without the keyword behaves as usual and can
        import any module whether trusted or not.</listitem>
342
343
    </itemizedlist>

344
    <para>
345
    Whether or not a module is trusted depends on a notion of trust for
346
    packages, which is determined by the client C invoking GHC (i.e. you). A
347
348
349
350
351
    package <emphasis>P</emphasis> is trusted when either C's package database
    records that <emphasis>P</emphasis> is trusted (and no command-line
    arguments override this), or C's command-line flags say to trust it
    regardless of what is recorded in the package database. In either case, C
    is the only authority on package trust. It is up to the client to decide
352
353
    which <link linkend="safe-package-trust">packages they trust</link>.
    </para>
354

355
    So a <emphasis>module M in a package P is trusted by a client C</emphasis>
356
    if and only if:
357

358
359
360
    <itemizedlist>
      <listitem>Both of these hold:
        <itemizedlist>
361
362
          <listitem> The module was compiled with <option>-XSafe</option>
            </listitem>
363
364
365
366
367
          <listitem> All of M's direct imports are trusted by C</listitem>
        </itemizedlist>
      </listitem>
      <listitem><emphasis>OR</emphasis> all of these hold:
        <itemizedlist>
368
369
          <listitem>The module was compiled with <option>-XTrustworthy</option>
            </listitem>
370
371
372
373
374
375
376
          <listitem>All of M's direct safe imports are trusted by C</listitem>
          <listitem>Package P is trusted by C</listitem>
        </itemizedlist>
      </listitem>
    </itemizedlist>

    For the first trust definition the trust guarantee is provided by GHC
377
    through the restrictions imposed by the safe language. For the second
378
379
380
381
    definition of trust, the guarantee is provided initially by the
    module author. The client C then establishes that they trust the
    module author by indicating they trust the package the module resides
    in. This trust chain is required as GHC provides no guarantee for
382
    <literal>-XTrustworthy</literal> compiled modules.
383
384

    <sect3 id="safe-trust-example">
385
      <title>Example</title>
386

387
388
389
390
391
392
393
394
395
396
397
398
399
      <programlisting>
        Package Wuggle:
           {-# LANGUAGE Safe #-}
           module Buggle where
             import Prelude
             f x = ...blah...

        Package P:
           {-# LANGUAGE Trustworthy #-}
           module M where
             import System.IO.Unsafe
             import safe Buggle
      </programlisting>
400

401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
      <para>
      Suppose a client C decides to trust package P. Then does C trust module
      M? To decide, GHC must check M's imports &mdash; M imports
      System.IO.Unsafe. M was compiled with <option>-XTrustworthy</option>, so
      P's author takes responsibility for that import. C trusts P's author, so
      C trusts M to only use its unsafe imports in a safe and consistent
      manner with respect to the API M exposes. M also has a safe import of
      Buggle, so for this import P's author takes no responsibility for the
      safety, so GHC must check whether Buggle is trusted by C. Is it? Well,
      it is compiled with <option>-XSafe</option>, so the code in Buggle
      itself is machine-checked to be OK, but again under the assumption that
      all of Buggle's imports are trusted by C. Prelude comes from base, which
      C trusts, and is compiled with <option>-XTrustworthy</option> (While
      Prelude is typically imported implicitly, it still obeys the same rules
      outlined here). So Buggle is considered trusted.
      </para>

      <para>
      Notice that C didn't need to trust package Wuggle; the machine checking
      is enough. C only needs to trust packages that contain
      <option>-XTrustworthy</option> modules.
      </para>
423
424
425
426
427
    </sect3>

    <sect3 id="safe-package-trust">
      <title>Package Trust</title>

428
429
430
      Safe Haskell gives packages a new Boolean property, that of trust.
      Several new options are available at the GHC command-line to specify the
      trust property of packages:
431
432

      <itemizedlist>
433
434
435
436
437
        <listitem><emphasis>-trust P</emphasis> &mdash; Exposes package P if it
          was hidden and considers it a trusted package regardless of the
          package database.</listitem>
        <listitem><emphasis>-distrust P</emphasis> &mdash; Exposes package P if
          it was hidden and considers it an untrusted package regardless of the
438
          package database.</listitem>
439
440
441
        <listitem><emphasis>-distrust-all-packages</emphasis> &mdash; Considers
          all packages distrusted unless they are explicitly set to be trusted
          by subsequent command-line options.</listitem>
442
443
      </itemizedlist>

444
445
      To set a package's trust property in the package database please refer to
      <xref linkend="packages"/>.
446
447
    </sect3>

448
449
    <sect3 id="safe-no-trust">
      <title>Safe Imports without Trust</title>
450
451
452
453
454
455
456
457
458
459
460

      If you are writing a module and want to import a module from an untrusted
      author, then you would use the following syntax:

      <programlisting>
        import safe Untrusted.Module
      </programlisting>

      As the safe import keyword is a feature of Safe Haskell and not Haskell98
      this would fail though unless you enabled Safe imports through on the of
      the Safe Haskell language flags. Three flags enable safe imports,
461
462
463
464
465
      <option>-XSafe, -XTrustworthy</option> and
      <option>-XSafeImports</option>. However <option>-XSafe</option> and
      <option>-XTrustworthy</option> do more then just enable the keyword which
      may be undesirable. Using the <option>-XSafeImports</option> language
      flag allows you to enable safe imports and nothing more.
466
    </sect3>
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
  </sect2>

  <sect2 id="safe-flag-summary">
    <title>Safe Haskell Flag Summary</title>

    In summary, Safe Haskell consists of the following language flags:

    <variablelist>
      <varlistentry>
        <term>-XSafe</term>
        <listitem>To be trusted, all of the module's direct imports must be
          trusted, but the module itself need not reside in a trusted
          package, because the compiler vouches for its trustworthiness. The
          "safe" keyword is allowed but meaningless in import statements,
          every import is required to be safe regardless.
          <itemizedlist>
            <listitem><emphasis>Module Trusted</emphasis> &mdash; Yes</listitem>
            <listitem><emphasis>Haskell Language</emphasis> &mdash; Restricted to Safe
              Language</listitem>
            <listitem><emphasis>Imported Modules</emphasis> &mdash; All forced to be
              safe imports, all must be trusted.</listitem>
          </itemizedlist>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term>-XTrustworthy</term>
        <listitem>This establishes that the module is trusted, but the
          guarantee is provided by the module's author. A client of this
          module then specifies that they trust the module author by
          specifying they trust the package containing the module.
          <option>-XTrustworthy</option> has no effect on the accepted range
          of Haskell programs or their semantics, except that they allow the
          safe import keyword.
          <itemizedlist>
            <listitem><emphasis>Module Trusted</emphasis> &mdash; Yes but only if the
              package the module resides in is also trusted.</listitem>
            <listitem><emphasis>Haskell Language</emphasis> &mdash; Unrestricted
            </listitem>
            <listitem><emphasis>Imported Modules</emphasis> &mdash; Under control of
              module author which ones must be trusted.</listitem>
          </itemizedlist>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term>-XSafeImport</term>
        <listitem>Enable the Safe Import extension so that a module can
          require a dependency to be trusted without asserting any trust
          about itself.
          <itemizedlist>
            <listitem><emphasis>Module Trusted</emphasis> &mdash; No</listitem>
            <listitem><emphasis>Haskell Language</emphasis> &mdash;
              Unrestricted</listitem>
            <listitem><emphasis>Imported Modules</emphasis> &mdash; Under control of
              module author which ones must be trusted.</listitem>
          </itemizedlist>
        </listitem>
      </varlistentry>

    </variablelist>
528

529
530
531
532
533
534
535
536
537
  </sect2>

</sect1>

<!-- Emacs stuff:
     ;;; Local Variables: ***
     ;;; sgml-parent-document: ("users_guide.xml" "book" "chapter" "sect1") ***
     ;;; End: ***
 -->