This page presents a module breakdown of the safety of the Base package.
Green: Made safe with no modifications
Blue: Made trustworthy with no modifications
Yellow: Split out some unsafe functions to Module.Unsafe, made Module trustworthy
Red: Left unsafe
Most blue squares are blue because they import GHC.Base which is currently unsafe. Other also import unsafePerformIO operations.
For splitting modules that contain both Safe and Unsafe Symbols, I've moved the entire definition to a new module called say GHC.Arr.Imp. Then added two new module, GHC.Arr.Safe, GHC.Arr.Unsafe. Then changed GHC.Arr to import the Safe and Unsafe modules and either just export the Safe API or export both Safe and Unsafe depending on a CPP flag. This allows us to choose at compile time if we want the base package to be safe by default or not. I could have used a simpler approach like having the entire module defined in GHC.Arr.Unsafe and not have a Imp module but I preferred the Safe and Unsafe modules having disjoint API's rather than Safe being a subset.
Keep in mind that anything in the IO monad is basically 'safe'. So Ptr, ForeignPtr are very dangerous but as long as we only allows use of these in the IO monad its not really in the domain of Safe Haskell to guarantee any safety.
I've taken the approach for the low level primitives (Int#, Addr#, ByteArray#) of being fairly heavy handed about keeping them unsafe. It gets tricky and hard to keep track of what operations are available at these low levels at time and if GHC will catch exceptions generated using them (i.e div by zero...).
Below is the breakdown for just the GHC modules in base:
*I tried to split Weak into Unsafe and Safe modules and have GHC.Weak just expose the Safe api (i.e this would make it a yellow box like the others). However I wasn't able to figure out how to move the definition of Weak. Many of the GHC modules are wired in and require changes to compiler/prelude/PreNames. For all other modules I was able to update their builtin location fine but for Weak I continually got links errors when trying to build libRts.a if I tried to move the definition of GHC.Weak around.
These are notes on specific modules and why they are the colour they are... ect.
GHC.Base and GHC.Prim:
Leaving unsafe. Had a go at making safe versions but gets
pretty ugly and complex quickly. See
Base Module for
a more detailed discussion.
Is it safe to expose ThreadId's constructors?
For the moment I've hidden both
GHC.Conc.IO and GHC.Conc.IO.Windows:
Made safe version that doesn't contain the asyncReadBA, asyncWriteBA functions.
Perhaps these can be left in and GHC.Conc.IO just made trustworthy since their
result is in the IO monad but they take a 'MutableByteArray# RealWorld' as a second
Made trustworthy... Not sure of this though
Left unsafe and didn't make safe / unsafe split
Mostly seems fine, only worry is access to Ptr constructor.
Also re-exports GHC.Prim
made safe/unsafe split
Exposes Ptr constructor
Cast operations of funptr to ptr seem dangerous as well, removed from safe version.
Made ForeignPtr type abstract
Has an '!unsafeForeignPtrToPtr' function also excluded
The whole module seems a little dangerous. (e.g castForeignPtr) As long as pointers can
only be dereferenced in the IO monad we should be OK though.
(Foreign.ForeignPtr - as above)
(Foreign.Ptr - as above)
GHC.IO.Encoding.CodePage.Table: Exports raw Addr# arrays. Also pretty specific code so doesn't seem that useful outside of the base package.
keeping unsafe and no safe version as depreciated module.
Made safe version due to access to IORef constructor
keeping unsafe and no safe version.
unpackCString# Among others seem quite unsafe.
Export STRep type and runSTrep in safe version. Is this OK?
*Made a Safe version but I had to leave GHC.Weak alone. When I tried to move GHC.Weak to GHC.Weak.Imp I would constantly get link errors when linking the libRts library. I changed the values in compiler/prelude/PrelNames.hs for GHC.Weak but this didn't seem to work. So there is GHC.Weak.Safe and GHC.Weak.Unsafe but no GHC.Weak.Imp and GHC.Weak has to be unsafe.
Left unmodified and made trustworthy
'uncheckedShiftRL64' is a little scary sounding but seems fine.
Data.Data and Data.Dynamic and Data.Typeable'
Left unsafe due to whole Typeable issue.
Was left unsafe. It can leak information to the console without detection.
The root of the base package and so of Haskell is GHC.Base and GHC.Prim. These both contain a lot of code and a lot of it is unsafe. Some of it obviously other less so. For example:
Addr# and Array# types are basically C style pointers, so no bounds checks. Can access arbitary memory with them, buffer overflows... ect
divInt :: Int -> Int -> Int seems perfectly safe but division by zero throws an uncatchable exception that crashes the program. (Is this intentional or a bug?)
It is also quite difficult to split this up since 1) GHC.Prim is defined inside of GHC not in any module text file, 2) GHC.Base is defined in a text file but extended by GHC (so GHC.Base exports Bool but Bool isn't defined in the actual GHC.Base text file).
This is potentially another argument for symbol level safety, it would make handling Base and Prim easier.
This does mean a lot of stuff is trustworthy though since
they import Base. I'd be happy to deal with the complexity
of making Safe versions but it seemed like the ongoing
maintenance work wouldn't be worth the benefits.
The best solution might be to leave Base and Prim alone and make Base.Safe and Prim.Safe that are both extended on demmand. (e.g we just add safe symbols to them as needed to get modules that use Base and Prim in a safe way to work in -XSafe). A fine grained total split of Base and Prim is doable but seems like it might be a maintenance problem.
I feel we could enable all of this except make Typeable
abstract so that instances can't be defined. (Could also
still allow deriving of these instances). My understanding
is that all of this dynamic stuff works fine as long as
the typeOf method basically doesn't lie and pretend two
types are the same. The original SYB paper on Typeable from
memory basically said this and said that allowing programmers
to define their own instances of typeOf was really an implementation
artifact and that it should be left up to the compiler.