Skip to content

`BufferCodec`s have unnecessary allocations

(WIP code in !9948 (closed))

The GHC.IO.Encoding.Types module contains:

type CodeBuffer from to = Buffer from -> Buffer to -> IO (CodeProgress, Buffer from, Buffer to)

data BufferCodec from to state = BufferCodec {
  encode :: CodeBuffer from to,
  recover :: Buffer from -> Buffer to -> IO (Buffer from, Buffer to),
  ...
}

Given a simple test program to benchmark this:

{-# LANGUAGE MagicHash #-}
{-# OPTIONS_GHC -dno-typeable-binds -O2 #-}

module Main (main) where

import System.IO
import Data.Bits
import GHC.Int
import GHC.Exts
import System.Environment

main :: IO ()
main = do
  [n] <- getArgs
  withFile "/dev/null" WriteMode (loop (read n))

loop :: Int -> Handle -> IO ()
loop 0  !_ = pure ()
loop !n !h = do
  hPutChar h $! dummy_char n
  loop (n-1) h

-- unsafe efficient version of `chr`
my_chr :: Int -> Char
my_chr (I# i) = C# (chr# i)

-- return either a or b
dummy_char :: Int -> Char
dummy_char !i = my_chr ((i .&. 1) + 97)

We can extract a ticky profile of this to give us allocation statistics about UTF8 encoding:

    Entries       Alloc     Alloc'd  Non-void Arguments   STG Name
--------------------------------------------------------------------
 100001    14400144           0   8 PMEiwiiS              GHC.IO.Encoding.UTF8.$wutf8_encode{v rh} (fun)
 100001     9600096           0   2 SS                    GHC.IO.Encoding.UTF8.mkUTF1{v rn} (fun)
 100000     6400000           0   2 cS                    $l$wact1_g5R1{v} (GHC.IO.Handle.Text) (fun)
 100002     4800096           0   4 LM>P                  GHC.IO.Handle.Internals.$wdo_operation{v rR} (fun)
 100002     4000080           0   4 LMP>                  GHC.IO.Handle.Internals.$wwantWritableHandle'{v r1N} (fun)
1700034     2400048           0   1 S                     sat_s5F0{v} (GHC.IO.Handle.Internals) (fun) in s5DP
 100000     2400000           0   4 >>SS                  GHC.IO.Handle.Internals.$wstreamEncode{v r1c} (fun)
4100082     1600032           0   0                       io{v s5DP} (GHC.IO.Handle.Internals) (fun) in r1N
 100000     1600000           0   2 Mc                    GHC.IO.Handle.Text.$whPutChar{v rL} (fun)

We can also extract heap statistics when running the program:

./HandlePerf 100 +RTS -s
      140,952 bytes allocated in the heap
./HandlePerf 1000 +RTS -s
      796,440 bytes allocated in the heap
./HandlePerf 10000 +RTS -s
     7,348,944 bytes allocated in the heap
./HandlePerf 100000 +RTS -s
    72,871,608 bytes allocated in the heap

A lot of these allocations are due to the use of boxed tuples in recover and decode - since every call to these will allocate a tuple for the result. We could instead define these as:

type CodeBuffer from to = Buffer from -> Buffer to -> IO (CodeProgress, Buffer from, Buffer to)
type CodeBuffer# from to = Buffer from -> Buffer to -> State# RealWorld -> (# State# RealWorld, CodeProgress, Buffer from, Buffer to #)

data BufferCodec from to state = BufferCodec {
  encode# :: CodeBuffer# from to,
  recover# :: Buffer from -> Buffer to -> State# RealWorld -> (# State# RealWorld, Buffer from, Buffer to #),
  ...
}

{-# INLINE encode #-}
encode :: BufferCodec from to state -> CodeBuffer from to
encode codec from to = IO $ \s -> case encode# codec from to s of
  (# s', progress, from', to' #) -> (# s', (progress, from', to') #)

{-# INLINE recover #-}
recover :: BufferCoder from to state -> Buffer from -> Buffer to -> (Buffer from, Buffer to)
recover codec from to = IO $ \s -> case recover# codec from to s of
  (# s', from', to' #) -> (# s', (from', to') #)

Then, if we propagate these changes to the encoders/decoders, such as GHC.IO.Encoding.UTF8.utf8_encode, we get the following allocation numbers:

     Entries       Alloc     Alloc'd  Non-void Arguments   STG Name
--------------------------------------------------------------------
     100001    14400144           0   8 PMEiwiiS           GHC.IO.Encoding.UTF8.$wutf8_encode{v r2mD} (fun)
     100001     6400064           0   2 SS                 GHC.IO.Encoding.UTF8.utf8_encode{v rC3} (fun)
     100000     6400000           0   2 cS                 $l$wact1_g5QY{v} (GHC.IO.Handle.Text) (fun)
     100002     4800096           0   4 LM>P               GHC.IO.Handle.Internals.$wdo_operation{v r4Zo} (fun)
     100002     4000080           0   4 LMP>               GHC.IO.Handle.Internals.$wwantWritableHandle'{v r4Zy} (fun)
    1700034     2400048           0   1 S                  sat_s5Fh{v} (GHC.IO.Handle.Internals) (fun) in s5E6
     100000     2400000           0   4 >>SS               GHC.IO.Handle.Internals.$wstreamEncode{v r4Yn} (fun)
    4100082     1600032           0   0                    io{v s5E6} (GHC.IO.Handle.Internals) (fun) in r4Zy
     100000     1600000           0   2 Mc                 GHC.IO.Handle.Text.$whPutChar{v r51K} (fun)
./HandlePerf 100 +RTS -s
         106,792 bytes allocated in the heap
./HandlePerf 1000 +RTS -s
         459,896 bytes allocated in the heap
./HandlePerf 10000 +RTS -s
       3,988,416 bytes allocated in the heap
./HandlePerf 100000 +RTS -s
      39,271,096 bytes allocated in the heap
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information