Skip to content

Data.Text.length allocates one closure per character

In #11284 (closed) I noticed that map Data.Text.length applied to a list of Text values would result in code which would allocate one closure per character due to lack of lambda lifting.

It turns out it's even worse than this.

Consider this example (inspired by #11284 (closed)),

module Hi2 (hello) where

import qualified Data.Text as T

hello :: Int
hello = T.length hi

hi :: T.Text
hi = T.pack "hello"

When compiled with GHC 7.10.3 with -O, the following simplified Core is produced,

hello :: Int
hello =
  case unpackCString# "hello"#
  of _ { Text dt_a33D dt1_a33E dt2_a33F ->
  let {
    a_a33C :: Int#
    a_a33C = +# dt1_a33E dt2_a33F } in
  letrec {
    $wloop_length_s3bt :: Int# -> Int# -> Int#
    $wloop_length_s3bt =
      \ (ww_s3bk :: Int#) (ww_s3bo :: Int#) ->
        case tagToEnum# @ Bool (>=# ww_s3bo a_a33C) of _ {
          False ->
            case indexWord16Array# dt_a33D ww_s3bo of r#_a35n { __DEFAULT ->
            case tagToEnum# @ Bool (geWord# r#_a35n (__word 55296)) of _ {
              False -> $wloop_length_s3bt (+# ww_s3bk 1) (+# ww_s3bo 1);
              True ->
                case tagToEnum# @ Bool (leWord# r#_a35n (__word 56319)) of _ {
                  False -> $wloop_length_s3bt (+# ww_s3bk 1) (+# ww_s3bo 1);
                  True -> $wloop_length_s3bt (+# ww_s3bk 1) (+# ww_s3bo 2)
                }
            }
            };
          True -> ww_s3bk
        }; } in
  case $wloop_length_s3bt 0 dt1_a33E of ww_s3bs { __DEFAULT ->
  I# ww_s3bs
  }
  }

Even this simple example produces a length function which allocations

Trac metadata
Trac field Value
Version 7.10.3
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information