Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
GHC
GHC
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,273
    • Issues 4,273
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 413
    • Merge Requests 413
  • Requirements
    • Requirements
    • List
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Security & Compliance
    • Security & Compliance
    • Dependency List
    • License Compliance
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #11284

Closed
Open
Opened Dec 24, 2015 by Ben Gamari@bgamari🐢Maintainer

Lambda-lifting fails in simple Text example

Consider the example (which uses Text; I'm working on finding a more minimal example),

{-# LANGUAGE BangPatterns #-}
module T11284 where

import Data.Char (isSpace)
import Data.List (foldl')
import GHC.Exts (build)
import qualified  Data.Text as T
import qualified  Data.Text.Array as A

longestWord :: T.Text -> Int
longestWord t = foldl' max 0 $ map T.length $ fusedWords t

fusedWords :: T.Text -> [T.Text]
fusedWords t0 = build $ \cons nil ->
  let go !t
        | T.null t  = nil
        | otherwise = let (w, rest) = T.span (not . isSpace) t
                      in cons w (go $ T.dropWhile isSpace rest)
  in go t0

-- For reference
data Text = Text
    {-# UNPACK #-} !A.Array          -- payload (Word16 elements)
    {-# UNPACK #-} !Int              -- offset (units of Word16, not Char)
    {-# UNPACK #-} !Int              -- length (units of Word16, not Char)

longestWord here produces the simplified Core,

Ticket.$wgo :: GHC.Prim.ByteArray# -> GHC.Prim.Int# -> GHC.Prim.Int# -> [T.Text]
Ticket.$wgo = ...

-- > $wgo1 xs n = foldl' (\a b -> max a $ T.length b) n xs
Ticket.$wgo1 :: [T.Text] -> GHC.Prim.Int# -> GHC.Prim.Int#
Ticket.$wgo1 =
  \ (w_s4GJ :: [T.Text]) (ww_s4GN :: GHC.Prim.Int#) ->
    case w_s4GJ of _ {
      [] -> ww_s4GN;
      : y_a4vC ys_a4vD ->
        case y_a4vC
        of _ { Data.Text.Internal.Text dt_a4jP dt1_a4jQ dt2_a4jR ->
        let {
          a_a4jO :: GHC.Prim.Int#
          a_a4jO = GHC.Prim.+# dt1_a4jQ dt2_a4jR } in
        letrec {
          -- For the love of all that is good, why must you allocate?
          --
          -- This loop is essentially `T.length`, the first argument being
          -- the length accumulator and the second being an index into the
          -- ByteArray#
          $wloop_length_s4GI :: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int#
          $wloop_length_s4GI =
            \ (ww1_s4Gz :: GHC.Prim.Int#) (ww2_s4GD :: GHC.Prim.Int#) ->
              -- Have we reached the end of the Text?
              case GHC.Prim.tagToEnum# @ Bool (GHC.Prim.>=# ww2_s4GD a_a4jO)
              of _ {
                False -> {
                  ...
                  -- in this body there are few cases analyses which
                  -- classify the code-points we encounter. The branches
                  -- are recursive calls of the form
                  $wloop_length_s4GI (GHC.Prim.+# ww1_s4Gz 1) (GHC.Prim.+# ww2_s4GD 1)
                  ...
                True -> ww1_s4Gz
              }; } in
        case $wloop_length_s4GI 0 dt1_a4jQ of ww1_s4GH { __DEFAULT ->
        case GHC.Prim.tagToEnum# @ Bool (GHC.Prim.<=# ww_s4GN ww1_s4GH)
        of _ {
          False -> Ticket.$wgo1 ys_a4vD ww_s4GN;
          True -> Ticket.$wgo1 ys_a4vD ww1_s4GH
        }
        }
        }
    }

longestWord :: T.Text -> Int
longestWord =
  \ (w_s4GT :: T.Text) ->
    case w_s4GT
    of _ { Data.Text.Internal.Text ww1_s4GW ww2_s4GX ww3_s4GY ->
    case Ticket.$wgo1 (Ticket.$wgo ww1_s4GW ww2_s4GX ww3_s4GY) 0
    of ww4_s4H2 { __DEFAULT ->
    GHC.Types.I# ww4_s4H2
    }
    }

Notice $wloop_length_s4GI: It should be a nice tight loop counting Unicode characters in the array dt_a4jP until it arrives at its end. However, GHC fails to lambda-lift this closure, thereby turning it into an allocating operation! Oh no!

Edited Mar 10, 2019 by Sebastian Graf
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: ghc/ghc#11284