Make lines stricter to fix space leak
I propose changing the implementation of lines to be stricter.
The current implementation causes (in GHC) a space leak, so it is better to sacrifice a little laziness to fix that leak.
Period of discussion: Three weeks, until 15^th^ October (because of ICFP).
Current implementation:
lines :: String -> [String]
lines "" = []
lines s = let (l, s') = break (== '\n') s
in l : case s' of
[] -> []
(_:s'') -> lines s''
The relevant part of the core showing the leak (-O or -O2, similar for -O0 but without inlining break
):
let {
ds1_sjM :: ([GHC.Types.Char], [GHC.Types.Char])
LclId
[Str: DmdType]
ds1_sjM =
case GHC.List.$wbreak @ GHC.Types.Char lvl1_rjW wild_B1
of _ { (# ww1_aj3, ww2_aj4 #) ->
(ww1_aj3, ww2_aj4)
} } in
GHC.Types.:
@ GHC.Base.String
(case ds1_sjM of _ { (l_aif, _) -> l_aif })
(case ds1_sjM of _ { (_, s'_aih) ->
case s'_aih of _ {
[] -> GHC.Types.[] @ GHC.Base.String;
: _ s''_adj -> Lines.lines s''_adj
}
})
ds1_sjM
keeps a reference to the first line until the second is demanded, preventing it from being GCed.
Current strictness properties:
lines _|_ = _|_
lines (_|_:_|_) = _|_ : _|_
Proposed implementation:
lines :: String -> [String]
lines "" = []
lines s = case break (== '\n') s of
(l, s') -> l : case s' of
[] -> []
(_:s'') -> lines s''
Due to the pattern matching on break
's result, the pair is never considered a single entity, allowing the first line to be GCed before the second is demanded.
Strictness properties of the proposed implementation:
lines _|_ = _|_
lines (_|_:_|_) = _|_
Generally, a _|_
as the first Char
of a line would produce a _|_
instead of the _|_ : _|_
which the current implementation produces. _|_
in any other place would be treated identically.
Trac metadata
Trac field | Value |
---|---|
Version | 6.12.3 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | libraries/base |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |