add "true group by" function in Data.List
Motivation
It is about the groupBy function in Data.List https://hackage.haskell.org/package/base-4.14.1.0/docs/Data-List.html#v:groupBy
In my own words: this function puts things in groups as long as they are the same, and creates a new group when a difference occurs.
This is not the same as grouping all things together which are equal (as by predicate in first argument). This is described in the documentation of https://hackage.haskell.org/package/base-4.14.1.0/docs/Data-List.html#v:group
Where the example yields: ["M","i","ss","i","ss","i","pp","i"]
instead of "true grouping" ["M","iiii","ssss","pp"]
.
Always when i used groupBy i had to sort first to solve this problem. But some data it's not easy to sort/hash but easy to compare in which case the groupBy function can not be used for "true grouping", thus there is no grouping function available in this use case.
I think the "true group by" is more frequently required than "groupBy" as it exists now. This is just from my experience.
For new users it's confusing when "groupBy does not group". Already had multiple bugs because of this.
Proposal
This feature request has two parts which can be accepted / rejected individually.
- Add documentation to groupBy function that more explicitly explains the behavior of "group until difference occurs" (probably need better wording). Also add big O notation for this function.
- Add function for "true group by" (needs better name)
trueGroupBy :: (a -> a -> Bool) -> [a] -> [[a]]
trueGroupBy f [] = []
trueGroupBy f (x:xs) = let (yes, no) = partition (f x) xs
in (x : yes) : trueGroupBy f no
This function is O(n^2) and should point in documentation to consider usage of groupBy instead since that one should be faster.