Avoiding head and tail in GHC codebase

added needs triage label

mentioned in merge request !8998 (closed)

mentioned in merge request !9002 (closed)

You shan't be alone... this has been on my mind for a while.

Awesome, yay!

mentioned in merge request !9006 (closed)

mentioned in merge request !9037 (closed)

added Pnormal Ttask labels

removed needs triage label

mentioned in merge request !9073 (closed)

mentioned in merge request !9078 (closed)

mentioned in commit ba417041

mentioned in commit 6a2eec98

mentioned in merge request !9083 (closed)

mentioned in merge request !9086 (closed)

mentioned in merge request !9090 (closed)

mentioned in merge request !9092 (closed)

mentioned in merge request !9091

It's not really a secret that I don't care if people use head or not. Personally I don't think the use of head is an actual issue in GHC. A quick search only turned up 2 tickets in the last 20 years that can be traced back to the use of head/tail.

That being said some of the patches removing uses of head have been very good and did make the code clearer which was nice to see.

Either way, I think if we want to exile the use of head from ghc on principle or enforce use of NonEmptyit should be because most contributors agree. While most removals of head were good (and generally Improved code clarity) I can't say that I loved most of the introductions of NonEmpty the code base. Between random INLINE pragmas, more verbose code and potential performance pitfalls I'm highly skeptical of this part of the initiative.

If GHC contributors agree that using NonEmpty is the way to go that's fine. But I think it's worth actually having this discussion. My personal stance is that: We should not introduce code patterns to GHC which are likely to degrade performance in order to rule out a rare class of errors.

But I'm aso not convinced that we have to choose between a slower compiler or the type safety of NonEmpty.

The problem I saw in a few places with the recent MR's is that NonEmpty being a separate data constructor requires us often to either be polymorphic on the input or handle the base case separately.

E.g if we have a function someUtility :: [SomeTy] -> SomeResult and we have uses both at NonEmpty and [a] (where the list is genuinly allowed to be empty) now we have a number of choices.

Use [] and require call sites having NonEmpty to convert to a list.
Provide a list and NonEmpty version of the function.
Use a overloaded definition someUtility :: Foldable f => f SomeTy -> SomeResult

I think both option 1. and 3. are acceptable. Option 1 is slightly more annoying to write code around in some cases and could come at a slight perf cost. Option 3 is something that was used in !9092 (diffs)

This requires us to either use INLINE, INLINEABLE or SPECIALIZE to ensure specialization.

INLINE is usually the wrong choice, as it will cause code bloat by inlining into places where it makes no sense to do so. It also does not INLINEABLE is better, can cause dupliate specializations and therefore code and compile time bloat so I don't think that's great.

SPECIALIZE is the best of this bunch. But it will result in at least two specializations for each such functions. It also doesn't guarantee the same level of performance! In particular if we overload e.g. foldr we go from one call to FoldableList_foldr <args> to FoldableNonEmpty_foldr which will apply the function once and then call out to FoldableList_foldr for the rest of the list. Even if the folds get inlined we still end up with more code which is bad for various reasons.

Orthogonal to that there is often overhead associated from convertion between lists and NonEmpty and so on.

So what can we do without degrading performance.

What I suggest is that instead of using NonEmpty we should use a newtype around data structures which describes them as NonEmpty instead. This means we get zero runtime overhead for conversions between lists and non-empty lists as they are represented by the same object at runtime. But we can still get the benefits of expressing the fact that a List (or with this approach any other data structure!) is not empty in the type system. And we get to reuse the safe operations for Lists at no cost.

So something along the lines of:

newtype NE a = UnsafeNE { unNE :: a }

safeHead :: NE [a] -> a
...

This would improve code-reuse, should sidestep most of the perf issues arising from the use of NonEmpty. It also allows us to expand the use to non-list types if we chose to do so. I think I would vastly prefer such an approach to the use of NonEmpty all over the code base. But I'm interested in what others think. In particular those proposing these changes.

I can't speak for others here, but my major motivation for this work (both NonEmpty and avoiding panic more broadly) has been my experience trying to add new features of GHC: some parts of the codebase feel like a minefield — modifying anything, it seems, is likely to cause a panic. (I think it particularly affected my efforts to clean up some of the comprehension/do machinery, and to implement "Type C" injectivity.)

A NE newtype would be much better than no static check at all. However, my preference is nonetheless for NonEmpty, unless and until we have strong reason to believe NonEmpty is hurting performance, for the following reasons:

The non-emptiness of NonEmpty is simple and obvious. A newtype wrapper module would need to be audited (not necessarily formally, but nonetheless).
Broadly, i would rather use the base library where possible in GHC, to ease the cognitive burden on contributors — most of all newer ones like myself!
It's not clear to me NonEmpty is really hurting performance much. Indeed, having one constructor, it is plausible to me in some cases it might slightly help (tho i have no evidence of this in any particular case).

Furthermore, for the Language.Haskell.Syntax subtree in particular, i would strongly argue for NonEmpty rather than a GHC-defined newtype wrapper, since we want to minimize (and ultimately abolish) the dependency on the GHC API. While this would be possible with a newtype wrapper (it would merely need to be defined in that package, or some other package split off of GHC), i believe it would also increase API friction.

PS: CI is currently badly set up to track regressions in performance that does not result in additional allocation (often the case for failed specialization) and code size. Which makes me further a bit uneasy about the influx of these changes adding NonEmpty.

I think code size metrics (potentially in CI) would be a very good idea! (I am not sure what to do about failed specialization.)

For the code size metrics, see #22256 (closed).

If GHC contributors agree that using NonEmpty is the way to go that's fine. But I think it's worth actually having this discussion.

I agree that it's a good debate to have.

I must say that I have been assuming that there is no perf hit to adopting these changes. If there was a significant perf hit, I'd indeed consider a newtype solution.

And indeed, all the compile time metrics I've seen have been 0.0% geom mean, with some small up-and-down. So is there evidence that we are taking a perf hit?

And indeed, all the compile time metrics I've seen have been 0.0% geom mean, with some small up-and-down. So is there evidence that we are taking a perf hit?

As I said above we currently don't track instructions executed or code size. I can believe that there is no huge allocation difference. But things like failed specialization or code bloat wouldn't show up in our current perf tests.

NE also has no fusion rules defined for it as far as I can tell opening up for more perf issues down the road.

PS: At least the patches I looked at so far which have landed have rarely added NonEmpty methods. I only started seeing patches using NE recently and that's when I commented.

Re: fusion: good point, see #22265.

I am not sure of which cases we must beware — many NonEmpty functions are defined in terms of their [] counterparts. I guess in (most) such cases an INLINE pragma should be enough, yes?

I guess in (most) such cases an INLINE pragma should be enough, yes?

I would expect fusion to just break if it involves NonEmpty in any form today, even with INLINE pragmas.

Generally fusion requires code of the form consumer (. producer/consumer)* . producer. If we insert a function working on NonEmpty anywhere in between I don't see how it could still work since none of them are good producers/consumers.

We can ensure that there are no issues with specialisation by using monomorphic functions, e. g., NE.map instead of fmap. I believe this is already mandated by Simon's style guide.

With regards to fusion, I have certain doubts that it ever happens outside of especially tight loops. If list is a part of a data structure, it's unlikely to be able to fuse, and in this case replacing it with a NonEmpty should not harm. Even if lists are passed as function arguments, producer / consumer fusion requires either {-# RULES#-}, or excessive inlining, which I do not recall seeing much in the "business logic" of GHC. My knowledge is of course very limited here, so happy to be corrected.

While other optimisations can be invisible for allocation counters, list fusion should definitely have an impact on them. Since we do not observe regressions in allocations (and in fact witnessed small improvements), it seems unlikely to me that the changes in question impact any existing list fusion in a material way.

We can ensure that there are no issues with specialisation by using monomorphic functions, e. g., NE.map instead of fmap. I believe this is already mandated by Simon's style guide.

In practice it seems we have added a few polymorphic functions since. Which hopefully will be fine as long as the right SPECIALIZE pragmas are in place. But it is a burden in MR's and I've reviewed MR's since I lasted posted here where these were missing.

With regards to fusion

Stuff like map f . filter sum . map or map . map is not at all uncommon in GHC and I would expect them to perform worse when using NonEmpty. It's not so much that I fear the MR's regression currently tested cases as these should be caught by CI.

But a scenario where we change a type to NonEmpty, and later on someone writes something like NE.map fst . NE.filter foo . NE.map bar seems reasonably where the author likely expects this to fuse as it would with lists which it currently will not.

At a very basic level I'm concerned about this ending up being a trade off between easier invariant enforcement and worse compiler performance but now but especially down the road.

It's simply easier to optimize code that's just using lists both for the compiler and for the developer.

Ideally we would have easier invariant enforcement with NonEmpty (a-la Haskell!) and still have great performance.

"Down the road", don't we want more NonEmpty users other than GHC? Should everyone slightly concerned about performance avoid NonEmpty altogether?

But a scenario where we change a type to NonEmpty, and later on someone writes something like NE.map fst . NE.filter foo . NE.map bar seems reasonably where the author likely expects this to fuse as it would with lists which it currently will not

Why will it not? Is it possible to have it fuse?

I guess I'm on the invariant enforcement side of the board:

I think that's what we ultimately want to have (think "if everything was equally fast which would I do")
And we can then make it performant

Also, I haven't understood if it does end up being less performant at all. As for a potential expression being less performant: Haskell developers have to pay attention to such performance pit falls. This particular case doesn't stand out that much to warrant not doing the best thing just to avoid someone potentially walking into it.

mentioned in issue #22256 (closed)

mentioned in merge request !9198 (closed)

mentioned in merge request !9230 (closed)

mentioned in merge request !9369 (closed)

changed the description

mentioned in issue #22610 (closed)

mentioned in merge request !9566 (closed)

mentioned in merge request !9789 (closed)

The bottomline is that @strake (thanks!) and me cleaned up every head and tail outside of compiler/ and fixed many usages inside compiler/. Further progress requires changes to core data structures, such as groups of recursive bindings:

data Bind b = NonRec b (Expr b)
            | Rec [(b, (Expr b))]

Such changes should rather be driven by core GHC developers and not by drive-by contributors like me, because of potential performance consequences, present and future. Since I don't feel comfortable to pursue it forwards and GHC team is busy with more important matters, closing.

closed

mentioned in merge request !13984

Avoiding head and tail in GHC codebase

Child items 0

Activity

So what can we do without degrading performance.