Tweaks to the existing record system
This page is used to discuss minor tweaks to the existing record system if it is decided that it will be left in basically unchanged. radical or brand new record systems should be discussed elsewhere.
Punning
Reallow punning, Foo {x,y,z} would be interpreted as Foo {x = x, y = y, z = z} in both declaration and pattern matching contexts.
Update
Update syntax should not bottom out when fields are undefined, e.g.
data Foo = Foo { x :: String, y :: Int } | Bar { x :: String }
foo = Bar { x = "hello }
baz = foo { y = 3 }
should not result in an error, but rather pass foo through unchanged. update should update the record if it exists, but pass the type through otherwise. This would make the update syntax actually useful.
Label-based pattern-matching
The function:
f val { x = "foo" } = 4
should match if passed a Foo or a Bar with x being equal to "foo" and val would be bound to its argument (like an @ pattern)
g _ { y = 3 } = 4
would match only the Bar constructor since it is the only one with a y field.
This would mitigate the problems caused by accessors being partial functions since you can use a simple case statement to get the effect of an accesor that returns its result in a Maybe.
Note from Simon. I hate that the above defn of 'f' has just one argument (val {x="foo")),
whereas it looks as if it has two. (This is a problem with existing Haskell.) It looks
like 'f' has an argument 'val' and another arguement that is a free-standing record,
something we really want in the end anyhow. Not sure how to fix this. val@{x="foo")
?
First-class syntax
First class update and setting syntax (more advanced, needs better syntax). A syntax for updating and setting fields should be allowed. Some possibilites are
foo { x = }
would be equivalent to (\v -> foo { x = v })
foo { x \ }
would be equivalent to
(\f -> foo { x = case foo of _ {x} -> foo { x = f x }; _ -> foo })
Polymorphic record update
Given a record like:
data Foo a = Foo { bar :: a }
it would be nice to be able to update it like:
f = Foo { bar = 'a' }
g = f { bar = False }
Note the change in the type of the stored field. At the moment, such a record update must be written using the data constructor, not the update syntax.
SLPJ: That isn't true. Haskell 98 already supports what you suggest here, and the code above compiles with GHC without flags.
However I would like to argue for removing this feature. It causes a quite unreasonable amount of
pain in the type checker, especially once we start thinking about GADTs and existentials. And I think the benefit is modest at best. I don't think programmers generally expect record update to be a type-changing operation.
'Open' statement
Having the ability to 'open' a record bringing all its values into scope would be useful for techniques such as first class modules when combined with PolymorphicComponents. a proposal is
data Record = Record { foo :: Int, bar :: String }
f :: Record -> Int
f x = ... where
open x
ans = ...
will desugar to
f x = ... where
Record { foo = foo } = x
Record { bar = bar } = x
ans = ...
open x
would be allowed at the top level, in a let binding, or in a where binding.
Abstraction
It is often useful to limit the ability of users to fill in or access parts of a data type arbitrarily to maintain invariants, instituting the following rule would let you enforce that to some degree:
- Positional notation for pattern matching and constructor creation for a constructor may not be used unless all of its field labels are in scope.
This would insure that by not exporting a field label, it cannot be gotten around by using positional notation. This fix would also require the polymorphic setting ability mentioned above and would partially mitigate the need for ReadonlyConstructors
Polymorphic Record Update take II
(The following was discussed briefly on the Haskell' list.) Consider the following data type:
data T a
= C1 { f1 :: a }
| C2 { f1 :: a, f2 :: Int }
| C3 { f2 :: Int }
deriving Show
Suppose we want to update the field f1
only in such a way that
its type changes. We cannot use the record update syntax, as not
all constructors have a field f1
. So we write a utility function.
However, we would prefer to do as little as possible when it
comes to values constructed by constructors NOT having a field
f2
. One might naively try this:
foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x = x
But of course, this does not type check as the type of x
is
different on the LHS and RHS. We can get around that by reconstructing
the value on the RHS:
foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x@(C3 {f2 = n}) = C3 {f2 = n}
However, this is bad, because we have to change the code if further
constructors are added, even when they do not have a field f1
,
and we also have to change the code if further fields are added
to constructors not having the field f1
. This is tedious,
error prone, and really defeats one of the main reasons for using
records in the first place. For example:
data T a
= C1 { f1 :: a }
| C2 { f1 :: a, f2 :: Int }
| C3 { f2 :: Int, f3 :: Char }
| C4 { f2 :: Int }
deriving Show
foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x@(C3 {f2 = n, f3 = c}) = C3 {f2 = n, f3 = c}
foo x@(C4 {f2 = n}) = C4 {f2 = n}
One might think it would be possible to do better if we're furtunate
enough to have a field that is common to *all* constructors not having
a field f1
, as is the case for f2
in this case:
foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x = x {f2 = f2 x}
But this does not type check, and it would not apply anyway if there is no such common field.
What we really need is a function that reconstructs a value of type T a
at type T b
for all values constructed by a constructor that does not have
a field f1
:
coerce_no_f1 :: T a -> T b
coerce_no_f1 x@(C3 {f2 = n, f3 = c}) = C3 {f2 = n, f3 = c}
coerce_no_f1 x@(C4 {f2 = n}) = C4 {f2 = n}
foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x = coerce_no_f1 x
But we'd rather not have to write such functions by hand, just as
we'd rather not write update functions by hand. Maybe the record
update syntax could be extended so that the function that gets
generated behind the scenes only includes constructors that
does NOT mention a particular field. For example, the field
name(s) that must not occur could be prefixed by ~
which suggests
negation in some settings. It does not have this connotation in Haskell,
but at least ~
is already a special symbol. We could then write:
foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x = x {~f1}
Now the code for foo
only has to be changed if new constructors
having a field f1
are added.
Of course, it should be possible to combine this with the normal record update syntax. E.g.
foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x = x {~f1, f2 = f2 x + 1}
GHC Extension
Note that in GHC you can work around it like this:
foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo C3{..} = C3{..}
To enable this extension, use -XRecordWildCards
.
Comment
(Comment from Niklas Broberg)
This suggestion would go a long way to alleviate the burden of boiler-plate coding. It is a conservative extension, and it is intuitive at that. Indeed I believe I have written code with the suggested update mechanism many times without thinking on the type mismatch (and been beaten on my fingers by the compiler of course). :-)
Is this really necessary? Adding ~
seems less intuitive to me than
just writing
foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x = x
or
foo x = x {f2 = f2 x + 1}
for the last example. From an implementor's point of view, if we
expect the proper coercions to be inferred by the type checker it
would still have to check that there are indeed no more fields than
other than f1
that mention the parameter a
, and also that there
are no more constructors that mention f1
. Wouldn't it be just as
simple to assert that for all the fields that mention a
, none of
these appear in any of the remaining constructors?
On the other hand pattern matching would certainly be more expressive
if ~
is added, so perhaps adding it has merit of its own. If we
write
foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x = x {~f1}
there could still be more constructors in T a that do mention the f1
field, but there is no matching clause for them in the definition of
foo
. But I would see that as a second separate proposal, e.g. a
Proposal for Negation in Record Pattern Matching. Sure it would fit
very well with the Polymorphic record update discussed here, but I
would think they should be treated separately.
Meta-Proposal
Due to a lack of experience with alternative record systems, the consensus seems to be that we should stick with the current system, perhaps with a few of the minor tweaks mentioned above. (Which ones is a question still open for discussion.)
However, the main reason for there being little use of alternative candidates, would seem to be that they are not compatible with current Haskell. Thus, it would be useful to have some mechanism for experimental records to be tried out in real Haskell implementations before the next language committee (Haskell-double-prime) starts its work. Then there might be a possibility of one of them being accepted.
A concrete suggestion is that we separate out everything from the Report to do with named-field records into something like a self-contained addendum. Whilst still an official part of the language standard, it might also be marked as a possibility for future removal. This would make it clear what parts of the language could be changed (or re-used without conflict) in an alternative records system. The re-use part is especially important, since taking some of the same syntax to mean something different is pretty-much essential for useability.