I've changed get_shndx_table
to accept ObjectCode
and moved the check into it. Is that better?
aadaa-fgtaa (67f6ecea) at 20 Jun 20:38
Optimise ELF linker (#23464)
I've tried moving shndx_table
from ObjectCode
to ObjectCodeFormatInfo
but unfortunately that doesn't work — ObjectCodeFormatInfo
is initialized by ocInit_ELF
only after ObjectCode
has been validated by ocVerifyImage_ELF
, but ocVerifyImage_ELF
uses shndx_table
itself. This also means that all my previous patches were incorrect — ocVerifyImage_ELF
could access shndx_table
before it was initialized. Maybe that was the cause of the i386 CI failure, through I'm not sure how it never caused segfaults then.
I've ended up turning shndx_table
into a cache instead of precomputed value, to make sure I haven't missed some other control flow paths where shndx_table
would be used before initialization. The only catch here is that we cannot really use NULL to signal uninitialized shndx_table
because NULL is a valid return value of get_shndx_table
. So instead of that I use the address of an otherwise-unused global variable as uninitialized marker.
Overall, except for that weird segfault on aarch64 CI, which I hope isn't related to my changes (if I understand correctly it segfaulted in cabal-configure for _build/stage0/libraries/transformers/setup-config (?)), I think it is ready for review.
aadaa-fgtaa (76a0bdf0) at 11 Jun 20:31
Optimise ELF linker (#23464)
aadaa-fgtaa (14a37e3d) at 11 Jun 19:56
Optimise ELF linker (#23464)
aadaa-fgtaa (9afde73b) at 11 Jun 12:15
Optimise ELF linker (#23464)
aadaa-fgtaa (c2761072) at 11 Jun 12:14
Optimise ELF linker (#23464)
@bgamari, thanks for the review. I'm still a bit worried about that CI failure, but I also have no idea how these changes could cause it. The only thing I would still like to change here is to move new shndx_table
field from generic struct ObjectCode
to ELF-specific struct ObjectCodeFormatInfo
, which seems to be more appropriate place for it. Can I still push this change after MR was assigned to marge-bot?
aadaa-fgtaa (2e77270b) at 05 Jun 18:02
Optimise ELF linker (#23464)
aadaa-fgtaa (3d84af1c) at 05 Jun 17:37
Optimise ELF linker (#23464)
This is an attempt at optimizing runtime linker to speed up the reproducer of #23464. According to perf record
, about a half of its runtime is spent in checkProddableBlock
, around 25% in ocResolve_ELF
and ~10% in ocInit_ELF
. I tried to address their inefficiencies, but due to lack of detailed knowledge about linker I'm not sure whether these changes are correct.
ocInit_ELF
's bottleneck is repeated O(n) append to single linked lists relTable
, relaTable
and symbolTables
. This MR caches their last nodes in ObjectCodeFormatInfo
to allow O(1) append.ocResolve_ELF
spends most of its time running do_Elf_Rela_relocations
, which in turns mostly computes get_shndx_table(ehdrC)
. This MR hoists this call out of loop by passing its result to doElf_Rela_relocations
as an argument.checkProddableBlock
, which is accountable for half of reproducer's runtime, is traversing a large linked list ensuring that given block is within some block in the list, and panics otherwise. From the comments in rts/Linker.c
it appears that this is just a sanity check, yet it is currently enabled in non-debug RTS. This MR disables the checks when using non-debug RTS.On my machine, these changes speed up the first run of the reproducer from #23464 by more than 10 times, from ~9 to ~0.7 seconds.
Questions:
checkProddableBlock
checking only ghc's sanity or the correctness of the object file? Can this check be avoided in non-debug RTS? If that's not the case I can try to turn proddables
into a more suitable data structure.shndx_table
always returns the same value for a given Elf_Shdr
, or modifications to Elf_Shdr
can change its result?aadaa-fgtaa (c32b5262) at 03 Jun 13:34
ELF Linker: run checkProddableBlock
only with debug rts (#23464)
... and 2 more commits
aadaa-fgtaa (e86421a3) at 03 Jun 12:44
ELF Linker: run checkProddableBlock
only with debug rts (#23464)
... and 2 more commits
Classes with default signatures involving type synonym to rank-N type cannot be derived using non-standalone anyclass deriving.
{-# LANGUAGE RankNTypes, DerivingStrategies, DeriveAnyClass, DefaultSignatures #-}
module Bug where
import Data.Proxy
type Method a = forall b . Eq b => Proxy (a, b)
class Class a where
method :: Method a
default method :: Method a
method = Proxy
data Data
deriving anyclass Class
fails with
src/Bug.hs:17:21: error:
• Could not deduce (Eq b0)
arising from the 'deriving' clause of a data type declaration
from the context: Eq b
bound by the deriving clause for ‘Class Data’
at src/Bug.hs:17:21-25
The type variable ‘b0’ is ambiguous
These potential instances exist:
instance forall k (s :: k). Eq (Proxy s) -- Defined in ‘Data.Proxy’
instance Eq Ordering -- Defined in ‘GHC.Classes’
instance Eq Integer -- Defined in ‘GHC.Num.Integer’
...plus 22 others
...plus four instances involving out-of-scope types
(use -fprint-potential-instances to see them all)
Possible fix:
use a standalone 'deriving instance' declaration,
so you can specify the instance context yourself
• When deriving the instance for (Class Data)
|
17 | deriving anyclass Class
Using standalone deriving or instance Class Data where {}
, removing default signature for method
or expanding Method
synonym by hand solves the problem.
But the most interesting thing is that compiling the original program with -fdefer-type-errors
leads to a warning about the type error, but then -ddump-ds
output does not contain any typeError
invocation, and the relevant part of core looks as expected:
$dmmethod :: forall a. Class a => Method a
$dmmethod = \ (@a_ais) _ (@b_a1rX) _ -> Proxy @(*) @(a_ais, b_a1rX)
Rec {
$fClassData :: Class Data
$fClassData = $cmethod_a1s4 `cast` <Co:3>
$cmethod_a1s4 :: Method Data
$cmethod_a1s4 = \ (@b_a1s7) ($dEq_a1s8 :: Eq b_a1s7) -> $dmmethod @Data $fClassData @b_a1s7 $dEq_a1s8
end Rec }
The example above compiles without an error, as it does when using standalone deriving instead.
Type family equations with wrong names are accepted when the type family has a standalone kind signature.
{-# LANGUAGE TypeFamilies, StandaloneKindSignatures #-}
module Bug where
data Bar
type Foo :: *
type family Foo where
Bar = ()
Bar
used in equation is obviously wrong, still ghc compiles the program without an error. -ddump-types
reports correct axiom:
COERCION AXIOMS
axiom Bug.D:R:Foo :: Foo = ()
I'd expect ghc to reject the program with the same error message it gives if Foo
had no SAKS
Bug.hs:7:3: error:
• Mismatched type name in type family instance.
Expected: Foo
Actual: Maybe
• In the type family declaration for ‘Foo’
|
7 | Maybe = ()
Optional:
Constraints mentioning type variable bound by data constructor are solved even when they trigger overlapping instances.
{-# LANGUAGE FlexibleInstances, ExistentialQuantification #-}
module Bug where
import Data.Proxy
class Foo x where
x :: Proxy x -> String
instance Foo Int where
x _ = "Int"
instance Foo a where
x _ = "-"
data SomeProxy = forall x . SomeProxy (Proxy x)
foo :: SomeProxy -> String
foo (SomeProxy p) = x p
This code compiles without an error, although Foo Int
and Foo a
are obviously overlapping. Encoding existential type via universal quantification fails as expected
bar :: ((forall x . Proxy x -> String) -> String) -> String
bar f = f x
• Overlapping instances for Foo x arising from a use of ‘x’
Matching instances:
instance Foo a -- Defined at Bug.hs:15:10
instance Foo Int -- Defined at Bug.hs:12:10
(The choice depends on the instantiation of ‘x’
To pick the first instance above, use IncoherentInstances
when compiling the other instance declarations)
Compiling foo
gives the same error as bar
.
[I'm not sure if this is a bug or documentation issue. I'd prefer the former so I applied "bug" template]
With QuantifiedConstraints
, UndecidableInstances
and UndecidableSuperClasses
it is possible to
produce bottom dictionary, which isn't mentioned in docs as far as I can see.
{-# LANGUAGE UndecidableInstances, UndecidableSuperClasses, ConstraintKinds, FlexibleInstances,
GADTs, QuantifiedConstraints, TypeApplications, ScopedTypeVariables #-}
module QCLoop where
class c => Hold c
instance c => Hold c
data Dict c = c => Dict
anythingDict :: Dict c
anythingDict = go
where
go :: (Hold c => c) => Dict c
go = Dict
Produces bottom dictionary
Rec {
$dHold_rxi :: forall {c :: Constraint}. Hold c
$dHold_rxi = $dHold_rxi
end Rec }
anythingDict :: forall (c :: Constraint). Dict c
anythingDict = \ (@c) -> Dict ($dHold_rxi `cast` <Co:2>)
Either the program is rejected with "could not deduce c" or "too many iterations", or documentation for
UndecidableInstances
and/or UndecidableSuperClasses
mentions that they may result in
non-termination not only in typechecker but also at runtime.
I would really prefer the former as the ability to "prove" anything with such code scares me a bit.