- 12 Jan, 2011 1 commit
-
-
simonpj@microsoft.com authored
This patch embodies many, many changes to the contraint solver, which make it simpler, more robust, and more beautiful. But it has taken me ages to get right. The forcing issue was some obscure programs involving recursive dictionaries, but these eventually led to a massive refactoring sweep. Main changes are: * No more "frozen errors" in the monad. Instead "insoluble constraints" are now part of the WantedConstraints type. * The WantedConstraint type is a product of bags, instead of (as before) a bag of sums. This eliminates a good deal of tagging and untagging. * This same WantedConstraints data type is used - As the way that constraints are gathered - As a field of an implication constraint - As both argument and result of solveWanted - As the argument to reportUnsolved * We do not generate any evidence for Derived constraints. They are purely there to allow "impovement" by unifying unification variables. * In consequence, nothing is ever *rewritten* by a Derived constraint. This removes, by construction, all the horrible potential recursive-dictionary loops that were making us tear our hair out. No more isGoodRecEv search either. Hurrah! * We add the superclass Derived constraints during canonicalisation, after checking for duplicates. So fewer superclass constraints are generated than before. * Skolem tc-tyvars no longer carry SkolemInfo. Instead, the SkolemInfo lives in the GivenLoc of the Implication, where it can be tidied, zonked, and substituted nicely. This alone is a major improvement. * Tidying is improved, so that we tend to get t1, t2, t3, rather than t1, t11, t111, etc Moreover, unification variables are always printed with a digit (thus a0, a1, etc), so that plain 'a' is available for a skolem arising from a type signature etc. In this way, (a) We quietly say which variables are unification variables, for those who know and care (b) Types tend to get printed as the user expects. If he writes f :: a -> a f = ...blah... then types involving 'a' get printed with 'a', rather than some tidied variant. * There are significant improvements in error messages, notably in the "Cannot deduce X from Y" messages.
-
- 30 Aug, 2010 1 commit
-
-
benl@ouroborus.net authored
-
- 30 Mar, 2010 1 commit
-
-
waern authored
The instances (and deriving declarations) have been taken from the ghc-syb package.
-
- 29 Oct, 2009 1 commit
-
-
simonpj@microsoft.com authored
This patch has been a long time in gestation and has, as a result, accumulated some extra bits and bobs that are only loosely related. I separated the bits that are easy to split off, but the rest comes as one big patch, I'm afraid. Note that: * It comes together with a patch to the 'base' library * Interface file formats change slightly, so you need to recompile all libraries The patch is mainly giant tidy-up, driven in part by the particular stresses of the Data Parallel Haskell project. I don't expect a big performance win for random programs. Still, here are the nofib results, relative to the state of affairs without the patch Program Size Allocs Runtime Elapsed -------------------------------------------------------------------------------- Min -12.7% -14.5% -17.5% -17.8% Max +4.7% +10.9% +9.1% +8.4% Geometric Mean +0.9% -0.1% -5.6% -7.3% The +10.9% allocation outlier is rewrite, which happens to have a very delicate optimisation opportunity involving an interaction of CSE and inlining (see nofib/Simon-nofib-notes). The fact that the 'before' case found the optimisation is somewhat accidental. Runtimes seem to go down, but I never kno wwhether to really trust this number. Binary sizes wobble a bit, but nothing drastic. The Main Ideas are as follows. InlineRules ~~~~~~~~~~~ When you say {-# INLINE f #-} f x = <rhs> you intend that calls (f e) are replaced by <rhs>[e/x] So we should capture (\x.<rhs>) in the Unfolding of 'f', and never meddle with it. Meanwhile, we can optimise <rhs> to our heart's content, leaving the original unfolding intact in Unfolding of 'f'. So the representation of an Unfolding has changed quite a bit (see CoreSyn). An INLINE pragma gives rise to an InlineRule unfolding. Moreover, it's only used when 'f' is applied to the specified number of arguments; that is, the number of argument on the LHS of the '=' sign in the original source definition. For example, (.) is now defined in the libraries like this {-# INLINE (.) #-} (.) f g = \x -> f (g x) so that it'll inline when applied to two arguments. If 'x' appeared on the left, thus (.) f g x = f (g x) it'd only inline when applied to three arguments. This slightly-experimental change was requested by Roman, but it seems to make sense. Other associated changes * Moving the deck chairs in DsBinds, which processes the INLINE pragmas * In the old system an INLINE pragma made the RHS look like (Note InlineMe <rhs>) The Note switched off optimisation in <rhs>. But it was quite fragile in corner cases. The new system is more robust, I believe. In any case, the InlineMe note has disappeared * The workerInfo of an Id has also been combined into its Unfolding, so it's no longer a separate field of the IdInfo. * Many changes in CoreUnfold, esp in callSiteInline, which is the critical function that decides which function to inline. Lots of comments added! * exprIsConApp_maybe has moved to CoreUnfold, since it's so strongly associated with "does this expression unfold to a constructor application". It can now do some limited beta reduction too, which Roman found was an important. Instance declarations ~~~~~~~~~~~~~~~~~~~~~ It's always been tricky to get the dfuns generated from instance declarations to work out well. This is particularly important in the Data Parallel Haskell project, and I'm now on my fourth attempt, more or less. There is a detailed description in TcInstDcls, particularly in Note [How instance declarations are translated]. Roughly speaking we now generate a top-level helper function for every method definition in an instance declaration, so that the dfun takes a particularly stylised form: dfun a d1 d2 = MkD (op1 a d1 d2) (op2 a d1 d2) ...etc... In fact, it's *so* stylised that we never need to unfold a dfun. Instead ClassOps have a special rewrite rule that allows us to short-cut dictionary selection. Suppose dfun :: Ord a -> Ord [a] d :: Ord a Then compare (dfun a d) --> compare_list a d in one rewrite, without first inlining the 'compare' selector and the body of the dfun. To support this a) ClassOps have a BuiltInRule (see MkId.dictSelRule) b) DFuns have a special form of unfolding (CoreSyn.DFunUnfolding) which is exploited in CoreUnfold.exprIsConApp_maybe Implmenting all this required a root-and-branch rework of TcInstDcls and bits of TcClassDcl. Default methods ~~~~~~~~~~~~~~~ If you give an INLINE pragma to a default method, it should be just as if you'd written out that code in each instance declaration, including the INLINE pragma. I think that it now *is* so. As a result, library code can be simpler; less duplication. The CONLIKE pragma ~~~~~~~~~~~~~~~~~~ In the DPH project, Roman found cases where he had p n k = let x = replicate n k in ...(f x)...(g x).... {-# RULE f (replicate x) = f_rep x #-} Normally the RULE would not fire, because doing so involves (in effect) duplicating the redex (replicate n k). A new experimental modifier to the INLINE pragma, {-# INLINE CONLIKE replicate #-}, allows you to tell GHC to be prepared to duplicate a call of this function if it allows a RULE to fire. See Note [CONLIKE pragma] in BasicTypes Join points ~~~~~~~~~~~ See Note [Case binders and join points] in Simplify Other refactoring ~~~~~~~~~~~~~~~~~ * I moved endPass from CoreLint to CoreMonad, with associated jigglings * Better pretty-printing of Core * The top-level RULES (ones that are not rules for locally-defined things) are now substituted on every simplifier iteration. I'm not sure how we got away without doing this before. This entails a bit more plumbing in SimplCore. * The necessary stuff to serialise and deserialise the new info across interface files. * Something about bottoming floats in SetLevels Note [Bottoming floats] * substUnfolding has moved from SimplEnv to CoreSubs, where it belongs -------------------------------------------------------------------------------- Program Size Allocs Runtime Elapsed -------------------------------------------------------------------------------- anna +2.4% -0.5% 0.16 0.17 ansi +2.6% -0.1% 0.00 0.00 atom -3.8% -0.0% -1.0% -2.5% awards +3.0% +0.7% 0.00 0.00 banner +3.3% -0.0% 0.00 0.00 bernouilli +2.7% +0.0% -4.6% -6.9% boyer +2.6% +0.0% 0.06 0.07 boyer2 +4.4% +0.2% 0.01 0.01 bspt +3.2% +9.6% 0.02 0.02 cacheprof +1.4% -1.0% -12.2% -13.6% calendar +2.7% -1.7% 0.00 0.00 cichelli +3.7% -0.0% 0.13 0.14 circsim +3.3% +0.0% -2.3% -9.9% clausify +2.7% +0.0% 0.05 0.06 comp_lab_zift +2.6% -0.3% -7.2% -7.9% compress +3.3% +0.0% -8.5% -9.6% compress2 +3.6% +0.0% -15.1% -17.8% constraints +2.7% -0.6% -10.0% -10.7% cryptarithm1 +4.5% +0.0% -4.7% -5.7% cryptarithm2 +4.3% -14.5% 0.02 0.02 cse +4.4% -0.0% 0.00 0.00 eliza +2.8% -0.1% 0.00 0.00 event +2.6% -0.0% -4.9% -4.4% exp3_8 +2.8% +0.0% -4.5% -9.5% expert +2.7% +0.3% 0.00 0.00 fem -2.0% +0.6% 0.04 0.04 fft -6.0% +1.8% 0.05 0.06 fft2 -4.8% +2.7% 0.13 0.14 fibheaps +2.6% -0.6% 0.05 0.05 fish +4.1% +0.0% 0.03 0.04 fluid -2.1% -0.2% 0.01 0.01 fulsom -4.8% +9.2% +9.1% +8.4% gamteb -7.1% -1.3% 0.10 0.11 gcd +2.7% +0.0% 0.05 0.05 gen_regexps +3.9% -0.0% 0.00 0.00 genfft +2.7% -0.1% 0.05 0.06 gg -2.7% -0.1% 0.02 0.02 grep +3.2% -0.0% 0.00 0.00 hidden -0.5% +0.0% -11.9% -13.3% hpg -3.0% -1.8% +0.0% -2.4% ida +2.6% -1.2% 0.17 -9.0% infer +1.7% -0.8% 0.08 0.09 integer +2.5% -0.0% -2.6% -2.2% integrate -5.0% +0.0% -1.3% -2.9% knights +4.3% -1.5% 0.01 0.01 lcss +2.5% -0.1% -7.5% -9.4% life +4.2% +0.0% -3.1% -3.3% lift +2.4% -3.2% 0.00 0.00 listcompr +4.0% -1.6% 0.16 0.17 listcopy +4.0% -1.4% 0.17 0.18 maillist +4.1% +0.1% 0.09 0.14 mandel +2.9% +0.0% 0.11 0.12 mandel2 +4.7% +0.0% 0.01 0.01 minimax +3.8% -0.0% 0.00 0.00 mkhprog +3.2% -4.2% 0.00 0.00 multiplier +2.5% -0.4% +0.7% -1.3% nucleic2 -9.3% +0.0% 0.10 0.10 para +2.9% +0.1% -0.7% -1.2% paraffins -10.4% +0.0% 0.20 -1.9% parser +3.1% -0.0% 0.05 0.05 parstof +1.9% -0.0% 0.00 0.01 pic -2.8% -0.8% 0.01 0.02 power +2.1% +0.1% -8.5% -9.0% pretty -12.7% +0.1% 0.00 0.00 primes +2.8% +0.0% 0.11 0.11 primetest +2.5% -0.0% -2.1% -3.1% prolog +3.2% -7.2% 0.00 0.00 puzzle +4.1% +0.0% -3.5% -8.0% queens +2.8% +0.0% 0.03 0.03 reptile +2.2% -2.2% 0.02 0.02 rewrite +3.1% +10.9% 0.03 0.03 rfib -5.2% +0.2% 0.03 0.03 rsa +2.6% +0.0% 0.05 0.06 scc +4.6% +0.4% 0.00 0.00 sched +2.7% +0.1% 0.03 0.03 scs -2.6% -0.9% -9.6% -11.6% simple -4.0% +0.4% -14.6% -14.9% solid -5.6% -0.6% -9.3% -14.3% sorting +3.8% +0.0% 0.00 0.00 sphere -3.6% +8.5% 0.15 0.16 symalg -1.3% +0.2% 0.03 0.03 tak +2.7% +0.0% 0.02 0.02 transform +2.0% -2.9% -8.0% -8.8% treejoin +3.1% +0.0% -17.5% -17.8% typecheck +2.9% -0.3% -4.6% -6.6% veritas +3.9% -0.3% 0.00 0.00 wang -6.2% +0.0% 0.18 -9.8% wave4main -10.3% +2.6% -2.1% -2.3% wheel-sieve1 +2.7% -0.0% +0.3% -0.6% wheel-sieve2 +2.7% +0.0% -3.7% -7.5% x2n1 -4.1% +0.1% 0.03 0.04 -------------------------------------------------------------------------------- Min -12.7% -14.5% -17.5% -17.8% Max +4.7% +10.9% +9.1% +8.4% Geometric Mean +0.9% -0.1% -5.6% -7.3%
-
- 21 Aug, 2009 1 commit
-
-
Simon Marlow authored
For: FastStrings, Names, and Bin values. This makes .hi files smaller on 64-bit platforms, while also making the format a bit more robust.
-
- 28 May, 2009 1 commit
-
-
simonpj@microsoft.com authored
In Tempate Haskell -ddump-splices, the "after" expression is populated with RdrNames, many of which are Orig things. We used to print these fully-qualified, but that's a bit heavy. This patch refactors the code a bit so that the same print-unqualified mechanism we use for Names also works for RdrNames. Lots of comments too, because it took me a while to figure out how it all worked again.
-
- 11 Aug, 2008 1 commit
-
-
simonpj@microsoft.com authored
-
- 07 Aug, 2008 1 commit
-
-
batterseapower authored
-
- 04 Aug, 2008 1 commit
-
-
simonpj@microsoft.com authored
-
- 28 May, 2008 1 commit
-
-
Simon Marlow authored
This is a much more robust way to do recompilation checking. The idea is to create a fingerprint of the ABI of an interface, and track dependencies by recording the fingerprints of ABIs that a module depends on. If any of those ABIs have changed, then we need to recompile. In bug #1372 we weren't recording dependencies on package modules, this patch fixes that by recording fingerprints of package modules that we depend on. Within a package there is still fine-grained recompilation avoidance as before. We currently use MD5 for fingerprints, being a good compromise between efficiency and security. We're not worried about attackers, but we are worried about accidental collisions. All the MD5 sums do make interface files a bit bigger, but compile times on the whole are about the same as before. Recompilation avoidance should be a bit more accurate than in 6.8.2 due to fixing #1959, especially when using -O.
-
- 12 Apr, 2008 1 commit
-
-
Ian Lynagh authored
-
- 26 Jan, 2008 1 commit
-
-
twanvl authored
-
- 17 Jan, 2008 1 commit
-
-
Isaac Dupree authored
re-recording to avoid new conflicts was too hard, so I just put it all in one big patch :-( (besides, some of the changes depended on each other.) Here are what the component patches were: Fri Dec 28 11:02:55 EST 2007 Isaac Dupree <id@isaac.cedarswampstudios.org> * document BreakArray better Fri Dec 28 11:39:22 EST 2007 Isaac Dupree <id@isaac.cedarswampstudios.org> * properly ifdef BreakArray for GHCI Fri Jan 4 13:50:41 EST 2008 Isaac Dupree <id@isaac.cedarswampstudios.org> * change ifs on __GLASGOW_HASKELL__ to account for... (#1405) for it not being defined. I assume it being undefined implies a compiler with relatively modern libraries but without most unportable glasgow extensions. Fri Jan 4 14:21:21 EST 2008 Isaac Dupree <id@isaac.cedarswampstudios.org> * MyEither-->EitherString to allow Haskell98 instance Fri Jan 4 16:13:29 EST 2008 Isaac Dupree <id@isaac.cedarswampstudios.org> * re-portabilize Pretty, and corresponding changes Fri Jan 4 17:19:55 EST 2008 Isaac Dupree <id@isaac.cedarswampstudios.org> * Augment FastTypes to be much more complete Fri Jan 4 20:14:19 EST 2008 Isaac Dupree <id@isaac.cedarswampstudios.org> * use FastFunctions, cleanup FastString slightly Fri Jan 4 21:00:22 EST 2008 Isaac Dupree <id@isaac.cedarswampstudios.org> * Massive de-"#", mostly Int# --> FastInt (#1405) Fri Jan 4 21:02:49 EST 2008 Isaac Dupree <id@isaac.cedarswampstudios.org> * miscellaneous unnecessary-extension-removal Sat Jan 5 19:30:13 EST 2008 Isaac Dupree <id@isaac.cedarswampstudios.org> * add FastFunctions
-
- 14 Nov, 2007 1 commit
-
-
mnislaih authored
-
- 10 Sep, 2007 1 commit
-
-
Simon Marlow authored
This fixes the long-standing bug that prevents some code with mutally-recursive modules from being compiled with --make and -O, including GHC itself. See the comments for details. There are some additional cleanups that were forced/enabled by this patch: I removed importedSrcLoc/importedSrcSpan: it wasn't adding any useful information, since a Name already contains its defining Module. In fact when re-typechecking an interface file we were wrongly replacing the interesting SrcSpans in the Names with boring importedSrcSpans, which meant that location information could degrade after reloading modules. Also, recreating all these Names was a waste of space/time.
-
- 06 Sep, 2007 1 commit
-
-
Simon Marlow authored
This turned out to be a black hole, however we believe we now have a plan that does the right thing and shouldn't need to change again. Error messages will only ever refer to a name in an unambiguous way, falling back to <package>:<module>.<name> if no unambiguous shorter variant can be found. See HscTypes.mkPrintUnqualified for the details. Earlier hacks to work around this problem have been removed (TcSimplify).
-
- 04 Sep, 2007 1 commit
-
-
Ian Lynagh authored
-
- 03 Sep, 2007 1 commit
-
-
Ian Lynagh authored
Older GHCs can't parse OPTIONS_GHC. This also changes the URL referenced for the -w options from WorkingConventions#Warnings to CodingStyle#Warnings for the compiler modules.
-
- 01 Sep, 2007 1 commit
-
-
Ian Lynagh authored
-
- 11 May, 2007 1 commit
-
-
Simon Marlow authored
This has been a long-standing ToDo.
-
- 25 Apr, 2007 1 commit
-
-
simonpj@microsoft.com authored
-
- 29 Nov, 2006 1 commit
-
-
andy@galois.com authored
This changes the internal representation of TickBoxes, from Note (TickBox "module" n) <expr> into case tick<module,n> of _ -> <expr> tick has type :: #State #World, when the module and tick numbe are stored inside IdInfo. Binary tick boxes change from Note (BinaryTickBox "module" t f) <expr> into btick<module,t,f> <expr> btick has type :: Bool -> Bool, with the module and tick number stored inside IdInfo.
-
- 11 Oct, 2006 2 commits
-
-
Simon Marlow authored
This patch is a start on removing import lists and generally tidying up the top of each module. In addition to removing import lists: - Change DATA.IOREF -> Data.IORef etc. - Change List -> Data.List etc. - Remove $Id$ - Update copyrights - Re-order imports to put non-GHC imports last - Remove some unused and duplicate imports
-
Simon Marlow authored
This large commit combines several interrelated changes: - IfaceSyn now contains actual Names rather than the special IfaceExtName type. The binary interface file contains a symbol table of Names, where each entry is a (package, ModuleName, OccName) triple. Names in the IfaceSyn point to entries in the symbol table. This reduces the size of interface files, which should hopefully improve performance (not measured yet). The toIfaceXXX functions now do not need to pass around a function from Name -> IfaceExtName, which makes that code simpler. - Names now do not point directly to their parents, and the nameParent operation has gone away. It turned out to be hard to keep this information consistent in practice, and the parent info was only valid in some Names. Instead we made the following changes: * ImportAvails contains a new field imp_parent :: NameEnv AvailInfo which gives the family info for any Name in scope, and is used by the renamer when renaming export lists, amongst other things. This info is thrown away after renaming. * The mi_ver_fn field of ModIface now maps to (OccName,Version) instead of just Version, where the OccName is the parent name. This mapping is used when constructing the usage info for dependent modules. There may be entries in mi_ver_fn for things that are not in scope, whereas imp_parent only deals with in-scope things. * The md_exports field of ModDetails now contains [AvailInfo] rather than NameSet. This gives us family info for the exported names of a module. Also: - ifaceDeclSubBinders moved to IfaceSyn (seems like the right place for it). - heavily refactored renaming of import/export lists. - Unfortunately external core is now broken, as it relied on IfaceSyn. It requires some attention.
-
- 18 Sep, 2006 1 commit
-
-
chak@cse.unsw.edu.au. authored
Tue Sep 12 16:57:32 EDT 2006 Manuel M T Chakravarty <chak@cse.unsw.edu.au> * Type tags in import/export lists - To write something like GMapKey(type GMap, empty, lookup, insert) - Requires -findexed-types
-
- 25 Jul, 2006 2 commits
-
-
Simon Marlow authored
I measured that this makes the comiler allocate a bit more, but it might also make it faster and reduce residency. The extra allocation is probably just because we're not inlining enough somewhere, so I think this change is a step in the right direction.
-
Simon Marlow authored
This patch pushes through one fundamental change: a module is now identified by the pair of its package and module name, whereas previously it was identified by its module name alone. This means that now a program can contain multiple modules with the same name, as long as they belong to different packages. This is a language change - the Haskell report says nothing about packages, but it is now necessary to understand packages in order to understand GHC's module system. For example, a type T from module M in package P is different from a type T from module M in package Q. Previously this wasn't an issue because there could only be a single module M in the program. The "module restriction" on combining packages has therefore been lifted, and a program can contain multiple versions of the same package. Note that none of the proposed syntax changes have yet been implemented, but the architecture is geared towards supporting import declarations qualified by package name, and that is probably the next step. It is now necessary to specify the package name when compiling a package, using the -package-name flag (which has been un-deprecated). Fortunately Cabal still uses -package-name. Certain packages are "wired in". Currently the wired-in packages are: base, haskell98, template-haskell and rts, and are always referred to by these versionless names. Other packages are referred to with full package IDs (eg. "network-1.0"). This is because the compiler needs to refer to entities in the wired-in packages, and we didn't want to bake the version of these packages into the comiler. It's conceivable that someone might want to upgrade the base package independently of GHC. Internal changes: - There are two module-related types: ModuleName just a FastString, the name of a module Module a pair of a PackageId and ModuleName A mapping from ModuleName can be a UniqFM, but a mapping from Module must be a FiniteMap (we provide it as ModuleEnv). - The "HomeModules" type that was passed around the compiler is now gone, replaced in most cases by the current package name which is contained in DynFlags. We can tell whether a Module comes from the current package by comparing its package name against the current package. - While I was here, I changed PrintUnqual to be a little more useful: it now returns the ModuleName that the identifier should be qualified with according to the current scope, rather than its original module. Also, PrintUnqual tells whether to qualify module names with package names (currently unused). Docs to follow.
-
- 07 Apr, 2006 1 commit
-
-
Simon Marlow authored
Most of the other users of the fptools build system have migrated to Cabal, and with the move to darcs we can now flatten the source tree without losing history, so here goes. The main change is that the ghc/ subdir is gone, and most of what it contained is now at the top level. The build system now makes no pretense at being multi-project, it is just the GHC build system. No doubt this will break many things, and there will be a period of instability while we fix the dependencies. A straightforward build should work, but I haven't yet fixed binary/source distributions. Changes to the Building Guide will follow, too.
-
- 25 Jan, 2006 1 commit
-
-
simonpj@microsoft.com authored
This very large commit adds impredicativity to GHC, plus numerous other small things. *** WARNING: I have compiled all the libraries, and *** a stage-2 compiler, and everything seems *** fine. But don't grab this patch if you *** can't tolerate a hiccup if something is *** broken. The big picture is this: a) GHC handles impredicative polymorphism, as described in the "Boxy types: type inference for higher-rank types and impredicativity" paper b) GHC handles GADTs in the new simplified (and very sligtly less epxrssive) way described in the "Simple unification-based type inference for GADTs" paper But there are lots of smaller changes, and since it was pre-Darcs they are not individually recorded. Some things to watch out for: c) The story on lexically-scoped type variables has changed, as per my email. I append the story below for completeness, but I am still not happy with it, and it may change again. In particular, the new story does not allow a pattern-bound scoped type variable to be wobbly, so (\(x::[a]) -> ...) is usually rejected. This is more restrictive than before, and we might loosen up again. d) A consequence of adding impredicativity is that GHC is a bit less gung ho about converting automatically between (ty1 -> forall a. ty2) and (forall a. ty1 -> ty2) In particular, you may need to eta-expand some functions to make typechecking work again. Furthermore, functions are now invariant in their argument types, rather than being contravariant. Again, the main consequence is that you may occasionally need to eta-expand function arguments when using higher-rank polymorphism. Please test, and let me know of any hiccups Scoped type variables in GHC ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ January 2006 0) Terminology. A *pattern binding* is of the form pat = rhs A *function binding* is of the form f pat1 .. patn = rhs A binding of the formm var = rhs is treated as a (degenerate) *function binding*. A *declaration type signature* is a separate type signature for a let-bound or where-bound variable: f :: Int -> Int A *pattern type signature* is a signature in a pattern: \(x::a) -> x f (x::a) = x A *result type signature* is a signature on the result of a function definition: f :: forall a. [a] -> a head (x:xs) :: a = x The form x :: a = rhs is treated as a (degnerate) function binding with a result type signature, not as a pattern binding. 1) The main invariants: A) A lexically-scoped type variable always names a (rigid) type variable (not an arbitrary type). THIS IS A CHANGE. Previously, a scoped type variable named an arbitrary *type*. B) A type signature always describes a rigid type (since its free (scoped) type variables name rigid type variables). This is also a change, a consequence of (A). C) Distinct lexically-scoped type variables name distinct rigid type variables. This choice is open; 2) Scoping 2(a) If a declaration type signature has an explicit forall, those type variables are brought into scope in the right hand side of the corresponding binding (plus, for function bindings, the patterns on the LHS). f :: forall a. a -> [a] f (x::a) = [x :: a, x] Both occurences of 'a' in the second line are bound by the 'forall a' in the first line A declaration type signature *without* an explicit top-level forall is implicitly quantified over all the type variables that are mentioned in the type but not already in scope. GHC's current rule is that this implicit quantification does *not* bring into scope any new scoped type variables. f :: a -> a f x = ...('a' is not in scope here)... This gives compatibility with Haskell 98 2(b) A pattern type signature implicitly brings into scope any type variables mentioned in the type that are not already into scope. These are called *pattern-bound type variables*. g :: a -> a -> [a] g (x::a) (y::a) = [y :: a, x] The pattern type signature (x::a) brings 'a' into scope. The 'a' in the pattern (y::a) is bound, as is the occurrence on the RHS. A pattern type siganture is the only way you can bring existentials into scope. data T where MkT :: forall a. a -> (a->Int) -> T f x = case x of MkT (x::a) f -> f (x::a) 2a) QUESTION class C a where op :: forall b. b->a->a instance C (T p q) where op = <rhs> Clearly p,q are in scope in <rhs>, but is 'b'? Not at the moment. Nor can you add a type signature for op in the instance decl. You'd have to say this: instance C (T p q) where op = let op' :: forall b. ... op' = <rhs> in op' 3) A pattern-bound type variable is allowed only if the pattern's expected type is rigid. Otherwise we don't know exactly *which* skolem the scoped type variable should be bound to, and that means we can't do GADT refinement. This is invariant (A), and it is a big change from the current situation. f (x::a) = x -- NO; pattern type is wobbly g1 :: b -> b g1 (x::b) = x -- YES, because the pattern type is rigid g2 :: b -> b g2 (x::c) = x -- YES, same reason h :: forall b. b -> b h (x::b) = x -- YES, but the inner b is bound k :: forall b. b -> b k (x::c) = x -- NO, it can't be both b and c 3a) You cannot give different names for the same type variable in the same scope (Invariant (C)): f1 :: p -> p -> p -- NO; because 'a' and 'b' would be f1 (x::a) (y::b) = (x::a) -- bound to the same type variable f2 :: p -> p -> p -- OK; 'a' is bound to the type variable f2 (x::a) (y::a) = (x::a) -- over which f2 is quantified -- NB: 'p' is not lexically scoped f3 :: forall p. p -> p -> p -- NO: 'p' is now scoped, and is bound to f3 (x::a) (y::a) = (x::a) -- to the same type varialble as 'a' f4 :: forall p. p -> p -> p -- OK: 'p' is now scoped, and its occurences f4 (x::p) (y::p) = (x::p) -- in the patterns are bound by the forall 3b) You can give a different name to the same type variable in different disjoint scopes, just as you can (if you want) give diferent names to the same value parameter g :: a -> Bool -> Maybe a g (x::p) True = Just x :: Maybe p g (y::q) False = Nothing :: Maybe q 3c) Scoped type variables respect alpha renaming. For example, function f2 from (3a) above could also be written: f2' :: p -> p -> p f2' (x::b) (y::b) = x::b where the scoped type variable is called 'b' instead of 'a'. 4) Result type signatures obey the same rules as pattern types signatures. In particular, they can bind a type variable only if the result type is rigid f x :: a = x -- NO g :: b -> b g x :: b = x -- YES; binds b in rhs 5) A *pattern type signature* in a *pattern binding* cannot bind a scoped type variable (x::a, y) = ... -- Legal only if 'a' is already in scope Reason: in type checking, the "expected type" of the LHS pattern is always wobbly, so we can't bind a rigid type variable. (The exception would be for an existential type variable, but existentials are not allowed in pattern bindings either.) Even this is illegal f :: forall a. a -> a f x = let ((y::b)::a, z) = ... in Here it looks as if 'b' might get a rigid binding; but you can't bind it to the same skolem as a. 6) Explicitly-forall'd type variables in the *declaration type signature(s)* for a *pattern binding* do not scope AT ALL. x :: forall a. a->a -- NO; the forall a does Just (x::a->a) = Just id -- not scope at all y :: forall a. a->a Just y = Just (id :: a->a) -- NO; same reason THIS IS A CHANGE, but one I bet that very few people will notice. Here's why: strange :: forall b. (b->b,b->b) strange = (id,id) x1 :: forall a. a->a y1 :: forall b. b->b (x1,y1) = strange This is legal Haskell 98 (modulo the forall). If both 'a' and 'b' both scoped over the RHS, they'd get unified and so cannot stand for distinct type variables. One could *imagine* allowing this: x2 :: forall a. a->a y2 :: forall a. a->a (x2,y2) = strange using the very same type variable 'a' in both signatures, so that a single 'a' scopes over the RHS. That seems defensible, but odd, because though there are two type signatures, they introduce just *one* scoped type variable, a. 7) Possible extension. We might consider allowing \(x :: [ _ ]) -> <expr> where "_" is a wild card, to mean "x has type list of something", without naming the something.
-
- 06 Jan, 2006 1 commit
-
-
simonmar authored
Add support for UTF-8 source files GHC finally has support for full Unicode in source files. Source files are now assumed to be UTF-8 encoded, and the full range of Unicode characters can be used, with classifications recognised using the implementation from Data.Char. This incedentally means that only the stage2 compiler will recognise Unicode in source files, because I was too lazy to port the unicode classifier code into libcompat. Additionally, the following synonyms for keywords are now recognised: forall symbol (U+2200) forall right arrow (U+2192) -> left arrow (U+2190) <- horizontal ellipsis (U+22EF) .. there are probably more things we could add here. This will break some source files if Latin-1 characters are being used. In most cases this should result in a UTF-8 decoding error. Later on if we want to support more encodings (perhaps with a pragma to specify the encoding), I plan to do it by recoding into UTF-8 before parsing. Internally, there were some pretty big changes: - FastStrings are now stored in UTF-8 - Z-encoding has been moved right to the back end. Previously we used to Z-encode every identifier on the way in for simplicity, and only decode when we needed to show something to the user. Instead, we now keep every string in its UTF-8 encoding, and Z-encode right before printing it out. To avoid Z-encoding the same string multiple times, the Z-encoding is cached inside the FastString the first time it is requested. This speeds up the compiler - I've measured some definite improvement in parsing at least, and I expect compilations overall to be faster too. It also cleans up a lot of cruft from the OccName interface. Z-encoding is nicely hidden inside the Outputable instance for Names & OccNames now. - StringBuffers are UTF-8 too, and are now represented as ForeignPtrs. - I've put together some test cases, not by any means exhaustive, but there are some interesting UTF-8 decoding error cases that aren't obvious. Also, take a look at unicode001.hs for a demo.
-
- 25 Jul, 2005 1 commit
-
-
simonpj authored
Comments
-
- 28 Apr, 2005 1 commit
-
-
simonpj authored
This big commit does several things at once (aeroplane hacking) which change the format of interface files. So you'll need to recompile your libraries! 1. The "stupid theta" of a newtype declaration ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Retain the "stupid theta" in a newtype declaration. For some reason this was being discarded, and putting it back in meant changing TyCon and IfaceSyn slightly. 2. Overlap flags travel with the instance ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Arrange that the ability to support overlap and incoherence is a property of the *instance declaration* rather than the module that imports the instance decl. This allows a library writer to define overlapping instance decls without the library client having to know. The implementation is that in an Instance we store the overlap flag, and preseve that across interface files 3. Nuke the "instnce pool" and "rule pool" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A major tidy-up and simplification of the way that instances and rules are sucked in from interface files. Up till now an instance decl has been held in a "pool" until its "gates" (a set of Names) are in play, when the instance is typechecked and added to the InstEnv in the ExternalPackageState. This is complicated and error-prone; it's easy to suck in too few (and miss an instance) or too many (and thereby be forced to suck in its type constructors, etc). Now, as we load an instance from an interface files, we put it straight in the InstEnv... but the Instance we put in the InstEnv has some Names (the "rough-match" names) that can be used on lookup to say "this Instance can't match". The detailed dfun is only read lazily, and the rough-match thing meansn it is'nt poked on until it has a chance of being needed. This simply continues the successful idea for Ids, whereby they are loaded straightaway into the TypeEnv, but their TyThing is a lazy thunk, not poked on until the thing is looked up. Just the same idea applies to Rules. On the way, I made CoreRule and Instance into full-blown records with lots of info, with the same kind of key status as TyCon or DataCon or Class. And got rid of IdCoreRule altogether. It's all much more solid and uniform, but it meant touching a *lot* of modules. 4. Allow instance decls in hs-boot files ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Allowing instance decls in hs-boot files is jolly useful, becuase in a big mutually-recursive bunch of data types, you want to give the instances with the data type declarations. To achieve this * The hs-boot file makes a provisional name for the dict-fun, something like $fx9. * When checking the "mother module", we check that the instance declarations line up (by type) and generate bindings for the boot dfuns, such as $fx9 = $f2 where $f2 is the dfun generated by the mother module * In doing this I decided that it's cleaner to have DFunIds get their final External Name at birth. To do that they need a stable OccName, so I have an integer-valued dfun-name-supply in the TcM monad. That keeps it simple. This feature is hardly tested yet. 5. Tidy up tidying, and Iface file generation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ main/TidyPgm now has two entry points: simpleTidyPgm is for hi-boot files, when typechecking only (not yet implemented), and potentially when compiling without -O. It ignores the bindings, and generates a nice small TypeEnv. optTidyPgm is the normal case: compiling with -O. It generates a TypeEnv rich in IdInfo MkIface.mkIface now only generates a ModIface. A separate procedure, MkIface.writeIfaceFile, writes the file out to disk.
-
- 08 Mar, 2005 1 commit
-
-
simonmar authored
Fix something that's been bugging me for a while: by default, -ddump-* output doesn't include uniques when it outputs internal names, but in most cases you need them because the output hasn't been tidied, so you end up doing -dppr-debug which is overkill. Now, -ddump-* prints uniques for internal names by default. This shouldn't affect anything else.
-
- 25 Feb, 2005 1 commit
-
-
simonpj authored
--------------------------------------------- Type signatures are no longer instantiated with skolem constants --------------------------------------------- Merge to STABLE Consider p :: a q :: b (p,q,r) = (r,r,p) Here, 'a' and 'b' end up being the same, because they are both bound to the type for 'r', which is just a meta type variable. So 'a' and 'b' can't be skolems. Sigh. This commit goes back to an earlier way of doing things, by arranging that type signatures get instantiated with *meta* type variables; then at the end we must check that they have not been unified with types, nor with each other. This is a real bore. I had to do quite a bit of related fiddling around to make error messages come out right. Improved one or two. Also a small unrelated fix to make :i (:+) print with parens in ghci. Sorry this got mixed up in the same commit.
-
- 26 Nov, 2004 1 commit
-
-
simonmar authored
Further integration with the new package story. GHC now supports pretty much everything in the package proposal. - GHC now works in terms of PackageIds (<pkg>-<version>) rather than just package names. You can still specify package names without versions on the command line, as long as the name is unambiguous. - GHC understands hidden/exposed modules in a package, and will refuse to import a hidden module. Also, the hidden/eposed status of packages is taken into account. - I had to remove the old package syntax from ghc-pkg, backwards compatibility isn't really practical. - All the package.conf.in files have been rewritten in the new syntax, and contain a complete list of modules in the package. I've set all the versions to 1.0 for now - please check your package(s) and fix the version number & other info appropriately. - New options: -hide-package P sets the expose flag on package P to False -ignore-package P unregisters P for this compilation For comparison, -package P sets the expose flag on package P to True, and also causes P to be linked in eagerly. -package-name is no longer officially supported. Unofficially, it's a synonym for -ignore-package, which has more or less the same effect as -package-name used to. Note that a package may be hidden and yet still be linked into the program, by virtue of being a dependency of some other package. To completely remove a package from the compiler's internal database, use -ignore-package. The compiler will complain if any two packages in the transitive closure of exposed packages contain the same module. You *must* use -ignore-package P when compiling modules for package P, if package P (or an older version of P) is already registered. The compiler will helpfully complain if you don't. The fptools build system does this. - Note: the Cabal library won't work yet. It still thinks GHC uses the old package config syntax. Internal changes/cleanups: - The ModuleName type has gone away. Modules are now just (a newtype of) FastStrings, and don't contain any package information. All the package-related knowledge is in DynFlags, which is passed down to where it is needed. - DynFlags manipulation has been cleaned up somewhat: there are no global variables holding DynFlags any more, instead the DynFlags are passed around properly. - There are a few less global variables in GHC. Lots more are scheduled for removal. - -i is now a dynamic flag, as are all the package-related flags (but using them in {-# OPTIONS #-} is Officially Not Recommended). - make -j now appears to work under fptools/libraries/. Probably wouldn't take much to get it working for a whole build.
-
- 30 Sep, 2004 1 commit
-
-
simonpj authored
------------------------------------ Add Generalised Algebraic Data Types ------------------------------------ This rather big commit adds support for GADTs. For example, data Term a where Lit :: Int -> Term Int App :: Term (a->b) -> Term a -> Term b If :: Term Bool -> Term a -> Term a ..etc.. eval :: Term a -> a eval (Lit i) = i eval (App a b) = eval a (eval b) eval (If p q r) | eval p = eval q | otherwise = eval r Lots and lots of of related changes throughout the compiler to make this fit nicely. One important change, only loosely related to GADTs, is that skolem constants in the typechecker are genuinely immutable and constant, so we often get better error messages from the type checker. See TcType.TcTyVarDetails. There's a new module types/Unify.lhs, which has purely-functional unification and matching for Type. This is used both in the typechecker (for type refinement of GADTs) and in Core Lint (also for type refinement).
-
- 31 Aug, 2004 1 commit
-
-
simonmar authored
Try to fix up the previous commit. I'm not sure if this is quite right, but it should be closer.
-
- 27 Aug, 2004 1 commit
-
-
simonpj authored
Oops: always qualify when printing code
-
- 26 Aug, 2004 1 commit
-
-
simonpj authored
------------------------------- Print built-in sytax right ------------------------------- Built-in syntax, like (:) and [], is not "in scope" via the GlobalRdrEnv in the usual way. When we print it out, we should also print it in unqualified form, even though it's not in the environment. I've finally bitten the (not very big) bullet, and added to Name the information about whether or not a name is one of these built-in ones. That entailed changing the calls to mkWiredInName, but those are exactly the places where you have to decide whether it's built-in or not, which is fine. Built-in syntax => It's a syntactic form, not "in scope" (e.g. []) Wired-in thing => The thing (Id, TyCon) is fully known to the compiler, not read from an interface file. E.g. Bool, True, Int, Float, and many others All built-in syntax is for wired-in things.
-
- 16 Aug, 2004 1 commit
-
-
simonpj authored
------------------------------- Add instance information to :i Get rid of the DeclPool ------------------------------- 1. Add instance information to :info command. GHCi now prints out which instances a type or class belongs to, when you use :i 2. Tidy up printing of unqualified names in user output. Previously Outputable.PrintUnqualified was type PrintUnqualified = Name -> Bool but it's now type PrintUnqualified = ModuleName -> OccName -> Bool This turns out to be tidier even for Names, and it's now also usable when printing IfaceSyn stuff in GHCi, eliminating a grevious hack. 3. On the way to doing this, Simon M had the great idea that we could get rid of the DeclPool holding pen, which held declarations read from interface files but not yet type-checked. We do this by eagerly populating the TypeEnv with thunks what, when poked, do the type checking. This is just a logical continuation of lazy import mechanism we've now had for some while. The InstPool and RulePool still exist, but I plan to get rid of them in the same way. The new scheme does mean that more rules get sucked in than before, because previously the TypeEnv was used to mean "this thing was needed" and hence to control which rules were sucked in. But now the TypeEnv is populated more eagerly => more rules get sucked in. However this problem will go away when I get rid of the Inst and Rule pools. I should have kept these changes separate, but I didn't. Change (1) affects mainly TcRnDriver, HscMain, CompMan, InteractiveUI whereas change (3) is more wide ranging.
-