Projects that should be within reach of a good undergraduate
Implement overlap and exhaustiveness checking for pattern matching. GHC's current overlap and exhaustiveness checker is old and inadequate. Furthermore, it takes no account of GADTs and type families. See #595 (closed) and #2395 (closed). There's an excellent selection of background material:
Improve parallel profiling tools. Starting with ThreadScope, incorporate performance-counter events, visualise more runtime events, include source-code information in the profile.
Implement some low-level C-- optimisations. During 2011 we expect to have the new C-- code generation route in place, and that will open up new opportunities for doing classic compiler-course optimisations on the imperative C-- code. There is more than routine stuff here, because we can use our generic dataflow framework to do the heavy lifting. Here are some particular ideas for optimisations we'd like to implement.
More ambitious or less-well-defined projects (PhD students / Interns)
Projects aimed at making GHC into a user-extensible plug-in platform, and less of a monolithic compiler.
Allow much finer and more modular control over the way in which rewrite rules and inlining directives are ordered. See this email thread
Allow unboxed tuples as function arguments. Currently unboxed tuples are second class; fixing this would be a nice simplification.
Extend kinds beyond * and k1->k2. With GADTs etc we clearly want to have kinds like Nat, so that advanced hackery at the type level can be done in a typed language; currently it's all effectively untyped. A neat approach would be to re-use any data type declaration as a kind declaration.
Extensible constraint domains. Andrew Kennedy shows how to incorporate dimensional analysis into an ML-like type system. Maybe we could do an extensible version of this, so that it wasn't restricted to dimensions. Integer arithmetic is another obvious domain.
Incremental or concurrent GC, for reducing pause-times. Perhaps via implementing mark-region GC in the old generation.
A package API tool.
fixed package ABIs for binary-upgradable packages.
Projects for people who want a decent-sized hacking project, with less research content.