Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • GHC GHC
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,868
    • Issues 4,868
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
  • Merge requests 458
    • Merge requests 458
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Releases
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #21375
Closed
Open
Created Apr 11, 2022 by Joachim Breitner@nomeataDeveloper

Use unicode-data (or its implementation) for Data.Char

The unicode-data library reimplements most of Data.Char in a better way. With better, I mean

  • reportedly up to 5× faster
  • only pure Haskell code, no FFI

Both seem to be obviously desirable even for base. Faster is better, and pure Haskell makes things like targeting JS or webassembly easier, I assume. Also, it’s more reputable for us :-)

As with our current implementation there is a tool that generates some code based on the Unicode standard.

It would be great to benefit from Adithya Kumar's work here, and use his implementation in base.

Unfortunately, it does not seem to be easily possible to for base to simply depend on unicode-data, as that uses too much stuff that’s in base (and not in, say, ghc-prim).

So the question is: Should we

  • copy that code (including generator) into base, use it to provide Data.Char, and maintain it in parallel with the stand-alone upstream library, or
  • copy that code (including generator) into base, use it to provide Data.Char and the more specialized modules provided by unicode-standard that are of interest to users with more precise Unicode needs (e.g. Unicode.Char.General), so that that package (if that’s in the interest of its maintainers) can be deprecated and everyone will find it all in base.

(This discourse thread may be relevant.)

Edited Apr 11, 2022 by Ben Gamari
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking