Skip to content
  • Marcin 'Qrczak' Kowalczyk's avatar
    [project @ 2000-08-07 23:37:19 by qrczak] · 4b172698
    Marcin 'Qrczak' Kowalczyk authored
    Now Char, Char#, StgChar have 31 bits (physically 32).
    "foo"# is still an array of bytes.
    
    CharRep represents 32 bits (on a 64-bit arch too). There is also
    Int8Rep, used in those places where bytes were originally meant.
    readCharArray, indexCharOffAddr etc. still use bytes. Storable and
    {I,M}Array use wide Chars.
    
    In future perhaps all sized integers should be primitive types. Then
    some usages of indexing primops scattered through the code could
    be changed to then-available Int8 ones, and then Char variants of
    primops could be made wide (other usages that handle text should use
    conversion that will be provided later).
    
    I/O and _ccall_ arguments assume ISO-8859-1. UTF-8 is internally used
    for string literals (only).
    
    Z-encoding is ready for Unicode identifiers.
    
    Ranges of intlike and charlike closures are more easily configurable.
    
    I've probably broken nativeGen/MachCode.lhs:chrCode for Alpha but I
    don't know the Alpha assembler to fix it (what is zapnot?). Generally
    I'm not sure if I've done the NCG changes right.
    
    This commit breaks the binary compatibility (of course).
    
    TODO:
    * is* and to{Lower,Upper} in Char (in progress).
    * Libraries for text conversion (in design / experiments),
      to be plugged to I/O and a higher level foreign library.
    * PackedString.
    * StringBuffer and accepting source in encodings other than ISO-8859-1.
    4b172698