Commit 4b172698 authored 24 years ago by Marcin 'Qrczak' Kowalczyk

[project @ 2000-08-07 23:37:19 by qrczak]

Now Char, Char#, StgChar have 31 bits (physically 32).
"foo"# is still an array of bytes.

CharRep represents 32 bits (on a 64-bit arch too). There is also
Int8Rep, used in those places where bytes were originally meant.
readCharArray, indexCharOffAddr etc. still use bytes. Storable and
{I,M}Array use wide Chars.

In future perhaps all sized integers should be primitive types. Then
some usages of indexing primops scattered through the code could
be changed to then-available Int8 ones, and then Char variants of
primops could be made wide (other usages that handle text should use
conversion that will be provided later).

I/O and _ccall_ arguments assume ISO-8859-1. UTF-8 is internally used
for string literals (only).

Z-encoding is ready for Unicode identifiers.

Ranges of intlike and charlike closures are more easily configurable.

I've probably broken nativeGen/MachCode.lhs:chrCode for Alpha but I
don't know the Alpha assembler to fix it (what is zapnot?). Generally
I'm not sure if I've done the NCG changes right.

This commit breaks the binary compatibility (of course).

TODO:
* is* and to{Lower,Upper} in Char (in progress).
* Libraries for text conversion (in design / experiments),
  to be plugged to I/O and a higher level foreign library.
* PackedString.
* StringBuffer and accepting source in encodings other than ISO-8859-1.

parent 514da0a6

No related merge requests found

Hide whitespace changes

Inline Side-by-side

Showing with 132 additions and 161 deletions

Please register or to comment