| ... | ... | @@ -116,19 +116,25 @@ A disadvantage or using encodings is that some byte-strings may not be legal enc |
|
|
|
- Introduce a pragma `{-# ENCODING e #-`} with a range of possible
|
|
|
|
values of the encoding `e`. If the pragma is present, it must be
|
|
|
|
at the beginning of the file. If it is not present, the file is
|
|
|
|
encoded in Latin-1. Note that even if the pragma is present, some
|
|
|
|
encoded in ASCII. Note that even if the pragma is present, some
|
|
|
|
heuristic may be needed even to get as far as interpreting the
|
|
|
|
encoding declaration, like in
|
|
|
|
[ XML](http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-guessing).
|
|
|
|
The fact that the first three characters must be `{-#` will be
|
|
|
|
useful here.
|
|
|
|
- A literal string may contain any literal character representable in the
|
|
|
|
source encoding. In addition, escapes are provided to permit the specification of
|
|
|
|
*any* Unicode character (which may or may not be otherwise
|
|
|
|
representable in the source encoding).
|
|
|
|
- An identifier may contain any Unicode alphanumeric or symbol
|
|
|
|
characters from a defined range. Thus, a source text may not be
|
|
|
|
representable in certain other encodings (especially in ASCII).
|
|
|
|
useful here. Haskell compilers must support at least the encodings
|
|
|
|
|
|
|
|
|
|
|
|
ASCII,LATIN1, and UTF8.
|
|
|
|
|
|
|
|
|
|
|
|
- A literal string may contain any literal character representable in the
|
|
|
|
source encoding. In addition, escapes are provided to permit the specification of
|
|
|
|
*any* Unicode character (which may or may not be otherwise
|
|
|
|
representable in the source encoding).
|
|
|
|
- An identifier may contain any Unicode alphanumeric or symbol
|
|
|
|
characters from a defined range. Thus, a source text may not be
|
|
|
|
representable in certain other encodings (especially in ASCII).
|
|
|
|
|
|
|
|
- **I/O.**
|
|
|
|
All raw I/O is in terms of octets, i.e. `Word8`
|
|
|
|
- **Conversions.**
|
| ... | ... | |