1. 22 Sep, 2014 1 commit
  2. 28 Sep, 2013 1 commit
  3. 19 May, 2013 1 commit
  4. 08 May, 2013 1 commit
    • batterseapower's avatar
      Support for Windows DBCS and new SBCS with MultiByteToWideChar · 2216b897
      batterseapower authored
      Because MultiByteToWideChar/WideCharToMultiByte have a rather unhelpful
      interface, we have to use a lot of binary searching tricks to get them
      to match the iconv-like interface that GHC requires.
      
      Even though the resulting encodings are slow, it does at least mean that we
      now support all of Window's code pages. What's more, since these codecs are
      basically only used for console output there probably won't be a huge volume
      of text to deal with in the common case, so speed is less of a worry.
      
      Note that we will still use GHC's faster table-based custom codec for supported
      SBCSs.
      2216b897
  5. 16 May, 2012 1 commit
  6. 26 Oct, 2011 1 commit
  7. 18 Jun, 2011 1 commit
  8. 14 May, 2011 1 commit
    • batterseapower's avatar
      Big patch to improve Unicode support in GHC. Validated on OS X and Windows, this · dc58b739
      batterseapower authored
      patch series fixes #5061, #1414, #3309, #3308, #3307, #4006 and #4855.
      
      The major changes are:
      
       1) Make Foreign.C.String.*CString use the locale encoding
      
          This change follows the FFI specification in Haskell 98, which
          has never actually been implemented before.
      
          The functions exported from Foreign.C.String are partially-applied
          versions of those from GHC.Foreign, which allows the user to supply
          their own TextEncoding.
      
          We also introduce foreignEncoding as the name of the text encoding
          that follows the FFI appendix in that it transliterates encoding
          errors.
      
       2) I also changed the code so that mkTextEncoding always tries the
          native-Haskell decoders in preference to those from iconv, even on
          non-Windows. The motivation here is simply that it is better for
          compatibility if we do this, and those are the ones you get for
          the utf* and latin1* predefined TextEncodings anyway.
      
       3) Implement surrogate-byte error handling mode for TextEncoding
      
          This implements PEP383-like behaviour so that we are able to
          roundtrip byte strings through Strings without loss of information.
      
          The withFilePath function now uses this encoding to get to/from CStrings,
          so any code that uses that will get the right PEP383 behaviour automatically.
      
       4) Implement three other coding failure modes: ignore, throw error, transliterate
      
          These mimic the behaviour of the GNU Iconv extensions.
      dc58b739
  9. 31 Jan, 2011 2 commits
  10. 28 Jan, 2011 1 commit
  11. 14 Sep, 2010 1 commit
  12. 13 Sep, 2010 1 commit
  13. 13 Sep, 2009 1 commit
    • judah's avatar
      On Windows, use the console code page for text file encoding/decoding. · b63b596e
      judah authored
      We keep all of the code page tables in the module
      GHC.IO.Encoding.CodePage.Table.  That file was generated automatically
      by running codepages/MakeTable.hs; more details are in the comments at the
      start of that script.
      
      Storing the lookup tables adds about 40KB to each statically linked executable;
      this only increases the size of a "hello world" program by about 7%.
      
      Currently we do not support double-byte encodings (Chinese/Japanese/Korean), since
      including those codepages would increase the table size to 400KB.  It will be
      straightforward to implement them once the work on library DLLs is finished.
      b63b596e