Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
U
utf8-string
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Iterations
Wiki
Requirements
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
This is an archived project. Repository and other project resources are read-only.
Show more breadcrumbs
Glasgow Haskell Compiler
Packages
utf8-string
Commits
d444bbbd
Commit
d444bbbd
authored
16 years ago
by
Sigbjorn Finne
Browse files
Options
Downloads
Patches
Plain Diff
added isUTF8Encoded predicate + utf8Encode to avoid repeated encodings
parent
6a8467a8
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
Codec/Binary/UTF8/String.hs
+1
-12
1 addition, 12 deletions
Codec/Binary/UTF8/String.hs
with
1 addition
and
12 deletions
Codec/Binary/UTF8/String.hs
+
1
−
12
View file @
d444bbbd
...
...
@@ -24,13 +24,9 @@ import Data.Char (chr,ord)
default
(
Int
)
-- | Encode a string using 'encode' and store the result in a 'String'.
encodeString
::
String
->
String
encodeString
xs
=
map
(
toEnum
.
fromEnum
)
(
encode
xs
)
-- | Decode a string using 'decode' using a 'String' as input.
-- | This is not safe but it is necessary if UTF-8 encoded text
-- | has been loaded into a 'String' prior to being decoded.
decodeString
::
String
->
String
decodeString
xs
=
decode
(
map
(
toEnum
.
fromEnum
)
xs
)
...
...
@@ -66,20 +62,13 @@ decode [ ] = ""
decode
(
c
:
cs
)
|
c
<
0x80
=
chr
(
fromEnum
c
)
:
decode
cs
|
c
<
0xc0
=
replacement_character
:
decode
cs
|
c
<
0xe0
=
multi
1
|
c
<
0xe0
=
multi
_byte
1
0x1f
0x80
|
c
<
0xf0
=
multi_byte
2
0xf
0x800
|
c
<
0xf8
=
multi_byte
3
0x7
0x10000
|
c
<
0xfc
=
multi_byte
4
0x3
0x200000
|
c
<
0xfe
=
multi_byte
5
0x1
0x4000000
|
otherwise
=
replacement_character
:
decode
cs
where
multi1
=
case
cs
of
c1
:
ds
|
c1
.&.
0xc0
==
0x80
->
let
d
=
((
fromEnum
c
.&.
0x1f
)
`
shiftL
`
6
)
.|.
fromEnum
(
c1
.&.
0x3f
)
in
if
d
>=
0x000080
then
toEnum
d
:
decode
ds
else
replacement_character
:
decode
ds
_
->
replacement_character
:
decode
cs
multi_byte
::
Int
->
Word8
->
Int
->
[
Char
]
multi_byte
i
mask
overlong
=
aux
i
cs
(
fromEnum
(
c
.&.
mask
))
where
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment