module UTF8:UTF-8 encoded Unicode strings.sig
..end
The Module for UTF-8 encoded Unicode strings.
typet =
string
exception Malformed_code
val validate : t -> unit
validate s
Succeeds if s is valid UTF-8, otherwise raises Malformed_code.
Other functions assume strings are valid UTF-8, so it is prudent
to test their validity for strings from untrusted origins.val get : t -> int -> UChar.uchar
get s n
returns n
-th Unicode character of s
.
The call requires O(n)-time.val init : int -> (int -> UChar.uchar) -> t
init len f
returns a new string which contains len
Unicode characters.
The i-th Unicode character is initialized by f i
val length : t -> int
length s
returns the number of Unicode characters contained in stypeindex =
int
0
val nth : t -> int -> index
nth s n
returns the position of the n
-th Unicode character.
The call requires O(n)-timeval last : t -> index
val look : t -> index -> UChar.uchar
look s i
returns the Unicode character of the location i
in the string s
.val out_of_range : t -> index -> bool
out_of_range s i
tests whether i
is a position inside of s
.val compare_index : t -> index -> index -> int
compare_index s i1 i2
returns
a value < 0 if i1
is the position located before i2
,
0 if i1
and i2
points the same location,
a value > 0 if i1
is the position located after i2
.val next : t -> index -> index
next s i
returns the position of the head of the Unicode character
located immediately after i
.
If i
is inside of s
, the function always successes.
If i
is inside of s
and there is no Unicode character after i
,
the position outside s
is returned.
If i
is not inside of s
, the behaviour is unspecified.val prev : t -> index -> index
prev s i
returns the position of the head of the Unicode character
located immediately before i
.
If i
is inside of s
, the function always successes.
If i
is inside of s
and there is no Unicode character before i
,
the position outside s
is returned.
If i
is not inside of s
, the behaviour is unspecified.val move : t -> index -> int -> index
move s i n
returns n
-th Unicode character after i
if n >= 0,
n
-th Unicode character before i
if n < 0.
If there is no such character, the result is unspecified.val iter : (UChar.uchar -> unit) -> t -> unit
iter f s
applies f
to all Unicode characters in s
.
The order of application is same to the order
of the Unicode characters in s
.val compare : t -> t -> int
compare s1 s2
returns
a positive integer if s1
> s2
,
0 if s1
= s2
,
a negative integer if s1
< s2
.module Buf:sig
..end