Reference number of working document:   ISO/IEC JTC1/SC22/WG20 N553 
Date:   1997-12-21 
Reference number of document:   ISO/IEC FCD 14652 
Committee identification:   ISO/IEC JTC1/SC22 
Secretariat:  ANSI 
Information technology  Specifications for Cultural Conventions 
Technologies de l'information  Spcifications des conventions culturelles 
 Contents 
 
1 SCOPE                                             1 
2 NORMATIVE REFERENCES                              1 
3 TERMS, DEFINITIONS AND NOTATIONS                  1 
4 FDCC-set                                          4 
4.1 FDCC-set definition                             5 
4.2 LC_CTYPE                                        8 
4.3 LC_COLLATE                                     22 
4.4 LC_MONETARY                                    36 
4.5 LC_NUMERIC                                     41 
4.6 LC_TIME                                        41 
4.7 LC_MESSAGES                                    47 
4.8 LC_PAPER                                       48 
4.9 LC_NAME                                        48 
4.10 LC_ADDRESS                                    51 
4.11 LC_TELEPHONE                                  52 
4.12 LC_MEASUREMENT                                52 
4.13 LC_VERSIONS                                   54 
5 CHARMAP                                          59 
6 REPERTOIREMAP                                    62 
7 CONFORMANCE                                      88 
Annex A (informative) DIFFERENCES FROM POSIX            89 
Annex B (informative) RATIONALE                    91 
Annex C (informative) INDEX                       106 
BIBLIOGRAPHY                                      111 
 FOREWORD 
 
ISO (the International Organization for Standardization) and 
IEC (the International Electrotechnical Commission) form the 
specialized system for worldwide standardization. National 
bodies that are members of ISO or IEC participate in the 
development of International Standards through technical 
committees established by the respective organization to deal 
with particular fields of technical activity. ISO and IEC 
technical committees collaborate in fields of mutual interest. 
Other international organizations, governmental and non- 
governmental, in liaison with ISO and IEC, also take part in 
the work. 
 
International Standards are drafted in accordance with the 
rules given in the ISO/IEC Directives, Part 3. 
 
In the field of information technology, ISO and IEC have 
established a joint technical committee, ISO/IEC JTC 1. Draft 
International Standards adopted by the joint technical 
committee are circulated to national bodies for voting. 
Publication as an International Standard requires approval by 
at least 75 % of the national bodies casting a vote. 
 
International Standard ISO/IEC 14652 was prepared by Joint 
Technical Committee ISO/IEC JTC 1., "Information Technology", 
subcommittee 22, "Programming languages, their environments 
and system software interfaces". 
 
The Standard uses text from ISO/IEC 9945-2:1993 "Information 
Technology - Portable Operating System Interface (POSIX) - 
Part 2: Shell and Utilities", primarily clauses 2.4 and 2.5. 
The major differences from this text is listed in annex A. 
 
The annexes A, B and C are for information only. 
 Introduction 
 
This International Standard defines a general mechanism to 
specify cultural conventions, and it defines formats for a 
number of specific cultural conventions in the areas of 
character classification and conversion, sorting, number 
formatting, monetary formatting, date formatting, message 
display, paper formats, addressing of persons, postal address 
formatting, telephone number handling, measurement handling, 
and a way to specify how much is covered and the status of it. 
 
 
There are a number of benefits coming from this standard: 
 
Rigid specification                                  Using this 
 International 
 Standard, a user 
 can rigidly 
 specify a number 
 of the cultural 
 conventions that 
 apply to the 
 information 
 technology 
 environment of the 
 user. 
Cultural adaptability                                An application may 
 use the 
 specifications as 
 data to its APIs, 
 and thus the same 
 application may 
 accommodate 
 different users in 
 a culturally 
 acceptable way to 
 each of the users, 
 without change of 
 the binary 
 application. 
Internationalization                                 An application 
 developer can 
 remove cultural 
 dependencies from 
 an application, 
 using the 
 localized data 
 given by the 
 customer. In this 
 way the 
 application 
 developer is 
 relieved from 
 getting the 
 different 
 information to 
 support all the 
 cultural 
 environments for 
 the expected 
 customers of the 
 product. The 
 application 
 developer is thus 
 ensured of 
 culturally correct 
 behaviour as 
 specified by the 
 customer, and 
 possibly more 
 markets may be 
 reached as 
 customers can 
 provide the data 
 themselves for 
 markets that were 
 not targeted. 
Uniform behaviour     A user may use his/her cultural convention 
 specifications with a number of 
 applications, and thus enjoy consistent and 
 correct behaviour on these issues from all 
 of the applications. 
 
The specification format is very general, independent of 
platforms and specific encoding, and targeted to be useable 
from a wide range of programming languages. 
 
This International Standard defines the format to be used for 
the International String Ordering standard, ISO/IEC 14651. 
This Internal Standard is backwards compatible with the 
ISO/IEC 9945:1993 POSIX shell and utilities standard, and it 
has enhanced functionality in a number of areas such as 
ISO/IEC 10646 support, more classification of characters, 
transliteration, dual currency support, enhanced date and time 
formatting, paper handling, personal name writing, postal 
address formatting, telephone number handling, measurement 
system handling, and management of categories. There is 
enhanced support for character sets including ISO 2022 
handling and an enhanced method to separate the specification 
of cultural conventions from an actual encoding via a 
description of the character repertoire employed. A standard 
set of values for all the categories has been defined covering 
the repertoire of ISO/IEC 10646. Information technology  Specifications for cultural 
conventions 
 
1   SCOPE 
 
This Standard specifies a description format for the 
specification of cultural conventions, a description format 
for character sets, and a description format for binding 
character names to ISO/IEC 10646, plus a set of default values 
for some of these items. The specification is upward 
compatible with POSIX locale specifications - a locale 
conformant to POSIX specifications will also be conformant to 
the specifications in this Standard, while the reverse 
condition will not hold. The descriptions are intended to be 
coded in text files to be used via Application Programming 
Interfaces. 
 
 
2   NORMATIVE REFERENCES 
 
The following normative documents contain provisions which, 
through reference in this text, constitute provisions of this 
International Standard. For dated references, subsequent 
amendments to, or revisions of, any of these publications do 
not apply. However, parties to agreements based on this 
International Standard are encouraged to investigate the 
possibility of applying the most recent editions of the 
normative documents indicated below. For undated references, 
the latest edition of the normative document referred to 
applies. Members of ISO and IEC maintain registers of 
currently valid International Standards. 
 
ISO/IEC 2022, "Information technology - Character code 
structure and extension techniques". 
 
ISO 4217, "Codes for the representation of currencies and 
funds". 
 
ISO 8601, "Data elements and interchange formats - Information 
interchange - Representation of dates and times". 
 
ISO/IEC 9945-2:1993, "Information technology - Portable 
Operating System Interface (POSIX) Part 2: Shell and 
Utilities". 
 
ISO/IEC 10646:1997, "Information technology - Universal 
Multiple-Octet Coded Character Set (UCS), including Cor.1 and 
AMD 1-9". 
 
ISO/IEC 14651, "Information technology - International string 
ordering - Method for comparing character strings and 
description of a default tailorable ordering". 
 
3   TERMS, DEFINITIONS AND NOTATIONS 
 
3.1   Terms and definitions 
 
For the purposes of this International Standard, the terms and 
definitions given in the following apply. 
 
3.1.1 byte: An individually addressable unit of data storage 
that is equal to or larger than an octet, used to store a 
character or a portion of a character. 
 
A byte is composed of a contiguous sequence of bits, the 
number of which is application defined. The least significant 
bit is called the low-order bit; the most significant bit is 
called the high-order bit. 
 
3.1.2 character: A member of a set of elements used for the 
organization, control or representation of data. 
 
3.1.3 coded character: A sequence of one or more bytes 
representing a single character. 
 
3.1.4 text file: A file that contains characters organized 
into one or more lines. 
 
3.1.5 cultural convention: A data item for computer use that 
may vary dependent on language, territory, or other cultural 
circumstances. 
 
3.1.6 FDCC-set: A Set of Formal Definitions of Cultural 
Conventions. The definition of the subset of a user's 
information technology environment that depends on language 
and cultural conventions. Note: the FDCC-set is a superset of 
the "locale" term in C and POSIX. 
 
3.1.7 charmap: A definition of a mapping between symbolic 
character names and the encoding for a coded character set" 
 
3.1.8 repertoiremap: A definition of a mapping between 
symbolic character names and characters for the repertoire of 
characters used in a FDCC-set, further described in clause 6. 
 
3.1.9 character class: A named set of characters sharing an 
attribute associated with the name of the class. 
 
3.1.10 printable character: One of the characters included in 
the "print" character classification of the LC_CTYPE category 
in the current FDCC-set. 
 
3.1.11 white space: A sequence of one or more characters that 
belong to the "space" class as defined via the LC_CTYPE 
category in the current FDCC-set. 
 
3.1.12 collation: The logical ordering of strings according to 
defined precedence rules. 
 
3.1.13 collating element: The smallest entity used to 
determine the logical ordering of strings. 
 
See collating sequence. A collating element shall consist of 
either a single character, or two or more characters collating 
as a single entity. The value of the LC_COLLATE category in 
the current FDCC-set determines the current set of collating 
elements. 
 
3.1.14 multicharacter collating element: A sequence of two or 
more characters that collate as an entity. 
 
For example, in some languages two characters are sorted as 
one letter, this is the case for Danish and Norwegian "aa". 
 
3.1.15 collating sequence: The relative order of collating 
elements as determined by the setting of the LC_LOCALE 
category in the current FDCC-set. 
 
3.1.16 equivalence class: A set of collating elements with the 
same primary collation weight. 
 
Elements in an equivalence class are typically elements that 
naturally group together, such as all accented letters based 
on the same letter. 
 
The collation order of elements within an equivalence class is 
determined by the weights assigned on any subsequent levels 
after the primary weight. 
 
3.1.17 affirmative response: A string conforming to the 
definition of LC_MESSAGES category keyword "yesexpr". 
 
3.1.18 negative response: A string conforming to the 
definition of LC_MESSAGES category keyword "noexpr". 
 
3.2   Notations 
 
The following notations and common conventions for 
specifications apply to this standard: 
 
3.2.1   Format of syntax descriptions 
 
In this standard the syntax descriptions for statements are 
specified in the following way: 
 
The format is given in a format string enclosed in double 
quotes, followed by a number of parameters, separated by a 
comma. The format of each parameter is given by an escape 
sequence as follows: 
 
 %s      specifies a string 
 %d      specifies an decimal integer 
 %c      specifies a character 
 %o      specifies an octal integer 
 %x      specifies a hexadecimal integer 
 
All other characters in the format string except 
 
 %%      specifies a single % 
 \n      specifies an end-of-line 
 
represent themselves. 
 
The notation "..." is used to specify that repetition of the 
previous specification is optional, and this is done in both 
the format string and in the parameter list. 
 
 
3.2.2   Continuation of lines 
 
A line in a specification can be continued by placing an 
escape character as the last visible graphic character on the 
line; this continuation character shall be discarded from the 
input. Comment lines shall not be continued on a subsequent 
line using an escaped <newline>. 
 
3.2.3   Ellipses 
 
A series of characters in a specification can be represented 
by three adjacent periods representing an absolute ellipsis 
symbol ("..."), or the symbols "...." or ".." representing 
respectively the symbolic decimal ellipsis symbol and the 
symbolic hexadecimal ellipsis symbol. The ellipsis 
specification shall be interpreted as meaning that all values 
between the values preceding and following it represent valid 
characters. 
 
The absolute ellipsis specification is only valid within a 
single encoded character set. An ellipsis shall be interpreted 
as including in the list all characters with an encoded value 
higher than the encoded value of the character preceding the 
ellipsis and lower than the encoded value of the character 
following the ellipsis. The absolute ellipsis specification is 
deprecated, as this is only relevant to FDCC-sets not using 
symbolic characters. 
 
The symbolic ellipsis specifications are only valid between 
symbolic character names. They shall be interpreted as all the 
symbolic names that can be generated by either incrementing 
the first symbolic names decimally or hexadecimally 
(corresponding to "...." or ".." respectively) until the 
symbolic character name is less or equal the second symbolic 
character name. 
 
Examples: 
 
The use of the hexadecimal symbolic ellipsis in 
<U01AC>..<U01B2> generates the symbolic character names 
<U01AC>, <U01AD>, <U01AE>, <U01AF>, <U01B0>, <U01B1>, and 
<U01B2> in that sequence. 
 
The use of the decimal symbolic ellipsis in <j0148>..<j0153> 
generates the symbolic character names <j0148>, <j0149>, 
<j0150>, <j0151>, <j0152>, and <j0153> in that sequence. 
 
 
4   FDCC-set 
 
A FDCC-set is the definition of the subset of a user's 
information technology environment that depends on language 
and cultural conventions. It is made up from one or more 
categories.  Each category is identified by its name and 
controls specific aspects of the behaviour of components of 
the system. This standard defines following categories: 
 
 LC_CTYPE            Character classification, case conversion 
 and code transformation. 
 LC_COLLATE          Collation order. 
 LC_TIME             Date and time formats. 
 LC_NUMERIC          Numeric, non-monetary formatting. 
 LC_MONETARY         Monetary formatting. 
 LC_MESSAGES         Formats of informative and diagnostic 
 messages and interactive responses. 
 LC_PAPER            Paper format 
 LC_NAME             Format of writing personal names 
 LC_ADDRESS          Format of postal addresses 
 LC_TELEPHONE        Format for telephone numbers, and other 
 telephone information 
 LC_MEASUREMENT      Information on measurement system 
 LC_VERSIONS         Versions and status of categories 
 
In future editions of this standards further categories may be 
added. Other category names beginning with the 3 characters 
"LC_" are intended for future standardization, except for 
category names beginning with the five letters "LC_X_" which 
use is application defined. An implementation should thus use 
category names beginning with the five letters "LC_X_" to 
avoid clashes with future standardized categories. 
 
This standard also defines an FDCC-set named "i18n" with 
values for each of the above categories. 
 
4.1   FDCC-set Definition 
 
FDCC-sets are described with the format presented in this 
subclause.  For the purposes of this standard, the text is 
referred to as the FDCC-set definition text or FDCC-set source 
text. 
 
The FDCC-set definition text shall contain one or more FDCC- 
set category source definitions, and shall not contain more 
than one definition for the same FDCC-set category. If the 
text contains source definitions for more than one category, 
application-defined categories, if present, shall appear after 
the categories defined by this clause. A category source 
definition shall contain either the definition of a category 
or a copy directive.  In the event that some of the 
information for a FDCC-set category, as specified in this 
standard, is missing from the FDCC-set source definition, the 
behaviour of that category, if it is referenced, is 
unspecified. A FDCC-set category is the normal way of 
specifying a single FDCC. 
 
A category source definition shall consist of a category 
header, a category body, and a category trailer. A category 
header shall consist of the character string naming of the 
category, beginning with the characters "LC_". The category 
trailer shall consist of the string "END", followed by one or 
more "blank"s and the string used in the corresponding 
category header. 
 
The category body shall consist of one or more lines of text. 
Each line shall contain an identifier, optionally followed by 
one or more operands. Identifiers shall be either keywords, 
identifying a particular FDCC, or collating elements, or 
script symbols, or transliteration statements. In addition to 
the keywords defined in this standard, the source can contain 
application-defined keywords. Each keyword within a category 
shall have a unique name (i.e., two categories can have a 
commonly-named keyword); no keyword shall start with the 
characters "LC_". Identifiers shall be separated from the 
operands by one or more "blank"s. 
 
Operands shall be characters, collating elements, script 
symbols, or strings of characters. Strings shall be enclosed 
in double-quotes. Literal double-quotes within strings shall 
be preceded by the <escape character>, described below. When a 
keyword is followed by more than one operand, the operands 
shall be separated by semicolons; "blank"s shall be allowed 
before and/or after a semicolon. 
 
4.1.1   Character representation 
 
Individual characters, characters in strings, and collating 
elements shall be represented using symbolic names, UCS 
notation or characters themselves, or as octal, hexadecimal, 
or decimal constants as defined below. When constant notation 
is used, the resultant FDCC-set definitions need not be 
portable between systems. 
 
(0)   The left angle bracket (<) is a reserved symbol, denoting 
 the start of a symbolic name; when used to represent 
 itself it shall be preceded by the escape character. 
 
(1)   A character can be represented via a symbolic name, 
 enclosed within angle brackets (< and >). The symbolic 
 name, including the angle brackets, shall exactly match a 
 symbolic name defined in a charmap or a repertoiremap to 
 be used, and shall be replaced by a character value 
 determined from the value associated with the symbolic 
 name in the charmap or a value associated via a 
 repertoiremap. Repertoiremaps have predefined symbolic 
 names for UCS characters, see clause 6. Use of the escape 
 character or a right angle bracket within a symbolic name 
 shall be invalid unless the character is preceded by the 
 escape character. 
 
 Example: <c>;<c-cedilla> "<M><a><y>" 
 
The items (2), (3), (4) and (5) are deprecated and are 
retained for compatibility with the POSIX standard. FDCC-sets 
should be specified in a coded character set independent way, 
using symbolic names. To make actual use of the FDCC-set, it 
shall be used together with charmaps and/or repertoiremaps, so 
that the symbolic character names can be resolved into the 
actual character encoding used. 
 
(2)   A character can be represented by the character itself, 
 in which case the value of the character is application- 
 defined. Within a string, the double-quote character, the 
 escape character, and the right angle bracket character 
 shall be escaped (preceded by the escape character) to be 
 interpreted as the character itself. Outside strings, the 
 characters 
 
 , ; < > escape_char 
 
 shall be escaped to be interpreted as the character itself 
 
 Example: c  "May" 
 
(3)   A character can be represented as an octal constant. An 
 octal constant shall be specified as the escape character 
 followed by two or more octal digits. Each constant shall 
 represent a byte value. 
 
 Example: \143; \347; "\115" 
 
(4)   A character can be represented as a hexadecimal constant. 
 A hexadecimal constant shall be specified as the escape 
 character followed by an x followed by two or more 
 hexadecimal digits. Each constant shall represent a byte 
 value. 
 
 Example: \x63;\xe7; 
 
(5)   A character can be represented as a decimal constant. A 
 decimal constant shall be specified as the escape 
 character followed by a d followed by two or more decimal 
 digits. Each constant shall represent a byte value. 
 
 Example: \d99; \d231; 
 
(6)   Multibyte characters can be represented by concatenated 
 constants specified in byte order with the last constant 
 specifying the least significant byte of the character. 
 Concatenated constants can include a mix of the above 
 character representations. 
 
 Example: \143\xe7; "\115\xe7\d171" 
 
Only characters existing in the character set for which the 
FDCC-set definition is created shall be specified, whether 
using symbolic names, the characters themselves, or octal, 
decimal, or hexadecimal constants. If a charmap is present, 
only characters defined in the charmap can be specified using 
octal, decimal, or hexadecimal constants. Symbolic names not 
present in the charmap can be specified and shall be ignored, 
as specified under item (1) above. 
 
4.1.2   Pre-category statements 
 
In a FDCC-set the following statements can precede category 
specifications, and they apply to all categories in the 
specified FDCC-set. 
 
4.1.2.1   comment_char 
 
The following line in a FDCC-set modifies the comment 
character. It shall have the following format, starting in 
column 1: 
 
 "comment_char %c\n", <comment character> 
 
The comment character shall default to the number-sign (#). 
All examples this standard use "%" as the <comment char>, 
except where otherwise noted. Blank lines and lines containing 
the <comment char> in the first position, and the remainder of 
a line with a <comment char> occurring where a syntactic 
semicolon may occur, shall be ignored. 
 
4.1.2.2   escape_char 
 
The following line in a FDCC-set modifies the escape character 
to be used in the text. It shall have the following format, 
starting in column 1: 
 
 "escape_char %c\n", <escape character> 
 
The escape character shall default to backslash "\". All 
examples in this standard uses "/" as the escape character, 
except where otherwise noted. 
 
4.1.2.3   repertoiremap 
 
The following line in a FDCC-set specifies the name of a 
repertoiremap used to define the symbolic character names in 
the FDCC-set. There may be at most one "repertoiremap" line. 
It shall have the following format, starting in column 1: 
 
 "repertoiremap %s\n", <repertoiremap> 
 
4.1.2.4   charmap 
 
The following line in a FDCC-set specifies the name of a 
charmap which may be used with the FDCC-set. It shall have the 
following format, starting in column 1: 
 
 "charmap %s\n",<charmap> 
 
There may be more than one charmap specification in a FDCC- 
set. For the actual use of a FDCC-set, at most one charmap may 
be in use, and this may be different from any charmap 
specified with the "charmap" line. The "charmap" keyword is 
intended to provide information on which charmaps are supposed 
to be used with the FDCC-set, but other charmaps may also be 
applicable. 
 
 
4.2   LC_CTYPE 
 
The LC_CTYPE category defines character classification, case 
conversion, character transformation, and other character 
attribute mappings. Ellipsises and symbolic ellipsises  as 
defined in clause 3.2.3 may be used to specify a list of 
characters. Support for the portable character set is 
required. 
 
Example: \x30:...;\x39; includes in the character class all 
characters with encoded values between the endpoints. 
 
4.2.1   Basic keywords 
 
The following keywords shall be defined. In the descriptions, 
the term "automatically included" means that it shall not be 
an error to either include the referenced characters or to 
omit them; the interpreting system shall provide them if 
missing and accept them silently if present. 
 
copy    Specify the name of an existing FDCC-set to be used as 
 the source for the definition of this category. If this 
 keyword is specified, no other keyword shall be 
 specified. 
upper   Define characters to be classified as uppercase 
 letters. No character specified for the keywords cntrl, 
 digit, punct, or space shall be specified. The 
 uppercase letters A through Z of the portable character 
 set, shall automatically belong to this class, with 
 application-defined character values. The keyword may 
 be omitted. 
lower   Define characters to be classified as lowercase 
 letters. No character specified for the keywords cntrl, 
 digit, punct, or space shall be specified. The 
 lowercase letters a through z of the portable character 
 set, shall automatically belong to this class, with 
 application-defined character values. The keyword my be 
 omitted. 
alpha   Define characters to be classified as letters or other 
 characters used in words of natural languages such as 
 syllabic or ideographic characters. No character 
 specified for the keywords cntrl, digit, punct, or 
 space shall be specified. In addition, characters 
 classified as either upper or lower shall automatically 
 belong to this class. The keyword may be omitted. 
digit   Define the characters to be classified as numeric 
 digits. Digits corresponding to the values 0, 1, 2, 3, 
 4, 5, 6, 7, 8, and 9 can be specified in groups of 10 
 digits, and in ascending order of the values they 
 represent. The digits of the portable character set are 
 automatically included. If this keyword is not 
 specified, the digits 0 through 9 of the portable 
 character set shall automatically belong to this class, 
 with application-defined character values. The keyword 
 may be omitted. 
outdigit    Define the characters to be classified as numeric 
 digits for output. Digits corresponding to the 
 values 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 can be 
 specified, and in ascending order of the values they 
 represent. If this keyword is not specified, the 
 digits 0 through 9 of the portable character set 
 shall automatically belong to this class, with 
 application-defined character values. The keyword 
 may be omitted. 
space   Define characters to be classified as white-space 
 characters, for to find syntactical boundaries. No 
 character specified for the keywords upper, lower, 
 alpha, digit, graph, or xdigit shall be specified. If 
 this keyword is not specified, the characters <space>, 
 <form-feed>, <newline>, <carriage-return>, <tab>, and 
 <vertical-tab>, shall automatically belong to this 
 class, with application-defined character values. Any 
 characters included in the class blank shall be 
 automatically included. The keyword may be omitted. 
cntrl   Define characters to be classified as control 
 characters. No character specified for the keywords 
 upper, lower, alpha, digit, punct, graph, print, or 
 xdigit shall be specified. The keyword shall be 
 specified. 
punct   Define characters to be classified as punctuation 
 characters. No character specified for the keywords 
 upper, lower, alpha, digit, cntrl, xdigit, or as the 
 <space> character shall be specified. The keyword shall 
 be specified. 
graph   Define characters to be classified as printable 
 characters, not including the <space> character. If 
 this keyword is not specified, characters specified for 
 the keywords upper, lower, alpha, digit, xdigit, and 
 punct shall belong to this character class. No 
 character specified for the keyword cntrl shall be 
 specified. 
print   Define characters to be classified as printable 
 characters, including the <space> character. If this 
 keyword is not provided, characters specified for the 
 keywords upper, lower, alpha, digit, xdigit, punct, 
 graph, and the <space> character shall belong to this 
 character class. No character specified for the keyword 
 cntrl shall be specified. 
xdigit      Define the characters to be classified as 
 hexadecimal digits. Only the characters defined for 
 the class digit shall be specified, in ascending 
 sequence by numerical value, followed by one or more 
 sets of six characters representing the hexadecimal 
 digits 10 through 15, with each set in ascending 
 order (for example A, B, C, D, E, F, a, b, c, d, e, 
 f). If this keyword is not specified, the digits 0 
 through 9, the uppercase letters A through F, and 
 the lowercase letters a through f, shall 
 automatically belong to this class, with applicat- 
 ion-defined character values. 
blank   Define characters to be classified as "blank" 
 characters. If this keyword is unspecified, the 
 characters <space> and <tab>, with application-defined 
 character values, shall belong to this character class. 
toupper     Define the mapping of lowercase letters to uppercase 
 letters. The operand shall consist of character 
 pairs, separated by semicolons. The characters in 
 each character pair shall be separated by a comma 
 and the pair enclosed by parentheses. The first 
 character in each pair shall be the lowercase 
 letter, the second the corresponding uppercase 
 letter. Only characters specified for the keywords 
 lower and upper shall be specified. If this keyword 
 is not specified, the lowercase letters a through z, 
 and their corresponding uppercase letters A through 
 Z, shall automatically be included, with 
 application-defined character values. 
tolower     Define the mapping of uppercase letters to lowercase 
 letters. The operand shall consist of character 
 pairs, separated by semicolons. The characters in 
 each character pair are separated by a comma and the 
 pair enclosed by parentheses. The first character in 
 each pair shall be the uppercase letter, the second 
 the corresponding lowercase letter. Only characters 
 specified for the keywords lower and upper shall be 
 specified. If this keyword is specified, the 
 uppercase letters A through Z, and their correspon- 
 ding lowercase letter, shall be specified. If this 
 keyword is not specified, the mapping shall be the 
 reverse mapping of the one specified for toupper. 
class   Define characters to be classified as characters in the 
 class defined with the first operand, which is a 
 string. The string shall only contain letters, digits 
 and <hyphen-minus> and <underline> form the portable 
 character set. The following operands are characters. 
 This keyword is optional. The keyword can only be 
 specified once per named class. Defined classes are: 
 left_to_right     Left-to-right directionality, for 
 example Latin letters. 
 right_to_left     Right-to-left directionality, for 
 example Hebrew letters. 
 num_terminator    Numeric terminator required for 
 determining the end of a number. 
 num_separator     numbers separator characters that can 
 separate numbers written with any of 
 the characters in the digit class. 
 segment_separator       Segment separator characters, 
 that delimits segments, normally 
 part of a line, with specific 
 directionality. 
 block_separator         Block separator characters, that 
 delimits larger blocks of text 
 with a specific directionality. 
 direction_control       Direction control characters, 
 such as the characters listed in 
 ISO/IEC 10646-1:1993 annex 
 D.1.3. 
 sym_swap_layout         Symmetrical swap layout 
 characters, such as the 
 characters listed in ISO/IEC 
 10646-1:1993 annex D.2.2 
 char_shape_selector     Character shaping selector 
 characters, such as the 
 characters listed in ISO/IEC 
 10646-1:1993 annex D.2.3 
 num_shape_selector      Numeric shaping selector 
 characters, such as the 
 characters listed in ISO/IEC 
 10646-1:1993 annex D.2.4 
 non_spacing       Characters to form composite graphic 
 symbols, such as characters listed in 
 ISO/IEC 10646:1993 annex B.1. 
 non_spacing_level3      Characters to form composite 
 graphic symbols, that may also 
 be represented by other 
 characters, such as characters 
 listed in ISO/IEC 10646-1:1993 
 annex B.2. 
 normal_connect    Characters that connect both to the 
 left and to the right 
 r_connect         Characters that connect only to their 
 right. 
 no_connect        Characters that do not connect and 
 cannot be overridden. 
 no_connect-space        Characters that may be 
 overridden, but do not connect. 
 vowel_connect     Connectable vowels. 
 special1          Characters that need special 
 handling. 
 special2          Characters that need special 
 handling. 
 special3          Characters that need special 
handling. 
 The class names "upper", "lower", "alpha", "digit", 
 "space", "cntrl", "punct", "graph", "print", "xdigit", 
 and "blank" are taken to mean the classes defined by 
 the respective keywords. 
map     Define the mapping of characters. The first operand is 
 a string, defining the name of the mapping. The string 
 shall only contain letters, digits and <hyphen-minus> 
 and <underline> form the portable character set. The 
 following operands shall consist of character pairs, 
 separated by semicolons. The characters in each 
 character pair shall be separated by a comma and the 
 pair enclosed by parentheses. The first character in 
 each pair shall be the character to map from, the 
 second the corresponding character to map to. This 
 keyword is optional. The keyword can only be specified 
 once per named mapping. Defined mappings are: 
 tosymmetric       Characters to be switched for 
 eachother in bidirectional text, for 
 example characters listed in ISO/IEC 
 10646-1 Annex C. For each pair also 
 the mapping form the second operand 
 to the first operand is also defined. 
 The mapping names "toupper", and "tolower" are taken to 
 mean the mapping defined by the respective keywords. 
 
Table 1 shows the allowed character class combinations. 
 
 
Table 1: Valid Character Class Combinations 
 
Class       upper      lower     alpha      digit       space       cntrl 
punct       graph      print     xdigit     blank 
 
upper             +    A    x    x     x    x     A     A     +     x 
lower       +          A    x    x     x    x     A     A     +     x 
alpha       +     +         x    x     x    x     A     A     +     x 
digit       x     x    x         x     x    x     A     A     A     x 
space       x     x    x    x          +    *     *     *     x     + 
cntrl       x     x    x    x    +          x     x     x     x     + 
punct       x     x    x    x    +     x          A     A     x     + 
graph       +     +    +    +    +     x    +           A     +     + 
print       +     +    +    +    +     x    +     +           +     + 
xdigit      +     +    +    +    x     x    x     A     A           x 
blank       x     x    x    x    A     +    *     *     *     x 
 
NOTES: 
Note 1: Explanation of codes: 
A Automatically included; see text 
+ Permitted 
x Mutually exclusive 
* See note 2 
 
Note 2: The <space> character, which is part of the space and 
blank class, cannot belong to punct or graph, but 
automatically shall belong to the print class. Other space or 
blank characters can be classified as punct, graph, and/or 
print. 
 
4.2.2   Character string transliteration 
 
The following keywords may be used to transliterate strings. 
The transliteration may for example be from the Cyrillic 
script to the Latin script. Transliteration is often language 
dependent, and the language to be transliterated to is 
identified with the FDCC-set, which may also be used to 
identify a specific language to be transliterated from. 
Transliteration of an incoming character string to a character 
string in a FDCC-set can be specified with the following 
keywords and transliteration statements. 
 
translit_start         The "translit_start" keyword is followed by 
 one or more transliteration statements 
 assigning  character transliteration values 
 to transliterating elements, and include 
 statements copying transliteration 
 specifications from other FDCC-sets. 
translit_end           The end of the transliteration statements. 
include           The name of the FDCC-set in text form to 
 transliterate from, and the repertoiremap for 
 the FDCC-set to be used for the definition of 
 the transliteration statements. Other 
 transliteration statements may follow to 
 replace specification of the copied FDCC-set. 
 This keyword is optional. 
default_missing        defines one or more characters to be used 
 if no transliteration statement can be 
 applied to a input <transliteration- 
 source>. 
 
4.2.2.1   Transliteration statements 
 
The "translit_start" keyword may be followed by 
transliteration statements. The syntax for a transliteration 
statement is: 
 
 "%s %s;%s;...;%s\n",<transliteration- 
source>,<transliteration-string>, 
 <transliteration-string>,... 
 
Each <transliteration-source> shall consist of one or more 
characters (in any of the forms defined in 4.1.1). The 
<transliteration-source> in terms of number of characters that 
match the input string is the one selected for 
transliteration. 
 
The order the <transliteration-strings> is defined in, defines 
the precedence of transliterations. The first 
<transliteration-string> that satisfies the transliteration 
(by for example having characters that are all in the coded 
character set that is transformed into and having the desired 
string length) is chosen. Note: For this match in the list of 
<transliteration-strings> it is expected that a repertoire 
describing which characters to be present in the resulting 
transformed string be available to the transliteration API. 
 
If more than one transliteration statement is given for a 
given <transliteration-source> this is an error, unless it is 
specifically allowed by the utility handling the FDCC-set - 
then a warning is given and the last transliteration statement 
is assumed. 
 
4.2.2.2   "include" keyword 
 
The "include" keyword specifies a set of transliteration 
statements in text form to be included in the current 
transliteration. 
 
The syntax of the "include" statement is: 
 
 "include %s;%s\n", <FDCC-set>, <repertoiremap> 
 
<FDCC-set> is a string identifying the FDCC-set to be included 
from. 
 
<repertoiremap> is a string identifying the repertoiremap used 
in the FDCC-set being included, and is used to map character 
specifications from the specified FDCC-set into the current 
FDCC-set. 
 
4.2.2.3   Example of use of transliteration 
 
 translit_start 
 include "de_DE";"de_repmap" 
 default_missing <?> 
 <ae>    <a:>;<e*>;<a><e>;"<e>" 
 <s>     <s*>;<s=> 
 <K><O>  <KO> 
 translit_end 
 
The "translit_start" keyword introduces the transliteration 
section in the LC_CTYPE category. 
 
The "include" keyword specifies that the FDCC-set "de_DE" is 
copied and that the repertoiremap "de_repmap" is used to 
define the symbolic character names in the FDCC-set "de_DE". 
 
The "default_missing" keyword introduces the character 
sequence "<?>" as the string to transform into for input 
characters that cannot be transformed into other strings, 
because no transliteration statement is applicable to the 
character. 
 
The next 3 lines are transliteration statements. 
 
The first transliteration statement defines a number of 
transliterations for the LATIN LETTER AE, including into LATIN 
LETTER A WITH DIAERESIS, GREEK LETTER EPSILON, the two Latin 
letters A and E, and finally the LATIN LETTER E. 
 
The second transliteration statement defines transliteration 
of the LATIN LETTER S into GREEK LETTER SIGMA, and CYRILLIC 
LETTER ES. 
 
The third transliteration statement transliterates the two 
Latin letters K and O into the Japanese Hiragana character KO. 
 
The transliteration sections is terminated via the 
"translit_end" keyword in the above example. 
 
4.2.3   "i18n" LC_CTYPE category 
 
The "i18n" FDCC-set for the LC_CTYPE is defined as follows: 
 
 LC_CTYPE 
 % The following is the 14652 i18n fdcc-set LC_CTYPE category. 
 % It covers ISO/IEC 10646-1 including Cor.1 and AMD 1 thru 9 
 upper / 
 <U0041>..<U005A>;<U00C0>..<U00D6>;<U00D8>..<U00DE>;<U0100>;/ 
 <U0102>;<U0104>;<U0106>;<U0108>;<U010A>;<U010C>;<U010E>;<U0110>;/ 
 <U0112>;<U0114>;<U0116>;<U0118>;<U011A>;<U011C>;<U011E>;<U0120>;/ 
 <U0122>;<U0124>;<U0126>;<U0128>;<U012A>;<U012C>;<U012E>;<U0130>;/ 
 <U0132>;<U0134>;<U0136>;<U0139>;<U013B>;<U013D>;<U013F>;<U0141>;/ 
 <U0143>;<U0145>;<U0147>;<U014A>;<U014C>;<U014E>;<U0150>;<U0152>;/ 
 <U0154>;<U0156>;<U0158>;<U015A>;<U015C>;<U015E>;<U0160>;<U0162>;/ 
 <U0164>;<U0166>;<U0168>;<U016A>;<U016C>;<U016E>;<U0170>;<U0172>;/ 
 <U0174>;<U0176>;<U0178>;<U0179>;<U017B>;<U017D>;<U0181>;<U0182>;/ 
 <U0184>;<U0186>;<U0187>;<U0189>..<U018B>;<U018E>..<U0191>;/ 
 
<U0193>;<U0194>;<U0196>..<U0198>;<U019C>;<U019D>;<U019F>;<U01A0>;<U01A2>;/ 
 <U01A4>;<U01A7>;<U01A9>;<U01AC>;<U01AE>;<U01AF>;<U01B1>..<U01B3>;/ 
 <U01B5>;<U01B7>;<U01B8>;<U01BC>;<U01C4>;<U01C5>;<U01C7>;<U01C8>;/ 
 
<U01CA>;<U01CB>;<U01CD>;<U01CF>;<U01D1>;<U01D3>;<U01D5>;<U01D7>;<U01D9>;/ 
 <U01DB>;<U01DE>;<U01E0>;<U01E2>;<U01E4>;<U01E6>;<U01E8>;<U01EA>;/ 
 <U01EC>;<U01EE>;<U01F1>;<U01F2>;<U01F4>;<U01FA>;<U01FC>;<U01FE>;/ 
 <U0200>;<U0202>;<U0204>;<U0206>;<U0208>;<U020A>;<U020C>;<U020E>;/ 
 <U0210>;<U0212>;<U0214>;<U0216>;<U0262>;<U026A>;<U0274>;<U0276>;/ 
 <U0280>;<U0281>;<U028F>;<U0299>;<U029B>;<U029C>;<U029F>;<U0386>;/ 
 
<U0388>..<U038A>;<U038C>;<U038E>;<U038F>;<U0391>..<U03A1>;<U03A3>..<U03AB>; 
/ 
 
<U0401>..<U040C>;<U040E>..<U042F>;<U0460>;<U0462>;<U0464>;<U0466>;<U0468>;/ 
 <U046A>;<U046C>;<U046E>;<U0470>;<U0472>;<U0474>;<U0476>;<U0478>;/ 
 <U047A>;<U047C>;<U047E>;<U0480>;<U0490>;<U0492>;<U0494>;<U0496>;/ 
 <U0498>;<U049A>;<U049C>;<U049E>;<U04A0>;<U04A2>;<U04A4>;<U04A6>;/ 
 <U04A8>;<U04AA>;<U04AC>;<U04AE>;<U04B0>;<U04B2>;<U04B4>;<U04B6>;/ 
 <U04B8>;<U04BA>;<U04BC>;<U04BE>;<U04C1>;<U04C3>;<U04C7>;<U04CB>;/ 
 <U04D0>;<U04D2>;<U04D4>;<U04D6>;<U04D8>;<U04DA>;<U04DC>;<U04DE>;/ 
 <U04E0>;<U04E2>;<U04E4>;<U04E6>;<U04E8>;<U04EA>;<U04EE>;<U04F0>;/ 
 <U04F2>;<U04F4>;<U04F8>;<U0531>..<U0556>;<U1E00>;<U1E02>;<U1E04>;/ 
 <U1E06>;<U1E08>;<U1E0A>;<U1E0C>;<U1E0E>;<U1E10>;<U1E12>;<U1E14>;/ 
 <U1E16>;<U1E18>;<U1E1A>;<U1E1C>;<U1E1E>;<U1E20>;<U1E22>;<U1E24>;/ 
 <U1E26>;<U1E28>;<U1E2A>;<U1E2C>;<U1E2E>;<U1E30>;<U1E32>;<U1E34>;/ 
 <U1E36>;<U1E38>;<U1E3A>;<U1E3C>;<U1E3E>;<U1E40>;<U1E42>;<U1E44>;/ 
 <U1E46>;<U1E48>;<U1E4A>;<U1E4C>;<U1E4E>;<U1E50>;<U1E52>;<U1E54>;/ 
 <U1E56>;<U1E58>;<U1E5A>;<U1E5C>;<U1E5E>;<U1E60>;<U1E62>;<U1E64>;/ 
 <U1E66>;<U1E68>;<U1E6A>;<U1E6C>;<U1E6E>;<U1E70>;<U1E72>;<U1E74>;/ 
 <U1E76>;<U1E78>;<U1E7A>;<U1E7C>;<U1E7E>;<U1E80>;<U1E82>;<U1E84>;/ 
 <U1E86>;<U1E88>;<U1E8A>;<U1E8C>;<U1E8E>;<U1E90>;<U1E92>;<U1E94>;/ 
 <U1EA0>;<U1EA2>;<U1EA4>;<U1EA6>;<U1EA8>;<U1EAA>;<U1EAC>;<U1EAE>;/ 
 <U1EB0>;<U1EB2>;<U1EB4>;<U1EB6>;<U1EB8>;<U1EBA>;<U1EBC>;<U1EBE>;/ 
 <U1EC0>;<U1EC2>;<U1EC4>;<U1EC6>;<U1EC8>;<U1ECA>;<U1ECC>;<U1ECE>;/ 
 <U1ED0>;<U1ED2>;<U1ED4>;<U1ED6>;<U1ED8>;<U1EDA>;<U1EDC>;<U1EDE>;/ 
 <U1EE0>;<U1EE2>;<U1EE4>;<U1EE6>;<U1EE8>;<U1EEA>;<U1EEC>;<U1EEE>;/ 
 <U1EF0>;<U1EF2>;<U1EF4>;<U1EF6>;<U1EF8>;<U1F08>..<U1F0F>;/ 
 
<U1F18>..<U1F1D>;<U1F28>..<U1F2F>;<U1F38>..<U1F3F>;<U1F48>..<U1F4D>;<U1F59> 
;/ 
 <U1F5B>;<U1F5D>;<U1F5F>;<U1F68>..<U1F6F>;<U1F88>..<U1F8F>;/ 
 <U1F98>..<U1F9F>;<U1FA8>..<U1FAF>;<U1FB8>..<U1FBC>;<U1FC8>..<U1FCC>;/ 
 <U1FD8>..<U1FDB>;<U1FE8>..<U1FEC>;<U1FF8>..<U1FFC>;<UFF21>..<UFF3A> 
 % 
 lower / 
 <U0061>..<U007A>;<U00DF>..<U00F6>;<U00F8>..<U00FF>;<U0101>;/ 
 <U0103>;<U0105>;<U0107>;<U0109>;<U010B>;<U010D>;<U010F>;<U0111>;/ 
 <U0113>;<U0115>;<U0117>;<U0119>;<U011B>;<U011D>;<U011F>;<U0121>;/ 
 <U0123>;<U0125>;<U0127>;<U0129>;<U012B>;<U012D>;<U012F>;<U0131>;/ 
 <U0133>;<U0135>;<U0137>;<U0138>;<U013A>;<U013C>;<U013E>;<U0140>;/ 
 <U0142>;<U0144>;<U0146>;<U0148>;<U0149>;<U014B>;<U014D>;<U014F>;/ 
 <U0151>;<U0153>;<U0155>;<U0157>;<U0159>;<U015B>;<U015D>;<U015F>;/ 
 <U0161>;<U0163>;<U0165>;<U0167>;<U0169>;<U016B>;<U016D>;<U016F>;/ 
 <U0171>;<U0173>;<U0175>;<U0177>;<U017A>;<U017C>;<U017E>..<U0180>;/ 
 <U0183>;<U0185>;<U0188>;<U018C>;<U018D>;<U0192>;<U0195>;/ 
 <U0199>..<U019B>;<U019E>;<U01A1>;<U01A3>;<U01A5>;<U01A8>;<U01AB>;<U01AD>;/ 
 <U01B0>;<U01B4>;<U01B6>;<U01B9>;<U01BA>;<U01BD>;<U01C5>;<U01C6>;/ 
 
<U01C8>;<U01C9>;<U01CB>;<U01CC>;<U01CE>;<U01D0>;<U01D2>;<U01D4>;<U01D6>;/ 
 <U01D8>;<U01DA>;<U01DC>;<U01DD>;<U01DF>;<U01E1>;<U01E3>;<U01E5>;/ 
 <U01E7>;<U01E9>;<U01EB>;<U01ED>;<U01EF>;<U01F0>;<U01F2>;<U01F3>;/ 
 <U01F5>;<U01FB>;<U01FD>;<U01FF>;<U0201>;<U0203>;<U0205>;<U0207>;/ 
 <U0209>;<U020B>;<U020D>;<U020F>;<U0211>;<U0213>;<U0215>;<U0217>;/ 
 
<U0250>..<U0293>;<U0299>..<U02A0>;<U02A3>..<U02A8>;<U0390>;<U03AC>..<U03CE> 
;/ 
 
<U0430>..<U044F>;<U0451>..<U045C>;<U045E>;<U045F>;<U0461>;<U0463>;<U0465>;/ 
 <U0467>;<U0469>;<U046B>;<U046D>;<U046F>;<U0471>;<U0473>;<U0475>;/ 
 <U0477>;<U0479>;<U047B>;<U047D>;<U047F>;<U0481>;<U0491>;<U0493>;/ 
 <U0495>;<U0497>;<U0499>;<U049B>;<U049D>;<U049F>;<U04A1>;<U04A3>;/ 
 <U04A5>;<U04A7>;<U04A9>;<U04AB>;<U04AD>;<U04AF>;<U04B1>;<U04B3>;/ 
 <U04B5>;<U04B7>;<U04B9>;<U04BB>;<U04BD>;<U04BF>;<U04C2>;<U04C4>;/ 
 <U04C8>;<U04CC>;<U04D1>;<U04D3>;<U04D5>;<U04D7>;<U04D9>;<U04DB>;/ 
 <U04DD>;<U04DF>;<U04E1>;<U04E3>;<U04E5>;<U04E7>;<U04E9>;<U04EB>;/ 
 <U04EF>;<U04F1>;<U04F3>;<U04F5>;<U04F9>;<U0561>..<U0586>;<U1E01>;/ 
 <U1E03>;<U1E05>;<U1E07>;<U1E09>;<U1E0B>;<U1E0D>;<U1E0F>;<U1E11>;/ 
 <U1E13>;<U1E15>;<U1E17>;<U1E19>;<U1E1B>;<U1E1D>;<U1E1F>;<U1E21>;/ 
 <U1E23>;<U1E25>;<U1E27>;<U1E29>;<U1E2B>;<U1E2D>;<U1E2F>;<U1E31>;/ 
 <U1E33>;<U1E35>;<U1E37>;<U1E39>;<U1E3B>;<U1E3D>;<U1E3F>;<U1E41>;/ 
 <U1E43>;<U1E45>;<U1E47>;<U1E49>;<U1E4B>;<U1E4D>;<U1E4F>;<U1E51>;/ 
 <U1E53>;<U1E55>;<U1E57>;<U1E59>;<U1E5B>;<U1E5D>;<U1E5F>;<U1E61>;/ 
 <U1E63>;<U1E65>;<U1E67>;<U1E69>;<U1E6B>;<U1E6D>;<U1E6F>;<U1E71>;/ 
 <U1E73>;<U1E75>;<U1E77>;<U1E79>;<U1E7B>;<U1E7D>;<U1E7F>;<U1E81>;/ 
 <U1E83>;<U1E85>;<U1E87>;<U1E89>;<U1E8B>;<U1E8D>;<U1E8F>;<U1E91>;/ 
 <U1E93>;<U1E95>..<U1E9B>;<U1EA1>;<U1EA3>;<U1EA5>;<U1EA7>;<U1EA9>;/ 
 <U1EAB>;<U1EAD>;<U1EAF>;<U1EB1>;<U1EB3>;<U1EB5>;<U1EB7>;<U1EB9>;/ 
 <U1EBB>;<U1EBD>;<U1EBF>;<U1EC1>;<U1EC3>;<U1EC5>;<U1EC7>;<U1EC9>;/ 
 <U1ECB>;<U1ECD>;<U1ECF>;<U1ED1>;<U1ED3>;<U1ED5>;<U1ED7>;<U1ED9>;/ 
 <U1EDB>;<U1EDD>;<U1EDF>;<U1EE1>;<U1EE3>;<U1EE5>;<U1EE7>;<U1EE9>;/ 
 <U1EEB>;<U1EED>;<U1EEF>;<U1EF1>;<U1EF3>;<U1EF5>;<U1EF7>;<U1EF9>;/ 
 <U1F00>..<U1F07>;<U1F10>..<U1F15>;<U1F20>..<U1F27>;<U1F30>..<U1F37>;/ 
 <U1F40>..<U1F45>;<U1F50>..<U1F57>;<U1F60>..<U1F67>;<U1F70>..<U1F7D>;/ 
 <U1F80>..<U1F87>;<U1F90>..<U1F97>;<U1FA0>..<U1FA7>;<U1FB0>..<U1FB4>;/ 
 <U1FB6>;<U1FB7>;<U1FC2>..<U1FC4>;<U1FC6>;<U1FC7>;<U1FD0>..<U1FD3>;/ 
 
<U1FD6>;<U1FD7>;<U1FE0>..<U1FE7>;<U1FF2>..<U1FF4>;<U1FF6>;<U1FF7>;<U207F>;/ 
 <U2129>;<UFB00>..<UFB06>;<UFF41>..<UFF5A> 
 % 
 alpha / 
 <U0041>..<U005A>;<U0061>..<U007A>;<U00AA>;<U00BA>;<U00C0>..<U00D6>;/ 
 <U00D8>..<U00F6>;<U00F8>..<U01F5>;<U01FA>..<U0217>;<U0250>..<U02A8>;/ 
 <U1E00>..<U1E9B>;<U1EA0>..<U1EF9>;<U207F>;/ 
 <U0386>;<U0388>..<U038A>;<U038C>;<U038E>..<U03A1>;<U03A3>..<U03CE>;/ 
 <U03D0>..<U03D6>;<U03DA>;<U03DC>;<U03DE>;<U03E0>;<U03E2>..<U03F3>;/ 
 <U1F00>..<U1F15>;<U1F18>..<U1F1D>;<U1F20>..<U1F45>;<U1F48>..<U1F4D>;/ 
 <U1F50>..<U1F57>;<U1F59>;<U1F5B>;<U1F5D>;<U1F5F>..<U1F7D>;/ 
 <U1F80>..<U1FB4>;<U1FB6>..<U1FBC>;<U1FC2>..<U1FC4>;<U1FC6>..<U1FCC>;/ 
 <U1FD0>..<U1FD3>;<U1FD6>..<U1FDB>;<U1FE0>..<U1FEC>;<U1FF2>..<U1FF4>;/ 
 <U1FF6>..<U1FFC>;/ 
 <U0401>..<U040C>;<U040E>..<U044F>;<U0451>..<U045C>;<U045E>..<U0481>;/ 
 <U0490>..<U04C4>;<U04C7>..<U04C8>;<U04CB>..<U04CC>;<U04D0>..<U04EB>;/ 
 <U04EE>..<U04F5>;<U04F8>..<U04F9>;/ 
 <U0531>..<U0556>;<U0561>..<U0587>;/ 
 <U05B0>..<U05B9>;<U05BB>..<U05BD>;<U05BF>;<U05C1>..<U05C2>;/ 
 <U05D0>..<U05EA>;<U05F0>..<U05F2>;/ 
 <U0621>..<U063A>;<U0640>..<U0652>;<U0670>..<U06B7>;<U06BA>..<U06BE>;/ 
 <U06C0>..<U06CE>;<U06D0>..<U06DC>;<U06E5>..<U06E8>;<U06EA>..<U06ED>;/ 
 <U0901>..<U0903>;<U0905>..<U0939>;<U093E>..<U094D>;<U0950>..<U0952>;/ 
 <U0958>..<U0963>;<U0981>..<U0983>;<U0985>..<U098C>;<U098F>..<U0990>;/ 
 <U0993>..<U09A8>;<U09AA>..<U09B0>;<U09B2>;<U09B6>..<U09B9>;/ 
 <U09BE>..<U09C4>;<U09C7>..<U09C8>;<U09CB>..<U09CD>;<U09DC>..<U09DD>;/ 
 <U09DF>..<U09E3>;<U09F0>..<U09F1>;/ 
 <U0A02>;<U0A05>..<U0A0A>;<U0A0F>..<U0A10>;<U0A13>..<U0A28>;/ 
 <U0A2A>..<U0A30>;<U0A32>..<U0A33>;<U0A35>..<U0A36>;<U0A38>..<U0A39>;/ 
 <U0A3E>..<U0A42>;<U0A47>..<U0A48>;<U0A4B>..<U0A4D>;<U0A59>..<U0A5C>;/ 
 <U0A5E>;<U0A74>;/ 
 <U0A81>..<U0A83>;<U0A85>..<U0A8B>;<U0A8D>;<U0A8F>..<U0A91>;/ 
 <U0A93>..<U0AA8>;<U0AAA>..<U0AB0>;<U0AB2>..<U0AB3>;<U0AB5>..<U0AB9>;/ 
 <U0ABD>..<U0AC5>;<U0AC7>..<U0AC9>;<U0ACB>..<U0ACD>;<U0AD0>;<U0AE0>;/ 
 <U0B01>..<U0B03>;<U0B05>..<U0B0C>;<U0B0F>..<U0B10>;<U0B13>..<U0B28>;/ 
 <U0B2A>..<U0B30>;<U0B32>..<U0B33>;<U0B36>..<U0B39>;<U0B3E>..<U0B43>;/ 
 <U0B47>..<U0B48>;<U0B4B>..<U0B4D>;<U0B5C>..<U0B5D>;<U0B5F>..<U0B61>;/ 
 <U0B82>..<U0B83>;<U0B85>..<U0B8A>;<U0B8E>..<U0B90>;<U0B92>..<U0B95>;/ 
 <U0B99>..<U0B9A>;<U0B9C>;<U0B9E>..<U0B9F>;<U0BA3>..<U0BA4>;/ 
 <U0BA8>..<U0BAA>;<U0BAE>..<U0BB5>;<U0BB7>..<U0BB9>;<U0BBE>..<U0BC2>;/ 
 <U0BC6>..<U0BC8>;<U0BCA>..<U0BCD>;/ 
 <U0C01>..<U0C03>;<U0C05>..<U0C0C>;<U0C0E>..<U0C10>;<U0C12>..<U0C28>;/ 
 <U0C2A>..<U0C33>;<U0C35>..<U0C39>;<U0C3E>..<U0C44>;<U0C46>..<U0C48>;/ 
 <U0C4A>..<U0C4D>;<U0C60>..<U0C61>;/ 
 <U0C82>..<U0C83>;<U0C85>..<U0C8C>;<U0C8E>..<U0C90>;<U0C92>..<U0CA8>;/ 
 <U0CAA>..<U0CB3>;<U0CB5>..<U0CB9>;<U0CBE>..<U0CC4>;<U0CC6>..<U0CC8>;/ 
 <U0CCA>..<U0CCD>;<U0CDE>;<U0CE0>..<U0CE1>;/ 
 <U0D02>..<U0D03>;<U0D05>..<U0D0C>;<U0D0E>..<U0D10>;<U0D12>..<U0D28>;/ 
 <U0D2A>..<U0D39>;<U0D3E>..<U0D43>;<U0D46>..<U0D48>;<U0D4A>..<U0D4D>;/ 
 <U0D60>..<U0D61>;/ 
 <U0E01>..<U0E3A>;<U0E40>..<U0E5B>;/ 
 <U0E81>..<U0E82>;<U0E84>;<U0E87>..<U0E88>;<U0E8A>;<U0E8D>;/ 
 <U0E94>..<U0E97>;<U0E99>..<U0E9F>;<U0EA1>..<U0EA3>;<U0EA5>;<U0EA7>;/ 
 <U0EAA>..<U0EAB>;<U0EAD>..<U0EAE>;<U0EB0>..<U0EB9>;<U0EBB>..<U0EBD>;/ 
 <U0EC0>..<U0EC4>;<U0EC6>;<U0EC8>..<U0ECD>;<U0EDC>..<U0EDD>;/ 
 <U0F00>;<U0F18>..<U0F19>;<U0F35>;<U0F37>;<U0F39>;<U0F3E>..<U0F47>;/ 
 <U0F49>..<U0F69>;/ 
 <U0F71>..<U0F84>;<U0F86>..<U0F8B>;<U0F90>..<U0F95>;<U0F97>;/ 
 <U0F99>..<U0FAD>;<U0FB1>..<U0FB7>;<U0FB9>;/ 
 <U10A0>..<U10C5>;<U10D0>..<U10F6>;/ 
 <U3041>..<U3093>;<U309B>..<U309C>;/ 
 <U30A1>..<U30F6>;<U30FB>..<U30FC>;/ 
 <U3105>..<U312C>;/ 
 <U4E01>..<U4E02>;<U4E04>..<U4E08>;<U4E0A>..<U4E8B>;<U4E8D>..<U4E93>;/ 
 <U4E95>..<U4E5C>;<U4E5E>..<U516A>;<U516C>;<U516E>..<U56DA>;/ 
 <U56DC>..<U9FA5>;/ 
 <UAC00>..<UD7A3>;/ 
 <U00B5>;<U00B7>;<U02B0>..<U02B8>;<U02BB>;<U02BD>..<U02C1>;/ 
 <U02D0>..<U02D1>;<U02E0>..<U02E4>;<U037A>;<U0559>;<U093D>;<U0B3D>;/ 
 <U1FBE>;<U203F>..<U2040>;<U2102>;<U2107>;<U210A>..<U2113>;<U2115>;/ 
 <U2118>..<U211D>;<U2124>;<U2126>;<U2128>;<U212A>..<U2131>;/ 
 <U2133>..<U2138>;<U2160>..<U2182>;<U3005>..<U3006>;<U3021>..<U3029> 
 % 
 digit / 
 <U0030>..<U0039>;<U0660>..<U0669>;<U06F0>..<U06F9>;<U0966>..<U096F>;/ 
 <U09E6>..<U09EF>;<U0A66>..<U0A6F>;<U0AE6>..<U0AEF>;<U0B66>..<U0B6F>;/ 
 
<0>;<U0BE7>..<U0BEF>;<U0C66>..<U0C6F>;<U0CE6>..<U0CEF>;<U0D66>..<U0D6F>;/ 
 
<U0E50>..<U0E59>;<U0ED0>..<U0ED9>;<U0F20>..<U0F29>;<U0F33>;<U0F2A>..<U0F32> 
;/ 
 <U3007>;<U4E00>;<U4E8C>;<U4E09>;<U56DB>;<U4E94>;/ 
 <U516D>;<U4E03>;<U516B>;<U4E5D> 
 % 
 outdigit <U0030>..<U0039> 
 % 
 space   <U0008>;<U000A>..<U000D>;<U0020>;<U2000>..<U2006>;/ 
 <U2008>..<U200B>;<U3000> 
 % 
 cntrl   <U0000>..<U001F>;<U0077>..<U009F> 
 % 
 punct / 
 <U0021>..<U002F>;<U003A>..<U0040>;<U005B>..<U0060>;/ 
 
<U007B>..<U007E>;<U00A0>..<U00BF>;<U00D7>;<U00F7>;<U02C7>;<U02D8>..<U02DD>; 
/ 
 <U037E>;<U0482>;<U055A>..<U055F>;<U0589>;<U05BE>;<U05C0>;<U05C3>;/ 
 <U05F3>;<U05F4>;<U060C>;<U061B>;<U061F>;<U0640>;<U064B>..<U0652>;/ 
 <U066A>..<U066D>;<U06D4>;<U06DD>..<U06E1>;<U06E9>..<U06EC>;<U10FB>;/ 
 <U2010>..<U2029>;<U2030>..<U2046>;<U20A0>..<U20AA>;<U2100>..<U210B>;/ 
 <U210D>..<U2110>;<U2112>..<U211B>;<U211D>..<U2127>;<U212A>..<U212C>;/ 
 
<U212E>..<U2138>;<U2200>..<U22F1>;<U2300>;<U2302>..<U237A>;<U2400>..<U2424> 
;/ 
 <U2440>..<U244A>;<U2580>..<U2595>;<U25A0>..<U25EF>;<U2600>..<U2613>;/ 
 <U261A>..<U266F>;<U2701>..<U2704>;<U2706>..<U2709>;<U270C>..<U2727>;/ 
 <U2729>..<U274B>;<U274D>;<U274F>..<U2752>;<U2756>;<U2758>..<U275E>;/ 
 
<U2761>..<U2767>;<U3000>..<U3020>;<U3030>;<U3036>;<U3037>;<U303F>;<U3164>;/ 
 <U3190>..<U319F>;<U3200>..<U321C>;<U3220>..<U3243>;<U3260>..<U327B>;/ 
 <U327F>..<U32B0>;<U32C0>..<U32CB>;<U32D0>..<U32FE>;<U3300>..<U3376>;/ 
 <U337B>..<U33DD>;<U33E0>..<U33FE>;<UFD3E>;<UFD3F>;<UFE49>..<UFE52>;/ 
 
<UFE54>..<UFE66>;<UFE68>..<UFE6B>;<UFEFF>;<UFF01>..<UFF0F>;<UFF1A>..<UFF20> 
;/ 
 
<UFF3B>..<UFF40>;<UFF5B>..<UFF5E>;<UFF61>..<UFF65>;<UFF70>;<UFF9E>..<UFFA0> 
;/ 
 <UFFE0>..<UFFE6>;<UFFE8>..<UFFEE>;<UFFFD> 
 % 
 graph / 
 <U0021>..<U007E>;<U00A0>..<U01F5>;<U01FA>..<U0217>;/ 
 <U0250>..<U02A8>;<U02B0>..<U02DE>;<U02E0>..<U02E9>;<U0300>..<U0345>;/ 
 
<U0360>;<U0361>;<U0374>;<U0375>;<U037A>;<U037E>;<U0384>..<U038A>;<U038C>;/ 
 
<U038E>..<U03A1>;<U03A3>..<U03CE>;<U03D0>..<U03D6>;<U03DA>;<U03DC>;<U03DE>; 
/ 
 <U03E0>;<U03E2>..<U03F3>;<U0401>..<U040C>;<U040E>..<U044F>;/ 
 <U0451>..<U045C>;<U045E>..<U0486>;<U0490>..<U04C4>;<U04C7>;<U04C8>;/ 
 <U04CB>;<U04CC>;<U04D0>..<U04EB>;<U04EE>..<U04F5>;<U04F8>;<U04F9>;/ 
 <U0531>..<U0556>;<U0559>..<U055F>;<U0561>..<U0587>;<U0589>;/ 
 <U0591>..<U05A1>;<U05A3>..<U05AF>;<U05B0>..<U05B9>;/ 
 
<U05BB>..<U05C4>;<U05D0>..<U05EA>;<U05F0>..<U05F4>;<U060C>;<U061B>;<U061F>; 
/ 
 <U0621>..<U063A>;<U0640>..<U0652>;<U0660>..<U066D>;<U0670>..<U06B7>;/ 
 <U06BA>..<U06BE>;<U06C0>..<U06CE>;<U06D0>..<U06ED>;<U06F0>..<U06F9>;/ 
 <U0901>..<U0903>;<U0905>..<U0939>;<U093C>..<U094D>;<U0950>..<U0954>;/ 
 <U0958>..<U0970>;<U0981>..<U0983>;<U0985>..<U098C>;<U098F>;<U0990>;/ 
 <U0993>..<U09A8>;<U09AA>..<U09B0>;<U09B2>;<U09B6>..<U09B9>;<U09BC>;/ 
 
<U09BE>..<U09C4>;<U09C7>;<U09C8>;<U09CB>..<U09CD>;<U09D7>;<U09DC>;<U09DD>;/ 
 
<U09DF>..<U09E3>;<U09E6>..<U09FA>;<U0A02>;<U0A05>..<U0A0A>;<U0A0F>;<U0A10>; 
/ 
 <U0A13>..<U0A28>;<U0A2A>..<U0A30>;<U0A32>;<U0A33>;<U0A35>;<U0A36>;/ 
 
<U0A38>;<U0A39>;<U0A3C>;<U0A3E>..<U0A42>;<U0A47>;<U0A48>;<U0A4B>..<U0A4D>;/ 
 
<U0A59>..<U0A5C>;<U0A5E>;<U0A66>..<U0A74>;<U0A81>..<U0A83>;<U0A85>..<U0A8B> 
;/ 
 <U0A8D>;<U0A8F>..<U0A91>;<U0A93>..<U0AA8>;<U0AAA>..<U0AB0>;/ 
 <U0AB2>;<U0AB3>;<U0AB5>..<U0AB9>;<U0ABC>..<U0AC5>;<U0AC7>..<U0AC9>;/ 
 <U0ACB>..<U0ACD>;<U0AD0>;<U0AE0>;<U0AE6>..<U0AEF>;<U0B01>..<U0B03>;/ 
 <U0B05>..<U0B0C>;<U0B0F>;<U0B10>;<U0B13>..<U0B28>;<U0B2A>..<U0B30>;/ 
 <U0B32>;<U0B33>;<U0B36>..<U0B39>;<U0B3C>..<U0B43>;<U0B47>;<U0B48>;/ 
 <U0B4B>..<U0B4D>;<U0B56>;<U0B57>;<U0B5C>;<U0B5D>;<U0B5F>..<U0B61>;/ 
 <U0B66>..<U0B70>;<U0B82>;<U0B83>;<U0B85>..<U0B8A>;<U0B8E>..<U0B90>;/ 
 
<U0B92>..<U0B95>;<U0B99>;<U0B9A>;<U0B9C>;<U0B9E>;<U0B9F>;<U0BA3>;<U0BA4>;/ 
 <U0BA8>..<U0BAA>;<U0BAE>..<U0BB5>;<U0BB7>..<U0BB9>;<U0BBE>..<U0BC2>;/ 
 
<U0BC6>..<U0BC8>;<U0BCA>..<U0BCD>;<U0BD7>;<U0BE7>..<U0BF2>;<U0C01>..<U0C03> 
;/ 
 <U0C05>..<U0C0C>;<U0C0E>..<U0C10>;<U0C12>..<U0C28>;<U0C2A>..<U0C33>;/ 
 <U0C35>..<U0C39>;<U0C3E>..<U0C44>;<U0C46>..<U0C48>;<U0C4A>..<U0C4D>;/ 
 <U0C55>;<U0C56>;<U0C60>;<U0C61>;<U0C66>..<U0C6F>;<U0C82>;<U0C83>;/ 
 <U0C85>..<U0C8C>;<U0C8E>..<U0C90>;<U0C92>..<U0CA8>;<U0CAA>..<U0CB3>;/ 
 <U0CB5>..<U0CB9>;<U0CBE>..<U0CC4>;<U0CC6>..<U0CC8>;<U0CCA>..<U0CCD>;/ 
 
<U0CD5>;<U0CD6>;<U0CDE>;<U0CE0>;<U0CE1>;<U0CE6>..<U0CEF>;<U0D02>;<U0D03>;/ 
 <U0D05>..<U0D0C>;<U0D0E>..<U0D10>;<U0D12>..<U0D28>;<U0D2A>..<U0D39>;/ 
 
<U0D3E>..<U0D43>;<U0D46>..<U0D48>;<U0D4A>..<U0D4D>;<U0D57>;<U0D60>;<U0D61>; 
/ 
 
<U0D66>..<U0D6F>;<U0E01>..<U0E3A>;<U0E3F>..<U0E5B>;<U0E81>;<U0E82>;<U0E84>; 
/ 
 <U0E87>;<U0E88>;<U0E8A>;<U0E8D>;<U0E94>..<U0E97>;<U0E99>..<U0E9F>;/ 
 <U0EA1>..<U0EA3>;<U0EA5>;<U0EA7>;<U0EAA>;<U0EAB>;<U0EAD>..<U0EB9>;/ 
 
<U0EBB>..<U0EBD>;<U0EC0>..<U0EC4>;<U0EC6>;<U0EC8>..<U0ECD>;<U0ED0>..<U0ED9> 
;/ 
 <U0EDC>;<U0EDD>;/ 
 <U0F00>..<U0F47>;<U0F49>..<U0F69>;<U0F71>..<U0F7F>;/ 
 <U10A0>..<U10C5>;<U10D0>..<U10F6>;<U10FB>;<U1100>..<U1159>;/ 
 <U115F>..<U11A2>;<U11A8>..<U11F9>;<U1E00>..<U1E9B>;<U1EA0>..<U1EF9>;/ 
 <U1F00>..<U1F15>;<U1F18>..<U1F1D>;<U1F20>..<U1F45>;<U1F48>..<U1F4D>;/ 
 
<U1F50>..<U1F57>;<U1F59>;<U1F5B>;<U1F5D>;<U1F5F>..<U1F7D>;<U1F80>..<U1FB4>; 
/ 
 <U1FB6>..<U1FC4>;<U1FC6>..<U1FD3>;<U1FD6>..<U1FDB>;<U1FDD>..<U1FEF>;/ 
 <U1FF2>..<U1FF4>;<U1FF6>..<U1FFE>;<U2000>..<U202E>;<U2030>..<U2046>;/ 
 <U206A>..<U2070>;<U2074>..<U208E>;<U20A0>..<U20AB>;<U20D0>..<U20E1>;/ 
 
<U2100>..<U2138>;<U2153>..<U2182>;<U2190>..<U21EA>;<U2200>..<U22F1>;<U2300> 
;/ 
 <U2302>..<U237A>;<U2400>..<U2424>;<U2440>..<U244A>;<U2460>..<U24EA>;/ 
 <U2500>..<U2595>;<U25A0>..<U25EF>;<U2600>..<U2613>;<U261A>..<U266F>;/ 
 
<U2701>..<U2704>;<U2706>..<U2709>;<U270C>..<U2727>;<U2729>..<U274B>;<U274D> 
;/ 
 
<U274F>..<U2752>;<U2756>;<U2758>..<U275E>;<U2761>..<U2767>;<U2776>..<U2794> 
;/ 
 
<U2798>..<U27AF>;<U27B1>..<U27BE>;<U3000>..<U3037>;<U303F>;<U3041>..<U3094> 
;/ 
 <U3099>..<U309E>;<U30A1>..<U30FE>;<U3105>..<U312C>;<U3131>..<U318E>;/ 
 <U3190>..<U319F>;<U3200>..<U321C>;<U3220>..<U3243>;<U3260>..<U327B>;/ 
 <U327F>..<U32B0>;<U32C0>..<U32CB>;<U32D0>..<U32FE>;<U3300>..<U3376>;/ 
 <U337B>..<U33DD>;<U33E0>..<U33FE>;<UFB00>..<UFB06>;<UFB13>..<UFB17>;/ 
 
<UFB1E>..<UFB36>;<UFB38>..<UFB3C>;<UFB3E>;<UFB40>;<UFB41>;<UFB43>;<UFB44>;/ 
 <UFB46>..<UFBB1>;<UFBD3>..<UFD3F>;<UFD50>..<UFD8F>;<UFD92>..<UFDC7>;/ 
 <UFDF0>..<UFDFB>;<UFE20>..<UFE23>;<UFE30>..<UFE44>;<UFE49>..<UFE52>;/ 
 
<UFE54>..<UFE66>;<UFE68>..<UFE6B>;<UFE70>..<UFE72>;<UFE74>;<UFE76>..<UFEFC> 
;/ 
 <UFEFF>;<UFF01>..<UFF5E>;<UFF61>..<UFFBE>;<UFFC2>..<UFFC7>;/ 
 <UFFCA>..<UFFCF>;<UFFD2>..<UFFD7>;<UFFDA>..<UFFDC>;<UFFE0>..<UFFE6>;/ 
 <UFFE8>..<UFFEE>;<UFFFD> 
 % 
 % "print" is by default "graph", and the <space> character 
 % 
 xdigit  <U0030>..<U0039>;<U0041>..<U0046>;<U0061>..<U0066> 
 % 
 blank   <U0008>;<U0020>;<U2000>..<U2006>;<U2008>..<U200B>;<U3000> 
 % 
 toupper / 
 
(<U0061>,<U0041>);(<U0062>,<U0042>);(<U0063>,<U0043>);(<U0064>,<U0044>);/ 
 
(<U0065>,<U0045>);(<U0066>,<U0046>);(<U0067>,<U0047>);(<U0068>,<U0048>);/ 
 
(<U0069>,<U0049>);(<U006A>,<U004A>);(<U006B>,<U004B>);(<U006C>,<U004C>);/ 
 
(<U006D>,<U004D>);(<U006E>,<U004E>);(<U006F>,<U004F>);(<U0070>,<U0050>);/ 
 
(<U0071>,<U0051>);(<U0072>,<U0052>);(<U0073>,<U0053>);(<U0074>,<U0054>);/ 
 
(<U0075>,<U0055>);(<U0076>,<U0056>);(<U0077>,<U0057>);(<U0078>,<U0058>);/ 
 
(<U0079>,<U0059>);(<U007A>,<U005A>);(<U00E0>,<U00C0>);(<U00E1>,<U00C1>);/ 
 
(<U00E2>,<U00C2>);(<U00E3>,<U00C3>);(<U00E4>,<U00C4>);(<U00E5>,<U00C5>);/ 
 
(<U00E6>,<U00C6>);(<U00E7>,<U00C7>);(<U00E8>,<U00C8>);(<U00E9>,<U00C9>);/ 
 
(<U00EA>,<U00CA>);(<U00EB>,<U00CB>);(<U00EC>,<U00CC>);(<U00ED>,<U00CD>);/ 
 
(<U00EE>,<U00CE>);(<U00EF>,<U00CF>);(<U00F0>,<U00D0>);(<U00F1>,<U00D1>);/ 
 
(<U00F2>,<U00D2>);(<U00F3>,<U00D3>);(<U00F4>,<U00D4>);(<U00F5>,<U00D5>);/ 
 (<U00F6>,<U00D6>);(<U00F8>,<U00D8>);(<U00F9>,<U00D9>);(<U00FA>,<U00DA>);/ 
 
(<U00FB>,<U00DB>);(<U00FC>,<U00DC>);(<U00FD>,<U00DD>);(<U00FE>,<U00DE>);/ 
 
(<U00FF>,<U0178>);(<U0101>,<U0100>);(<U0103>,<U0102>);(<U0105>,<U0104>);/ 
 
(<U0107>,<U0106>);(<U0109>,<U0108>);(<U010B>,<U010A>);(<U010D>,<U010C>);/ 
 
(<U010F>,<U010E>);(<U0111>,<U0110>);(<U0113>,<U0112>);(<U0115>,<U0114>);/ 
 
(<U0117>,<U0116>);(<U0119>,<U0118>);(<U011B>,<U011A>);(<U011D>,<U011C>);/ 
 
(<U011F>,<U011E>);(<U0121>,<U0120>);(<U0123>,<U0122>);(<U0125>,<U0124>);/ 
 
(<U0127>,<U0126>);(<U0129>,<U0128>);(<U012B>,<U012A>);(<U012D>,<U012C>);/ 
 
(<U012F>,<U012E>);(<U0133>,<U0132>);(<U0135>,<U0134>);(<U0137>,<U0136>);/ 
 
(<U013A>,<U0139>);(<U013C>,<U013B>);(<U013E>,<U013D>);(<U0140>,<U013F>);/ 
 
(<U0142>,<U0141>);(<U0144>,<U0143>);(<U0146>,<U0145>);(<U0148>,<U0147>);/ 
 
(<U014B>,<U014A>);(<U014D>,<U014C>);(<U014F>,<U014E>);(<U0151>,<U0150>);/ 
 
(<U0153>,<U0152>);(<U0155>,<U0154>);(<U0157>,<U0156>);(<U0159>,<U0158>);/ 
 
(<U015B>,<U015A>);(<U015D>,<U015C>);(<U015F>,<U015E>);(<U0161>,<U0160>);/ 
 
(<U0163>,<U0162>);(<U0165>,<U0164>);(<U0167>,<U0166>);(<U0169>,<U0168>);/ 
 
(<U016B>,<U016A>);(<U016D>,<U016C>);(<U016F>,<U016E>);(<U0171>,<U0170>);/ 
 
(<U0173>,<U0172>);(<U0175>,<U0174>);(<U0177>,<U0176>);(<U017A>,<U0179>);/ 
 
(<U017C>,<U017B>);(<U017E>,<U017D>);(<U017F>,<U0053>);(<U0183>,<U0182>);/ 
 
(<U0185>,<U0184>);(<U0188>,<U0187>);(<U018C>,<U018B>);(<U0192>,<U0191>);/ 
 
(<U0199>,<U0198>);(<U01A1>,<U01A0>);(<U01A3>,<U01A2>);(<U01A5>,<U01A4>);/ 
 
(<U01A8>,<U01A7>);(<U01AD>,<U01AC>);(<U01B0>,<U01AF>);(<U01B4>,<U01B3>);/ 
 
(<U01B6>,<U01B5>);(<U01B9>,<U01B8>);(<U01BD>,<U01BC>);(<U01C5>,<U01C4>);/ 
 
(<U01C6>,<U01C4>);(<U01C6>,<U01C4>);(<U01C8>,<U01C7>);(<U01C9>,<U01C7>);/ 
 
(<U01C9>,<U01C7>);(<U01CB>,<U01CA>);(<U01CC>,<U01CA>);(<U01CC>,<U01CA>);/ 
 
(<U01CE>,<U01CD>);(<U01D0>,<U01CF>);(<U01D2>,<U01D1>);(<U01D4>,<U01D3>);/ 
 
(<U01D6>,<U01D5>);(<U01D8>,<U01D7>);(<U01DA>,<U01D9>);(<U01DC>,<U01DB>);/ 
 
(<U01DD>,<U018E>);(<U01DF>,<U01DE>);(<U01E1>,<U01E0>);(<U01E3>,<U01E2>);/ 
 
(<U01E5>,<U01E4>);(<U01E7>,<U01E6>);(<U01E9>,<U01E8>);(<U01EB>,<U01EA>);/ 
 
(<U01ED>,<U01EC>);(<U01EF>,<U01EE>);(<U01F2>,<U01F1>);(<U01F3>,<U01F1>);/ 
 
(<U01F3>,<U01F1>);(<U01F5>,<U01F4>);(<U01FB>,<U01FA>);(<U01FD>,<U01FC>);/ 
 
(<U01FF>,<U01FE>);(<U0201>,<U0200>);(<U0203>,<U0202>);(<U0205>,<U0204>);/ 
 
(<U0207>,<U0206>);(<U0209>,<U0208>);(<U020B>,<U020A>);(<U020D>,<U020C>);/ 
 
(<U020F>,<U020E>);(<U0211>,<U0210>);(<U0213>,<U0212>);(<U0215>,<U0214>);/ 
 
(<U0217>,<U0216>);(<U0253>,<U0181>);(<U0254>,<U0186>);(<U0256>,<U0189>);/ 
 
(<U0257>,<U018A>);(<U0258>,<U018E>);(<U0259>,<U018F>);(<U025B>,<U0190>);/ 
 
(<U0260>,<U0193>);(<U0263>,<U0194>);(<U0268>,<U0197>);(<U0269>,<U0196>);/ 
 
(<U026F>,<U019C>);(<U0272>,<U019D>);(<U0283>,<U01A9>);(<U0288>,<U01AE>);/ 
 
(<U028A>,<U01B1>);(<U028B>,<U01B2>);(<U0292>,<U01B7>);(<U03AC>,<U0386>);/ 
 
(<U03AD>,<U0388>);(<U03AE>,<U0389>);(<U03AF>,<U038A>);(<U03B1>,<U0391>);/ 
 
(<U03B2>,<U0392>);(<U03B3>,<U0393>);(<U03B4>,<U0394>);(<U03B5>,<U0395>);/ 
 
(<U03B6>,<U0396>);(<U03B7>,<U0397>);(<U03B8>,<U0398>);(<U03B9>,<U0399>);/ 
 
(<U03BA>,<U039A>);(<U03BB>,<U039B>);(<U03BC>,<U039C>);(<U03BD>,<U039D>);/ 
 
(<U03BE>,<U039E>);(<U03BF>,<U039F>);(<U03C0>,<U03A0>);(<U03C1>,<U03A1>);/ 
 
(<U03C2>,<U03A3>);(<U03C3>,<U03A3>);(<U03C4>,<U03A4>);(<U03C5>,<U03A5>);/ 
 
(<U03C6>,<U03A6>);(<U03C7>,<U03A7>);(<U03C8>,<U03A8>);(<U03C9>,<U03A9>);/ 
 
(<U03CA>,<U03AA>);(<U03CB>,<U03AB>);(<U03CC>,<U038C>);(<U03CD>,<U038E>);/ 
 
(<U03CE>,<U038F>);(<U0430>,<U0410>);(<U0431>,<U0411>);(<U0432>,<U0412>);/ 
 
(<U0433>,<U0413>);(<U0434>,<U0414>);(<U0435>,<U0415>);(<U0436>,<U0416>);/ 
 
(<U0437>,<U0417>);(<U0438>,<U0418>);(<U0439>,<U0419>);(<U043A>,<U041A>);/ 
 
(<U043B>,<U041B>);(<U043C>,<U041C>);(<U043D>,<U041D>);(<U043E>,<U041E>);/ 
 
(<U043F>,<U041F>);(<U0440>,<U0420>);(<U0441>,<U0421>);(<U0442>,<U0422>);/ 
 
(<U0443>,<U0423>);(<U0444>,<U0424>);(<U0445>,<U0425>);(<U0446>,<U0426>);/ 
 
(<U0447>,<U0427>);(<U0448>,<U0428>);(<U0449>,<U0429>);(<U044A>,<U042A>);/ 
 
(<U044B>,<U042B>);(<U044C>,<U042C>);(<U044D>,<U042D>);(<U044E>,<U042E>);/ 
 
(<U044F>,<U042F>);(<U0451>,<U0401>);(<U0452>,<U0402>);(<U0453>,<U0403>);/ 
 
(<U0454>,<U0404>);(<U0455>,<U0405>);(<U0456>,<U0406>);(<U0457>,<U0407>);/ 
 
(<U0458>,<U0408>);(<U0459>,<U0409>);(<U045A>,<U040A>);(<U045B>,<U040B>);/ 
 
(<U045C>,<U040C>);(<U045E>,<U040E>);(<U045F>,<U040F>);(<U0461>,<U0460>);/ 
 
(<U0463>,<U0462>);(<U0465>,<U0464>);(<U0467>,<U0466>);(<U0469>,<U0468>);/ 
 
(<U046B>,<U046A>);(<U046D>,<U046C>);(<U046F>,<U046E>);(<U0471>,<U0470>);/ 
 
(<U0473>,<U0472>);(<U0475>,<U0474>);(<U0477>,<U0476>);(<U0479>,<U0478>);/ 
 
(<U047B>,<U047A>);(<U047D>,<U047C>);(<U047F>,<U047E>);(<U0481>,<U0480>);/ 
 
(<U0491>,<U0490>);(<U0493>,<U0492>);(<U0495>,<U0494>);(<U0497>,<U0496>);/ 
 
(<U0499>,<U0498>);(<U049B>,<U049A>);(<U049D>,<U049C>);(<U049F>,<U049E>);/ 
 
(<U04A1>,<U04A0>);(<U04A3>,<U04A2>);(<U04A5>,<U04A4>);(<U04A7>,<U04A6>);/ 
 
(<U04A9>,<U04A8>);(<U04AB>,<U04AA>);(<U04AD>,<U04AC>);(<U04AF>,<U04AE>);/ 
 
(<U04B1>,<U04B0>);(<U04B3>,<U04B2>);(<U04B5>,<U04B4>);(<U04B7>,<U04B6>);/ 
 
(<U04B9>,<U04B8>);(<U04BB>,<U04BA>);(<U04BD>,<U04BC>);(<U04BF>,<U04BE>);/ 
 
(<U04C2>,<U04C1>);(<U04C4>,<U04C3>);(<U04C8>,<U04C7>);(<U04CC>,<U04CB>);/ 
 
(<U04D1>,<U04D0>);(<U04D3>,<U04D2>);(<U04D5>,<U04D4>);(<U04D7>,<U04D6>);/ 
 
(<U04D9>,<U04D8>);(<U04DB>,<U04DA>);(<U04DD>,<U04DC>);(<U04DF>,<U04DE>);/ 
 
(<U04E1>,<U04E0>);(<U04E3>,<U04E2>);(<U04E5>,<U04E4>);(<U04E7>,<U04E6>);/ 
 
(<U04E9>,<U04E8>);(<U04EB>,<U04EA>);(<U04EF>,<U04EE>);(<U04F1>,<U04F0>);/ 
 
(<U04F3>,<U04F2>);(<U04F5>,<U04F4>);(<U04F9>,<U04F8>);(<U0561>,<U0531>);/ 
 
(<U0562>,<U0532>);(<U0563>,<U0533>);(<U0564>,<U0534>);(<U0565>,<U0535>);/ 
 
(<U0566>,<U0536>);(<U0567>,<U0537>);(<U0568>,<U0538>);(<U0569>,<U0539>);/ 
 
(<U056A>,<U053A>);(<U056B>,<U053B>);(<U056C>,<U053C>);(<U056D>,<U053D>);/ 
 
(<U056E>,<U053E>);(<U056F>,<U053F>);(<U0570>,<U0540>);(<U0571>,<U0541>);/ 
 
(<U0572>,<U0542>);(<U0573>,<U0543>);(<U0574>,<U0544>);(<U0575>,<U0545>);/ 
 (<U0576>,<U0546>);(<U0577>,<U0547>);(<U0578>,<U0548>);(<U0579>,<U0549>);/ 
 
(<U057A>,<U054A>);(<U057B>,<U054B>);(<U057C>,<U054C>);(<U057D>,<U054D>);/ 
 
(<U057E>,<U054E>);(<U057F>,<U054F>);(<U0580>,<U0550>);(<U0581>,<U0551>);/ 
 
(<U0582>,<U0552>);(<U0583>,<U0553>);(<U0584>,<U0554>);(<U0585>,<U0555>);/ 
 
(<U0586>,<U0556>);(<U1E01>,<U1E00>);(<U1E03>,<U1E02>);(<U1E05>,<U1E04>);/ 
 
(<U1E07>,<U1E06>);(<U1E09>,<U1E08>);(<U1E0B>,<U1E0A>);(<U1E0D>,<U1E0C>);/ 
 
(<U1E0F>,<U1E0E>);(<U1E11>,<U1E10>);(<U1E13>,<U1E12>);(<U1E15>,<U1E14>);/ 
 
(<U1E17>,<U1E16>);(<U1E19>,<U1E18>);(<U1E1B>,<U1E1A>);(<U1E1D>,<U1E1C>);/ 
 
(<U1E1F>,<U1E1E>);(<U1E21>,<U1E20>);(<U1E23>,<U1E22>);(<U1E25>,<U1E24>);/ 
 
(<U1E27>,<U1E26>);(<U1E29>,<U1E28>);(<U1E2B>,<U1E2A>);(<U1E2D>,<U1E2C>);/ 
 
(<U1E2F>,<U1E2E>);(<U1E31>,<U1E30>);(<U1E33>,<U1E32>);(<U1E35>,<U1E34>);/ 
 
(<U1E37>,<U1E36>);(<U1E39>,<U1E38>);(<U1E3B>,<U1E3A>);(<U1E3D>,<U1E3C>);/ 
 
(<U1E3F>,<U1E3E>);(<U1E41>,<U1E40>);(<U1E43>,<U1E42>);(<U1E45>,<U1E44>);/ 
 
(<U1E47>,<U1E46>);(<U1E49>,<U1E48>);(<U1E4B>,<U1E4A>);(<U1E4D>,<U1E4C>);/ 
 
(<U1E4F>,<U1E4E>);(<U1E51>,<U1E50>);(<U1E53>,<U1E52>);(<U1E55>,<U1E54>);/ 
 
(<U1E57>,<U1E56>);(<U1E59>,<U1E58>);(<U1E5B>,<U1E5A>);(<U1E5D>,<U1E5C>);/ 
 
(<U1E5F>,<U1E5E>);(<U1E61>,<U1E60>);(<U1E63>,<U1E62>);(<U1E65>,<U1E64>);/ 
 
(<U1E67>,<U1E66>);(<U1E69>,<U1E68>);(<U1E6B>,<U1E6A>);(<U1E6D>,<U1E6C>);/ 
 
(<U1E6F>,<U1E6E>);(<U1E71>,<U1E70>);(<U1E73>,<U1E72>);(<U1E75>,<U1E74>);/ 
 
(<U1E77>,<U1E76>);(<U1E79>,<U1E78>);(<U1E7B>,<U1E7A>);(<U1E7D>,<U1E7C>);/ 
 
(<U1E7F>,<U1E7E>);(<U1E81>,<U1E80>);(<U1E83>,<U1E82>);(<U1E85>,<U1E84>);/ 
 
(<U1E87>,<U1E86>);(<U1E89>,<U1E88>);(<U1E8B>,<U1E8A>);(<U1E8D>,<U1E8C>);/ 
 
(<U1E8F>,<U1E8E>);(<U1E91>,<U1E90>);(<U1E93>,<U1E92>);(<U1E95>,<U1E94>);/ 
 
(<U1EA1>,<U1EA0>);(<U1EA3>,<U1EA2>);(<U1EA5>,<U1EA4>);(<U1EA7>,<U1EA6>);/ 
 
(<U1EA9>,<U1EA8>);(<U1EAB>,<U1EAA>);(<U1EAD>,<U1EAC>);(<U1EAF>,<U1EAE>);/ 
 
(<U1EB1>,<U1EB0>);(<U1EB3>,<U1EB2>);(<U1EB5>,<U1EB4>);(<U1EB7>,<U1EB6>);/ 
 
(<U1EB9>,<U1EB8>);(<U1EBB>,<U1EBA>);(<U1EBD>,<U1EBC>);(<U1EBF>,<U1EBE>);/ 
 
(<U1EC1>,<U1EC0>);(<U1EC3>,<U1EC2>);(<U1EC5>,<U1EC4>);(<U1EC7>,<U1EC6>);/ 
 
(<U1EC9>,<U1EC8>);(<U1ECB>,<U1ECA>);(<U1ECD>,<U1ECC>);(<U1ECF>,<U1ECE>);/ 
 
(<U1ED1>,<U1ED0>);(<U1ED3>,<U1ED2>);(<U1ED5>,<U1ED4>);(<U1ED7>,<U1ED6>);/ 
 
(<U1ED9>,<U1ED8>);(<U1EDB>,<U1EDA>);(<U1EDD>,<U1EDC>);(<U1EDF>,<U1EDE>);/ 
 
(<U1EE1>,<U1EE0>);(<U1EE3>,<U1EE2>);(<U1EE5>,<U1EE4>);(<U1EE7>,<U1EE6>);/ 
 
(<U1EE9>,<U1EE8>);(<U1EEB>,<U1EEA>);(<U1EED>,<U1EEC>);(<U1EEF>,<U1EEE>);/ 
 
(<U1EF1>,<U1EF0>);(<U1EF3>,<U1EF2>);(<U1EF5>,<U1EF4>);(<U1EF7>,<U1EF6>);/ 
 
(<U1EF9>,<U1EF8>);(<U1F00>,<U1F08>);(<U1F01>,<U1F09>);(<U1F02>,<U1F0A>);/ 
 
(<U1F03>,<U1F0B>);(<U1F04>,<U1F0C>);(<U1F05>,<U1F0D>);(<U1F06>,<U1F0E>);/ 
 
(<U1F07>,<U1F0F>);(<U1F10>,<U1F18>);(<U1F11>,<U1F19>);(<U1F12>,<U1F1A>);/ 
 
(<U1F13>,<U1F1B>);(<U1F14>,<U1F1C>);(<U1F15>,<U1F1D>);(<U1F20>,<U1F28>);/ 
 
(<U1F21>,<U1F29>);(<U1F22>,<U1F2A>);(<U1F23>,<U1F2B>);(<U1F24>,<U1F2C>);/ 
 
(<U1F25>,<U1F2D>);(<U1F26>,<U1F2E>);(<U1F27>,<U1F2F>);(<U1F30>,<U1F38>);/ 
 
(<U1F31>,<U1F39>);(<U1F32>,<U1F3A>);(<U1F33>,<U1F3B>);(<U1F34>,<U1F3C>);/ 
 
(<U1F35>,<U1F3D>);(<U1F36>,<U1F3E>);(<U1F37>,<U1F3F>);(<U1F40>,<U1F48>);/ 
 
(<U1F41>,<U1F49>);(<U1F42>,<U1F4A>);(<U1F43>,<U1F4B>);(<U1F44>,<U1F4C>);/ 
 
(<U1F45>,<U1F4D>);(<U1F51>,<U1F59>);(<U1F53>,<U1F5B>);(<U1F55>,<U1F5D>);/ 
 
(<U1F57>,<U1F5F>);(<U1F60>,<U1F68>);(<U1F61>,<U1F69>);(<U1F62>,<U1F6A>);/ 
 
(<U1F63>,<U1F6B>);(<U1F64>,<U1F6C>);(<U1F65>,<U1F6D>);(<U1F66>,<U1F6E>);/ 
 
(<U1F67>,<U1F6F>);(<U1F70>,<U1FBA>);(<U1F71>,<U1FBB>);(<U1F72>,<U1FC8>);/ 
 
(<U1F73>,<U1FC9>);(<U1F74>,<U1FCA>);(<U1F75>,<U1FCB>);(<U1F76>,<U1FDA>);/ 
 
(<U1F77>,<U1FDB>);(<U1F78>,<U1FF8>);(<U1F79>,<U1FF9>);(<U1F7A>,<U1FEA>);/ 
 
(<U1F7B>,<U1FEB>);(<U1F7C>,<U1FFA>);(<U1F7D>,<U1FFB>);(<U1F80>,<U1F88>);/ 
 
(<U1F81>,<U1F89>);(<U1F82>,<U1F8A>);(<U1F83>,<U1F8B>);(<U1F84>,<U1F8C>);/ 
 
(<U1F85>,<U1F8D>);(<U1F86>,<U1F8E>);(<U1F87>,<U1F8F>);(<U1F90>,<U1F98>);/ 
 
(<U1F91>,<U1F99>);(<U1F92>,<U1F9A>);(<U1F93>,<U1F9B>);(<U1F94>,<U1F9C>);/ 
 
(<U1F95>,<U1F9D>);(<U1F96>,<U1F9E>);(<U1F97>,<U1F9F>);(<U1FA0>,<U1FA8>);/ 
 
(<U1FA1>,<U1FA9>);(<U1FA2>,<U1FAA>);(<U1FA3>,<U1FAB>);(<U1FA4>,<U1FAC>);/ 
 
(<U1FA5>,<U1FAD>);(<U1FA6>,<U1FAE>);(<U1FA7>,<U1FAF>);(<U1FB0>,<U1FB8>);/ 
 
(<U1FB1>,<U1FB9>);(<U1FB3>,<U1FBC>);(<U1FC3>,<U1FCC>);(<U1FD0>,<U1FD8>);/ 
 
(<U1FD1>,<U1FD9>);(<U1FE0>,<U1FE8>);(<U1FE1>,<U1FE9>);(<U1FE5>,<U1FEC>);/ 
 
(<U1FF3>,<U1FFC>);(<UFF41>,<UFF21>);(<UFF42>,<UFF22>);(<UFF43>,<UFF23>);/ 
 
(<UFF44>,<UFF24>);(<UFF45>,<UFF25>);(<UFF46>,<UFF26>);(<UFF47>,<UFF27>);/ 
 
(<UFF48>,<UFF28>);(<UFF49>,<UFF29>);(<UFF4A>,<UFF2A>);(<UFF4B>,<UFF2B>);/ 
 
(<UFF4C>,<UFF2C>);(<UFF4D>,<UFF2D>);(<UFF4E>,<UFF2E>);(<UFF4F>,<UFF2F>);/ 
 
(<UFF50>,<UFF30>);(<UFF51>,<UFF31>);(<UFF52>,<UFF32>);(<UFF53>,<UFF33>);/ 
 
(<UFF54>,<UFF34>);(<UFF55>,<UFF35>);(<UFF56>,<UFF36>);(<UFF57>,<UFF37>);/ 
 (<UFF58>,<UFF38>);(<UFF59>,<UFF39>);(<UFF5A>,<UFF3A>) 
 tolower / 
 
(<U0041>,<U0061>);(<U0042>,<U0062>);(<U0043>,<U0063>);(<U0044>,<U0064>);/ 
 
(<U0045>,<U0065>);(<U0046>,<U0066>);(<U0047>,<U0067>);(<U0048>,<U0068>);/ 
 
(<U0049>,<U0069>);(<U004A>,<U006A>);(<U004B>,<U006B>);(<U004C>,<U006C>);/ 
 
(<U004D>,<U006D>);(<U004E>,<U006E>);(<U004F>,<U006F>);(<U0050>,<U0070>);/ 
 
(<U0051>,<U0071>);(<U0052>,<U0072>);(<U0053>,<U0073>);(<U0054>,<U0074>);/ 
 
(<U0055>,<U0075>);(<U0056>,<U0076>);(<U0057>,<U0077>);(<U0058>,<U0078>);/ 
 
(<U0059>,<U0079>);(<U005A>,<U007A>);(<U00C0>,<U00E0>);(<U00C1>,<U00E1>);/ 
 
(<U00C2>,<U00E2>);(<U00C3>,<U00E3>);(<U00C4>,<U00E4>);(<U00C5>,<U00E5>);/ 
 
(<U00C6>,<U00E6>);(<U00C7>,<U00E7>);(<U00C8>,<U00E8>);(<U00C9>,<U00E9>);/ 
 
(<U00CA>,<U00EA>);(<U00CB>,<U00EB>);(<U00CC>,<U00EC>);(<U00CD>,<U00ED>);/ 
 
(<U00CE>,<U00EE>);(<U00CF>,<U00EF>);(<U00D0>,<U00F0>);(<U00D1>,<U00F1>);/ 
 (<U00D2>,<U00F2>);(<U00D3>,<U00F3>);(<U00D4>,<U00F4>);(<U00D5>,<U00F5>);/ 
 
(<U00D6>,<U00F6>);(<U00D8>,<U00F8>);(<U00D9>,<U00F9>);(<U00DA>,<U00FA>);/ 
 
(<U00DB>,<U00FB>);(<U00DC>,<U00FC>);(<U00DD>,<U00FD>);(<U00DE>,<U00FE>);/ 
 
(<U0178>,<U00FF>);(<U0100>,<U0101>);(<U0102>,<U0103>);(<U0104>,<U0105>);/ 
 
(<U0106>,<U0107>);(<U0108>,<U0109>);(<U010A>,<U010B>);(<U010C>,<U010D>);/ 
 
(<U010E>,<U010F>);(<U0110>,<U0111>);(<U0112>,<U0113>);(<U0114>,<U0115>);/ 
 
(<U0116>,<U0117>);(<U0118>,<U0119>);(<U011A>,<U011B>);(<U011C>,<U011D>);/ 
 
(<U011E>,<U011F>);(<U0120>,<U0121>);(<U0122>,<U0123>);(<U0124>,<U0125>);/ 
 
(<U0126>,<U0127>);(<U0128>,<U0129>);(<U012A>,<U012B>);(<U012C>,<U012D>);/ 
 
(<U012E>,<U012F>);(<U0132>,<U0133>);(<U0134>,<U0135>);(<U0136>,<U0137>);/ 
 
(<U0139>,<U013A>);(<U013B>,<U013C>);(<U013D>,<U013E>);(<U013F>,<U0140>);/ 
 
(<U0141>,<U0142>);(<U0143>,<U0144>);(<U0145>,<U0146>);(<U0147>,<U0148>);/ 
 
(<U014A>,<U014B>);(<U014C>,<U014D>);(<U014E>,<U014F>);(<U0150>,<U0151>);/ 
 
(<U0152>,<U0153>);(<U0154>,<U0155>);(<U0156>,<U0157>);(<U0158>,<U0159>);/ 
 
(<U015A>,<U015B>);(<U015C>,<U015D>);(<U015E>,<U015F>);(<U0160>,<U0161>);/ 
 
(<U0162>,<U0163>);(<U0164>,<U0165>);(<U0166>,<U0167>);(<U0168>,<U0169>);/ 
 
(<U016A>,<U016B>);(<U016C>,<U016D>);(<U016E>,<U016F>);(<U0170>,<U0171>);/ 
 
(<U0172>,<U0173>);(<U0174>,<U0175>);(<U0176>,<U0177>);(<U0179>,<U017A>);/ 
 
(<U017B>,<U017C>);(<U017D>,<U017E>);(<U0182>,<U0183>);(<U0184>,<U0185>);/ 
 
(<U0187>,<U0188>);(<U0256>,<U0189>);(<U018B>,<U018C>);(<U018E>,<U01DD>);/ 
 
(<U0191>,<U0192>);(<U0198>,<U0199>);(<U01A0>,<U01A1>);(<U01A2>,<U01A3>);/ 
 
(<U01A4>,<U01A5>);(<U01A7>,<U01A8>);(<U01AC>,<U01AD>);(<U01AF>,<U01B0>);/ 
 
(<U01B3>,<U01B4>);(<U01B5>,<U01B6>);(<U01B8>,<U01B9>);(<U01BC>,<U01BD>);/ 
 
(<U01C6>,<U01C4>);(<U01C6>,<U01C5>);(<U01C4>,<U01C6>);(<U01C9>,<U01C7>);/ 
 
(<U01C9>,<U01C8>);(<U01C7>,<U01C9>);(<U01CC>,<U01CA>);(<U01CC>,<U01CB>);/ 
 
(<U01CA>,<U01CC>);(<U01CD>,<U01CE>);(<U01CF>,<U01D0>);(<U01D1>,<U01D2>);/ 
 
(<U01D3>,<U01D4>);(<U01D5>,<U01D6>);(<U01D7>,<U01D8>);(<U01D9>,<U01DA>);/ 
 
(<U01DB>,<U01DC>);(<U01DE>,<U01DF>);(<U01E0>,<U01E1>);(<U01E2>,<U01E3>);/ 
 
(<U01E4>,<U01E5>);(<U01E6>,<U01E7>);(<U01E8>,<U01E9>);(<U01EA>,<U01EB>);/ 
 
(<U01EC>,<U01ED>);(<U01EE>,<U01EF>);(<U01F3>,<U01F1>);(<U01F3>,<U01F2>);/ 
 
(<U01F1>,<U01F3>);(<U01F4>,<U01F5>);(<U01FA>,<U01FB>);(<U01FC>,<U01FD>);/ 
 
(<U01FE>,<U01FF>);(<U0200>,<U0201>);(<U0202>,<U0203>);(<U0204>,<U0205>);/ 
 
(<U0206>,<U0207>);(<U0208>,<U0209>);(<U020A>,<U020B>);(<U020C>,<U020D>);/ 
 
(<U020E>,<U020F>);(<U0210>,<U0211>);(<U0212>,<U0213>);(<U0214>,<U0215>);/ 
 
(<U0216>,<U0217>);(<U0181>,<U0253>);(<U0186>,<U0254>);(<U018A>,<U0257>);/ 
 
(<U018E>,<U0258>);(<U018F>,<U0259>);(<U0190>,<U025B>);(<U0193>,<U0260>);/ 
 
(<U0194>,<U0263>);(<U0197>,<U0268>);(<U0196>,<U0269>);(<U019C>,<U026F>);/ 
 
(<U019D>,<U0272>);(<U01A9>,<U0283>);(<U01AE>,<U0288>);(<U01B1>,<U028A>);/ 
 
(<U01B2>,<U028B>);(<U01B7>,<U0292>);(<U0386>,<U03AC>);(<U0388>,<U03AD>);/ 
 
(<U0389>,<U03AE>);(<U038A>,<U03AF>);(<U0391>,<U03B1>);(<U0392>,<U03B2>);/ 
 
(<U0393>,<U03B3>);(<U0394>,<U03B4>);(<U0395>,<U03B5>);(<U0396>,<U03B6>);/ 
 
(<U0397>,<U03B7>);(<U0398>,<U03B8>);(<U0399>,<U03B9>);(<U039A>,<U03BA>);/ 
 
(<U039B>,<U03BB>);(<U039C>,<U03BC>);(<U039D>,<U03BD>);(<U039E>,<U03BE>);/ 
 
(<U039F>,<U03BF>);(<U03A0>,<U03C0>);(<U03A1>,<U03C1>);(<U03A3>,<U03C3>);/ 
 
(<U03A4>,<U03C4>);(<U03A5>,<U03C5>);(<U03A6>,<U03C6>);(<U03A7>,<U03C7>);/ 
 
(<U03A8>,<U03C8>);(<U03A9>,<U03C9>);(<U03AA>,<U03CA>);(<U03AB>,<U03CB>);/ 
 
(<U038C>,<U03CC>);(<U038E>,<U03CD>);(<U038F>,<U03CE>);(<U0410>,<U0430>);/ 
 
(<U0411>,<U0431>);(<U0412>,<U0432>);(<U0413>,<U0433>);(<U0414>,<U0434>);/ 
 
(<U0415>,<U0435>);(<U0416>,<U0436>);(<U0417>,<U0437>);(<U0418>,<U0438>);/ 
 
(<U0419>,<U0439>);(<U041A>,<U043A>);(<U041B>,<U043B>);(<U041C>,<U043C>);/ 
 
(<U041D>,<U043D>);(<U041E>,<U043E>);(<U041F>,<U043F>);(<U0420>,<U0440>);/ 
 
(<U0421>,<U0441>);(<U0422>,<U0442>);(<U0423>,<U0443>);(<U0424>,<U0444>);/ 
 
(<U0425>,<U0445>);(<U0426>,<U0446>);(<U0427>,<U0447>);(<U0428>,<U0448>);/ 
 
(<U0429>,<U0449>);(<U042A>,<U044A>);(<U042B>,<U044B>);(<U042C>,<U044C>);/ 
 
(<U042D>,<U044D>);(<U042E>,<U044E>);(<U042F>,<U044F>);(<U0401>,<U0451>);/ 
 
(<U0402>,<U0452>);(<U0403>,<U0453>);(<U0404>,<U0454>);(<U0405>,<U0455>);/ 
 
(<U0406>,<U0456>);(<U0407>,<U0457>);(<U0408>,<U0458>);(<U0409>,<U0459>);/ 
 
(<U040A>,<U045A>);(<U040B>,<U045B>);(<U040C>,<U045C>);(<U040E>,<U045E>);/ 
 
(<U040F>,<U045F>);(<U0460>,<U0461>);(<U0462>,<U0463>);(<U0464>,<U0465>);/ 
 
(<U0466>,<U0467>);(<U0468>,<U0469>);(<U046A>,<U046B>);(<U046C>,<U046D>);/ 
 
(<U046E>,<U046F>);(<U0470>,<U0471>);(<U0472>,<U0473>);(<U0474>,<U0475>);/ 
 
(<U0476>,<U0477>);(<U0478>,<U0479>);(<U047A>,<U047B>);(<U047C>,<U047D>);/ 
 
(<U047E>,<U047F>);(<U0480>,<U0481>);(<U0490>,<U0491>);(<U0492>,<U0493>);/ 
 
(<U0494>,<U0495>);(<U0496>,<U0497>);(<U0498>,<U0499>);(<U049A>,<U049B>);/ 
 
(<U049C>,<U049D>);(<U049E>,<U049F>);(<U04A0>,<U04A1>);(<U04A2>,<U04A3>);/ 
 
(<U04A4>,<U04A5>);(<U04A6>,<U04A7>);(<U04A8>,<U04A9>);(<U04AA>,<U04AB>);/ 
 
(<U04AC>,<U04AD>);(<U04AE>,<U04AF>);(<U04B0>,<U04B1>);(<U04B2>,<U04B3>);/ 
 
(<U04B4>,<U04B5>);(<U04B6>,<U04B7>);(<U04B8>,<U04B9>);(<U04BA>,<U04BB>);/ 
 
(<U04BC>,<U04BD>);(<U04BE>,<U04BF>);(<U04C1>,<U04C2>);(<U04C3>,<U04C4>);/ 
 
(<U04C7>,<U04C8>);(<U04CB>,<U04CC>);(<U04D0>,<U04D1>);(<U04D2>,<U04D3>);/ 
 
(<U04D4>,<U04D5>);(<U04D6>,<U04D7>);(<U04D8>,<U04D9>);(<U04DA>,<U04DB>);/ 
 
(<U04DC>,<U04DD>);(<U04DE>,<U04DF>);(<U04E0>,<U04E1>);(<U04E2>,<U04E3>);/ 
 
(<U04E4>,<U04E5>);(<U04E6>,<U04E7>);(<U04E8>,<U04E9>);(<U04EA>,<U04EB>);/ 
 
(<U04EE>,<U04EF>);(<U04F0>,<U04F1>);(<U04F2>,<U04F3>);(<U04F4>,<U04F5>);/ 
 
(<U04F8>,<U04F9>);(<U0531>,<U0561>);(<U0532>,<U0562>);(<U0533>,<U0563>);/ 
 
(<U0534>,<U0564>);(<U0535>,<U0565>);(<U0536>,<U0566>);(<U0537>,<U0567>);/ 
 
(<U0538>,<U0568>);(<U0539>,<U0569>);(<U053A>,<U056A>);(<U053B>,<U056B>);/ 
 
(<U053C>,<U056C>);(<U053D>,<U056D>);(<U053E>,<U056E>);(<U053F>,<U056F>);/ 
 
(<U0540>,<U0570>);(<U0541>,<U0571>);(<U0542>,<U0572>);(<U0543>,<U0573>);/ 
 (<U0544>,<U0574>);(<U0545>,<U0575>);(<U0546>,<U0576>);(<U0547>,<U0577>);/ 
 
(<U0548>,<U0578>);(<U0549>,<U0579>);(<U054A>,<U057A>);(<U054B>,<U057B>);/ 
 
(<U054C>,<U057C>);(<U054D>,<U057D>);(<U054E>,<U057E>);(<U054F>,<U057F>);/ 
 
(<U0550>,<U0580>);(<U0551>,<U0581>);(<U0552>,<U0582>);(<U0553>,<U0583>);/ 
 
(<U0554>,<U0584>);(<U0555>,<U0585>);(<U0556>,<U0586>);(<U1E00>,<U1E01>);/ 
 
(<U1E02>,<U1E03>);(<U1E04>,<U1E05>);(<U1E06>,<U1E07>);(<U1E08>,<U1E09>);/ 
 
(<U1E0A>,<U1E0B>);(<U1E0C>,<U1E0D>);(<U1E0E>,<U1E0F>);(<U1E10>,<U1E11>);/ 
 
(<U1E12>,<U1E13>);(<U1E14>,<U1E15>);(<U1E16>,<U1E17>);(<U1E18>,<U1E19>);/ 
 
(<U1E1A>,<U1E1B>);(<U1E1C>,<U1E1D>);(<U1E1E>,<U1E1F>);(<U1E20>,<U1E21>);/ 
 
(<U1E22>,<U1E23>);(<U1E24>,<U1E25>);(<U1E26>,<U1E27>);(<U1E28>,<U1E29>);/ 
 
(<U1E2A>,<U1E2B>);(<U1E2C>,<U1E2D>);(<U1E2E>,<U1E2F>);(<U1E30>,<U1E31>);/ 
 
(<U1E32>,<U1E33>);(<U1E34>,<U1E35>);(<U1E36>,<U1E37>);(<U1E38>,<U1E39>);/ 
 
(<U1E3A>,<U1E3B>);(<U1E3C>,<U1E3D>);(<U1E3E>,<U1E3F>);(<U1E40>,<U1E41>);/ 
 
(<U1E42>,<U1E43>);(<U1E44>,<U1E45>);(<U1E46>,<U1E47>);(<U1E48>,<U1E49>);/ 
 
(<U1E4A>,<U1E4B>);(<U1E4C>,<U1E4D>);(<U1E4E>,<U1E4F>);(<U1E50>,<U1E51>);/ 
 
(<U1E52>,<U1E53>);(<U1E54>,<U1E55>);(<U1E56>,<U1E57>);(<U1E58>,<U1E59>);/ 
 
(<U1E5A>,<U1E5B>);(<U1E5C>,<U1E5D>);(<U1E5E>,<U1E5F>);(<U1E60>,<U1E61>);/ 
 
(<U1E62>,<U1E63>);(<U1E64>,<U1E65>);(<U1E66>,<U1E67>);(<U1E68>,<U1E69>);/ 
 
(<U1E6A>,<U1E6B>);(<U1E6C>,<U1E6D>);(<U1E6E>,<U1E6F>);(<U1E70>,<U1E71>);/ 
 
(<U1E72>,<U1E73>);(<U1E74>,<U1E75>);(<U1E76>,<U1E77>);(<U1E78>,<U1E79>);/ 
 
(<U1E7A>,<U1E7B>);(<U1E7C>,<U1E7D>);(<U1E7E>,<U1E7F>);(<U1E80>,<U1E81>);/ 
 
(<U1E82>,<U1E83>);(<U1E84>,<U1E85>);(<U1E86>,<U1E87>);(<U1E88>,<U1E89>);/ 
 
(<U1E8A>,<U1E8B>);(<U1E8C>,<U1E8D>);(<U1E8E>,<U1E8F>);(<U1E90>,<U1E91>);/ 
 
(<U1E92>,<U1E93>);(<U1E94>,<U1E95>);(<U1EA0>,<U1EA1>);(<U1EA2>,<U1EA3>);/ 
 
(<U1EA4>,<U1EA5>);(<U1EA6>,<U1EA7>);(<U1EA8>,<U1EA9>);(<U1EAA>,<U1EAB>);/ 
 
(<U1EAC>,<U1EAD>);(<U1EAE>,<U1EAF>);(<U1EB0>,<U1EB1>);(<U1EB2>,<U1EB3>);/ 
 
(<U1EB4>,<U1EB5>);(<U1EB6>,<U1EB7>);(<U1EB8>,<U1EB9>);(<U1EBA>,<U1EBB>);/ 
 
(<U1EBC>,<U1EBD>);(<U1EBE>,<U1EBF>);(<U1EC0>,<U1EC1>);(<U1EC2>,<U1EC3>);/ 
 
(<U1EC4>,<U1EC5>);(<U1EC6>,<U1EC7>);(<U1EC8>,<U1EC9>);(<U1ECA>,<U1ECB>);/ 
 
(<U1ECC>,<U1ECD>);(<U1ECE>,<U1ECF>);(<U1ED0>,<U1ED1>);(<U1ED2>,<U1ED3>);/ 
 
(<U1ED4>,<U1ED5>);(<U1ED6>,<U1ED7>);(<U1ED8>,<U1ED9>);(<U1EDA>,<U1EDB>);/ 
 
(<U1EDC>,<U1EDD>);(<U1EDE>,<U1EDF>);(<U1EE0>,<U1EE1>);(<U1EE2>,<U1EE3>);/ 
 
(<U1EE4>,<U1EE5>);(<U1EE6>,<U1EE7>);(<U1EE8>,<U1EE9>);(<U1EEA>,<U1EEB>);/ 
 
(<U1EEC>,<U1EED>);(<U1EEE>,<U1EEF>);(<U1EF0>,<U1EF1>);(<U1EF2>,<U1EF3>);/ 
 
(<U1EF4>,<U1EF5>);(<U1EF6>,<U1EF7>);(<U1EF8>,<U1EF9>);(<U1F08>,<U1F00>);/ 
 
(<U1F09>,<U1F01>);(<U1F0A>,<U1F02>);(<U1F0B>,<U1F03>);(<U1F0C>,<U1F04>);/ 
 
(<U1F0D>,<U1F05>);(<U1F0E>,<U1F06>);(<U1F0F>,<U1F07>);(<U1F18>,<U1F10>);/ 
 
(<U1F19>,<U1F11>);(<U1F1A>,<U1F12>);(<U1F1B>,<U1F13>);(<U1F1C>,<U1F14>);/ 
 
(<U1F1D>,<U1F15>);(<U1F28>,<U1F20>);(<U1F29>,<U1F21>);(<U1F2A>,<U1F22>);/ 
 
(<U1F2B>,<U1F23>);(<U1F2C>,<U1F24>);(<U1F2D>,<U1F25>);(<U1F2E>,<U1F26>);/ 
 
(<U1F2F>,<U1F27>);(<U1F38>,<U1F30>);(<U1F39>,<U1F31>);(<U1F3A>,<U1F32>);/ 
 
(<U1F3B>,<U1F33>);(<U1F3C>,<U1F34>);(<U1F3D>,<U1F35>);(<U1F3E>,<U1F36>);/ 
 
(<U1F3F>,<U1F37>);(<U1F48>,<U1F40>);(<U1F49>,<U1F41>);(<U1F4A>,<U1F42>);/ 
 
(<U1F4B>,<U1F43>);(<U1F4C>,<U1F44>);(<U1F4D>,<U1F45>);(<U1F59>,<U1F51>);/ 
 
(<U1F5B>,<U1F53>);(<U1F5D>,<U1F55>);(<U1F5F>,<U1F57>);(<U1F68>,<U1F60>);/ 
 
(<U1F69>,<U1F61>);(<U1F6A>,<U1F62>);(<U1F6B>,<U1F63>);(<U1F6C>,<U1F64>);/ 
 
(<U1F6D>,<U1F65>);(<U1F6E>,<U1F66>);(<U1F6F>,<U1F67>);(<U1FBA>,<U1F70>);/ 
 
(<U1FBB>,<U1F71>);(<U1FC8>,<U1F72>);(<U1FC9>,<U1F73>);(<U1FCA>,<U1F74>);/ 
 
(<U1FCB>,<U1F75>);(<U1FDA>,<U1F76>);(<U1FDB>,<U1F77>);(<U1FF8>,<U1F78>);/ 
 
(<U1FF9>,<U1F79>);(<U1FEA>,<U1F7A>);(<U1FEB>,<U1F7B>);(<U1FFA>,<U1F7C>);/ 
 
(<U1FFB>,<U1F7D>);(<U1F88>,<U1F80>);(<U1F89>,<U1F81>);(<U1F8A>,<U1F82>);/ 
 
(<U1F8B>,<U1F83>);(<U1F8C>,<U1F84>);(<U1F8D>,<U1F85>);(<U1F8E>,<U1F86>);/ 
 
(<U1F8F>,<U1F87>);(<U1F98>,<U1F90>);(<U1F99>,<U1F91>);(<U1F9A>,<U1F92>);/ 
 
(<U1F9B>,<U1F93>);(<U1F9C>,<U1F94>);(<U1F9D>,<U1F95>);(<U1F9E>,<U1F96>);/ 
 
(<U1F9F>,<U1F97>);(<U1FA8>,<U1FA0>);(<U1FA9>,<U1FA1>);(<U1FAA>,<U1FA2>);/ 
 
(<U1FAB>,<U1FA3>);(<U1FAC>,<U1FA4>);(<U1FAD>,<U1FA5>);(<U1FAE>,<U1FA6>);/ 
 
(<U1FAF>,<U1FA7>);(<U1FB8>,<U1FB0>);(<U1FB9>,<U1FB1>);(<U1FBC>,<U1FB3>);/ 
 
(<U1FCC>,<U1FC3>);(<U1FD8>,<U1FD0>);(<U1FD9>,<U1FD1>);(<U1FE8>,<U1FE0>);/ 
 
(<U1FE9>,<U1FE1>);(<U1FEC>,<U1FE5>);(<U1FFC>,<U1FF3>);(<UFF21>,<UFF41>);/ 
 
(<UFF22>,<UFF42>);(<UFF23>,<UFF43>);(<UFF24>,<UFF44>);(<UFF25>,<UFF45>);/ 
 
(<UFF26>,<UFF46>);(<UFF27>,<UFF47>);(<UFF28>,<UFF48>);(<UFF29>,<UFF49>);/ 
 
(<UFF2A>,<UFF4A>);(<UFF2B>,<UFF4B>);(<UFF2C>,<UFF4C>);(<UFF2D>,<UFF4D>);/ 
 
(<UFF2E>,<UFF4E>);(<UFF2F>,<UFF4F>);(<UFF30>,<UFF50>);(<UFF31>,<UFF51>);/ 
 
(<UFF32>,<UFF52>);(<UFF33>,<UFF53>);(<UFF34>,<UFF54>);(<UFF35>,<UFF55>);/ 
 
(<UFF36>,<UFF56>);(<UFF37>,<UFF57>);(<UFF38>,<UFF58>);(<UFF39>,<UFF59>);/ 
 (<UFF3A>,<UFF5A>) 
 % 
 right_to_left / 
 <U0591>..<U05A1>;<U05A3>..<U05AF>;<U05B0>..<U05B9>;/ 
 
<U05BB>..<U05C4>;<U05D0>..<U05EA>;<U05F0>..<U05F4>;<U060C>;<U061B>;<U061F>; 
/ 
 <U0621>..<U063A>;<U0640>..<U0652>;<U066D>;<U0670>..<U06B7>;/ 
 <U06BA>..<U06BE>;<U06C0>..<U06CE>;<U06D0>..<U06ED>;<U06F0>..<U06F9>;/ 
 <U200F> 
 % 
 class          "num_terminator";<:>;<space> 
 class          "num_separator";<:>;<space> 
 class          "direction_control";<U200E>;<U200F>;<U202A>..<U202E> 
 class          "sym_swap_layout";<U206A>;<U206B> 
 class          "char_shape_selector";<U206C>;<U206D> 
 class          "num_shape_selector";<U206E>;<U206F> 
 class          "non_spacing"; / 
 <U0300>..<U036F>; <U20D0>..<U20FF>; <UFE20>..<UFE2F>;/ 
 <U0483>..<U0486>;<U0591>..<U05A1>;<U05A3>..<U05B9>;/ 
 
<U05BB>..<U05BD>;<U05BF>;<U05C1>;<U05C2>;<U05C4>;<U064B>..<U0652>;<U0670>;/ 
 
<U06D7>..<U06E4>;<U06E7>;<U06E8>;<U06EA>..<U06ED>;<U0901>..<U0903>;<U093C>; 
/ 
 
<U093E>..<U094D>;<U0951>..<U0954>;<U0962>;<U0963>;<U0981>..<U0983>;<U09BC>; 
/ 
 
<U09BE>..<U09C4>;<U09C7>;<U09C8>;<U09CB>..<U09CD>;<U09D7>;<U09E2>;<U09E3>;/ 
 <U0A02>;<U0A3C>;<U0A3E>..<U0A42>;<U0A47>;<U0A48>;<U0A4B>..<U0A4D>;/ 
 
<U0A70>;<U0A71>;<U0A81>..<U0A83>;<U0ABC>;<U0ABE>..<U0AC5>;<U0AC7>..<U0AC9>; 
/ 
 
<U0ACB>..<U0ACD>;<U0B01>..<U0B03>;<U0B3C>;<U0B3E>..<U0B43>;<U0B47>;<U0B48>; 
/ 
 <U0B4B>..<U0B4D>;<U0B56>;<U0B57>;<U0B82>;<U0B83>;<U0BBE>..<U0BC2>;/ 
 
<U0BC6>..<U0BC8>;<U0BCA>..<U0BCD>;<U0BD7>;<U0C01>..<U0C03>;<U0C3E>..<U0C44> 
;/ 
 <U0C46>..<U0C48>;<U0C4A>..<U0C4D>;<U0C55>;<U0C56>;<U0C82>;<U0C83>;/ 
 <U0CBE>..<U0CC4>;<U0CC6>..<U0CC8>;<U0CCA>..<U0CCD>;<U0CD5>;<U0CD6>;/ 
 
<U0D02>;<U0D03>;<U0D3E>..<U0D43>;<U0D46>..<U0D48>;<U0D4A>..<U0D4D>;<U0D57>; 
/ 
 <U0E31>;<U0E34>..<U0E3A>;<U0E47>..<U0E4E>;<U0EB1>;<U0EB4>..<U0EB9>;/ 
 
<U0EBB>;<U0EBC>;<U0EC8>..<U0ECD>;<U0F18>;<U0F19>;<U0F35>;<U0F37>;<U0F39>;/ 
 
<U0F3E>;<U0F3F>;<U0F71>..<U0F84>;<U0F86>..<U0F89>;<U0F8B>;<U0F90>..<U0F95>; 
/ 
 <U0F97>;<U0F99>..<U0FAD>;<U0FB1>..<U0FB7>;<U0FB9>;<U302A>..<U302F>;/ 
 <U3099>;<U309A>;<UFB1E> 
 % 
 class          "non_spacing_level3";      / 
 <U0300>..<U036F>;<U20D0>..<U20FF>;<U1100>..<U11FF>;<UFE20>..<UFE2F>;/ 
 <U0483>..<U0486>;<U0591>..<U05A1>;<U05A3>..<U05AE>;<U05C4>;/ 
 <U05AF>;<U093C>;<U0953>;<U0954>;<U09BC>;<U09D7>;<U0A3C>;/ 
 
<U0A70>;<U0A71>;<U0ABC>;<U0B3C>;<U0B56>;<U0B57>;<U0BD7>;<U0C55>;<U0C56>;/ 
 <U0CD5>;<U0CD6>;<U0D57>;<U0F39>;<U302A>..<U302F>;<U3099>;<U309A> 
 % 
 map "tosymmetric"; / 
 (<U0028>,<U0029>); (<U003C>,<U003E>); (<U005B>,<U005D>); 
(<U007B>,<U007D>); 
 (<U2045>,<U2046>); (<U207D>,<U207E>); (<U208D>,<U208E>); 
(<U2201>,<U2202>); 
 (<U2203>,<U2204>); (<U2208>,<U2209>); (<U220A>,<U220B>); 
(<U220C>,<U220D>); 
 (<U2211>,<U2215>); (<U2216>,<U221A>); (<U221B>,<U221C>); 
(<U221D>,<U221F>); 
 (<U2220>,<U2221>); (<U2222>,<U2224>); (<U2226>,<U222B>); 
(<U222C>,<U222D>); 
 (<U222E>,<U222F>); (<U2230>,<U2231>); (<U2232>,<U2233>); 
(<U2239>,<U223B>); 
 (<U223C>,<U223D>); (<U223E>,<U223F>); (<U2240>,<U2241>); 
(<U2242>,<U2243>); 
 (<U2244>,<U2245>); (<U2246>,<U2247>); (<U2248>,<U2249>); 
(<U224A>,<U224B>); 
 (<U224C>,<U2252>); (<U2253>,<U2254>); (<U2255>,<U225F>); 
(<U2260>,<U2262>); 
 (<U2264>,<U2265>); (<U2266>,<U2267>); (<U2268>,<U2269>); 
(<U226A>,<U226B>); 
 (<U226E>,<U226F>); (<U2270>,<U2271>); (<U2272>,<U2273>); 
(<U2274>,<U2275>); 
 (<U2276>,<U2277>); (<U2278>,<U2279>); (<U227A>,<U227B>); 
(<U227C>,<U227D>); 
 (<U227E>,<U227F>); (<U2280>,<U2281>); (<U2282>,<U2283>); 
(<U2284>,<U2285>); 
 (<U2286>,<U2287>); (<U2288>,<U2289>); (<U228A>,<U228B>); 
(<U228C>,<U228F>); 
 (<U2290>,<U2291>); (<U2292>,<U2298>); (<U22A2>,<U22A3>); 
(<U22A6>,<U22A7>); 
 (<U22A8>,<U22A9>); (<U22AA>,<U22AB>); (<U22AC>,<U22AD>); 
(<U22AE>,<U22AF>); 
 (<U22B0>,<U22B1>); (<U22B2>,<U22B3>); (<U22B4>,<U22B5>); 
(<U22B6>,<U22B7>); 
 (<U22B8>,<U22BE>); (<U22BF>,<U22C9>); (<U22CA>,<U22CB>); 
(<U22CC>,<U22CD>); 
 (<U22D0>,<U22D1>); (<U22D6>,<U22D7>); (<U22D8>,<U22D9>); 
(<U22DA>,<U22DB>); 
 (<U22DC>,<U22DD>); (<U22DE>,<U22DF>); (<U22E0>,<U22E1>); 
(<U22E2>,<U22E3>); 
 (<U22E4>,<U22E5>); (<U22E6>,<U22E7>); (<U22E8>,<U22E9>); 
(<U22EA>,<U22EB>); 
 (<U22EC>,<U22ED>); (<U22F0>,<U22F1>); (<U2308>,<U2309>); 
(<U230A>,<U230B>); 
 (<U2320>,<U2321>); (<U2329>,<U232A>); (<U3008>,<U3009>); 
(<U300A>,<U300B>); 
 (<U300C>,<U300D>); (<U300E>,<U300F>); (<U3010>,<U3011>); 
(<U3014>,<U3015>); 
 (<U3016>,<U3017>); (<U3018>,<U3019>); (<U301A>,<U301B>) 
 
 END LC_CTYPE 
 
 
4.3   LC_COLLATE 
 
A collation sequence definition defines the relative order 
between collating elements (characters and multicharacter 
collating elements) in the FDCC-set. This order is expressed 
in terms of collation values; i.e., by assigning each element 
one or more collation values (also known as collation 
weights). This does not imply that applications shall assign 
such values, but that ordering of strings using the resultant 
collation definition in the FDCC-set shall behave as if such 
assignment is done and used in the collation process. The 
collation sequence definition shall be used by regular 
expressions, pattern matching, and sorting. The following 
capabilities are provided: 
 
(1)   Multicharacter collating elements. Specification of 
 multicharacter collating elements (i.e., sequences of two 
 or more characters to be collated as an entity). 
(2)   User-defined ordering of collating elements. Each 
 collating element shall be assigned a collation value 
 defining its order in the character (or basic) collation 
 sequence. This ordering is used by regular expressions 
 and pattern matching and, unless collation weights are 
 explicitly specified, also as the collation weight to be 
 used in sorting. 
(3)   Multiple weights and equivalence classes. Collating 
 elements can be assigned one or more (up to the limit 
 (COLL_WEIGHTS_MAX)) collating weights for use in sorting. 
 The first weight is hereafter referred to as the primary 
 weight. 
(4)   One-to Many mapping. A single character is mapped into a 
 string of collating elements. 
(5)   Many-to-Many substitution. A string of one or more 
 characters is substituted by another string (or an empty 
 string, i.e., the character or characters shall be 
 ignored for collation purposes). 
(6)   Equivalence class definition. Two or more collating 
 elements have the same collation value (primary weight). 
(7)   Ordering by weights. When two strings are compared to 
 determine their relative order, the two strings are first 
 broken up into a series of collating elements, and each 
 successive pair of elements are compared according to the 
 relative primary weights for the elements. If equal, and 
 more than one weight has been assigned, then the pairs of 
 collating elements are recompared according to the 
 relative subsequent weights, until either a pair of 
 collating elements compare unequal or the weights are 
 exhausted. 
(8)   Per script ordering rules. Some cultures order some 
 scripts in a different direction than other scripts, for 
 example in French cultures the Latin script is ordered 
 backwards on the level handling accents, while the 
 Cyrillic script may be ordered forwards. 
(9)   Easy reordering of characters. ISO/IEC 14651 has a 
 template for collation specification that with just a few 
 modifications can be culturally correct for a specific 
 culture. Here the "reorder-after" keyword gives a 
 convenient way to modify a FDCC-set template. 
(10)  Easy reordering of scripts. The template in ISO/IEC 14651 
 gives an ordering of the scripts that may not be 
 culturally acceptable in certain cultures.  The keyword 
 "reorder-script-after" gives a convenient way to modify 
 the order of scripts in a FDCC-set template. 
 
The following keywords shall be defined in a collation 
sequence definition. Some of them are described in detail in 
the following subclauses. 
 
copy            Specify the name of an existing FDCC-set to be 
 used as the source for the definition of this 
 category. If this keyword is specified, only the 
 "reorder-after", "reorder-end", "reorder-scripts- 
 after" and "reorder-scripts-end" keywords may 
 also be specified. The FDCC-set shall be copied 
 in source form. 
coll_weight_max          Define as a decimal number the number of 
 collation levels that an interpreting 
 system needs to support, this value is 
 elsewhere referred as the COLL_WEIGHT_MAX 
 limit. The minimum value is 7. 
script          Define a script symbol representing a set of 
 collation order statements. This keyword is optional. 
collating-element        Define a collating-element symbol 
 representing a multicharacter collating 
 element. This keyword is optional. 
collating-symbol         Define a collating symbol for use in 
 collation order statements. This keyword 
 is optional. 
order_start     Define collation rules. This statement is 
 followed by one or more collation order 
 statements, assigning character collation values 
 and collation weights to collating elements. 
order_end       Specify the end of the collation-order 
 statements. 
reorder-after     Redefine collating rules.  Specify after which 
 collating element the redefinition of 
 collation order shall take order. This state- 
 ment is followed by one or more collation 
 order statements, reassigning character 
 collation values and collation weights to 
 collating elements. 
reorder-end     Specify the end of the "reorder-after" collating 
 order statements. 
reorder-script-after     Redefine the order of scripts. This 
 statement is followed by one or more 
 script symbols, reassigning character 
 collation values and collation weights to 
 collating elements. 
reorder-script-end       Specify the end of the "reorder-scripts" 
 script order statements. 
 
Toggling keywords: 
 
define          defines a toggle 
undef           undefines a toggle 
ifdef           tests a toggle, and if defined uses the following 
 statements 
ifndef          tests a toggle, and if undefined uses the 
 following statements 
else            uses the following statements if no preceding 
 toggling statements have been used 
elif            tests a toggle, and uses the following statements 
 if no preceding toggling statements have been 
 used, and the toggle is defined 
endif           terminates set of toggling statements 
 
4.3.1   Collation statements 
 
The "order_start" and "replace-after" keyword shall be 
followed by collating statements. The syntax for the collating 
statements is 
 
 "%s %s;%s;...;%s\n",<collating- 
element>,<weight>,<weight>,... 
 
Each collating-element shall consist of either a character (in 
any of the forms defined in 4.1.1), a <collating-element>, a 
<collating-symbol>, an ellipsis, or the special symbol 
UNDEFINED. The order in which collating elements are specified 
determines the character collation sequence, such that each 
collating element shall compare less than the elements 
following it. The NUL character shall compare lower than any 
other character. 
 
A <collating-element> shall be used to specify multicharacter 
collating elements, and indicates that the character sequence 
specified via the <collating-element> is to be collated as a 
unit and in the relative order specified by its place. 
 
A <collating-symbol> shall be used to define a position in the 
relative order for use in weights. 
 
The ellipsis symbol ("...") specifies that a sequence of 
characters shall collate according to their encoded character 
values. It shall be interpreted as indicating that all 
characters with a coded character set value higher than the 
value of the character in the preceding line, and lower than 
the coded character set value for the character in the 
following line, in the current coded character set, shall be 
placed in the character collation order between the previous 
and the following character in ascending order according to 
their coded character set values. An initial ellipsis shall be 
interpreted as if the preceding line specified the NUL 
character, and a trailing ellipsis as if the following line 
specified the highest coded character set value in the current 
coded character set. An ellipsis shall be treated as invalid 
if the preceding or following lines do not specify characters 
in the current coded character set. The use of the ellipsis 
symbol ties the definition to a specific coded character set 
and may preclude the definition from being portable between 
applications. Symbolic ellipses may be used as the ellipses 
symbol, but generating symbolic character names, and thus have 
a better chance of portability between applications. 
 
The symbolic ellipsises (".." or "....") specifies that a 
sequence collating statements. It shall be interpreted as 
indicating that all characters with symbolic names higher then 
the symbolic name of the character in the preceding line, and 
lower than the coded character set value for the character in 
the following line, shall be placed in the character collation 
order between the previous and the following character in 
ascending order. 
 
The symbol UNDEFINED shall be interpreted as including all 
coded character set values not specified explicitly or via the 
ellipsis or one of the symbolic elipsises symbols. Such 
characters shall be inserted in the character collation order 
at the point indicated by the symbol, and in ascending order 
according to their coded character set values. If no UNDEFINED 
symbol is specified, and the current coded character set 
contains characters not specified in this clause, the utility 
shall issue a warning message and place such characters at the 
end of the character collation order. 
 
The optional operands for each collation-element shall be used 
to define the primary, secondary, or subsequent weights for 
the collating element. The first operand specifies the 
relative primary weight, the second the relative secondary 
weight, and so on. Two or more collation-elements can be 
assigned the same weight; they belong to the same equivalence 
class if they have the same primary weight. Collation shall 
behave as if, for each weight level, IGNOREd elements are 
removed. Then each successive pair of elements shall be 
compared according to the relative weights for the elements. 
If the two strings compare equal, the process shall be 
repeated for the next weight level, up to the limit "COLL_- 
WEIGHTS_MAX" . 
 
Weights shall be expressed as characters (in any of the forms 
specified here), <collating-symbol>s, <collating-element>s, an 
ellipsis, or the special symbol IGNORE. A single character, a 
<collating-symbol>, or a <collating-element> shall represent 
the relative order in the character collating sequence of the 
character or symbol, rather than the character or characters 
themselves. 
 
One-to-many mapping is indicated by specifying two or more 
concatenated characters or symbolic names. Thus, if the 
character <ss> is given the string <s><s> as a weight, 
comparisons shall be performed as if all occurrences of the 
character <ss> are replaced by <s><s>. If it is desirable to 
define <ss> and <s><s> as an equivalence class, then a 
collating-element must be defined for the string "ss", as in 
the example below. 
 
All characters specified via an ellipsis shall by default be 
assigned unique weights, equal to the relative order of 
characters. Characters specified via an explicit or implicit 
UNDEFINED special symbol shall by default be assigned the same 
primary weight (i.e., belong to the same equivalence class). 
An ellipsis symbol as a weight shall be interpreted to mean 
that each character in the sequence shall have unique weights, 
equal to the relative order of their character in the 
character collation sequence. Secondary and subsequent weights 
have unique values. The use of the ellipsis as a weight shall 
be treated as an error if the collating element is neither an 
ellipsis nor the special symbol UNDEFINED. 
 
The special keyword IGNORE as a weight shall indicate that 
when strings are compared using the weights at the level where 
IGNORE is specified, the collating element shall be ignored; 
i.e., as if the string did not contain the collating element. 
In regular expressions and pattern matching, all characters 
that are IGNOREd in their primary weight form an equivalence 
class. 
 
A <comment character> occurring where the delimiter ";" may 
occur, terminates the collating statement. 
 
An empty operand shall be interpreted as the collating-element 
itself. 
 
For example, the collation statement 
 
 <a>    <a>;<a> 
 
is equal to 
 
 <a> 
 
An ellipsis (absolute or symbolic) can be used as an operand 
if the collating-element was an ellipsis, and shall be 
interpreted as the value of each character defined by the 
ellipsis. 
 
 Example: 
 
 collating-element <ch> from <c><h> 
 collating-element <Ch> from <C><h> 
 order_start    forward;backward 
 UNDEFINED      IGNORE;IGNORE 
 <LOW> 
 <space>        <LOW>;<space> 
 ...            <LOW>; 
 <a>            <a>;<a> 
 <a'>           <a>;<a'> 
 <A>            <a>;<A> 
 <A'>           <a>;<A'> 
 <ch>           <ch>;<ch> 
 <Ch>           <ch>;<Ch> 
 <s>            <s>;<s> 
 <ss>           <s><s>;<ss><ss> 
 order_end 
 
 
This example is interpreted as follows: 
 
(1)             The UNDEFINED means that all characters not 
 specified in this definition (explicitly or via 
 the ellipsis) shall be ignored. 
(2)             <LOW> defines the first collating weight, and 
 thus the lowest weight in this example. 
(3)             All characters between <space> and <a> shall have 
 the same primary equivalence class <LOW> and 
 individual secondary weights based on their 
 ordinal encoded values. 
(4)             All characters based on the upper or lowercase 
 character "a" belong to the same primary 
 equivalence class. 
(5)             The multicharacter collating element <c><h> is 
 represented by the collating symbol <ch> and 
 belongs to the same primary equivalence class as 
 the multicharacter collating element <C><h>. 
(6)             The <ss> collating element has two weights on the 
 primary level, and it is in the same primary 
 equivalence class as two consecutive <s>-es; on 
 the secondary level the collating element has two 
 weights of the equivalence class <ss>. 
 
4.3.2   "copy" keyword 
 
This keyword specifies the name of an existing FDCC-set to be 
used as the source for the definition of this category. The 
syntax is 
 
 "copy %s\n", <FDCC-set-name> 
 
The <FDCC-set-name> shall consist of one or more characters 
(in any of the forms defined in 4.1.1). If this keyword is 
specified, only the "reorder-after", "reorder-end", "reorder- 
scripts-after" and "reorder-scripts-end" keywords may also be 
specified. The FDCC-set shall be copied in source form. 
 
4.3.3   "col_weight_max" keyword 
 
This keyword defines as a decimal number the number of 
collation levels that an interpreting system needs to support, 
this value is elsewhere referred as the COLL_WEIGHT_MAX limit. 
The minimum value is 7. The syntax is 
 
 "col_weight_max %d\n", <value> 
 
4.3.4   "script" keyword 
 
This keyword shall be used to define symbols for use in script 
related statements; such as the "order_start", and "reorder- 
scripts-after" keywords and script-reordering statements. The 
syntax is 
 
 "script %s\n", <script-symbol> 
 
The <script-symbol> shall be a symbolic name, enclosed between 
angle brackets (< and >), and shall not duplicate any symbolic 
name in the current charmap (if any), or any other symbolic 
name defined in this collation definition. A <script-symbol> 
defined via this keyword is only defined with the LC_COLLATE 
category. 
 
 Example: 
 script <LATIN> 
 script <ARABIC> 
 
4.3.5   "collating-element" keyword 
 
In addition to the collating elements in the character set, 
the collating-element keyword shall be used to define 
multicharacter collating elements. The syntax is 
 
 "collating-element %s from %s\n",<collating- 
symbol>,<string> 
 
The <collating-symbol> operand shall be a symbolic name, 
enclosed between angle brackets (< and >), and shall not 
duplicate any symbolic name in the current charmap or 
repertoiremap file (if any), or any other symbolic name 
defined in this collation definition. The string operand shall 
be a string of two or more characters that shall collate as an 
entity. A <collating-element> defined via this keyword is only 
defined with the LC_COLLATE category. 
 
 Example with ISO/IEC 6937: 
 collating-element <ch> from <c><h> 
 collating-element <e-acute> from <acute><e> 
 collating-element <aa> from <a><a> 
 
4.3.6   "collating-symbol" keyword 
 
This keyword shall be used to define symbols for use in 
collation sequence statements; e.g., between the order_start 
and the order_end keywords. The syntax is 
 
 "collating-symbol %s\n", <collating-symbol> 
 
The <collating-symbol> shall be a symbolic name, enclosed 
between angle brackets (< and >), and shall not duplicate any 
symbolic name in the current charmap (if any), or any other 
symbolic name defined in this collation definition. A 
<collating-symbol> defined via this keyword is only defined 
with the LC_COLLATE category. 
 
 Example: 
 collating-symbol <CAPITAL> 
 collating-symbol <HIGH> 
 
4.3.7   "symbol-equivalence" keyword 
 
This keyword shall be used to define symbols for use in 
collation sequence statements; and assign the same weight as 
another defined symbol. The syntax is 
 
 "symbol-equivalence %s %s\n", <collating-symbol-1>, 
<collating-symbol-2> 
 
The <collating-symbol-1> and <collating-symbol-2> shall be 
symbolic names, enclosed between angle brackets (< and >). 
<collating-symbol-1> shall not duplicate any symbolic name in 
the current charmap (if any), or any other symbolic name 
defined in this collation definition. <collating-symbol-2> is 
defined elsewhere in the LC_COLLATE category as a collating- 
symbol. The use of <collating-symbol-2> shall be equivalent to 
using the <collating-symbol-2 in the LC_COLLATE category. A 
<collating-symbol-1> defined via this keyword is only defined 
with the LC_COLLATE category. 
 
 Example 
 collating-symbol <CAP> 
 symbol-equivalence <CAPITAL> <CAP> 
 
4.3.8   "order_start" keyword 
 
The "order_start" keyword shall precede collation order 
entries and also defines the number of weights for this 
collation sequence definition, the collation script name and 
other collation rules. 
 
The syntax of the "order_start" keyword has two forms: 
 
 "order_start %s;%s;...;%s\n", <sort-rules>, <sort-rules> ... 
and 
 "order_start %s;%s;...;%s\n", <script-symbol>, <sort-rules>, 
<sort-rules> ... 
 
The operands to the order_start keyword are optional. If 
present, the operands define rules to be applied when strings 
are compared. The first operand may be a <script-symbol> 
surrounded by "<" and ">" and the set of collating statements 
following the "order_start" keyword until the "order_end" 
keyword are identified with this <script_symbol> or another 
"order_start" keyword is encountered. The remaining number of 
operands define how many weights each element is assigned; if 
no operands are present, one forward operand is assumed. If 
present, the first operand defines rules to be applied when 
comparing strings using the first (primary) weight; the second 
when comparing strings using the second weight, and so on. 
Operands shall be separated by semicolons (;). Each operand 
shall consist of one or more collation directives, separated 
by commas (,). If the number or operands exceeds the 
(COLL_WEIGHTS_MAX) limit, the utility shall issue a warning 
message. The following directives shall be supported: 
 
forward         Specifies that the direction of scanning a 
 substring in this script at a given point in a 
 string is done towards the logical end of the 
 string for this weight level. 
backward        Specifies that the direction of scanning a 
 substring in this script at a given point in a 
 string is done towards the logical beginning of 
 the string for this weight level. 
position        Specifies that comparison operations for the 
 weight level will consider the relative position 
 of non-IGNOREd elements in the strings. The 
 string containing a non-IGNOREd element after the 
 fewest IGNOREd collating elements from the start 
 of the compare shall collate first. If both 
 strings contain a non-IGNOREd character in the 
 same relative position, the collating values 
 assigned to the elements shall determine the 
 ordering. In case of equality, subsequent non- 
 IGNOREd characters shall be considered in the 
 same manner. 
 
The directives forward and backward are mutually exclusive. 
 
 Examples: 
 order_start forward;backward 
 order_start <CYRILLIC>;forward;forward 
 
If no operands are specified, a single forward operand shall 
be assumed. 
 
 
4.3.9   "order_end" keyword 
 
The collating order entries shall be terminated with an 
order_end keyword. 
 
4.3.10   "reorder-after" keyword 
 
The "reorder-after" keyword shall be used to specify a 
modification to a copied collation specification of an 
existing FDCC-set. There can be more than one "reorder-after" 
statement in a collating specification. The syntax shall be: 
 
 "reorder-after %s\n",<collating-symbol> 
 
The <collating-symbol> operand shall be a symbolic name, 
enclosed between angle brackets, and shall be present in the 
source FDCC-set copied via the "copy" keyword. 
The "reorder-after" statement is followed by one or more 
collation statements as described in the "Collating Order" 
clause (4.3.5), with the exception that the ellipsis symbol 
(...) shall not be used. 
 
Each collation statement reassigns character collation values 
and collation weights to collating elements existing in the 
copied collation specification, by removing the collating 
statement from the copied specification, and inserting the 
collating element in the collating sequence with the new 
collation weights after the preceding collating element of the 
"reorder-after" specification, the first collating element in 
the collation sequence being the <collating-symbol> specified 
on the "reorder-after" statement. 
 
A "reorder-after" specification is terminated by another 
"reorder-after" specification or the "reorder-end" statement. 
 
4.3.10.1   Example of "reorder-after" 
 
 reorder-after <y8> 
 <U:>       <Y>;<U:>;<CAPITAL> 
 <u:>       <Y>;<U:>;<SMALL> 
 reorder-after <z8> 
 <AE>       <AE>;<NONE>;<CAPITAL> 
 <ae>       <AE>;<NONE>;<SMALL> 
 <A:>       <AE>;<DIAERESIS>;<CAPITAL> 
 <a:>       <AE>;<DIAERESIS>;<SMALL> 
 <O/>       <O/>;<NONE>;<CAPITAL> 
 <o/>       <O/>;<NONE>;<SMALL> 
 <AA>       <AA>;<NONE>;<CAPITAL> 
 <aa>       <AA>;<NONE>;<SMALL> 
 reorder-end 
 
The example is interpreted as follows (using the "i18nrep" 
repertoiremap): 
 
1.  The collating element <U:> is removed from the copied 
 collating sequence and inserted after <y8> in the collating 
 sequence with the new weights. The collating element <u:> 
 is removed from the copied collating sequence and inserted 
 in the resulting collation sequence after <U:> with the new 
 weights. 
 
2.  The second "reorder-after" statement terminates the first 
 list of reordering collation identifier entries, and 
 initiates a second list, rearranging the order and weights 
 for the <AE>, <ae>, <A:>, <a:>, <O/>, and <o/> collating 
 elements after the <z8> collating symbol in the copied 
 specification. 
 
3.  The "reorder-end" statement terminates the second list of 
 reordering entries. 
 
4.  Thus for the original sequence 
 
 ... ( U u   ) V v W w X x Y y Z z 
 
 this example reordering gives 
 
 ... U u V v W w X x ( Y y   ) Z z (     )     
 
4.3.11   "reorder-end" keyword 
 
The "reorder-end" keyword shall specify the end of a list of 
collating statements, initiated by the "reorder-after" 
keyword. 
 
4.3.12   "reorder-scripts-after" keyword 
 
The "reorder-scripts-after" keyword shall be used to specify a 
modification to a copied collation specification of an 
existing FDCC-set. The "reorder-scripts-after" statement is 
followed by one or more statements consisting of script 
reordering statements. 
 
4.3.12.1   script reordering statements 
 
The script reordering statements rearranges the set of 
collating entries and changes sorting rules for the set of 
collating entries identified by a script symbol in a preceding 
"order_start" statement. Each script reorder statement has the 
syntax: 
 
 "%s %s;...%s\n", <script-symbol>, <sort-rules>, <sort- 
rules> ... 
 
The <script-symbol> identifies the set of collating entries, 
and shall be defined via a "script" keyword. 
 
The <sort-rules> are as described for the "order_start" 
keyword. Specified <sort-rules> replace the specification for 
the ordering of the script given on the "order_start" 
statement identified by the <script-symbol>. The <sort-rules> 
are optional and <sort-rules> not to be changed may be given 
by empty specifications. 
 
The order of the script reordering statements rearranges the 
assignment of collation entries for the sets of collation 
entries identified by the <script-symbols> to the order that 
the <script-symbols> occur after the "reorder-scripts-after" 
statement. 
 
The script reordering statements are terminated by a "reorder- 
scripts-end" statement. 
 
4.3.12.2   Example of script reordering 
 
 copy "i18n" 
 reorder-scripts-after <DIGITS> 
 <ARABIC> 
 <LATIN> forward;backward;forward;forward,position 
 reorder-scripts-end 
 
This example is interpreted as follows: The LC_COLLATE 
category of the "i18n" FDCC-set is copied. Then a reordering 
of all collating statements for the scripts <ARABIC> and 
<LATIN> is done, leaving the rest of the scripts as they were 
in the "i18n" FDCC-set. The <ARABIC> script is placed 
immediately after the <DIGITS> script, and the <LATIN> script 
immediately following the <ARABIC> script. The ordering rules 
are kept as they were in the "i18n" FDCC-set, while the 
<LATIN> script gets new ordering rules as indicated. The 
"reorder-scripts-end" keyword terminates the script reordering 
statements. 
 
4.3.13   "reorder-scripts-end" keyword 
 
The "reorder-scripts-end" keyword shall specify the end of a 
list of script symbols, initiated by the "reorder-scripts- 
after" keyword. 
 
4.3.14   Toggling keyword statements 
 
The toggling keywords "define" and "undef" shall set, 
respectively unset a toggle. Toggles that are not defined, are 
regarded as unset. The toggle is a string of characters, in 
any form as described in clause 4.1.1. The keywords "ifdef", 
"ifndef", "elif", "else", and "endif" controls the inclusion 
of LC_COLLATE keywords and statements, as described in the 
following, and they work in a nesting manner. The toggling 
keywords are modelled after the precompiler in the C standard. 
 
4.3.14.1   "define" keyword 
 
This keyword shall be used to set a toggle, for use with other 
toggling keywords. The same toggle may occur with more 
"define" statements. The syntax is 
 
 "define %s\n", <toggle> 
 
4.3.14.2   "undef" keyword 
 
This keyword shall be used to unset a toggle, for use with 
other toggling keywords. The same toggle may occur with more 
"undef" statements. The syntax is 
 
 "undef %s\n", <toggle> 
 
4.3.14.3   "ifdef" keyword 
 
This keyword shall be used to control the inclusion of the 
following LC_COLLATE statements, up to a corresponding "elif", 
"else" or "endif" keyword. If the toggle is set, the 
statements are used, otherwise they are ignored. The syntax is 
 
 "ifdef %s\n", <toggle> 
 
4.3.14.4   "ifndef" keyword 
 
This keyword shall be used to control the inclusion of the 
following LC_COLLATE statements, up to a corresponding "elif", 
"else" or "endif" keyword. If the toggle is unset, the 
statements are used, otherwise they are ignored. The syntax is 
 
 "ifndef %s\n", <toggle> 
 
4.3.14.5   "elif" keyword 
 
This keyword shall be used to control the inclusion of the 
following LC_COLLATE statements, up to a corresponding "elif", 
"else" or "endif" keyword. The keyword shall be preceded by a 
corresponding "ifdef", "ifndef", or "elif" statement and the 
statement that these keyword statements control. If no 
preceding "ifdef", "ifndef" or "elif" statement has been used, 
and if the toggle is set, the statements are used, otherwise 
they are ignored. The syntax is 
 
 "elif %s\n", <toggle> 
 
4.3.14.6   "else" keyword 
 
This keyword shall be used to control the inclusion of the 
following LC_COLLATE statements, up to a corresponding "endif" 
keyword. The keyword shall be preceded by a corresponding 
"ifdef", "ifndef", or "elif" statement and the statement that 
these keyword statements control. If the preceding block of 
statements were not used, the statements are used, otherwise 
they are ignored. The syntax is 
 
 "else\n" 
 
4.3.14.7   "endif" keyword 
 
This keyword shall be used to terminate the control of the 
inclusion of the preceding LC_COLLATE statements. The keyword 
shall be preceded by a corresponding "ifdef", "ifndef", "elif" 
or "else" statement. The syntax is 
 
 "endif\n" 
 
4.3.14.8   Toggling example 
 
Here is an example to show the workings of the toggling 
statements: 
 
The "gensort" FDCC-set may be defined as: 
 
 LC_COLLATE 
 ifdef BACKWARD 
 order_start <LATIN>;forward;backward;forward;forward,position 
 else 
 order_start <LATIN>;forward;forward;forward;forward,position 
 endif 
 .... 
 END LC_COLLATE 
 
Then the following LC_COLLATE category specification can use 
the "gensort" specification to create a new LC_COLLATE 
category: 
 
 LC_COLLATE 
 define BACKWARD 
 copy "gensort" 
 END LC_COLLATE 
 
The example is explained as follows: The LC_COLLATE category 
in the "gensort" FDCC-set uses the toggle BACKWARD, and as 
BACKWARD is not set the second "order_start" statement (all 
"forward") is used. 
 
In the second LC_COLLATE category, the BACKWARD toggle is set 
before copying the first LC_COLLATE category, and thus the 
first "order_start" statement with 2nd level "backward" is 
used. 
 
4.3.15   "i18n" LC_COLLATE category 
 
The "i18n" LC_COLLATE category is defined as the tailorable 
template in ISO/IEC 14651. 
 
4.4   LC_MONETARY 
 
The LC_MONETARY category defines the rules and symbols that 
shall be used to format monetary numeric information. The 
operands are strings. For some keywords, the strings can 
contain only integers. Keywords that are not provided, string 
values set to the empty string "", or integer keywords set to 
-1, shall be used to indicate that the value is unspecified, 
and then no default is taken. The following keywords shall be 
defined: 
 
copy                 Specify the name of an existing FDCC-set to 
 be used as the source for the definition of 
 this category. If this keyword is specified, 
 no other keyword shall be specified. 
int_curr_symbol      The international currency symbol. The 
 operand shall be a four character string, 
 with the first three characters containing 
 the alphabetic international currency symbol 
 in accordance with those specified in ISO 
 4217 (Codes for the representation of 
 currencies and funds). The fourth character 
 shall be the character used to separate the 
 international currency symbol from the 
 monetary quantity. The keyword shall be 
 specified, unless the "copy" keyword is 
 used. 
currency_symbol      The string that shall be used as the local 
 currency symbol. 
mon_decimal_point       The operand is a string containing the 
 symbol that shall be used as the decimal 
 delimiter in monetary formatted 
 quantities. In contexts where other 
 standards limit the mon_decimal_point to a 
 single byte, the result of specifying a 
 multibyte operand is unspecified. The 
 keyword shall be specified, unless the 
 "copy" keyword is used. 
mon_thousands_sep       The operand is a string containing the 
 symbol that shall be used as a separator 
 for groups of digits to the left of the 
 decimal delimiter in formatted monetary 
 quantities. In contexts where other stan- 
 dards limit the mon_thousands_sep to a 
 single byte, the result of specifying a 
 multibyte operand is unspecified. The 
 keyword shall be specified, unless the 
 "copy" keyword is used. 
mon_grouping         Define the size of each group of digits in 
 formatted monetary quantities. The operand 
 is a sequence of integers separated by 
 semicolons. Each integer specifies the 
 number of digits in each group, with the 
 initial integer defining the size of the 
 group immediately preceding the decimal 
 delimiter, and the following integers 
 defining the preceding groups. If the last 
 integer is not -1, then the size of the 
 previous group (if any) shall be repeatedly 
 used for the remainder of the digits. If the 
 last integer is -1, then no further grouping 
 shall be performed. The keyword shall be 
 specified, unless the "copy" keyword is 
 used. 
positive_sign        A string that shall be used to indicate a 
 nonnegative-valued formatted monetary 
 quantity. The keyword shall be specified, 
 unless the "copy" keyword is used. 
negative_sign        A string that shall be used to indicate a 
 negative-valued formatted monetary quantity. 
 The keyword shall be specified, unless the 
 "copy" keyword is used. 
int_frac_digits      An integer representing the number of 
 fractional digits (those to the right of the 
 decimal delimiter) to be written in a 
 formatted monetary quantity using 
 int_curr_symbol. The keyword shall be 
 specified, unless the "copy" keyword is 
 used. 
frac_digits          An integer representing the number of 
 fractional digits (those to the right of the 
 decimal delimiter) to be written in a 
 formatted monetary quantity using 
 currency_symbol. The keyword shall be 
 specified, unless the "copy" keyword is 
 used. 
p_cs_precedes        An integer set to 1 if the currency_symbol 
 precedes the value for a nonnegative 
 formatted monetary quantity, and set to 0 if 
 the symbol succeeds the value. The keyword 
 shall be specified, unless the "copy" 
 keyword is used. 
p_sep_by_space       An integer set to 0 if no space separates 
 the currency_symbol from the value for a 
 nonnegative formatted monetary quantity, set 
 to 1 if a space separates the symbol from 
 the value, and set to 2 if a space separates 
 the symbol and the sign string, if adjacent. 
 The keyword shall be specified, unless the 
 "copy" keyword is used. 
n_cs_precedes        An integer set to 1 if the currency_symbol 
 precedes the value for a negative formatted 
 monetary quantity, and set to 0 if the 
 symbol succeeds the value. The keyword shall 
 be specified, unless the "copy" keyword is 
 used. 
n_sep_by_space       An integer set to 0 if no space separates 
 the currency_symbol from the value for a 
 negative formatted monetary quantity, set to 
 1 if a space separates the symbol from the 
 value, and set to 2 if a space separates the 
 symbol and the sign string, if adjacent. The 
 keyword shall be specified, unless the 
 "copy" keyword is used. 
int_p_cs_precedes       An integer set to 1 if the int_curr_symbol 
 precedes the value for a nonnegative 
 formatted monetary quantity, and set to 0 
 if the symbol succeeds the value. If not 
 specified, the value of p_cs_precedes is 
 taken. 
int_p_sep_by_space      An integer set to 0 if no space separates 
 the int_curr_symbol from the value for a 
 nonnegative formatted monetary quantity, 
 set to 1 if a space separates the symbol 
 from the value, and set to 2 if a space 
 separates the symbol and the sign string, 
 if adjacent. If not specified, the value 
 of p_sep_by_space is taken. 
int_n_cs_precedes       An integer set to 1 if the int_curr_symbol 
 precedes the value for a negative 
 formatted monetary quantity, and set to 0 
 if the symbol succeeds the value. If not 
 specified, the value of n_cs_precedes is 
 taken. 
int_n_sep_by_space      An integer set to 0 if no space separates 
 the int_curr_symbol from the value for a 
 negative formatted monetary quantity, set 
 to 1 if a space separates the symbol from 
 the value, and set to 2 if a space 
 separates the symbol and the sign string, 
 if adjacent. If not specified, the value 
 of n_sep_by_space is taken. 
p_sign_posn          An integer set to a value indicating the 
 positioning of the positive_sign for a 
 nonnegative formatted monetary quantity 
 using the currency_symbol. The following 
 integer values shall be defined: 
 
 0  Parentheses enclose the quantity and the 
 currency_symbol. 
 1  The sign string precedes the quantity and 
 the currency_symbol. 
 2  The sign string succeeds the quantity and 
 the currency_symbol. 
 3  The sign string immediately precedes the 
 currency_symbol. 
 4  The sign string immediately succeeds the 
 currency_symbol. 
 The keyword shall be specified, unless the 
 "copy" keyword is used. 
 
n_sign_posn          An integer set to a value indicating the 
 positioning of the negative_sign for a 
 negative formatted monetary quantity using 
 the currency_symbol. The following integer 
 values shall be defined: 
 
 0  Parentheses enclose the quantity and the 
 int_curr_symbol. 
 1  The sign string precedes the quantity and 
 the currency_symbol. 
 2  The sign string succeeds the quantity and 
 the currency_symbol. 
 3  The sign string immediately precedes the 
 currency_symbol. 
 4  The sign string immediately succeeds the 
 currency_symbol. 
 The keyword shall be specified, unless the 
 "copy" keyword is used. 
 
int_p_sign_posn      An integer set to a value indicating the 
 positioning of the positive_sign for a 
 nonnegative formatted international monetary 
 quantity. The following integer values shall 
 be defined: 
 
 0  Parentheses enclose the quantity and the 
 int_curr_symbol. 
 1  The sign string precedes the quantity and 
 the int_curr_symbol. 
 2  The sign string succeeds the quantity and 
 the int_curr_symbol. 
 3  The sign string immediately precedes the 
 int_curr_symbol. 
 4  The sign string immediately succeeds the int_curr_symbol. 
 If no int_p_sign_posn is present the value 
 of the p_sign_posn is taken. 
 
int_n_sign_posn      An integer set to a value indicating the 
 positioning of the negative_sign for a 
 negative formatted international monetary 
 quantity. The following integer values shall 
 be defined: 
 
 0  Parentheses enclose the quantity and the 
 int_curr_symbol. 
 1  The sign string precedes the quantity and 
 the int_curr_symbol. 
 2  The sign string succeeds the quantity and 
 the int_curr_symbol. 
 3  The sign string immediately precedes the 
 int_curr_symbol. 
 4  The sign string immediately succeeds the 
 int_curr_symbol. 
 If no int_n_sign_posn is present the value 
 of the n_sign_posn is taken. 
duo_int_curr_symbol     The second international currency symbol. 
 The operand shall be a four character 
 string, with the first three characters 
 containing the alphabetic international 
 currency symbol in accordance with those 
 specified in ISO 4217 (Codes for the 
 representation of currencies and funds). 
 The fourth character shall be the charac- 
 ter used to separate the international 
 currency symbol from the monetary 
 quantity. The keyword is optional. 
duo_currency_symbol     The string that shall be used as the 
 second local currency symbol. 
duo_int_frac_digits     An integer representing the number of 
 fractional digits (those to the right of 
 the decimal delimiter) to be written in a 
 formatted monetary quantity using 
 duo_int_curr_symbol. The keyword is 
 optional. 
duo_frac_digits         An integer representing the number of 
 fractional digits (those to the right of 
 the decimal delimiter) to be written in a 
 formatted monetary quantity using 
 duo_currency_symbol. The keyword is 
 optional. 
duo_p_cs_precedes       An integer set to 1 if the 
 duo_currency_symbol precedes the value for 
 a nonnegative formatted monetary quantity, 
 and set to 0 if the symbol succeeds the 
 value. The keyword is optional. 
duo_p_sep_by_space      An integer set to 0 if no space separates 
 the duo_currency_symbol from the value for 
 a nonnegative formatted monetary quantity, 
 set to 1 if a space separates the symbol 
 from the value, and set to 2 if a space 
 separates the symbol and the sign string, 
 if adjacent. The keyword is optional. 
duo_n_cs_precedes       An integer set to 1 if the 
 duo_currency_symbol precedes the value for 
 a negative formatted monetary quantity, 
 and set to 0 if the symbol succeeds the 
 value. The keyword is optional. 
duo_n_sep_by_space      An integer set to 0 if no space separates 
 the duo_currency_symbol from the value for 
 a negative formatted monetary quantity, 
 set to 1 if a space separates the symbol 
 from the value, and set to 2 if a space 
 separates the symbol and the sign string, 
 if adjacent. The keyword is optional. 
duo_int_p_cs_precedes       An integer set to 1 if the 
 duo_int_curr_symbol precedes the value 
 for a nonnegative formatted monetary 
 quantity, and set to 0 if the symbol 
 succeeds the value. If not specified, 
 the value of duo_p_cs_precedes is 
 taken. 
duo_int_p_sep_by_space      An integer set to 0 if no space 
 separates the duo_int_curr_symbol from 
 the value for a nonnegative formatted 
 monetary quantity, set to 1 if a space 
 separates the symbol from the value, 
 and set to 2 if a space separates the 
 symbol and the sign string, if 
 adjacent. If not specified, the value 
 of duo_p_sep_by_space is taken. 
duo_int_n_cs_precedes       An integer set to 1 if the 
 duo_int_curr_symbol precedes the value 
 for a negative formatted monetary 
 quantity, and set to 0 if the symbol 
 succeeds the value. If not specified, 
 the value of duo_n_cs_precedes is 
 taken. 
duo_int_n_sep_by_space      An integer set to 0 if no space 
 separates the duo_int_curr_symbol from 
 the value for a negative formatted 
 monetary quantity, set to 1 if a space 
 separates the symbol from the value, 
 and set to 2 if a space separates the 
 symbol and the sign string, if 
 adjacent. If not specified, the value 
 of duo_n_sep_by_space is taken. 
duo_p_sign_posn      An integer set to a value indicating the 
 positioning of the positive_sign for a 
 nonnegative formatted monetary quantity 
 using the duo_currency_symbol. The following 
 integer values shall be defined: 
 
 0  Parentheses enclose the quantity and the 
 duo_currency_symbol. 
 1  The sign string precedes the quantity and 
 the duo_currency_symbol. 
 2  The sign string succeeds the quantity and 
 the duo_currency_symbol. 
 3  The sign string immediately precedes the 
 duo_currency_symbol. 
 4  The sign string immediately succeeds the 
 duo_currency_symbol. 
 The keyword is optional. 
 
duo_n_sign_posn      An integer set to a value indicating the 
 positioning of the negative_sign for a 
 negative formatted monetary quantity using 
 the duo_currency_symbol. The following 
 integer values shall be defined: 
 
 0  Parentheses enclose the quantity and the 
 int_curr_symbol. 
 1  The sign string precedes the quantity and 
 the duo_currency_symbol. 
 2  The sign string succeeds the quantity and 
 the duo_currency_symbol. 
 3  The sign string immediately precedes the 
 duo_currency_symbol. 
 4  The sign string immediately succeeds the 
 duo_currency_symbol. 
 The keyword is optional. 
 
duo_int_p_sign_posn     An integer set to a value indicating the 
 positioning of the positive_sign for a 
 nonnegative formatted second international 
 monetary quantity. The following integer 
 values shall be defined: 
 
 0  Parentheses enclose the quantity and the 
 duo_int_curr_symbol. 
 1  The sign string precedes the quantity and 
 the duo_int_curr_symbol. 
 2  The sign string succeeds the quantity and 
 the duo_int_curr_symbol. 
 3  The sign string immediately precedes the 
 duo_int_curr_symbol. 
 4  The sign string immediately succeeds the 
 duo_int_curr_symbol. 
 If no duo_int_p_sign_posn is present the 
 value of the p_sign_posn is taken. 
 
duo_int_n_sign_posn     An integer set to a value indicating the 
 positioning of the negative_sign for a 
 negative formatted second international 
 monetary quantity. The following integer 
 values shall be defined: 
 
 0  Parentheses enclose the quantity and the 
 duo_int_curr_symbol. 
 1  The sign string precedes the quantity and 
 the duo_int_curr_symbol. 
 2  The sign string succeeds the quantity and 
 the duo_int_curr_symbol. 
 3  The sign string immediately precedes the 
 duo_int_curr_symbol. 
 4  The sign string immediately succeeds the 
 duo_int_curr_symbol. 
 If no duo_int_n_sign_posn is present the 
 value of the duo_n_sign_posn is taken. 
uno_valid_from       an integer representing a Gregorian date in 
 the form YYYYMMDD, specifying the beginning 
 date (inclusive) of the validity of the 
 first currency. If not specified, it is 
 taken to be the beginning of time. 
uno_valid_to         an integer representing a Gregorian date in 
 the form YYYYMMDD, specifying the end date 
 (inclusive) of the validity of the first 
 currency. If not specified, it is taken to 
 be the end of time. 
duo_valid_from       an integer representing a Gregorian date in 
 the form YYYYMMDD, specifying the beginning 
 date (inclusive) of the validity of the 
 second currency. If not specified, it is 
 taken to be the beginning of time. 
duo_valid_to         an integer representing a Gregorian date in 
 the form YYYYMMDD, specifying the end date 
 (inclusive) of the validity of the second 
 currency. If not specified, it is taken to 
 be the end of time. 
 
conversion_rate      two integers separated by a <semicolon> 
 specifying the fixed conversion rate between 
 the first and second currencies; the first 
 integer is for multiplying the first 
 currency, and the second for dividing this 
 result to get the amount in the second 
 currency. 
 
The "i18n" FDCC-set is defined as follows for the LC_MONETARY 
category. 
 
 LC_MONETARY 
 % This is the 14652 i18n fdcc-set definition for 
 % the LC_MONETARY category. 
 % 
 int_curr_symbol     "" 
 currency_symbol     "" 
 mon_decimal_point   "" 
 mon_thousands_sep   "" 
 mon_grouping        -1 
 positive_sign       "" 
 negative_sign       "" 
 int_frac_digits     -1 
 frac_digits         -1 
 p_cs_precedes       -1 
 p_sep_by_space      -1 
 n_cs_precedes       -1 
 n_sep_by_space      -1 
 p_sign_posn         -1 
 n_sign_posn         -1 
 % 
 END LC_MONETARY 
 
 
4.5   LC_NUMERIC 
 
The LC_NUMERIC category defines the rules and symbols that 
shall be used to format nonmonetary numeric information. The 
operands are strings. For some keywords, the strings only can 
contain integers. Keywords that are not provided, string 
values set to the empty string (""), or integer keywords set 
to -1, shall be used to indicate that the value is 
unspecified. The following keywords shall be defined: 
 
copy         Specify the name of an existing FDCC-set to be used 
 as the source for the definition of this category. 
 If this keyword is specified, no other keyword 
 shall be specified. 
decimal_point    The operand is a string containing the symbol 
 that shall be used as the decimal delimiter in 
 numeric, nonmonetary formatted quantities. This 
 keyword cannot be omitted and cannot be set to 
 the empty string. In contexts where other 
 standards limit the decimal point to a single 
 byte, the result of specifying a multibyte 
 operand is unspecified. 
thousands_sep    The operand is a string containing the symbol 
 that shall be used as a separator for groups of 
 digits to the left of the decimal delimiter in 
 numeric, nonmonetary formatted monetary quan- 
 tities. In contexts where other standards limit 
 the thousands_sep to a single byte, the result 
 of specifying a multibyte operand is 
 unspecified. 
grouping     Define the size of each group of digits in 
 formatted non-monetary quantities. The operand is a 
 sequence of integers separated by semicolons. Each 
 integer specifies the number of digits in each 
 group, with the initial integer defining the size 
 of the group immediately preceding the decimal 
 delimiter, and the following integers defining the 
 preceding groups. If the last integer is not -1, 
 then the size of the previous group (if any) shall 
 be repeatedly used for the remainder of the digits. 
 If the last integer is -1, then no further grouping 
 shall be performed. 
 
The "i18n" FDCC-set is for the LC_NUMERIC category: 
 
 LC_NUMERIC 
 % This is the 14652 i18n fdcc-set definition for 
 % the LC_NUMERIC category. 
 % 
 decimal_point   "" 
 thousands_sep   "" 
 grouping        -1 
 % 
 END LC_NUMERIC 
 
 
4.6   LC_TIME 
 
The following keywords shall be defined: 
 
copy         Specify the name of an existing FDCC-set to be used 
 as the source for the definition of this category. 
 If this keyword is specified, no other keyword 
 shall be specified. 
abday        Define the abbreviated weekday names for calendar 
 systems with weeks of constant length, to be 
 referenced by the %a field descriptor. The length 
 of the week and a gregorian date for the first 
 weekday is defined by the "week" keyword. The 
 operand shall consist of semicolon-separated 
 strings. The first string shall be the abbreviated 
 name of the day corresponding to the first day of 
 the week (default Sunday), the second the 
 abbreviated name of the day corresponding to the 
 second day of the week (default Monday), and so on. 
day          Define the full weekday names for calendar systems 
 with weeks of constant length, to be referenced by 
 the %a field descriptor. The length of the week and 
 a gregorian date for the first weekday is defined 
 by the "week" keyword. The operand shall consist of 
 semicolon-separated strings. The first string shall 
 be the full name of the day corresponding to the 
 first day of the week (default Sunday), the second 
 the full name of the day corresponding to the 
 second day of the week (default Monday), and so on. 
week         Shall be used to define the number of days in a 
 week, which is the first weekday - the first 
 weekday has the value 1, and which week is to be 
 considered the first in a year. The first operand 
 is an integer specifying the number of days in the 
 week, The second operand is an integer specifying 
 the gregorian date in the format YYYYMMDD with a 
 leading <hyphen-minus> if before Christ. The third 
 operand is an integer specifying the weekday number 
 to be contained in the first week of the year. If 
 the keyword is not specified the values are taken 
 as 7,  19971130 (a Sunday), and 7 (Saturday), 
 respectively. ISO 8601 conforming applications 
 should use the values 7, 19971201 (a Monday), and 4 
 (Thursday), respectively. 
abmon        Define the abbreviated month names, to be 
 referenced by the %b field descriptor. The operand 
 shall consist of twelve or thirteen semicolon- 
 separated strings. The first string shall be the 
 abbreviated name of the first month of the year 
 (January), the second the abbreviated name of the 
 second month, and so on. 
mon          Define the full month names, to be referenced by 
 the %B field descriptor. The operand shall consist 
 of twelve or thirteen semicolon-separated strings. 
 The first string shall be the full name of the 
 first month of the year (January), the second the 
 full name of the second month, and so on. 
d_t_fmt      Define the appropriate date and time 
 representation, to be referenced by the %c field 
 descriptor. The operand shall consist of a string, 
 and can contain any combination of characters and 
 field descriptors. In addition, the string can 
 contain escape sequences defined in Table 2. 
d_fmt        Define the appropriate date representation, to be 
 referenced by the %x field descriptor. The operand 
 shall consist of a string, and can contain any 
 combination of characters and field descriptors. In 
 addition, the string can contain escape sequences 
 defined in Table 2. 
t_fmt        Define the appropriate time representation, to be 
 referenced by the %X field descriptor. The operand 
 shall consist of a string, and can contain any com- 
 bination of characters and field descriptors. In 
 addition, the string can contain escape sequences 
 defined in Table 2. 
am_pm        Define the appropriate representation of the ante 
 meridiem and post meridiem strings, to be 
 referenced by the %p field descriptor. The operand 
 shall consist of two strings, separated by a 
 semicolon. The first string shall represent the an- 
 temeridiem designation, the last string the 
 postmeridiem designation. The keyword is optional. 
 If unspecified, the %p field descriptor shall refer 
 to the empty string. 
t_fmt_ampm   Define the appropriate time representation in the 
 12-hour clock format with am_pm, to be referenced 
 by the %r field descriptor. The operand shall 
 consist of a string and can contain any combination 
 of characters and field descriptors. If the string 
 is empty, the 12-hour format is not supported in 
 the FDCC-set. 
era          Shall be used to define alternate Eras, 
 corresponding to the %E field descriptor modifier. 
 The format of the operand is unspecified, but shall 
 support the definition of the %EC and %Ey field 
 descriptors, and may also define the era_year 
 format (%EY). 
era_year     Shall be used to define the format of the year in 
 alternate Era format, corresponding to the %EY 
 field descriptor. 
era_d_fmt    Shall be used to define the format of the date in 
 alternate Era notation, corresponding to the %Ex 
 field descriptor. 
alt_digits   Shall be used to define alternate symbols for 
 digits, corresponding to the %O field descriptor 
 modifier. The operand shall consist of semicolon- 
 separated strings. The first string shall be the 
 alternate symbol corresponding with zero, the 
 second string the symbol corresponding with one, 
 and so on. Up to 100 alternate symbol strings can 
 be specified. The %O modifier indicates that the 
 string corresponding to the value specified via the 
 field descriptor shall be used instead of the 
 value. 
first_weekday    Shall be used to define the first day to be 
 displayed, for example in a calendar display 
 utility. The operand is an integer specifying 
 the day number (1 = first) according to the 
 information specified with the "day" keyword. 
 The keyword may be omitted, and then the value 1 
 is taken, corresponding to Sunday for a week 
 beginning Sunday, or to Monday for a week 
 beginning Monday. 
first_workday    Shall be used to define the first workday as an 
 integer according to the day numbering specified 
 with the "week" keyword. 
cal_direction    Shall be used to define the direction of the 
 display of dates, for example in a calendar 
 display utility. The operand is an integer, and 
 the following values are defined: 
 1  left-right from top 
 2  top-down from left 
 3  right-left from top 
 The keyword may be omitted, and then the value 1 is 
 taken. 
timezone     Shall be used to define a set of timezones, each 
 defined by a string. In the following the 
 characters <, >, [ and ] are used as 
 metacharacters. Only characters with a visible 
 glyph from the portable character set may be used, 
 except in the <std> and <dst> fields. The format of 
 the string is: 
 
 <std><offset><dst>[<offset>][,<rule>[,<rule>...] 
 ] 
 
 where 
 
 <std> and <dst>          Indicates no less than 
 three, nor more than 10 
 characters that are the 
 designation for the 
 standard <std> or summer 
 <dst> time zone. only <std> 
 is required; if <dst> is 
 missing, then summer time 
 does not apply in this 
 category. Upper- and 
 lowercase letters are 
 explicitly allowed. Any 
 characters except a leading 
 colon <:> or digits, the 
 comma <,>, the minus <->, 
 the plus <+>, and the null 
 character are permitted to 
 appear in these fields, but 
 their meaning is 
 unspecified. 
 <offset>              Indicates the value one must add 
 to the local time to arrive at 
 the Coordinated Universal Time. 
 The <offset> has the form: 
 
 hh[:mm[:ss]] 
 
 The minutes (mm) and seconds (ss) are 
 optional. The hour (hh) shall be 
 required and may be a single digit. The 
 <offset> following <std> shall be 
 required. If no <offset> follows <dst>, 
 summer time is assumed to be one hour 
 ahead of standard time. One or more 
 digits may be used; the value is always 
 interpreted as a decimal number. The 
 hour shall be between zero and 24, and 
 the minutes (and seconds) - if 
 present - shall be between zero and 59. 
 If preceded by a "-", the time zone 
 shall be east of the Prime Meridian; 
 otherwise it shall be west of (which 
 may be indicated by an optional 
 preceding "+"). 
 <rule>                Indicates when to change to and 
 back from summer time. The <rule> 
 has the form: 
 
 
<date>[/<time>/<year>],<date>[/<time>/<year>] 
 where the first <date> describes when 
 the change from standard time to summer 
 time occurs, and the second <date> 
 describes when the change back happens. 
 Each <time> field describes when, in 
 current local time, the change to the 
 other time is made. The first <year> 
 field defines the beginning of the 
 validity of this rule, and the second 
 <year> field defines the end of the 
 validity of the rule. A number of rules 
 may be given. 
 
 The format of <date> shall be one of 
 the following: 
 
 J<n>   The Julian day <n> (1 <= n 
 <= 365) Leap years shall 
 not be counted. That is, in 
 all years - including leap 
 years - February 28 is day 
 59 and March 1 is day 60. 
 It is impossible to 
 explicitly refer to the 
 occasional February 29. 
 <n>    The zero-based Julian day 
 (0 <= n <= 365). Leap years 
 shall be counted and it is 
 possible to refer to 
 February 29. 
 M<m>.<n>.<d> 
 the <d>th day (0 <= d <= 7) 
 of week <n> of month <m> (1 
 <= n <= 5, 1 <= m <= 12, 
 where week 5 means "the 
 last <d> day in month <m>" 
 which may occur in either 
 the fourth or fifth week). 
 Week 1 is the first week in 
 which the <d>th day occurs. 
 Day zero and day seven is 
 Sunday. 
 
 The <time> has the same format as 
 <offset> except that no leading sign 
 ("-" or "+") shall be allowed. The 
 default, if <time> is not given, shall 
 be "02:00:00". 
 
 The <year> has the format YYYY. 
 
4.6.1   Date Field Descriptors 
 
The LC_TIME category defines the interpretation of a number of 
field descriptors. The field descriptors are also available in 
the definitions with the following LC_TIME keywords: d_t_fmt, 
d_fmt, t_fmt, t_fmt_ampm, era, and era_d_fmt. 
A field descriptor may not be used with the LC_TIME keywords 
defining it. 
 
Table 2: Escape sequences for the date field 
 
%a           FDCC-set's abbreviated weekday name. 
%A           FDCC-set's full weekday name. 
%b           FDCC-set's abbreviated month name. 
%B           FDCC-set's full month name. 
%c           FDCC-set's appropriate date and time 
 representation. 
%C           Century (a year divided by 100 and truncated to 
 integer) as decimal number (00-99). 
%d           Day of the month as a decimal number (01-31). 
%D           Date in the format mm/dd/yy. 
%e           Day of the month as a decimal number (1-31 in at 
 two-digit field with leading <space> fill). 
%f           Weekday as a decimal number (1(Monday)-7). 
%F           is replaced by the date in the format YYYY-MM-DD 
 (ISO 8601 format) 
%h           A synonym for %b. 
%H           Hour (24-hour clock) as a decimal number (00-23). 
%I           Hour (12-hour clock) as a decimal number (01-12). 
%j           Day of the year as a decimal number (001-366). 
%m           Month as a decimal number (01-13). 
%M           Minute as a decimal number (00-59). 
%n           A <newline> character. 
%p           FDCC-set's equivalent of either AM or PM. 
%r           12-hour clock time (01-12) using the AM/PM 
 notation. 
%S           Seconds as a decimal number (00-61). 
%t           A <tab> character. 
%T           24-hour clock time in the format HH:MM:SS. 
%u           Week number of the year as a decimal number with 
 two digits and leading zero, according to "week" 
 keyword. 
%U           Week number of the year (Sunday as the first day of 
 the week) as a decimal number (00-53). 
%w           Weekday as a decimal number (0(Sunday)-6). 
%W           Week number of the year (Monday as the first day of 
 the week) as a decimal number (00-53). 
%x           FDCC-set's appropriate date representation. 
%X           FDCC-set's appropriate time representation. 
%y           Year (offset from %C) as a decimal number (00-99). 
%Y           Year with century as a decimal number. 
%Z           Time-zone name, or no characters if no time zone is 
 determinable. 
%%           A <percent-sign> character. 
 
4.6.2   Modified Field Descriptors 
 
Some field descriptors can be modified by the E and O modifier 
characters to indicate a different format or specification as 
specified in the LC_TIME FDCC-set description. If the 
corresponding keyword (see era, era_year, era_d_fmt, and 
alt_digits) is not specified for the current FDCC-set, the un- 
modified field descriptor value shall be used. 
 
%Ec          FDCC-set's alternate date and time representation. 
%EC          The name of the base year (period) in the FDCC- 
 set's alternate representation. 
%Ex          FDCC-set's alternate date representation. 
%Ey          Offset from %EC (year only) in the FDCC-set's 
 alternate representation. 
%EY          Full alternate year representation. 
%Od          Day of month using the FDCC-set's alternate numeric 
 symbols. 
%Oe          Day of month using the FDCC-set's alternate numeric 
 symbols. 
%Of          Weekday as a decimal number according to alt_day (1 
 is first day). 
%OH          Hour (24-hour clock) using the FDCC-set's alternate 
 numeric symbols. 
%OI          Hour (12-hour clock) using the FDCC-set's alternate 
 numeric symbols. 
%Om          Month using the FDCC-set's alternate numeric 
 symbols. 
%OM          Minutes using the FDCC-set's alternate numeric 
 symbols. 
%OS          Seconds using the FDCC-set's alternate numeric 
 symbols. 
%OU          Week number of the year (Sunday as the first day of 
 the week) using the FDCC-set's alternate numeric 
 symbols. 
%Ow          Weekday as number in the FDCC-set's alternate 
 representation (Sunday=0). 
%OW          Week number of the year (Monday as the first day of 
 the week) using the FDCC-set's alternate numeric 
 symbols. 
%Oy          Year (offset from %C) in alternate representation. 
 
4.6.3   "i18n" LC_TIME category 
 
The "i18n" LC_TIME category is (following ISO 8601): 
 
 LC_TIME 
 % This is the ISO/IEC 14652 "i18n" definition for 
 % the LC_TIME category. 
 % 
 % Weekday and week numbering according to ISO 8601 
 abday   "<1>";"<2>";"<3>";"<4>";"<5>";"<6>;<7>" 
 day     "<1>";"<2>";"<3>";"<4>";"<5>";"<6>;<7>" 
 week    7;19971201;4 
 abmon   "<0><1>";"<0><2>";"<0><3>";"<0><4>";"<0><5>";"<0><6>";/ 
 "<0><7>";"<0><8>";"<0><9>";"<1><0>";"<1><1>";"<1><2>" 
 mon     "<0><1>";"<0><2>";"<0><3>";"<0><4>";"<0><5>";"<0><6>";/ 
 "<0><7>";"<0><8>";"<0><9>";"<1><0>";"<1><1>";"<1><2>" 
 am_pm   "";"" 
 % Date formats following ISO 8601 
 % Appropriate date and time representation (%c) 
 %       "%a %F %T" 
 d_t_fmt "<%><a><SP><%><F><SP><%><T>" 
 % 
 % Appropriate date representation (%x)   "%F" 
 d_fmt   "<%><F>" 
 % 
 % Appropriate time representation (%X)   "%T" 
 t_fmt   "<%><T>" 
 t_fmt_ampm "" 
 % 
 END LC_TIME 
 
 
4.7   LC_MESSAGES 
 
The LC_MESSAGES category shall define the format and values 
for affirmative and negative responses. The operands shall be 
strings or extended regular expressions; see ISO/IEC 9945-2 
clause 2.8.4. The following keywords shall be defined: 
 
copy         Specify the name of an existing FDCC-set to be used 
 as the source for the definition of this category. 
 If this keyword is specified, no other keyword 
 shall be specified. 
yesexpr      The operand shall consist of an extended regular 
 expression that describes the acceptable 
 affirmative response to a question expecting an 
 affirmative or negative response. 
noexpr       The operand shall consist of an extended regular 
 expression that describes the acceptable negative 
 response to a question expecting an affirmative or 
 negative response. 
 
The "i18n" LC_MESSAGES category is: 
 
 LC_MESSAGES 
 % This is the ISO/IEC 14652 "i18n" definition for 
 % the LC_MESSAGES category. 
 % 
 yesexpr "<U005B><+><1><U005D>" 
 noexpr  "<U005B><-><0><U005D>" 
 END LC_MESSAGES 
 
4.8   LC_PAPER 
 
The LC_PAPER category defines the paper size. The following 
keywords shall be defined: 
 
copy         Specify the name of an existing FDCC-set to be used 
 as the source for the definition of this category. 
 If this keyword is specified, no other keyword 
 shall be specified. 
height       Shall be used to specify the height of the paper. 
 The operand is an integer and the value is the 
 height measured in millimetres. 
width        Shall be used to specify the width of the paper. 
 The operand is an integer and the value is the 
 width measured in millimetres. 
 
The "i18n" LC_PAPER category is: 
 
 LC_PAPER 
 % This is the ISO/IEC 14652 "i18n" definition for 
 % the LC_PAPER category. 
 % 
 height   297 
 width    210 
 END LC_PAPER 
 
4.9   LC_NAME 
 
The LC_NAME category defines formats to be used in addressing 
a person, e.g. in a postal address or in a letter. The 
following keywords shall be defined: 
 
copy         Specify the name of an existing FDCC-set to be used 
 as the source for the definition of this category. 
 If this keyword is specified, no other keyword 
 shall be specified. 
name_fmt     Define the appropriate representation of a person's 
 name and title. The operand shall consist of a 
 string, and can contain any combination of 
 characters and field descriptors. In addition, the 
 string can contain escape sequences defined below. 
name_gen     The operand is a string defining a salutation valid 
 for all persons, example: the Japanese "-san" 
 salutation. 
name_mr      The operand is a string defining a salutation valid 
 for males. 
name_mrs     The operand is a string defining a salutation valid 
 for married females. 
name_miss    The operand is a string defining a salutation valid 
 for unmarried females. 
name_ms      The operand is a string defining a salutation valid 
 for all females. 
 
The LC_NAME category defines the interpretation of a number of 
escape sequences. The escape sequences are also available in 
the definitions with the following LC_NAME keywords: 
"name_fmt". 
 
Escape sequences for the "name_fmt" keyword: 
 
%f           Family names. 
%F           Family names in uppercase. 
%g           First given name. 
%G           First given initial 
%l           First given name with latin letters 
%o           Other shorter name, eg. "Bill" 
%m           Middle names. 
%M           Middle initial 
%p           Profession 
%s           salutation, such as "Mr." 
%S           salutation, using the FDCC-sets conventions, with 1 
 for the name_gen, 2 for name_mr, 3 for name_mrs, 4 
 for name_miss, 5 for name_ms 
%t           if the preceding escape sequence resulted in an 
 empty string, then the empty string, else a <space> 
 
Each escape sequence may have an <R> after the <%> to specify 
that the information is taken from a Romanized version string 
of the entity. 
 
The "i18n" LC_NAME category is: 
 
 LC_NAME 
 % This is the ISO/IEC 14652 "i18n" definition for 
 % the LC_NAME category. 
 % 
 name_fmt    "<%><p><%><t><%><g><%><t><%><m><%><t><%><f>" 
 END LC_NAME 
 
4.10   LC_ADDRESS 
 
The LC_ADDRESS category defines formats to be used in 
addressing a person, e.g. in a postal address or in a letter, 
and other items of geographic nature. All keywords are 
optional. The following keywords shall be defined: 
 
copy         Specify the name of an existing FDCC-set to be used 
 as the source for the definition of this category. 
 If this keyword is specified, no other keyword 
 shall be specified. 
postal_fmt   Define the appropriate representation of a postal 
 address such as street and city. The proper 
 formatting of a person's name and title is done 
 with the "name_fmt" keyword of the LC_NAME 
 category. The operand shall consist of a string, 
 and can contain any combination of characters and 
 field descriptors. In addition, the string can 
 contain escape sequences defined below. 
country_name     The operand is a string with the name of the 
 country in the language of the FDCC-set 
country_post     The operand is a string with the abbreviation of 
 the country, used for postal addresses, 
 according to CEPT-MAILCODE 
country_ab2      The operand is a string with the two-letter 
 abbreviation of the country, according to ISO 
 3166 
country_ab3      The operand is a string with the three-letter 
 abbreviation of the country, according to ISO 
 3166 
country_num      The operand is an integer with the three-digit 
 number of the country, according to ISO 3166 
country_car      The operand is a string with the abbreviation of 
 the country, used for motor vehicles and 
 traffic, according to the Genve convention 
 1949:68. 
country_isbn     The operand is a string with the abbreviation of 
 the country, used for book numbering (ISBN), 
 according to ISO 2108. 
lang_name    The operand is a string with the name of the 
 language in the language of the FDCC-set. 
lang_ab      The operand is a string with the two-letter 
 abbreviation of the language, according to ISO 639 
lang_term    The operand is a string with the three-letter 
 abbreviation of the language for terminology use, 
 according to ISO 639-2 
lang_lib     The operand is a string with the three-letter 
 abbreviation of the language for library use, 
 according to ISO 639-2. If not specified, the value 
 of the "lang_term" keyword is taken. 
 
The LC_ADDRESS category defines the interpretation of a number 
of escape sequences. The escape sequences are also available 
in the definitions with the following LC_ADDRESS keywords: 
"postal_fmt". 
 
Escape sequences for the "postal_fmt" keyword: 
 
%a           C/O address. 
%f           Firm name. 
%d           department name. 
%b           Building name 
%s           street name 
%h           house number or designation 
%N           if any graphical characters have been specified 
 then an end of line is made. 
%t           if the preceding escape sequence resulted in an 
 empty string, then the empty string, else a <space> 
%r           room number, door designation 
%e           floor number 
%C           country designation 
%z           zip number, postal code 
%T           town, city 
%c           country 
 
Each escape sequence may have an <R> after the <%> to specify 
that the information is taken from a Romanized version string 
of the entity. 
 
The "i18n" LC_ADDRESS category is: 
 
 LC_ADDRESS 
 % This is the ISO/IEC 14652 "i18n" definition for 
 % the LC_ADDRESS category. 
 % 
 postal_fmt    "<%><a><%><N><%><f><%><N><%><d><%><N><%><b><%><N><%>/ 
 <%><s><SP><%><h><SP><%><e><SP><%><r><%><N>/ 
 <%><C><-><%><z><SP><%><T><%><N><%><c><%><N>" 
 END LC_ADDRESS 
 
 
4.11   LC_TELEPHONE 
 
The LC_TELEPHONE category defines formats to be used with 
telephone services. All keywords are optional. The following 
keywords shall be defined: 
 
copy         Specify the name of an existing FDCC-set to be used 
 as the source for the definition of this category. 
 If this keyword is specified, no other keyword 
 shall be specified. 
tel_int_fmt      Define the appropriate representation of a 
 telephone number for international use. The 
 operand shall consist of a string, and can 
 contain any combination of characters and field 
 descriptors. In addition, the string can contain 
 escape sequences defined below. 
tel_dom_fmt      Define the appropriate representation of a 
 telephone number for domestic use. The operand 
 shall consist of a string, and can contain any 
 combination of characters and field descriptors. 
 In addition, the string can contain escape 
 sequences defined below. 
int_select   The operand is a string with the digits used to 
 call international telephone numbers. 
int_prefix   The operand is a string with the prefix used from 
 other countries to call the area 
 
The LC_TELEPHONE category defines the interpretation of a 
number of escape sequences. The escape sequences are also 
available in the definitions with the following LC_TELEPHONE 
keywords: "tel_int_fmt" and "tel_dom_fmt". 
 
%a           are code without prefix (prefix is often <0>). 
%A           are code including prefix (prefix is often <0>). 
%l           local number. 
%c           country code 
 
The "i18n" LC_TELEPHONE category is: 
 
 LC_TELEPHONE 
 % This is the ISO/IEC 14652 "i18n" definition for 
 % the LC_TELEPHONE category. 
 % 
 tel_int_fmt    "<+><%><c><SP><%><a><SP><%><l>" 
 END LC_TELEPHONE 
 
 
4.12   LC_MEASUREMENT 
 
The LC_MEASUREMENT category defines which measurement system 
in use. All keywords are optional. The following keywords 
shall be defined: 
 
copy         Specify the name of an existing FDCC-set to be used 
 as the source for the definition of this category. 
 If this keyword is specified, no other keyword 
 shall be specified. 
measurement      Shall be used to define the measurement system 
 in use. The operand is an integer. The following 
 values are defined: 
 1 ISO 1000 
 2 U.S.A. measurement 
 3 other 
 
The "i18n" LC_MEASUREMENT category is: 
 
 LC_MEASUREMENT 
 % This is the ISO/IEC 14652 "i18n" definition for 
 % the LC_MEASUREMENT category. 
 % 
 measurement    1 
 END LC_MEASUREMENT 
 
4.13   LC_VERSIONS - Specification method of FDCC-sets 
 
The LC_VERSIONS category defines which specification methods 
that have been used. All keywords are mandatory unless 
otherwise noted, and the operands are strings. The following 
keywords shall be defined: 
 
title        Title of the FDCC-set 
source       Organization name of provider of the source 
address      Organization postal address 
contact      Name of contact person 
email        Electronic mail address of the organization, or 
 contact person 
tel          Telephone number for the organization, in 
 international format. 
fax          Fax number for the organization, in international 
 format. 
language     Natural language, as specified in ISO 639 
territory    Territory, as two-letter form of ISO 3166 
audience     If not for general use, an indication of the 
 intended user audience. This keyword is optional. 
application      If for use of a special application, a 
 description of the application. This keyword is 
 optional. 
abbreviation     Short name for provider of the source. This 
 keyword is optional. 
revision     Revision number consisting of digits and zero or 
 more full stops ("."). 
date         Revision date in the format according to this 
 example: "1995-02-05" meaning the 5th of February, 
 1995. 
 
If any of the above information is non-existent, it must be 
stated in each case; the corresponding string is then the 
empty string. If required information is not present in ISO 
639 or ISO 3166, the relevant Maintenance Authority should be 
approached to get the needed item registered. 
 
category     Shall be used to define that a category is present 
 and what specification the category is claiming 
 conformance to. The first operand is a string that 
 describes the specification that the category is 
 claiming conformance to, and the following values 
 shall be defined: 
 i18n:1998 
 posix:1993 
 The second operand is a string with the category 
 name, where the category names of clause 4 shall be 
 defined. More than one "category" keyword may be 
 given, but only one per category name. 
 
The "i18n" LC_VERSIONS category is: 
 
 LC_VERSIONS 
 % This is the ISO/IEC 14652 "i18n" definition for 
 % the LC_VERSIONS category. 
 % 
 title      "ISO/IEC 14652 i18n FDCC-set" 
 source     "ISO/IEC JTC1/SC22/WG20 - internationalization" 
 address    "C/o Keld Simonsen, Skt. Jorgens Alle 8, DK-1615 
Kobenhavn V" 
 contact    "Keld Simonsen" 
 email      "keld@dkuug.dk" 
 tel        "+45 3122-6543" 
 fax        "+45 3325-6543" 
 language   "" 
 territory      "ISO" 
 revision   "1.0" 
 date       "1997-12-20" 
 % 
 category  i18n:1998;LC_VERSIONS 
 category  i18n:1998;LC_CTYPE 
 category  i18n:1998;LC_COLLATE 
 category  i18n:1998;LC_TIME 
 category  i18n:1998;LC_NUMERIC 
 category  i18n:1998;LC_MONETARY 
 category  i18n:1998;LC_MESSAGES 
 category  i18n:1998;LC_PAPER 
 category  i18n:1998;LC_NAME 
 category  i18n:1998;LC_ADDRESS 
 category  i18n:1998;LC_TELEPHONE 
 category  i18n:1998;LC_MEASUREMENT 
 
 END LC_VERSIONS 
 
 
5.  CHARMAP 
 
A character set description may exist for each coded character 
set supported by an application.  This text is referred 
elsewhere in this standard as a charmap. 
 
A conforming charmap to be used with a FDCC-set shall support 
the portable character set specified in Table 3.  The table 
defines the characters in the portable character set and the 
corresponding symbolic character names used to identify each 
character in a character description text. 
 
 
Table 3: portable character set 
 
Symbolic name         Glyph           UCS        UCS name 
 
<NUL>                                 <U0000>    NULL (NUL) 
<alert>                               <U0007>    BELL (BEL) 
<backspace>                           <U0008>    BACKSPACE (BS) 
<tab>                                 <U0009>    CHARACTER TABULATION (HT) 
<carriage-return>                     <U000D>    CARRIAGE RETURN (CR) 
<newline>                             <U000A>    LINE FEED (LF) 
<vertical-tab>                        <U000B>    LINE TABULATION (VT) 
<form-feed>                           <U000C>    FORM FEED (FF) 
<space>                               <U0020>    SPACE 
<exclamation-mark>    !               <U0021>    EXCLAMATION MARK 
<quotation-mark>      "               <U0022>    QUOTATION MARK 
<number-sign>         #               <U0023>    NUMBER SIGN 
<dollar-sign>         $               <U0024>    DOLLAR SIGN 
<percent-sign>        %               <U0025>    PERCENT SIGN 
<ampersand>           &               <U0026>    AMPERSAND 
<apostrophe>          '               <U0027>    APOSTROPHE 
<left-parenthesis>    (               <U0028>    LEFT PARENTHESIS 
<right-parenthesis>   )               <U0029>    RIGHT PARENTHESIS 
<asterisk>            *               <U002A>    ASTERISK 
<plus-sign>           +               <U002B>    PLUS SIGN 
<comma>               ,               <U002C>    COMMA 
<hyphen-minus>        -               <U002D>    HYPHEN-MINUS 
<hyphen>              -               <U002D>    HYPHEN-MINUS 
<full-stop>           .               <U002E>    FULL STOP 
<period>              .               <U002E>    FULL STOP 
<slash>               /               <U002F>    SOLIDUS 
<solidus>             /               <U002F>    SOLIDUS 
<zero>                0               <U0030>    DIGIT ZERO 
<one>                 1               <U0031>    DIGIT ONE 
<two>                 2               <U0032>    DIGIT TWO 
<three>               3               <U0033>    DIGIT THREE 
<four>                4               <U0034>    DIGIT FOUR 
<five>                5               <U0035>    DIGIT FIVE 
<six>                 6               <U0036>    DIGIT SIX 
<seven>               7               <U0037>    DIGIT SEVEN 
<eight>               8               <U0038>    DIGIT EIGHT 
<nine>                9               <U0039>    DIGIT NINE 
<colon>               :               <U003A>    COLON 
<semicolon>           ;               <U003B>    SEMICOLON 
<less-than-sign>      <               <U003C>    LESS-THAN SIGN 
<equals-sign>         =               <U003D>    EQUALS SIGN 
<greater-than-sign>   >               <U003E>    GREATER-THAN SIGN 
<question-mark>       ?               <U003F>    QUESTION MARK 
<commercial-at>       @               <U0040>    COMMERCIAL AT 
<A>                   A               <U0041>    LATIN CAPITAL LETTER A 
<B>                   B               <U0042>    LATIN CAPITAL LETTER B 
<C>                   C               <U0043>    LATIN CAPITAL LETTER C 
<D>                   D               <U0044>    LATIN CAPITAL LETTER D 
<E>                   E               <U0045>    LATIN CAPITAL LETTER E 
<F>                   F               <U0046>    LATIN CAPITAL LETTER F 
<G>                   G               <U0047>    LATIN CAPITAL LETTER G 
<H>                   H               <U0048>    LATIN CAPITAL LETTER H 
<I>                   I               <U0049>    LATIN CAPITAL LETTER I 
<J>                   J               <U004A>    LATIN CAPITAL LETTER J 
<K>                   K               <U004B>    LATIN CAPITAL LETTER K 
<L>                   L               <U004C>    LATIN CAPITAL LETTER L 
<M>                   M               <U004D>    LATIN CAPITAL LETTER M 
<N>                   N               <U004E>    LATIN CAPITAL LETTER N 
<O>                   O               <U004F>    LATIN CAPITAL LETTER O 
<P>                   P               <U0050>    LATIN CAPITAL LETTER P 
<Q>                   Q               <U0051>    LATIN CAPITAL LETTER Q 
<R>                   R               <U0052>    LATIN CAPITAL LETTER R 
<S>                   S               <U0053>    LATIN CAPITAL LETTER S 
<T>                   T               <U0054>    LATIN CAPITAL LETTER T 
<U>                   U               <U0055>    LATIN CAPITAL LETTER U 
<V>                   V               <U0056>    LATIN CAPITAL LETTER V 
<W>                   W               <U0057>    LATIN CAPITAL LETTER W 
<X>                   X               <U0058>    LATIN CAPITAL LETTER X 
<Y>                   Y               <U0059>    LATIN CAPITAL LETTER Y 
<Z>                   Z               <U005A>    LATIN CAPITAL LETTER Z 
<left-square-bracket> [               <U005B>    LEFT SQUARE BRACKET 
<backslash>           \               <U005C>    REVERSE SOLIDUS 
<reverse-solidus>     \               <U005C>    REVERSE SOLIDUS 
<right-square-bracket>                ]          <U005D>       RIGHT 
SQUARE BRACKET 
<circumflex-accent>   ^               <U005E>    CIRCUMFLEX ACCENT 
<circumflex>          ^               <U005E>    CIRCUMFLEX ACCENT 
<low-line>            _               <U005F>    LOW LINE 
<underscore>          _               <U005F>    LOW LINE 
<grave-accent>        `               <U0060>    GRAVE ACCENT 
<a>                   a               <U0061>    LATIN SMALL LETTER A 
<b>                   b               <U0062>    LATIN SMALL LETTER B 
<c>                   c               <U0063>    LATIN SMALL LETTER C 
<d>                   d               <U0064>    LATIN SMALL LETTER D 
<e>                   e               <U0065>    LATIN SMALL LETTER E 
<f>                   f               <U0066>    LATIN SMALL LETTER F 
<g>                   g               <U0067>    LATIN SMALL LETTER G 
<h>                   h               <U0068>    LATIN SMALL LETTER H 
<i>                   i               <U0069>    LATIN SMALL LETTER I 
<j>                   j               <U006A>    LATIN SMALL LETTER J 
<k>                   k               <U006B>    LATIN SMALL LETTER K 
<l>                   l               <U006C>    LATIN SMALL LETTER L 
<m>                   m               <U006D>    LATIN SMALL LETTER M 
<n>                   n               <U006E>    LATIN SMALL LETTER N 
<o>                   o               <U006F>    LATIN SMALL LETTER O 
<p>                   p               <U0070>    LATIN SMALL LETTER P 
<q>                   q               <U0071>    LATIN SMALL LETTER Q 
<r>                   r               <U0072>    LATIN SMALL LETTER R 
<s>                   s               <U0073>    LATIN SMALL LETTER S 
<t>                   t               <U0074>    LATIN SMALL LETTER T 
<u>                   u               <U0075>    LATIN SMALL LETTER U 
<v>                   v               <U0076>    LATIN SMALL LETTER V 
<w>                   w               <U0077>    LATIN SMALL LETTER W 
<x>                   x               <U0078>    LATIN SMALL LETTER X 
<y>                   y               <U0079>    LATIN SMALL LETTER Y 
<z>                   z               <U007A>    LATIN SMALL LETTER Z 
<left-brace>          {               <U007B>    LEFT CURLY BRACKET 
<left-curly-bracket>  {               <U007B>    LEFT CURLY BRACKET 
<vertical-line>       |               <U007C>    VERTICAL LINE 
<right-brace>         }               <U007D>    RIGHT CURLY BRACKET 
<right-curly-bracket> }               <U007D>    RIGHT CURLY BRACKET 
<tilde>               ~               <U007E>    TILDE 
 
This standard places only the following requirements on the 
encoded values of the characters in the portable character 
set: 
 
 (1)  The encoded values associated with each member of the 
portable character set shall be invariant across all FDCC-sets 
supported by the application. 
 
 (2)  The encoded values associated with the digits '0' to 
'9' shall be such that the value of each character after '0' 
shall be one greater than the value of the previous character. 
 
Conforming charmaps shall specify certain character and 
character set attributes, as defined in 5.1. 
 
5.1   Character Set Description Text 
 
The character set description text (charmap) describes the 
mapping between symbolic character names and actual encoding 
of a coded character set. It is used to bind the symbolic 
character names in a FDCC-set to an actual encoding, so an 
application can process data in this encoding. 
 
The following declarations can precede the character 
definitions.  Each shall consist of the symbol shown in the 
following list, starting in column 1, including the 
surrounding brackets, followed by one of more "blank"s, 
followed by the value to be assigned to the symbol.  If any of 
the declarations are included, they shall be specified in the 
order shown in the following list: 
 
<code_set_name>      The name of the coded character set for 
 which the character set description text is 
 defined. The characters of the name shall be 
 taken form the set of characters with 
 visible glyphs defined in Table 3. 
 
<mb_cur_max>     The maximum number of bytes in a multibyte 
 character.  This shall default to 1. 
 
<mb_cur_min>     An unsigned positive integer value that shall 
 define the minimum number of bytes in a 
 character for the encoded character set. The 
 value shall be less or equal to "mb_cur_max". If 
 not specified, the minimum number shall be equal 
 to "mb_cur_max". 
 
<escape_char>        The escape character used to indicate that 
 the characters following shall be 
 interpreted in a special way, as defined 
 later in this subclause. This shall default 
 to backslash (\). The character slash (/) is 
 used in all the following text and examples, 
 unless otherwise noted. 
 
<comment_char>       The character that when placed in column 1 
 of a charmap line, is used to indicate that 
 the line shall be ignored. The default 
 character shall be the number sign (#). The 
 character percent-sign (%) is used in all 
 the following text and examples, unless 
 otherwise noted. 
 
<repertoiremap>      The name of the repertoiremap used to define 
 the symbolic character names in the charmap. 
 The characters of the name shall be taken 
 form the set of characters with visible 
 glyphs defined in Table 3. 
 
<escseq>         defines the escape sequences for ISO 2022 
 shifting for the coded character set defined by 
 the charmap. The semicolon-separated operands 
 are all strings with characters taken from the 
 set of characters with visible glyphs defined in 
 table 3. The first operand defines the g-set or 
 c-set to be defined, and the following values 
 are defined: c0, c1, g0, g1, g2, g3. The second 
 operand defines what range of characters in the 
 charmap that is affected, and the values defined 
 are: c0, c1, g0, g1. The third operand is the 
 escape sequence that is defined. 
 
<addset>         the name of the charmap to be added the current 
 coded character set and to be selected by the 
 escape sequences defined by <escseq> of the 
 added charmap. 
 
<include>        include the encoding of another charmap in the 
 current charmap. The semicolon-separated 
 operands are all strings with characters taken 
 from the set of characters with visible glyphs 
 defined in table 3. The first operand defines 
 the g-set or c-set to be defined in the current 
 charmap, and the following values are defined: 
 c0, c1, g0, g1, g2, g3. The second operand 
 defines what range of characters in the 
 referenced charmap, and the values defined are: 
 c0, c1, g0, g1. The third operand is the name of 
 another charmap. 
 
The character set mapping definitions shall be all the lines 
immediately following an identifier line containing the string 
CHARMAP starting in column 1, and preceding a trailer line 
containing the string END CHARMAP starting in column 1.  Empty 
lines and lines containing a <comment_char> in the first 
column shall be ignored.  Each noncomment line of the 
character set mapping definition (i.e., between the CHARMAP 
and END CHARMAP lines of the text) shall be in one of the 
following formats. 
 
 
 "%s %s %s\n", <symbolic-name>,<encoding>,<comments> 
 
 "%s...%s %s %s\n", <symbolic-name>,<symbolic- 
name>,<encoding>,<comments> 
 
 "%s....%s %s %s\n", <symbolic-name>,<symbolic- 
name>,<encoding>,<comments> 
 
 "%s..%s %s %s\n", <symbolic-name>,<symbolic- 
name>,<encoding>,<comments> 
 
In the first format, the line of the character set mapping 
definition shall start with the symbolic name, immediately 
preceded by a <less-than> character and immediately followed 
by a <greater-than> character.  Symbolic names shall only 
contain characters from the set shown with a visible glyph in 
Table 3.  The <greater-than> character or the escape character 
can be included as part of the symbolic name by specifying it 
twice; for example, the sequence "<\\>>>" represents the 
symbolic name "\>". 
 
The same symbolic name may occur several times, with different 
values. The first value is the one used when generating an 
encoding, while the other values are accepted in decoding. 
Symbolic names may be included to identify values that can 
overlap with each other or with the values of the symbolic 
names shown in Table 3.  It is possible to specify symbolic 
names for which no encoding exists in the encoded character 
set, by not specifying a value. 
 
In the second and third format (symbolic decimal ellipsis), 
the line in the character set mapping defines a range of one 
or more symbolic names. The difference between the second and 
the third format is the number of dots in the ellipsis: the 
second has 3 dots, the third has 4 dots. In these forms the 
symbolic names shall consist of zero or more nonnumeric 
characters from the set shown with visible glyphs in Table 3, 
followed by an integer formed by one or more decimal digits. 
The characters preceding the integer shall be identical in the 
two symbolic names, and the integer formed by the digits in 
the second symbolic name shall be identical to or greater than 
the integer formed by the digits in the first name. This shall 
be interpreted as a series of symbolic names formed from the 
common part and each of the integers in decimal format between 
the first and the second integer, inclusive, and with a length 
of the symbolic names generated that is equal to the length of 
the first (and also the second) symbolic name. As an example, 
<j0101>...<j0104> is interpreted as the symbolic names 
<j0101>, <j0102>, <j0103>, and <j0104>, in that order. 
 
In the fourth format (symbolic hexadecimal ellipsis, with two 
dots), the line in the character set mapping defines a range 
of one or more symbolic names. In this form the symbolic names 
shall consist of zero or more nonnumeric characters from the 
set shown with visible glyphs in Table 3, followed by an 
integer formed by one or more hexadecimal digits, using 
uppercase letters only for the range "A" to "F".  The 
characters preceding the hexadecimal integer shall be 
identical in the two symbolic names, and the integer formed by 
the hexadecimal digits in the second symbolic name shall be 
identical to or greater than the integer formed by the 
hexadecimal digits in the first name. This shall be 
interpreted as a series of symbolic names formed from the 
common part and each of the integers in hexadecimal format 
using uppercase letters only between the first and the second 
integer, inclusive, and with a length of the symbolic names 
generated that is equal to the length of the first (and also 
the second) symbolic name. As an example, <U010E>..<U0111> is 
interpreted as the symbolic names <U010E>, <U010F>, <U0110>, 
and <U0111>, in that order. 
 
The encoding part shall be expressed as one (for single-byte 
values) or more concatenated decimal, octal or hexadecimal 
constants. Decimal constants shall be represented by two or 
three decimal digits, preceded by the escape character and the 
lowercase letter "d"; for example /d05, /d97, or /d143. 
Hexadecimal constants shall be represented by two hexadecimal 
digits, preceded by the escape character and the lowercase 
letter "x"; for example /x05, /x61, or /x8f.  Octal constants 
shall be represented by two or three octal digits, preceded by 
the escape character; for example /05, /141, or /217. In a 
charmap, each constant should represent an 8 bit byte for 
portability reasons. Applications supporting other byte sizes 
may allow constants to represent values larger than those that 
can be represented in 8 bit bytes, and to allow additional 
digits in constants. When constants are concatenated for 
multibyte character values, they may be of different types, 
and interpreted in byte order from the first to the last with 
the least significant byte of the multibyte character 
specified by the last byte. The manner in which these 
constants are represented in the character stored in the 
system is application defined. Omitting bytes from a multibyte 
character produces undefined results. 
 
In lines defining ranges of symbolic names, the encoded value 
is the value for the first symbolic name in the range (the 
symbolic name preceding the ellipsis). Subsequent symbolic 
names defined by the range shall have encoding values in 
increasing order. For example the line 
 
 <j0101>...<j0104>         /d129/d254 
 
shall be interpreted as 
 
 <j0101>     /d129/d254 
 <j0102>     /d129/d255 
 <j0103>     /d130/d000 
 <j0104>     /d130/d001 
 
The comments parameter is optional. 
 
 
6   REPERTOIREMAP 
 
FDCC-set and Charmap sources may be specified in a coded 
character set independent way, using symbolic character names. 
The relation between the symbolic character names and charac- 
ters may be specified via a Repertoiremap, which defines the 
repertoire of characters defined for a FDCC-set, and the 
symbolic character names and corresponding abstract character 
(by a reference to ISO/IEC 10646). 
 
The repertoire mapping is defined by specifying the symbolic 
character name and the ISO/IEC 10646 code position in 
hexadecimal form (with a preceding 'U') and optionally the 
long ISO/IEC 10646 character name in the following format: 
 
 "%s %s %s\n",<symbolic-name>,<10646-codepoint>,<comments> 
 
The symbolic character name and the ISO/IEC 10646 code 
position are each surrounded by angle brackets <>, and the 
fields shall be separated by one or more spaces or tabs on a 
line. If a right angle bracket or an escape character is used 
within a symbolic name, it shall be preceded by the escape 
character. 
 
The escape character can be redefined from the default reverse 
solidus (\) with the first line of the Repertoiremap 
containing the string "escape_char" followed by one or more 
spaces or tabs and then the escape character. 
 
Several symbolic character names can refer to the same 
abstract character, and are then used as synonyms in FDCC-sets 
and charmaps. The set of <U0000>..<UFFFF> and 
<U00000000>..<U7FFFFFFF> symbolic names (no lowercase letters) 
are predefined and refers to the corresponding code points of 
ISO/IEC 10646 with the same short identifier. 
 
The "i18nrep" repertoiremap is defined to accommodate prior 
art. The contents of the "i18nrep" repertoiremap is as 
follows: 
 
escape_char / 
<NUL>                <U0000>       NULL (NUL) 
<SOH>                <U0001>       START OF HEADING (SOH) 
<STX>                <U0002>       START OF TEXT (STX) 
<ETX>                <U0003>       END OF TEXT (ETX) 
<EOT>                <U0004>       END OF TRANSMISSION (EOT) 
<ENQ>                <U0005>       ENQUIRY (ENQ) 
<ACK>                <U0006>       ACKNOWLEDGE (ACK) 
<alert>              <U0007>       BELL (BEL) 
<BEL>                <U0007>       BELL (BEL) 
<backspace>          <U0008>       BACKSPACE (BS) 
<tab>                <U0009>       CHARACTER TABULATION (HT) 
<newline>            <U000A>       LINE FEED (LF) 
<vertical-tab>       <U000B>       LINE TABULATION (VT) 
<form-feed>          <U000C>       FORM FEED (FF) 
<carriage-return>    <U000D>       CARRIAGE RETURN (CR) 
<DLE>                <U0010>       DATALINK ESCAPE (DLE) 
<DC1>                <U0011>       DEVICE CONTROL ONE (DC1) 
<DC2>                <U0012>       DEVICE CONTROL TWO (DC2) 
<DC3>                <U0013>       DEVICE CONTROL THREE (DC3) 
<DC4>                <U0014>       DEVICE CONTROL FOUR (DC4) 
<NAK>                <U0015>       NEGATIVE ACKNOWLEDGE (NAK) 
<SYN>                <U0016>       SYNCRONOUS IDLE (SYN) 
<ETB>                <U0017>       END OF TRANSMISSION BLOCK (ETB) 
<CAN>                <U0018>       CANCEL (CAN) 
<SUB>                <U001A>       SUBSTITUTE (SUB) 
<ESC>                <U001B>       ESCAPE (ESC) 
<IS4>                <U001C>       FILE SEPARATOR (IS4) 
<IS3>                <U001D>       GROUP SEPARATOR (IS3) 
<intro>              <U001D>       GROUP SEPARATOR (IS3) 
<IS2>                <U001E>       RECORD SEPARATOR (IS2) 
<IS1>                <U001F>       UNIT SEPARATOR (IS1) 
<DEL>                <U007F>       DELETE (DEL) 
<space>              <U0020>       SPACE 
<exclamation-mark>   <U0021>       EXCLAMATION MARK 
<quotation-mark>     <U0022>       QUOTATION MARK 
<number-sign>        <U0023>       NUMBER SIGN 
<dollar-sign>        <U0024>       DOLLAR SIGN 
<percent-sign>       <U0025>       PERCENT SIGN 
<ampersand>          <U0026>       AMPERSAND 
<apostrophe>         <U0027>       APOSTROPHE 
<left-parenthesis>   <U0028>       LEFT PARENTHESIS 
<right-parenthesis>  <U0029>       RIGHT PARENTHESIS 
<asterisk>           <U002A>       ASTERISK 
<plus-sign>          <U002B>       PLUS SIGN 
<comma>              <U002C>       COMMA 
<hyphen>             <U002D>       HYPHEN-MINUS 
<hyphen-minus>       <U002D>       HYPHEN-MINUS 
<period>             <U002E>       FULL STOP 
<full-stop>          <U002E>       FULL STOP 
<slash>              <U002F>       SOLIDUS 
<solidus>            <U002F>       SOLIDUS 
<zero>               <U0030>       DIGIT ZERO 
<one>                <U0031>       DIGIT ONE 
<two>                <U0032>       DIGIT TWO 
<three>              <U0033>       DIGIT THREE 
<four>               <U0034>       DIGIT FOUR 
<five>               <U0035>       DIGIT FIVE 
<six>                <U0036>       DIGIT SIX 
<seven>              <U0037>       DIGIT SEVEN 
<eight>              <U0038>       DIGIT EIGHT 
<nine>               <U0039>       DIGIT NINE 
<colon>              <U003A>       COLON 
<semicolon>          <U003B>       SEMICOLON 
<less-than-sign>     <U003C>       LESS-THAN SIGN 
<equals-sign>        <U003D>       EQUALS SIGN 
<greater-than-sign>  <U003E>       GREATER-THAN SIGN 
<question-mark>      <U003F>       QUESTION MARK 
<commercial-at>      <U0040>       COMMERCIAL AT 
<left-square-bracket>       <U005B>       LEFT SQUARE BRACKET 
<backslash>          <U005C>       REVERSE SOLIDUS 
<reverse-solidus>    <U005C>       REVERSE SOLIDUS 
<right-square-bracket>      <U005D>       RIGHT SQUARE BRACKET 
<circumflex>         <U005E>       CIRCUMFLEX ACCENT 
<circumflex-accent>  <U005E>       CIRCUMFLEX ACCENT 
<underscore>         <U005F>       LOW LINE 
<low-line>           <U005F>       LOW LINE 
<grave-accent>       <U0060>       GRAVE ACCENT 
<left-brace>         <U007B>       LEFT CURLY BRACKET 
<left-curly-bracket> <U007B>       LEFT CURLY BRACKET 
<vertical-line>      <U007C>       VERTICAL LINE 
<right-brace>        <U007D>       RIGHT CURLY BRACKET 
<right-curly-bracket>       <U007D>       RIGHT CURLY BRACKET 
<tilde>              <U007E>       TILDE 
<NU>    <U0000>      NULL (NUL) 
<SH>    <U0001>      START OF HEADING (SOH) 
<SX>    <U0002>      START OF TEXT (STX) 
<EX>    <U0003>      END OF TEXT (ETX) 
<ET>    <U0004>      END OF TRANSMISSION (EOT) 
<EQ>    <U0005>      ENQUIRY (ENQ) 
<AK>    <U0006>      ACKNOWLEDGE (ACK) 
<BL>    <U0007>      BELL (BEL) 
<BS>    <U0008>      BACKSPACE (BS) 
<HT>    <U0009>      CHARACTER TABULATION (HT) 
<LF>    <U000A>      LINE FEED (LF) 
<VT>    <U000B>      LINE TABULATION (VT) 
<FF>    <U000C>      FORM FEED (FF) 
<CR>    <U000D>      CARRIAGE RETURN (CR) 
<SO>    <U000E>      SHIFT OUT (SO) 
<SI>    <U000F>      SHIFT IN (SI) 
<DL>    <U0010>      DATALINK ESCAPE (DLE) 
<D1>    <U0011>      DEVICE CONTROL ONE (DC1) 
<D2>    <U0012>      DEVICE CONTROL TWO (DC2) 
<D3>    <U0013>      DEVICE CONTROL THREE (DC3) 
<D4>    <U0014>      DEVICE CONTROL FOUR (DC4) 
<NK>    <U0015>      NEGATIVE ACKNOWLEDGE (NAK) 
<SY>    <U0016>      SYNCHRONOUS IDLE (SYN) 
<EB>    <U0017>      END OF TRANSMISSION BLOCK (ETB) 
<CN>    <U0018>      CANCEL (CAN) 
<EM>    <U0019>      END OF MEDIUM (EM) 
<SB>    <U001A>      SUBSTITUTE (SUB) 
<EC>    <U001B>      ESCAPE (ESC) 
<FS>    <U001C>      FILE SEPARATOR (IS4) 
<GS>    <U001D>      GROUP SEPARATOR (IS3) 
<RS>    <U001E>      RECORD SEPARATOR (IS2) 
<US>    <U001F>      UNIT SEPARATOR (IS1) 
<DT>    <U007F>      DELETE (DEL) 
<PA>    <U0080>      PADDING CHARACTER (PAD) 
<HO>    <U0081>      HIGH OCTET PRESET (HOP) 
<BH>    <U0082>      BREAK PERMITTED HERE (BPH) 
<NH>    <U0083>      NO BREAK HERE (NBH) 
<IN>    <U0084>      INDEX (IND) 
<NL>    <U0085>      NEXT LINE (NEL) 
<SA>    <U0086>      START OF SELECTED AREA (SSA) 
<ES>    <U0087>      END OF SELECTED AREA (ESA) 
<HS>    <U0088>      CHARACTER TABULATION SET (HTS) 
<HJ>    <U0089>      CHARACTER TABULATION WITH JUSTIFICATION (HTJ) 
<VS>    <U008A>      LINE TABULATION SET (VTS) 
<PD>    <U008B>      PARTIAL LINE FORWARD (PLD) 
<PU>    <U008C>      PARTIAL LINE BACKWARD (PLU) 
<RI>    <U008D>      REVERSE LINE FEED (RI) 
<S2>    <U008E>      SINGLE-SHIFT TWO (SS2) 
<S3>    <U008F>      SINGLE-SHIFT THREE (SS3) 
<DC>    <U0090>      DEVICE CONTROL STRING (DCS) 
<P1>    <U0091>      PRIVATE USE ONE (PU1) 
<P2>    <U0092>      PRIVATE USE TWO (PU2) 
<TS>    <U0093>      SET TRANSMIT STATE (STS) 
<CC>    <U0094>      CANCEL CHARACTER (CCH) 
<MW>    <U0095>      MESSAGE WAITING (MW) 
<SG>    <U0096>      START OF GUARDED AREA (SPA) 
<EG>    <U0097>      END OF GUARDED AREA (EPA) 
<SS>    <U0098>      START OF STRING (SOS) 
<GC>    <U0099>      SINGLE GRAPHIC CHARACTER INTRODUCER (SGCI) 
<SC>    <U009A>      SINGLE CHARACTER INTRODUCER (SCI) 
<CI>    <U009B>      CONTROL SEQUENCE INTRODUCER (CSI) 
<ST>    <U009C>      STRING TERMINATOR (ST) 
<OC>    <U009D>      OPERATING SYSTEM COMMAND (OSC) 
<PM>    <U009E>      PRIVACY MESSAGE (PM) 
<AC>    <U009F>      APPLICATION PROGRAM COMMAND (APC) 
<SP>    <U0020>      SPACE 
<!>     <U0021>      EXCLAMATION MARK 
<">     <U0022>      QUOTATION MARK 
<Nb>    <U0023>      NUMBER SIGN 
<DO>    <U0024>      DOLLAR SIGN 
<%>     <U0025>      PERCENT SIGN 
<&>     <U0026>      AMPERSAND 
<'>     <U0027>      APOSTROPHE 
<(>     <U0028>      LEFT PARENTHESIS 
<)>     <U0029>      RIGHT PARENTHESIS 
<*>     <U002A>      ASTERISK 
<+>     <U002B>      PLUS SIGN 
<,>     <U002C>      COMMA 
<->     <U002D>      HYPHEN-MINUS 
<.>     <U002E>      FULL STOP 
<//>    <U002F>      SOLIDUS 
<0>     <U0030>      DIGIT ZERO 
<1>     <U0031>      DIGIT ONE 
<2>     <U0032>      DIGIT TWO 
<3>     <U0033>      DIGIT THREE 
<4>     <U0034>      DIGIT FOUR 
<5>     <U0035>      DIGIT FIVE 
<6>     <U0036>      DIGIT SIX 
<7>     <U0037>      DIGIT SEVEN 
<8>     <U0038>      DIGIT EIGHT 
<9>     <U0039>      DIGIT NINE 
<:>     <U003A>      COLON 
<;>     <U003B>      SEMICOLON 
<<>     <U003C>      LESS-THAN SIGN 
<=>     <U003D>      EQUALS SIGN 
</>>    <U003E>      GREATER-THAN SIGN 
<?>     <U003F>      QUESTION MARK 
<At>    <U0040>      COMMERCIAL AT 
<A>     <U0041>      LATIN CAPITAL LETTER A 
<B>     <U0042>      LATIN CAPITAL LETTER B 
<C>     <U0043>      LATIN CAPITAL LETTER C 
<D>     <U0044>      LATIN CAPITAL LETTER D 
<E>     <U0045>      LATIN CAPITAL LETTER E 
<F>     <U0046>      LATIN CAPITAL LETTER F 
<G>     <U0047>      LATIN CAPITAL LETTER G 
<H>     <U0048>      LATIN CAPITAL LETTER H 
<I>     <U0049>      LATIN CAPITAL LETTER I 
<J>     <U004A>      LATIN CAPITAL LETTER J 
<K>     <U004B>      LATIN CAPITAL LETTER K 
<L>     <U004C>      LATIN CAPITAL LETTER L 
<M>     <U004D>      LATIN CAPITAL LETTER M 
<N>     <U004E>      LATIN CAPITAL LETTER N 
<O>     <U004F>      LATIN CAPITAL LETTER O 
<P>     <U0050>      LATIN CAPITAL LETTER P 
<Q>     <U0051>      LATIN CAPITAL LETTER Q 
<R>     <U0052>      LATIN CAPITAL LETTER R 
<S>     <U0053>      LATIN CAPITAL LETTER S 
<T>     <U0054>      LATIN CAPITAL LETTER T 
<U>     <U0055>      LATIN CAPITAL LETTER U 
<V>     <U0056>      LATIN CAPITAL LETTER V 
<W>     <U0057>      LATIN CAPITAL LETTER W 
<X>     <U0058>      LATIN CAPITAL LETTER X 
<Y>     <U0059>      LATIN CAPITAL LETTER Y 
<Z>     <U005A>      LATIN CAPITAL LETTER Z 
<<(>    <U005B>      LEFT SQUARE BRACKET 
<////>  <U005C>      REVERSE SOLIDUS 
<)/>>   <U005D>      RIGHT SQUARE BRACKET 
<'/>>   <U005E>      CIRCUMFLEX ACCENT 
<_>     <U005F>      LOW LINE 
<'!>    <U0060>      GRAVE ACCENT 
<a>     <U0061>      LATIN SMALL LETTER A 
<b>     <U0062>      LATIN SMALL LETTER B 
<c>     <U0063>      LATIN SMALL LETTER C 
<d>     <U0064>      LATIN SMALL LETTER D 
<e>     <U0065>      LATIN SMALL LETTER E 
<f>     <U0066>      LATIN SMALL LETTER F 
<g>     <U0067>      LATIN SMALL LETTER G 
<h>     <U0068>      LATIN SMALL LETTER H 
<i>     <U0069>      LATIN SMALL LETTER I 
<j>     <U006A>      LATIN SMALL LETTER J 
<k>     <U006B>      LATIN SMALL LETTER K 
<l>     <U006C>      LATIN SMALL LETTER L 
<m>     <U006D>      LATIN SMALL LETTER M 
<n>     <U006E>      LATIN SMALL LETTER N 
<o>     <U006F>      LATIN SMALL LETTER O 
<p>     <U0070>      LATIN SMALL LETTER P 
<q>     <U0071>      LATIN SMALL LETTER Q 
<r>     <U0072>      LATIN SMALL LETTER R 
<s>     <U0073>      LATIN SMALL LETTER S 
<t>     <U0074>      LATIN SMALL LETTER T 
<u>     <U0075>      LATIN SMALL LETTER U 
<v>     <U0076>      LATIN SMALL LETTER V 
<w>     <U0077>      LATIN SMALL LETTER W 
<x>     <U0078>      LATIN SMALL LETTER X 
<y>     <U0079>      LATIN SMALL LETTER Y 
<z>     <U007A>      LATIN SMALL LETTER Z 
<(!>    <U007B>      LEFT CURLY BRACKET 
<!!>    <U007C>      VERTICAL LINE 
<!)>    <U007D>      RIGHT CURLY BRACKET 
<'?>    <U007E>      TILDE 
<NS>    <U00A0>      NO-BREAK SPACE 
<!I>    <U00A1>      INVERTED EXCLAMATION MARK 
<Ct>    <U00A2>      CENT SIGN 
<Pd>    <U00A3>      POUND SIGN 
<Cu>    <U00A4>      CURRENCY SIGN 
<Ye>    <U00A5>      YEN SIGN 
<BB>    <U00A6>      BROKEN BAR 
<SE>    <U00A7>      SECTION SIGN 
<':>    <U00A8>      DIAERESIS 
<Co>    <U00A9>      COPYRIGHT SIGN 
<-a>    <U00AA>      FEMININE ORDINAL INDICATOR 
<<<>    <U00AB>      LEFT-POINTING DOUBLE ANGLE QUOTATION MARK 
<NO>    <U00AC>      NOT SIGN 
<-->    <U00AD>      SOFT HYPHEN 
<Rg>    <U00AE>      REGISTERED SIGN 
<'m>    <U00AF>      MACRON 
<DG>    <U00B0>      DEGREE SIGN 
<+->    <U00B1>      PLUS-MINUS SIGN 
<2S>    <U00B2>      SUPERSCRIPT TWO 
<3S>    <U00B3>      SUPERSCRIPT THREE 
<''>    <U00B4>      ACUTE ACCENT 
<My>    <U00B5>      MICRO SIGN 
<PI>    <U00B6>      PILCROW SIGN 
<.M>    <U00B7>      MIDDLE DOT 
<',>    <U00B8>      CEDILLA 
<1S>    <U00B9>      SUPERSCRIPT ONE 
<-o>    <U00BA>      MASCULINE ORDINAL INDICATOR 
</>/>>  <U00BB>      RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK 
<14>    <U00BC>      VULGAR FRACTION ONE QUARTER 
<12>    <U00BD>      VULGAR FRACTION ONE HALF 
<34>    <U00BE>      VULGAR FRACTION THREE QUARTERS 
<?I>    <U00BF>      INVERTED QUESTION MARK 
<A!>    <U00C0>      LATIN CAPITAL LETTER A WITH GRAVE 
<A'>    <U00C1>      LATIN CAPITAL LETTER A WITH ACUTE 
<A/>>   <U00C2>      LATIN CAPITAL LETTER A WITH CIRCUMFLEX 
<A?>    <U00C3>      LATIN CAPITAL LETTER A WITH TILDE 
<A:>    <U00C4>      LATIN CAPITAL LETTER A WITH DIAERESIS 
<AA>    <U00C5>      LATIN CAPITAL LETTER A WITH RING ABOVE 
<AE>    <U00C6>      LATIN CAPITAL LETTER AE (ash) 
<C,>    <U00C7>      LATIN CAPITAL LETTER C WITH CEDILLA 
<E!>    <U00C8>      LATIN CAPITAL LETTER E WITH GRAVE 
<E'>    <U00C9>      LATIN CAPITAL LETTER E WITH ACUTE 
<E/>>   <U00CA>      LATIN CAPITAL LETTER E WITH CIRCUMFLEX 
<E:>    <U00CB>      LATIN CAPITAL LETTER E WITH DIAERESIS 
<I!>    <U00CC>      LATIN CAPITAL LETTER I WITH GRAVE 
<I'>    <U00CD>      LATIN CAPITAL LETTER I WITH ACUTE 
<I/>>   <U00CE>      LATIN CAPITAL LETTER I WITH CIRCUMFLEX 
<I:>    <U00CF>      LATIN CAPITAL LETTER I WITH DIAERESIS 
<D->    <U00D0>      LATIN CAPITAL LETTER ETH (Icelandic) 
<N?>    <U00D1>      LATIN CAPITAL LETTER N WITH TILDE 
<O!>    <U00D2>      LATIN CAPITAL LETTER O WITH GRAVE 
<O'>    <U00D3>      LATIN CAPITAL LETTER O WITH ACUTE 
<O/>>   <U00D4>      LATIN CAPITAL LETTER O WITH CIRCUMFLEX 
<O?>    <U00D5>      LATIN CAPITAL LETTER O WITH TILDE 
<O:>    <U00D6>      LATIN CAPITAL LETTER O WITH DIAERESIS 
<*X>    <U00D7>      MULTIPLICATION SIGN 
<O//>   <U00D8>      LATIN CAPITAL LETTER O WITH STROKE 
<U!>    <U00D9>      LATIN CAPITAL LETTER U WITH GRAVE 
<U'>    <U00DA>      LATIN CAPITAL LETTER U WITH ACUTE 
<U/>>   <U00DB>      LATIN CAPITAL LETTER U WITH CIRCUMFLEX 
<U:>    <U00DC>      LATIN CAPITAL LETTER U WITH DIAERESIS 
<Y'>    <U00DD>      LATIN CAPITAL LETTER Y WITH ACUTE 
<TH>    <U00DE>      LATIN CAPITAL LETTER THORN (Icelandic) 
<ss>    <U00DF>      LATIN SMALL LETTER SHARP S (German) 
<a!>    <U00E0>      LATIN SMALL LETTER A WITH GRAVE 
<a'>    <U00E1>      LATIN SMALL LETTER A WITH ACUTE 
<a/>>   <U00E2>      LATIN SMALL LETTER A WITH CIRCUMFLEX 
<a?>    <U00E3>      LATIN SMALL LETTER A WITH TILDE 
<a:>    <U00E4>      LATIN SMALL LETTER A WITH DIAERESIS 
<aa>    <U00E5>      LATIN SMALL LETTER A WITH RING ABOVE 
<ae>    <U00E6>      LATIN SMALL LETTER AE (ash) 
<c,>    <U00E7>      LATIN SMALL LETTER C WITH CEDILLA 
<e!>    <U00E8>      LATIN SMALL LETTER E WITH GRAVE 
<e'>    <U00E9>      LATIN SMALL LETTER E WITH ACUTE 
<e/>>   <U00EA>      LATIN SMALL LETTER E WITH CIRCUMFLEX 
<e:>    <U00EB>      LATIN SMALL LETTER E WITH DIAERESIS 
<i!>    <U00EC>      LATIN SMALL LETTER I WITH GRAVE 
<i'>    <U00ED>      LATIN SMALL LETTER I WITH ACUTE 
<i/>>   <U00EE>      LATIN SMALL LETTER I WITH CIRCUMFLEX 
<i:>    <U00EF>      LATIN SMALL LETTER I WITH DIAERESIS 
<d->    <U00F0>      LATIN SMALL LETTER ETH (Icelandic) 
<n?>    <U00F1>      LATIN SMALL LETTER N WITH TILDE 
<o!>    <U00F2>      LATIN SMALL LETTER O WITH GRAVE 
<o'>    <U00F3>      LATIN SMALL LETTER O WITH ACUTE 
<o/>>   <U00F4>      LATIN SMALL LETTER O WITH CIRCUMFLEX 
<o?>    <U00F5>      LATIN SMALL LETTER O WITH TILDE 
<o:>    <U00F6>      LATIN SMALL LETTER O WITH DIAERESIS 
<-:>    <U00F7>      DIVISION SIGN 
<o//>   <U00F8>      LATIN SMALL LETTER O WITH STROKE 
<u!>    <U00F9>      LATIN SMALL LETTER U WITH GRAVE 
<u'>    <U00FA>      LATIN SMALL LETTER U WITH ACUTE 
<u/>>   <U00FB>      LATIN SMALL LETTER U WITH CIRCUMFLEX 
<u:>    <U00FC>      LATIN SMALL LETTER U WITH DIAERESIS 
<y'>    <U00FD>      LATIN SMALL LETTER Y WITH ACUTE 
<th>    <U00FE>      LATIN SMALL LETTER THORN (Icelandic) 
<y:>    <U00FF>      LATIN SMALL LETTER Y WITH DIAERESIS 
<A->    <U0100>      LATIN CAPITAL LETTER A WITH MACRON 
<a->    <U0101>      LATIN SMALL LETTER A WITH MACRON 
<A(>    <U0102>      LATIN CAPITAL LETTER A WITH BREVE 
<a(>    <U0103>      LATIN SMALL LETTER A WITH BREVE 
<A;>    <U0104>      LATIN CAPITAL LETTER A WITH OGONEK 
<a;>    <U0105>      LATIN SMALL LETTER A WITH OGONEK 
<C'>    <U0106>      LATIN CAPITAL LETTER C WITH ACUTE 
<c'>    <U0107>      LATIN SMALL LETTER C WITH ACUTE 
<C/>>   <U0108>      LATIN CAPITAL LETTER C WITH CIRCUMFLEX 
<c/>>   <U0109>      LATIN SMALL LETTER C WITH CIRCUMFLEX 
<C.>    <U010A>      LATIN CAPITAL LETTER C WITH DOT ABOVE 
<c.>    <U010B>      LATIN SMALL LETTER C WITH DOT ABOVE 
<C<>    <U010C>      LATIN CAPITAL LETTER C WITH CARON 
<c<>    <U010D>      LATIN SMALL LETTER C WITH CARON 
<D<>    <U010E>      LATIN CAPITAL LETTER D WITH CARON 
<d<>    <U010F>      LATIN SMALL LETTER D WITH CARON 
<D//>   <U0110>      LATIN CAPITAL LETTER D WITH STROKE 
<d//>   <U0111>      LATIN SMALL LETTER D WITH STROKE 
<E->    <U0112>      LATIN CAPITAL LETTER E WITH MACRON 
<e->    <U0113>      LATIN SMALL LETTER E WITH MACRON 
<E(>    <U0114>      LATIN CAPITAL LETTER E WITH BREVE 
<e(>    <U0115>      LATIN SMALL LETTER E WITH BREVE 
<E.>    <U0116>      LATIN CAPITAL LETTER E WITH DOT ABOVE 
<e.>    <U0117>      LATIN SMALL LETTER E WITH DOT ABOVE 
<E;>    <U0118>      LATIN CAPITAL LETTER E WITH OGONEK 
<e;>    <U0119>      LATIN SMALL LETTER E WITH OGONEK 
<E<>    <U011A>      LATIN CAPITAL LETTER E WITH CARON 
<e<>    <U011B>      LATIN SMALL LETTER E WITH CARON 
<G/>>   <U011C>      LATIN CAPITAL LETTER G WITH CIRCUMFLEX 
<g/>>   <U011D>      LATIN SMALL LETTER G WITH CIRCUMFLEX 
<G(>    <U011E>      LATIN CAPITAL LETTER G WITH BREVE 
<g(>    <U011F>      LATIN SMALL LETTER G WITH BREVE 
<G.>    <U0120>      LATIN CAPITAL LETTER G WITH DOT ABOVE 
<g.>    <U0121>      LATIN SMALL LETTER G WITH DOT ABOVE 
<G,>    <U0122>      LATIN CAPITAL LETTER G WITH CEDILLA 
<g,>    <U0123>      LATIN SMALL LETTER G WITH CEDILLA 
<H/>>   <U0124>      LATIN CAPITAL LETTER H WITH CIRCUMFLEX 
<h/>>   <U0125>      LATIN SMALL LETTER H WITH CIRCUMFLEX 
<H//>   <U0126>      LATIN CAPITAL LETTER H WITH STROKE 
<h//>   <U0127>      LATIN SMALL LETTER H WITH STROKE 
<I?>    <U0128>      LATIN CAPITAL LETTER I WITH TILDE 
<i?>    <U0129>      LATIN SMALL LETTER I WITH TILDE 
<I->    <U012A>      LATIN CAPITAL LETTER I WITH MACRON 
<i->    <U012B>      LATIN SMALL LETTER I WITH MACRON 
<I(>    <U012C>      LATIN CAPITAL LETTER I WITH BREVE 
<i(>    <U012D>      LATIN SMALL LETTER I WITH BREVE 
<I;>    <U012E>      LATIN CAPITAL LETTER I WITH OGONEK 
<i;>    <U012F>      LATIN SMALL LETTER I WITH OGONEK 
<I.>    <U0130>      LATIN CAPITAL LETTER I WITH DOT ABOVE 
<i.>    <U0131>      LATIN SMALL LETTER DOTLESS I 
<IJ>    <U0132>      LATIN CAPITAL LIGATURE IJ 
<ij>    <U0133>      LATIN SMALL LIGATURE IJ 
<J/>>   <U0134>      LATIN CAPITAL LETTER J WITH CIRCUMFLEX 
<j/>>   <U0135>      LATIN SMALL LETTER J WITH CIRCUMFLEX 
<K,>    <U0136>      LATIN CAPITAL LETTER K WITH CEDILLA 
<k,>    <U0137>      LATIN SMALL LETTER K WITH CEDILLA 
<kk>    <U0138>      LATIN SMALL LETTER KRA (Greenlandic) 
<L'>    <U0139>      LATIN CAPITAL LETTER L WITH ACUTE 
<l'>    <U013A>      LATIN SMALL LETTER L WITH ACUTE 
<L,>    <U013B>      LATIN CAPITAL LETTER L WITH CEDILLA 
<l,>    <U013C>      LATIN SMALL LETTER L WITH CEDILLA 
<L<>    <U013D>      LATIN CAPITAL LETTER L WITH CARON 
<l<>    <U013E>      LATIN SMALL LETTER L WITH CARON 
<L.>    <U013F>      LATIN CAPITAL LETTER L WITH MIDDLE DOT 
<l.>    <U0140>      LATIN SMALL LETTER L WITH MIDDLE DOT 
<L//>   <U0141>      LATIN CAPITAL LETTER L WITH STROKE 
<l//>   <U0142>      LATIN SMALL LETTER L WITH STROKE 
<N'>    <U0143>      LATIN CAPITAL LETTER N WITH ACUTE 
<n'>    <U0144>      LATIN SMALL LETTER N WITH ACUTE 
<N,>    <U0145>      LATIN CAPITAL LETTER N WITH CEDILLA 
<n,>    <U0146>      LATIN SMALL LETTER N WITH CEDILLA 
<N<>    <U0147>      LATIN CAPITAL LETTER N WITH CARON 
<n<>    <U0148>      LATIN SMALL LETTER N WITH CARON 
<'n>    <U0149>      LATIN SMALL LETTER N PRECEDED BY APOSTROPHE 
<NG>    <U014A>      LATIN CAPITAL LETTER ENG (Sami) 
<ng>    <U014B>      LATIN SMALL LETTER ENG (Sami) 
<O->    <U014C>      LATIN CAPITAL LETTER O WITH MACRON 
<o->    <U014D>      LATIN SMALL LETTER O WITH MACRON 
<O(>    <U014E>      LATIN CAPITAL LETTER O WITH BREVE 
<o(>    <U014F>      LATIN SMALL LETTER O WITH BREVE 
<O">    <U0150>      LATIN CAPITAL LETTER O WITH DOUBLE ACUTE 
<o">    <U0151>      LATIN SMALL LETTER O WITH DOUBLE ACUTE 
<OE>    <U0152>      LATIN CAPITAL LIGATURE OE 
<oe>    <U0153>      LATIN SMALL LIGATURE OE 
<R'>    <U0154>      LATIN CAPITAL LETTER R WITH ACUTE 
<r'>    <U0155>      LATIN SMALL LETTER R WITH ACUTE 
<R,>    <U0156>      LATIN CAPITAL LETTER R WITH CEDILLA 
<r,>    <U0157>      LATIN SMALL LETTER R WITH CEDILLA 
<R<>    <U0158>      LATIN CAPITAL LETTER R WITH CARON 
<r<>    <U0159>      LATIN SMALL LETTER R WITH CARON 
<S'>    <U015A>      LATIN CAPITAL LETTER S WITH ACUTE 
<s'>    <U015B>      LATIN SMALL LETTER S WITH ACUTE 
<S/>>   <U015C>      LATIN CAPITAL LETTER S WITH CIRCUMFLEX 
<s/>>   <U015D>      LATIN SMALL LETTER S WITH CIRCUMFLEX 
<S,>    <U015E>      LATIN CAPITAL LETTER S WITH CEDILLA 
<s,>    <U015F>      LATIN SMALL LETTER S WITH CEDILLA 
<S<>    <U0160>      LATIN CAPITAL LETTER S WITH CARON 
<s<>    <U0161>      LATIN SMALL LETTER S WITH CARON 
<T,>    <U0162>      LATIN CAPITAL LETTER T WITH CEDILLA 
<t,>    <U0163>      LATIN SMALL LETTER T WITH CEDILLA 
<T<>    <U0164>      LATIN CAPITAL LETTER T WITH CARON 
<t<>    <U0165>      LATIN SMALL LETTER T WITH CARON 
<T//>   <U0166>      LATIN CAPITAL LETTER T WITH STROKE 
<t//>   <U0167>      LATIN SMALL LETTER T WITH STROKE 
<U?>    <U0168>      LATIN CAPITAL LETTER U WITH TILDE 
<u?>    <U0169>      LATIN SMALL LETTER U WITH TILDE 
<U->    <U016A>      LATIN CAPITAL LETTER U WITH MACRON 
<u->    <U016B>      LATIN SMALL LETTER U WITH MACRON 
<U(>    <U016C>      LATIN CAPITAL LETTER U WITH BREVE 
<u(>    <U016D>      LATIN SMALL LETTER U WITH BREVE 
<U0>    <U016E>      LATIN CAPITAL LETTER U WITH RING ABOVE 
<u0>    <U016F>      LATIN SMALL LETTER U WITH RING ABOVE 
<U">    <U0170>      LATIN CAPITAL LETTER U WITH DOUBLE ACUTE 
<u">    <U0171>      LATIN SMALL LETTER U WITH DOUBLE ACUTE 
<U;>    <U0172>      LATIN CAPITAL LETTER U WITH OGONEK 
<u;>    <U0173>      LATIN SMALL LETTER U WITH OGONEK 
<W/>>   <U0174>      LATIN CAPITAL LETTER W WITH CIRCUMFLEX 
<w/>>   <U0175>      LATIN SMALL LETTER W WITH CIRCUMFLEX 
<Y/>>   <U0176>      LATIN CAPITAL LETTER Y WITH CIRCUMFLEX 
<y/>>   <U0177>      LATIN SMALL LETTER Y WITH CIRCUMFLEX 
<Y:>    <U0178>      LATIN CAPITAL LETTER Y WITH DIAERESIS 
<Z'>    <U0179>      LATIN CAPITAL LETTER Z WITH ACUTE 
<z'>    <U017A>      LATIN SMALL LETTER Z WITH ACUTE 
<Z.>    <U017B>      LATIN CAPITAL LETTER Z WITH DOT ABOVE 
<z.>    <U017C>      LATIN SMALL LETTER Z WITH DOT ABOVE 
<Z<>    <U017D>      LATIN CAPITAL LETTER Z WITH CARON 
<z<>    <U017E>      LATIN SMALL LETTER Z WITH CARON 
<s1>    <U017F>      LATIN SMALL LETTER LONG S 
<b//>   <U0180>      LATIN SMALL LETTER B WITH STROKE 
<B2>    <U0181>      LATIN CAPITAL LETTER B WITH HOOK 
<C2>    <U0187>      LATIN CAPITAL LETTER C WITH HOOK 
<c2>    <U0188>      LATIN SMALL LETTER C WITH HOOK 
<F2>    <U0191>      LATIN CAPITAL LETTER F WITH HOOK 
<f2>    <U0192>      LATIN SMALL LETTER F WITH HOOK 
<K2>    <U0198>      LATIN CAPITAL LETTER K WITH HOOK 
<k2>    <U0199>      LATIN SMALL LETTER K WITH HOOK 
<O9>    <U01A0>      LATIN CAPITAL LETTER O WITH HORN 
<o9>    <U01A1>      LATIN SMALL LETTER O WITH HORN 
<OI>    <U01A2>      LATIN CAPITAL LETTER OI 
<oi>    <U01A3>      LATIN SMALL LETTER OI 
<yr>    <U01A6>      LATIN LETTER YR 
<U9>    <U01AF>      LATIN CAPITAL LETTER U WITH HORN 
<u9>    <U01B0>      LATIN SMALL LETTER U WITH HORN 
<Z//>   <U01B5>      LATIN CAPITAL LETTER Z WITH STROKE 
<z//>   <U01B6>      LATIN SMALL LETTER Z WITH STROKE 
<ED>    <U01B7>      LATIN CAPITAL LETTER EZH 
<DZ<>   <U01C4>      LATIN CAPITAL LETTER DZ WITH CARON 
<Dz<>   <U01C5>      LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON 
<dz<>   <U01C6>      LATIN SMALL LETTER DZ WITH CARON 
<LJ3>   <U01C7>      LATIN CAPITAL LETTER LJ 
<Lj3>   <U01C8>      LATIN CAPITAL LETTER L WITH SMALL LETTER J 
<lj3>   <U01C9>      LATIN SMALL LETTER LJ 
<NJ3>   <U01CA>      LATIN CAPITAL LETTER NJ 
<Nj3>   <U01CB>      LATIN CAPITAL LETTER N WITH SMALL LETTER J 
<nj3>   <U01CC>      LATIN SMALL LETTER NJ 
<A<>    <U01CD>      LATIN CAPITAL LETTER A WITH CARON 
<a<>    <U01CE>      LATIN SMALL LETTER A WITH CARON 
<I<>    <U01CF>      LATIN CAPITAL LETTER I WITH CARON 
<i<>    <U01D0>      LATIN SMALL LETTER I WITH CARON 
<O<>    <U01D1>      LATIN CAPITAL LETTER O WITH CARON 
<o<>    <U01D2>      LATIN SMALL LETTER O WITH CARON 
<U<>    <U01D3>      LATIN CAPITAL LETTER U WITH CARON 
<u<>    <U01D4>      LATIN SMALL LETTER U WITH CARON 
<U:->   <U01D5>      LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON 
<u:->   <U01D6>      LATIN SMALL LETTER U WITH DIAERESIS AND MACRON 
<U:'>   <U01D7>      LATIN CAPITAL LETTER U WITH DIAERESIS AND ACUTE 
<u:'>   <U01D8>      LATIN SMALL LETTER U WITH DIAERESIS AND ACUTE 
<U:<>   <U01D9>      LATIN CAPITAL LETTER U WITH DIAERESIS AND CARON 
<u:<>   <U01DA>      LATIN SMALL LETTER U WITH DIAERESIS AND CARON 
<U:!>   <U01DB>      LATIN CAPITAL LETTER U WITH DIAERESIS AND GRAVE 
<u:!>   <U01DC>      LATIN SMALL LETTER U WITH DIAERESIS AND GRAVE 
<e1>    <U01DD>      LATIN SMALL LETTER TURNED E 
<A1>    <U01DE>      LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON 
<a1>    <U01DF>      LATIN SMALL LETTER A WITH DIAERESIS AND MACRON 
<A7>    <U01E0>      LATIN CAPITAL LETTER A WITH DOT ABOVE AND MACRON 
<a7>    <U01E1>      LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON 
<A3>    <U01E2>      LATIN CAPITAL LETTER AE WITH MACRON (ash) 
<a3>    <U01E3>      LATIN SMALL LETTER AE WITH MACRON (ash) 
<G//>   <U01E4>      LATIN CAPITAL LETTER G WITH STROKE 
<g//>   <U01E5>      LATIN SMALL LETTER G WITH STROKE 
<G<>    <U01E6>      LATIN CAPITAL LETTER G WITH CARON 
<g<>    <U01E7>      LATIN SMALL LETTER G WITH CARON 
<K<>    <U01E8>      LATIN CAPITAL LETTER K WITH CARON 
<k<>    <U01E9>      LATIN SMALL LETTER K WITH CARON 
<O;>    <U01EA>      LATIN CAPITAL LETTER O WITH OGONEK 
<o;>    <U01EB>      LATIN SMALL LETTER O WITH OGONEK 
<O1>    <U01EC>      LATIN CAPITAL LETTER O WITH OGONEK AND MACRON 
<o1>    <U01ED>      LATIN SMALL LETTER O WITH OGONEK AND MACRON 
<EZ>    <U01EE>      LATIN CAPITAL LETTER EZH WITH CARON 
<ez>    <U01EF>      LATIN SMALL LETTER EZH WITH CARON 
<j<>    <U01F0>      LATIN SMALL LETTER J WITH CARON 
<DZ3>   <U01F1>      LATIN CAPITAL LETTER DZ 
<Dz3>   <U01F2>      LATIN CAPITAL LETTER D WITH SMALL LETTER Z 
<dz3>   <U01F3>      LATIN SMALL LETTER DZ 
<G'>    <U01F4>      LATIN CAPITAL LETTER G WITH ACUTE 
<g'>    <U01F5>      LATIN SMALL LETTER G WITH ACUTE 
<AA'>   <U01FA>      LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE 
<aa'>   <U01FB>      LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE 
<AE'>   <U01FC>      LATIN CAPITAL LETTER AE WITH ACUTE (ash) 
<ae'>   <U01FD>      LATIN SMALL LETTER AE WITH ACUTE (ash) 
<O//'>  <U01FE>      LATIN CAPITAL LETTER O WITH STROKE AND ACUTE 
<o//'>  <U01FF>      LATIN SMALL LETTER O WITH STROKE AND ACUTE 
<A!!>   <U0200>      LATIN CAPITAL LETTER A WITH DOUBLE GRAVE 
<a!!>   <U0201>      LATIN SMALL LETTER A WITH DOUBLE GRAVE 
<A)>    <U0202>      LATIN CAPITAL LETTER A WITH INVERTED BREVE 
<a)>    <U0203>      LATIN SMALL LETTER A WITH INVERTED BREVE 
<E!!>   <U0204>      LATIN CAPITAL LETTER E WITH DOUBLE GRAVE 
<e!!>   <U0205>      LATIN SMALL LETTER E WITH DOUBLE GRAVE 
<E)>    <U0206>      LATIN CAPITAL LETTER E WITH INVERTED BREVE 
<e)>    <U0207>      LATIN SMALL LETTER E WITH INVERTED BREVE 
<I!!>   <U0208>      LATIN CAPITAL LETTER I WITH DOUBLE GRAVE 
<i!!>   <U0209>      LATIN SMALL LETTER I WITH DOUBLE GRAVE 
<I)>    <U020A>      LATIN CAPITAL LETTER I WITH INVERTED BREVE 
<i)>    <U020B>      LATIN SMALL LETTER I WITH INVERTED BREVE 
<O!!>   <U020C>      LATIN CAPITAL LETTER O WITH DOUBLE GRAVE 
<o!!>   <U020D>      LATIN SMALL LETTER O WITH DOUBLE GRAVE 
<O)>    <U020E>      LATIN CAPITAL LETTER O WITH INVERTED BREVE 
<o)>    <U020F>      LATIN SMALL LETTER O WITH INVERTED BREVE 
<R!!>   <U0210>      LATIN CAPITAL LETTER R WITH DOUBLE GRAVE 
<r!!>   <U0211>      LATIN SMALL LETTER R WITH DOUBLE GRAVE 
<R)>    <U0212>      LATIN CAPITAL LETTER R WITH INVERTED BREVE 
<r)>    <U0213>      LATIN SMALL LETTER R WITH INVERTED BREVE 
<U!!>   <U0214>      LATIN CAPITAL LETTER U WITH DOUBLE GRAVE 
<u!!>   <U0215>      LATIN SMALL LETTER U WITH DOUBLE GRAVE 
<U)>    <U0216>      LATIN CAPITAL LETTER U WITH INVERTED BREVE 
<u)>    <U0217>      LATIN SMALL LETTER U WITH INVERTED BREVE 
<r1>    <U027C>      LATIN SMALL LETTER R WITH LONG LEG 
<ed>    <U0292>      LATIN SMALL LETTER EZH 
<;S>    <U02BB>      MODIFIER LETTER TURNED COMMA 
<1/>>   <U02C6>      MODIFIER LETTER CIRCUMFLEX ACCENT 
<'<>    <U02C7>      CARON (Mandarin Chinese third tone) 
<1->    <U02C9>      MODIFIER LETTER MACRON (Mandarin Chinese first tone) 
<1!>    <U02CB>      MODIFIER LETTER GRAVE ACCENT (Mandarin Chinese fourth 
tone) 
<'(>    <U02D8>      BREVE 
<'.>    <U02D9>      DOT ABOVE (Mandarin Chinese light tone) 
<'0>    <U02DA>      RING ABOVE 
<';>    <U02DB>      OGONEK 
<1?>    <U02DC>      SMALL TILDE 
<'">    <U02DD>      DOUBLE ACUTE ACCENT 
<'G>    <U0374>      GREEK NUMERAL SIGN (Dexia keraia) 
<,G>    <U0375>      GREEK LOWER NUMERAL SIGN (Aristeri keraia) 
<j3>    <U037A>      GREEK YPOGEGRAMMENI 
<?%>    <U037E>      GREEK QUESTION MARK (Erotimatiko) 
<'*>    <U0384>      GREEK TONOS 
<'%>    <U0385>      GREEK DIALYTIKA TONOS 
<A%>    <U0386>      GREEK CAPITAL LETTER ALPHA WITH TONOS 
<.*>    <U0387>      GREEK ANO TELEIA 
<E%>    <U0388>      GREEK CAPITAL LETTER EPSILON WITH TONOS 
<Y%>    <U0389>      GREEK CAPITAL LETTER ETA WITH TONOS 
<I%>    <U038A>      GREEK CAPITAL LETTER IOTA WITH TONOS 
<O%>    <U038C>      GREEK CAPITAL LETTER OMICRON WITH TONOS 
<U%>    <U038E>      GREEK CAPITAL LETTER UPSILON WITH TONOS 
<W%>    <U038F>      GREEK CAPITAL LETTER OMEGA WITH TONOS 
<i3>    <U0390>      GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS 
<A*>    <U0391>      GREEK CAPITAL LETTER ALPHA 
<B*>    <U0392>      GREEK CAPITAL LETTER BETA 
<G*>    <U0393>      GREEK CAPITAL LETTER GAMMA 
<D*>    <U0394>      GREEK CAPITAL LETTER DELTA 
<E*>    <U0395>      GREEK CAPITAL LETTER EPSILON 
<Z*>    <U0396>      GREEK CAPITAL LETTER ZETA 
<Y*>    <U0397>      GREEK CAPITAL LETTER ETA 
<H*>    <U0398>      GREEK CAPITAL LETTER THETA 
<I*>    <U0399>      GREEK CAPITAL LETTER IOTA 
<K*>    <U039A>      GREEK CAPITAL LETTER KAPPA 
<L*>    <U039B>      GREEK CAPITAL LETTER LAMDA 
<M*>    <U039C>      GREEK CAPITAL LETTER MU 
<N*>    <U039D>      GREEK CAPITAL LETTER NU 
<C*>    <U039E>      GREEK CAPITAL LETTER XI 
<O*>    <U039F>      GREEK CAPITAL LETTER OMICRON 
<P*>    <U03A0>      GREEK CAPITAL LETTER PI 
<R*>    <U03A1>      GREEK CAPITAL LETTER RHO 
<S*>    <U03A3>      GREEK CAPITAL LETTER SIGMA 
<T*>    <U03A4>      GREEK CAPITAL LETTER TAU 
<U*>    <U03A5>      GREEK CAPITAL LETTER UPSILON 
<F*>    <U03A6>      GREEK CAPITAL LETTER PHI 
<X*>    <U03A7>      GREEK CAPITAL LETTER CHI 
<Q*>    <U03A8>      GREEK CAPITAL LETTER PSI 
<W*>    <U03A9>      GREEK CAPITAL LETTER OMEGA 
<J*>    <U03AA>      GREEK CAPITAL LETTER IOTA WITH DIALYTIKA 
<V*>    <U03AB>      GREEK CAPITAL LETTER UPSILON WITH DIALYTIKA 
<a%>    <U03AC>      GREEK SMALL LETTER ALPHA WITH TONOS 
<e%>    <U03AD>      GREEK SMALL LETTER EPSILON WITH TONOS 
<y%>    <U03AE>      GREEK SMALL LETTER ETA WITH TONOS 
<i%>    <U03AF>      GREEK SMALL LETTER IOTA WITH TONOS 
<u3>    <U03B0>      GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS 
<a*>    <U03B1>      GREEK SMALL LETTER ALPHA 
<b*>    <U03B2>      GREEK SMALL LETTER BETA 
<g*>    <U03B3>      GREEK SMALL LETTER GAMMA 
<d*>    <U03B4>      GREEK SMALL LETTER DELTA 
<e*>    <U03B5>      GREEK SMALL LETTER EPSILON 
<z*>    <U03B6>      GREEK SMALL LETTER ZETA 
<y*>    <U03B7>      GREEK SMALL LETTER ETA 
<h*>    <U03B8>      GREEK SMALL LETTER THETA 
<i*>    <U03B9>      GREEK SMALL LETTER IOTA 
<k*>    <U03BA>      GREEK SMALL LETTER KAPPA 
<l*>    <U03BB>      GREEK SMALL LETTER LAMDA 
<m*>    <U03BC>      GREEK SMALL LETTER MU 
<n*>    <U03BD>      GREEK SMALL LETTER NU 
<c*>    <U03BE>      GREEK SMALL LETTER XI 
<o*>    <U03BF>      GREEK SMALL LETTER OMICRON 
<p*>    <U03C0>      GREEK SMALL LETTER PI 
<r*>    <U03C1>      GREEK SMALL LETTER RHO 
<*s>    <U03C2>      GREEK SMALL LETTER FINAL SIGMA 
<s*>    <U03C3>      GREEK SMALL LETTER SIGMA 
<t*>    <U03C4>      GREEK SMALL LETTER TAU 
<u*>    <U03C5>      GREEK SMALL LETTER UPSILON 
<f*>    <U03C6>      GREEK SMALL LETTER PHI 
<x*>    <U03C7>      GREEK SMALL LETTER CHI 
<q*>    <U03C8>      GREEK SMALL LETTER PSI 
<w*>    <U03C9>      GREEK SMALL LETTER OMEGA 
<j*>    <U03CA>      GREEK SMALL LETTER IOTA WITH DIALYTIKA 
<v*>    <U03CB>      GREEK SMALL LETTER UPSILON WITH DIALYTIKA 
<o%>    <U03CC>      GREEK SMALL LETTER OMICRON WITH TONOS 
<u%>    <U03CD>      GREEK SMALL LETTER UPSILON WITH TONOS 
<w%>    <U03CE>      GREEK SMALL LETTER OMEGA WITH TONOS 
<b3>    <U03D0>      GREEK BETA SYMBOL 
<T3>    <U03DA>      GREEK LETTER STIGMA 
<M3>    <U03DC>      GREEK LETTER DIGAMMA 
<K3>    <U03DE>      GREEK LETTER KOPPA 
<P3>    <U03E0>      GREEK LETTER SAMPI 
<IO>    <U0401>      CYRILLIC CAPITAL LETTER IO 
<D%>    <U0402>      CYRILLIC CAPITAL LETTER DJE (Serbocroatian) 
<G%>    <U0403>      CYRILLIC CAPITAL LETTER GJE 
<IE>    <U0404>      CYRILLIC CAPITAL LETTER UKRAINIAN IE 
<DS>    <U0405>      CYRILLIC CAPITAL LETTER DZE 
<II>    <U0406>      CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I 
<YI>    <U0407>      CYRILLIC CAPITAL LETTER YI (Ukrainian) 
<J%>    <U0408>      CYRILLIC CAPITAL LETTER JE 
<LJ>    <U0409>      CYRILLIC CAPITAL LETTER LJE 
<NJ>    <U040A>      CYRILLIC CAPITAL LETTER NJE 
<Ts>    <U040B>      CYRILLIC CAPITAL LETTER TSHE (Serbocroatian) 
<KJ>    <U040C>      CYRILLIC CAPITAL LETTER KJE 
<V%>    <U040E>      CYRILLIC CAPITAL LETTER SHORT U (Byelorussian) 
<DZ>    <U040F>      CYRILLIC CAPITAL LETTER DZHE 
<A=>    <U0410>      CYRILLIC CAPITAL LETTER A 
<B=>    <U0411>      CYRILLIC CAPITAL LETTER BE 
<V=>    <U0412>      CYRILLIC CAPITAL LETTER VE 
<G=>    <U0413>      CYRILLIC CAPITAL LETTER GHE 
<D=>    <U0414>      CYRILLIC CAPITAL LETTER DE 
<E=>    <U0415>      CYRILLIC CAPITAL LETTER IE 
<Z%>    <U0416>      CYRILLIC CAPITAL LETTER ZHE 
<Z=>    <U0417>      CYRILLIC CAPITAL LETTER ZE 
<I=>    <U0418>      CYRILLIC CAPITAL LETTER I 
<J=>    <U0419>      CYRILLIC CAPITAL LETTER SHORT I 
<K=>    <U041A>      CYRILLIC CAPITAL LETTER KA 
<L=>    <U041B>      CYRILLIC CAPITAL LETTER EL 
<M=>    <U041C>      CYRILLIC CAPITAL LETTER EM 
<N=>    <U041D>      CYRILLIC CAPITAL LETTER EN 
<O=>    <U041E>      CYRILLIC CAPITAL LETTER O 
<P=>    <U041F>      CYRILLIC CAPITAL LETTER PE 
<R=>    <U0420>      CYRILLIC CAPITAL LETTER ER 
<S=>    <U0421>      CYRILLIC CAPITAL LETTER ES 
<T=>    <U0422>      CYRILLIC CAPITAL LETTER TE 
<U=>    <U0423>      CYRILLIC CAPITAL LETTER U 
<F=>    <U0424>      CYRILLIC CAPITAL LETTER EF 
<H=>    <U0425>      CYRILLIC CAPITAL LETTER HA 
<C=>    <U0426>      CYRILLIC CAPITAL LETTER TSE 
<C%>    <U0427>      CYRILLIC CAPITAL LETTER CHE 
<S%>    <U0428>      CYRILLIC CAPITAL LETTER SHA 
<Sc>    <U0429>      CYRILLIC CAPITAL LETTER SHCHA 
<=">    <U042A>      CYRILLIC CAPITAL LETTER HARD SIGN 
<Y=>    <U042B>      CYRILLIC CAPITAL LETTER YERU 
<%">    <U042C>      CYRILLIC CAPITAL LETTER SOFT SIGN 
<JE>    <U042D>      CYRILLIC CAPITAL LETTER E 
<JU>    <U042E>      CYRILLIC CAPITAL LETTER YU 
<JA>    <U042F>      CYRILLIC CAPITAL LETTER YA 
<a=>    <U0430>      CYRILLIC SMALL LETTER A 
<b=>    <U0431>      CYRILLIC SMALL LETTER BE 
<v=>    <U0432>      CYRILLIC SMALL LETTER VE 
<g=>    <U0433>      CYRILLIC SMALL LETTER GHE 
<d=>    <U0434>      CYRILLIC SMALL LETTER DE 
<e=>    <U0435>      CYRILLIC SMALL LETTER IE 
<z%>    <U0436>      CYRILLIC SMALL LETTER ZHE 
<z=>    <U0437>      CYRILLIC SMALL LETTER ZE 
<i=>    <U0438>      CYRILLIC SMALL LETTER I 
<j=>    <U0439>      CYRILLIC SMALL LETTER SHORT I 
<k=>    <U043A>      CYRILLIC SMALL LETTER KA 
<l=>    <U043B>      CYRILLIC SMALL LETTER EL 
<m=>    <U043C>      CYRILLIC SMALL LETTER EM 
<n=>    <U043D>      CYRILLIC SMALL LETTER EN 
<o=>    <U043E>      CYRILLIC SMALL LETTER O 
<p=>    <U043F>      CYRILLIC SMALL LETTER PE 
<r=>    <U0440>      CYRILLIC SMALL LETTER ER 
<s=>    <U0441>      CYRILLIC SMALL LETTER ES 
<t=>    <U0442>      CYRILLIC SMALL LETTER TE 
<u=>    <U0443>      CYRILLIC SMALL LETTER U 
<f=>    <U0444>      CYRILLIC SMALL LETTER EF 
<h=>    <U0445>      CYRILLIC SMALL LETTER HA 
<c=>    <U0446>      CYRILLIC SMALL LETTER TSE 
<c%>    <U0447>      CYRILLIC SMALL LETTER CHE 
<s%>    <U0448>      CYRILLIC SMALL LETTER SHA 
<sc>    <U0449>      CYRILLIC SMALL LETTER SHCHA 
<='>    <U044A>      CYRILLIC SMALL LETTER HARD SIGN 
<y=>    <U044B>      CYRILLIC SMALL LETTER YERU 
<%'>    <U044C>      CYRILLIC SMALL LETTER SOFT SIGN 
<je>    <U044D>      CYRILLIC SMALL LETTER E 
<ju>    <U044E>      CYRILLIC SMALL LETTER YU 
<ja>    <U044F>      CYRILLIC SMALL LETTER YA 
<io>    <U0451>      CYRILLIC SMALL LETTER IO 
<d%>    <U0452>      CYRILLIC SMALL LETTER DJE (Serbocroatian) 
<g%>    <U0453>      CYRILLIC SMALL LETTER GJE 
<ie>    <U0454>      CYRILLIC SMALL LETTER UKRAINIAN IE 
<ds>    <U0455>      CYRILLIC SMALL LETTER DZE 
<ii>    <U0456>      CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I 
<yi>    <U0457>      CYRILLIC SMALL LETTER YI (Ukrainian) 
<j%>    <U0458>      CYRILLIC SMALL LETTER JE 
<lj>    <U0459>      CYRILLIC SMALL LETTER LJE 
<nj>    <U045A>      CYRILLIC SMALL LETTER NJE 
<ts>    <U045B>      CYRILLIC SMALL LETTER TSHE (Serbocroatian) 
<kj>    <U045C>      CYRILLIC SMALL LETTER KJE 
<v%>    <U045E>      CYRILLIC SMALL LETTER SHORT U (Byelorussian) 
<dz>    <U045F>      CYRILLIC SMALL LETTER DZHE 
<Y3>    <U0462>      CYRILLIC CAPITAL LETTER YAT 
<y3>    <U0463>      CYRILLIC SMALL LETTER YAT 
<O3>    <U046A>      CYRILLIC CAPITAL LETTER BIG YUS 
<o3>    <U046B>      CYRILLIC SMALL LETTER BIG YUS 
<F3>    <U0472>      CYRILLIC CAPITAL LETTER FITA 
<f3>    <U0473>      CYRILLIC SMALL LETTER FITA 
<V3>    <U0474>      CYRILLIC CAPITAL LETTER IZHITSA 
<v3>    <U0475>      CYRILLIC SMALL LETTER IZHITSA 
<C3>    <U0480>      CYRILLIC CAPITAL LETTER KOPPA 
<c3>    <U0481>      CYRILLIC SMALL LETTER KOPPA 
<G3>    <U0490>      CYRILLIC CAPITAL LETTER GHE WITH UPTURN 
<g3>    <U0491>      CYRILLIC SMALL LETTER GHE WITH UPTURN 
<A+>    <U05D0>      HEBREW LETTER ALEF 
<B+>    <U05D1>      HEBREW LETTER BET 
<G+>    <U05D2>      HEBREW LETTER GIMEL 
<D+>    <U05D3>      HEBREW LETTER DALET 
<H+>    <U05D4>      HEBREW LETTER HE 
<W+>    <U05D5>      HEBREW LETTER VAV 
<Z+>    <U05D6>      HEBREW LETTER ZAYIN 
<X+>    <U05D7>      HEBREW LETTER HET 
<Tj>    <U05D8>      HEBREW LETTER TET 
<J+>    <U05D9>      HEBREW LETTER YOD 
<K%>    <U05DA>      HEBREW LETTER FINAL KAF 
<K+>    <U05DB>      HEBREW LETTER KAF 
<L+>    <U05DC>      HEBREW LETTER LAMED 
<M%>    <U05DD>      HEBREW LETTER FINAL MEM 
<M+>    <U05DE>      HEBREW LETTER MEM 
<N%>    <U05DF>      HEBREW LETTER FINAL NUN 
<N+>    <U05E0>      HEBREW LETTER NUN 
<S+>    <U05E1>      HEBREW LETTER SAMEKH 
<E+>    <U05E2>      HEBREW LETTER AYIN 
<P%>    <U05E3>      HEBREW LETTER FINAL PE 
<P+>    <U05E4>      HEBREW LETTER PE 
<Zj>    <U05E5>      HEBREW LETTER FINAL TSADI 
<ZJ>    <U05E6>      HEBREW LETTER TSADI 
<Q+>    <U05E7>      HEBREW LETTER QOF 
<R+>    <U05E8>      HEBREW LETTER RESH 
<Sh>    <U05E9>      HEBREW LETTER SHIN 
<T+>    <U05EA>      HEBREW LETTER TAV 
<,+>    <U060C>      ARABIC COMMA 
<;+>    <U061B>      ARABIC SEMICOLON 
<?+>    <U061F>      ARABIC QUESTION MARK 
<H'>    <U0621>      ARABIC LETTER HAMZA 
<aM>    <U0622>      ARABIC LETTER ALEF WITH MADDA ABOVE 
<aH>    <U0623>      ARABIC LETTER ALEF WITH HAMZA ABOVE 
<wH>    <U0624>      ARABIC LETTER WAW WITH HAMZA ABOVE 
<ah>    <U0625>      ARABIC LETTER ALEF WITH HAMZA BELOW 
<yH>    <U0626>      ARABIC LETTER YEH WITH HAMZA ABOVE 
<a+>    <U0627>      ARABIC LETTER ALEF 
<b+>    <U0628>      ARABIC LETTER BEH 
<tm>    <U0629>      ARABIC LETTER TEH MARBUTA 
<t+>    <U062A>      ARABIC LETTER TEH 
<tk>    <U062B>      ARABIC LETTER THEH 
<g+>    <U062C>      ARABIC LETTER JEEM 
<hk>    <U062D>      ARABIC LETTER HAH 
<x+>    <U062E>      ARABIC LETTER KHAH 
<d+>    <U062F>      ARABIC LETTER DAL 
<dk>    <U0630>      ARABIC LETTER THAL 
<r+>    <U0631>      ARABIC LETTER REH 
<z+>    <U0632>      ARABIC LETTER ZAIN 
<s+>    <U0633>      ARABIC LETTER SEEN 
<sn>    <U0634>      ARABIC LETTER SHEEN 
<c+>    <U0635>      ARABIC LETTER SAD 
<dd>    <U0636>      ARABIC LETTER DAD 
<tj>    <U0637>      ARABIC LETTER TAH 
<zH>    <U0638>      ARABIC LETTER ZAH 
<e+>    <U0639>      ARABIC LETTER AIN 
<i+>    <U063A>      ARABIC LETTER GHAIN 
<++>    <U0640>      ARABIC TATWEEL 
<f+>    <U0641>      ARABIC LETTER FEH 
<q+>    <U0642>      ARABIC LETTER QAF 
<k+>    <U0643>      ARABIC LETTER KAF 
<l+>    <U0644>      ARABIC LETTER LAM 
<m+>    <U0645>      ARABIC LETTER MEEM 
<n+>    <U0646>      ARABIC LETTER NOON 
<h+>    <U0647>      ARABIC LETTER HEH 
<w+>    <U0648>      ARABIC LETTER WAW 
<j+>    <U0649>      ARABIC LETTER ALEF MAKSURA 
<y+>    <U064A>      ARABIC LETTER YEH 
<:+>    <U064B>      ARABIC FATHATAN 
<"+>    <U064C>      ARABIC DAMMATAN 
<=+>    <U064D>      ARABIC KASRATAN 
<//+>   <U064E>      ARABIC FATHA 
<'+>    <U064F>      ARABIC DAMMA 
<1+>    <U0650>      ARABIC KASRA 
<3+>    <U0651>      ARABIC SHADDA 
<0+>    <U0652>      ARABIC SUKUN 
<0a>    <U0660>      ARABIC-INDIC DIGIT ZERO 
<1a>    <U0661>      ARABIC-INDIC DIGIT ONE 
<2a>    <U0662>      ARABIC-INDIC DIGIT TWO 
<3a>    <U0663>      ARABIC-INDIC DIGIT THREE 
<4a>    <U0664>      ARABIC-INDIC DIGIT FOUR 
<5a>    <U0665>      ARABIC-INDIC DIGIT FIVE 
<6a>    <U0666>      ARABIC-INDIC DIGIT SIX 
<7a>    <U0667>      ARABIC-INDIC DIGIT SEVEN 
<8a>    <U0668>      ARABIC-INDIC DIGIT EIGHT 
<9a>    <U0669>      ARABIC-INDIC DIGIT NINE 
<aS>    <U0670>      ARABIC LETTER SUPERSCRIPT ALEF 
<p+>    <U067E>      ARABIC LETTER PEH 
<hH>    <U0681>      ARABIC LETTER HAH WITH HAMZA ABOVE 
<tc>    <U0686>      ARABIC LETTER TCHEH 
<zj>    <U0698>      ARABIC LETTER JEH 
<v+>    <U06A4>      ARABIC LETTER VEH 
<gf>    <U06AF>      ARABIC LETTER GAF 
<A-0>   <U1E00>      LATIN CAPITAL LETTER A WITH RING BELOW 
<a-0>   <U1E01>      LATIN SMALL LETTER A WITH RING BELOW 
<B.>    <U1E02>      LATIN CAPITAL LETTER B WITH DOT ABOVE 
<b.>    <U1E03>      LATIN SMALL LETTER B WITH DOT ABOVE 
<B-.>   <U1E04>      LATIN CAPITAL LETTER B WITH DOT BELOW 
<b-.>   <U1E05>      LATIN SMALL LETTER B WITH DOT BELOW 
<B_>    <U1E06>      LATIN CAPITAL LETTER B WITH LINE BELOW 
<b_>    <U1E07>      LATIN SMALL LETTER B WITH LINE BELOW 
<C,'>   <U1E08>      LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE 
<c,'>   <U1E09>      LATIN SMALL LETTER C WITH CEDILLA AND ACUTE 
<D.>    <U1E0A>      LATIN CAPITAL LETTER D WITH DOT ABOVE 
<d.>    <U1E0B>      LATIN SMALL LETTER D WITH DOT ABOVE 
<D-.>   <U1E0C>      LATIN CAPITAL LETTER D WITH DOT BELOW 
<d-.>   <U1E0D>      LATIN SMALL LETTER D WITH DOT BELOW 
<D_>    <U1E0E>      LATIN CAPITAL LETTER D WITH LINE BELOW 
<d_>    <U1E0F>      LATIN SMALL LETTER D WITH LINE BELOW 
<D,>    <U1E10>      LATIN CAPITAL LETTER D WITH CEDILLA 
<d,>    <U1E11>      LATIN SMALL LETTER D WITH CEDILLA 
<D-/>>  <U1E12>      LATIN CAPITAL LETTER D WITH CIRCUMFLEX BELOW 
<d-/>>  <U1E13>      LATIN SMALL LETTER D WITH CIRCUMFLEX BELOW 
<E-!>   <U1E14>      LATIN CAPITAL LETTER E WITH MACRON AND GRAVE 
<e-!>   <U1E15>      LATIN SMALL LETTER E WITH MACRON AND GRAVE 
<E-'>   <U1E16>      LATIN CAPITAL LETTER E WITH MACRON AND ACUTE 
<e-'>   <U1E17>      LATIN SMALL LETTER E WITH MACRON AND ACUTE 
<E-/>>  <U1E18>      LATIN CAPITAL LETTER E WITH CIRCUMFLEX BELOW 
<e-/>>  <U1E19>      LATIN SMALL LETTER E WITH CIRCUMFLEX BELOW 
<E-?>   <U1E1A>      LATIN CAPITAL LETTER E WITH TILDE BELOW 
<e-?>   <U1E1B>      LATIN SMALL LETTER E WITH TILDE BELOW 
<E,(>   <U1E1C>      LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE 
<e,(>   <U1E1D>      LATIN SMALL LETTER E WITH CEDILLA AND BREVE 
<F.>    <U1E1E>      LATIN CAPITAL LETTER F WITH DOT ABOVE 
<f.>    <U1E1F>      LATIN SMALL LETTER F WITH DOT ABOVE 
<G->    <U1E20>      LATIN CAPITAL LETTER G WITH MACRON 
<g->    <U1E21>      LATIN SMALL LETTER G WITH MACRON 
<H.>    <U1E22>      LATIN CAPITAL LETTER H WITH DOT ABOVE 
<h.>    <U1E23>      LATIN SMALL LETTER H WITH DOT ABOVE 
<H-.>   <U1E24>      LATIN CAPITAL LETTER H WITH DOT BELOW 
<h-.>   <U1E25>      LATIN SMALL LETTER H WITH DOT BELOW 
<H:>    <U1E26>      LATIN CAPITAL LETTER H WITH DIAERESIS 
<h:>    <U1E27>      LATIN SMALL LETTER H WITH DIAERESIS 
<H,>    <U1E28>      LATIN CAPITAL LETTER H WITH CEDILLA 
<h,>    <U1E29>      LATIN SMALL LETTER H WITH CEDILLA 
<H-(>   <U1E2A>      LATIN CAPITAL LETTER H WITH BREVE BELOW 
<h-(>   <U1E2B>      LATIN SMALL LETTER H WITH BREVE BELOW 
<I-?>   <U1E2C>      LATIN CAPITAL LETTER I WITH TILDE BELOW 
<i-?>   <U1E2D>      LATIN SMALL LETTER I WITH TILDE BELOW 
<I:'>   <U1E2E>      LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE 
<i:'>   <U1E2F>      LATIN SMALL LETTER I WITH DIAERESIS AND ACUTE 
<K'>    <U1E30>      LATIN CAPITAL LETTER K WITH ACUTE 
<k'>    <U1E31>      LATIN SMALL LETTER K WITH ACUTE 
<K-.>   <U1E32>      LATIN CAPITAL LETTER K WITH DOT BELOW 
<k-.>   <U1E33>      LATIN SMALL LETTER K WITH DOT BELOW 
<K_>    <U1E34>      LATIN CAPITAL LETTER K WITH LINE BELOW 
<k_>    <U1E35>      LATIN SMALL LETTER K WITH LINE BELOW 
<L-.>   <U1E36>      LATIN CAPITAL LETTER L WITH DOT BELOW 
<l-.>   <U1E37>      LATIN SMALL LETTER L WITH DOT BELOW 
<L--.>  <U1E38>      LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRON 
<l--.>  <U1E39>      LATIN SMALL LETTER L WITH DOT BELOW AND MACRON 
<L_>    <U1E3A>      LATIN CAPITAL LETTER L WITH LINE BELOW 
<l_>    <U1E3B>      LATIN SMALL LETTER L WITH LINE BELOW 
<L-/>>  <U1E3C>      LATIN CAPITAL LETTER L WITH CIRCUMFLEX BELOW 
<l-/>>  <U1E3D>      LATIN SMALL LETTER L WITH CIRCUMFLEX BELOW 
<M'>    <U1E3E>      LATIN CAPITAL LETTER M WITH ACUTE 
<m'>    <U1E3F>      LATIN SMALL LETTER M WITH ACUTE 
<M.>    <U1E40>      LATIN CAPITAL LETTER M WITH DOT ABOVE 
<m.>    <U1E41>      LATIN SMALL LETTER M WITH DOT ABOVE 
<M-.>   <U1E42>      LATIN CAPITAL LETTER M WITH DOT BELOW 
<m-.>   <U1E43>      LATIN SMALL LETTER M WITH DOT BELOW 
<N.>    <U1E44>      LATIN CAPITAL LETTER N WITH DOT ABOVE 
<n.>    <U1E45>      LATIN SMALL LETTER N WITH DOT ABOVE 
<N-.>   <U1E46>      LATIN CAPITAL LETTER N WITH DOT BELOW 
<n-.>   <U1E47>      LATIN SMALL LETTER N WITH DOT BELOW 
<N_>    <U1E48>      LATIN CAPITAL LETTER N WITH LINE BELOW 
<n_>    <U1E49>      LATIN SMALL LETTER N WITH LINE BELOW 
<N-/>>  <U1E4A>      LATIN CAPITAL LETTER N WITH CIRCUMFLEX BELOW 
<n-/>>  <U1E4B>      LATIN SMALL LETTER N WITH CIRCUMFLEX BELOW 
<O?'>   <U1E4C>      LATIN CAPITAL LETTER O WITH TILDE AND ACUTE 
<o?'>   <U1E4D>      LATIN SMALL LETTER O WITH TILDE AND ACUTE 
<O?:>   <U1E4E>      LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS 
<o?:>   <U1E4F>      LATIN SMALL LETTER O WITH TILDE AND DIAERESIS 
<O-!>   <U1E50>      LATIN CAPITAL LETTER O WITH MACRON AND GRAVE 
<o-!>   <U1E51>      LATIN SMALL LETTER O WITH MACRON AND GRAVE 
<O-'>   <U1E52>      LATIN CAPITAL LETTER O WITH MACRON AND ACUTE 
<o-'>   <U1E53>      LATIN SMALL LETTER O WITH MACRON AND ACUTE 
<P'>    <U1E54>      LATIN CAPITAL LETTER P WITH ACUTE 
<p'>    <U1E55>      LATIN SMALL LETTER P WITH ACUTE 
<P.>    <U1E56>      LATIN CAPITAL LETTER P WITH DOT ABOVE 
<p.>    <U1E57>      LATIN SMALL LETTER P WITH DOT ABOVE 
<R.>    <U1E58>      LATIN CAPITAL LETTER R WITH DOT ABOVE 
<r.>    <U1E59>      LATIN SMALL LETTER R WITH DOT ABOVE 
<R-.>   <U1E5A>      LATIN CAPITAL LETTER R WITH DOT BELOW 
<r-.>   <U1E5B>      LATIN SMALL LETTER R WITH DOT BELOW 
<R--.>  <U1E5C>      LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON 
<r--.>  <U1E5D>      LATIN SMALL LETTER R WITH DOT BELOW AND MACRON 
<R_>    <U1E5E>      LATIN CAPITAL LETTER R WITH LINE BELOW 
<r_>    <U1E5F>      LATIN SMALL LETTER R WITH LINE BELOW 
<S.>    <U1E60>      LATIN CAPITAL LETTER S WITH DOT ABOVE 
<s.>    <U1E61>      LATIN SMALL LETTER S WITH DOT ABOVE 
<S-.>   <U1E62>      LATIN CAPITAL LETTER S WITH DOT BELOW 
<s-.>   <U1E63>      LATIN SMALL LETTER S WITH DOT BELOW 
<S'.>   <U1E64>      LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE 
<s'.>   <U1E65>      LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE 
<S<.>   <U1E66>      LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE 
<s<.>   <U1E67>      LATIN SMALL LETTER S WITH CARON AND DOT ABOVE 
<S.-.>  <U1E68>      LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE 
<s.-.>  <U1E69>      LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE 
<T.>    <U1E6A>      LATIN CAPITAL LETTER T WITH DOT ABOVE 
<t.>    <U1E6B>      LATIN SMALL LETTER T WITH DOT ABOVE 
<T-.>   <U1E6C>      LATIN CAPITAL LETTER T WITH DOT BELOW 
<t-.>   <U1E6D>      LATIN SMALL LETTER T WITH DOT BELOW 
<T_>    <U1E6E>      LATIN CAPITAL LETTER T WITH LINE BELOW 
<t_>    <U1E6F>      LATIN SMALL LETTER T WITH LINE BELOW 
<T-/>>  <U1E70>      LATIN CAPITAL LETTER T WITH CIRCUMFLEX BELOW 
<t-/>>  <U1E71>      LATIN SMALL LETTER T WITH CIRCUMFLEX BELOW 
<U--:>  <U1E72>      LATIN CAPITAL LETTER U WITH DIAERESIS BELOW 
<u--:>  <U1E73>      LATIN SMALL LETTER U WITH DIAERESIS BELOW 
<U-?>   <U1E74>      LATIN CAPITAL LETTER U WITH TILDE BELOW 
<u-?>   <U1E75>      LATIN SMALL LETTER U WITH TILDE BELOW 
<U-/>>  <U1E76>      LATIN CAPITAL LETTER U WITH CIRCUMFLEX BELOW 
<u-/>>  <U1E77>      LATIN SMALL LETTER U WITH CIRCUMFLEX BELOW 
<U?'>   <U1E78>      LATIN CAPITAL LETTER U WITH TILDE AND ACUTE 
<u?'>   <U1E79>      LATIN SMALL LETTER U WITH TILDE AND ACUTE 
<U-:>   <U1E7A>      LATIN CAPITAL LETTER U WITH MACRON AND DIAERESIS 
<u-:>   <U1E7B>      LATIN SMALL LETTER U WITH MACRON AND DIAERESIS 
<V?>    <U1E7C>      LATIN CAPITAL LETTER V WITH TILDE 
<v?>    <U1E7D>      LATIN SMALL LETTER V WITH TILDE 
<V-.>   <U1E7E>      LATIN CAPITAL LETTER V WITH DOT BELOW 
<v-.>   <U1E7F>      LATIN SMALL LETTER V WITH DOT BELOW 
<W!>    <U1E80>      LATIN CAPITAL LETTER W WITH GRAVE 
<w!>    <U1E81>      LATIN SMALL LETTER W WITH GRAVE 
<W'>    <U1E82>      LATIN CAPITAL LETTER W WITH ACUTE 
<w'>    <U1E83>      LATIN SMALL LETTER W WITH ACUTE 
<W:>    <U1E84>      LATIN CAPITAL LETTER W WITH DIAERESIS 
<w:>    <U1E85>      LATIN SMALL LETTER W WITH DIAERESIS 
<W.>    <U1E86>      LATIN CAPITAL LETTER W WITH DOT ABOVE 
<w.>    <U1E87>      LATIN SMALL LETTER W WITH DOT ABOVE 
<W-.>   <U1E88>      LATIN CAPITAL LETTER W WITH DOT BELOW 
<w-.>   <U1E89>      LATIN SMALL LETTER W WITH DOT BELOW 
<X.>    <U1E8A>      LATIN CAPITAL LETTER X WITH DOT ABOVE 
<x.>    <U1E8B>      LATIN SMALL LETTER X WITH DOT ABOVE 
<X:>    <U1E8C>      LATIN CAPITAL LETTER X WITH DIAERESIS 
<x:>    <U1E8D>      LATIN SMALL LETTER X WITH DIAERESIS 
<Y.>    <U1E8E>      LATIN CAPITAL LETTER Y WITH DOT ABOVE 
<y.>    <U1E8F>      LATIN SMALL LETTER Y WITH DOT ABOVE 
<Z/>>   <U1E90>      LATIN CAPITAL LETTER Z WITH CIRCUMFLEX 
<z/>>   <U1E91>      LATIN SMALL LETTER Z WITH CIRCUMFLEX 
<Z-.>   <U1E92>      LATIN CAPITAL LETTER Z WITH DOT BELOW 
<z-.>   <U1E93>      LATIN SMALL LETTER Z WITH DOT BELOW 
<Z_>    <U1E94>      LATIN CAPITAL LETTER Z WITH LINE BELOW 
<z_>    <U1E95>      LATIN SMALL LETTER Z WITH LINE BELOW 
<A-.>   <U1EA0>      LATIN CAPITAL LETTER A WITH DOT BELOW 
<a-.>   <U1EA1>      LATIN SMALL LETTER A WITH DOT BELOW 
<A2>    <U1EA2>      LATIN CAPITAL LETTER A WITH HOOK ABOVE 
<a2>    <U1EA3>      LATIN SMALL LETTER A WITH HOOK ABOVE 
<A/>'>  <U1EA4>      LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUTE 
<a/>'>  <U1EA5>      LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE 
<A/>!>  <U1EA6>      LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE 
<a/>!>  <U1EA7>      LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE 
<A/>2>  <U1EA8>      LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE 
<a/>2>  <U1EA9>      LATIN SMALL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE 
<A/>?>  <U1EAA>      LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND TILDE 
<a/>?>  <U1EAB>      LATIN SMALL LETTER A WITH CIRCUMFLEX AND TILDE 
<A/>-.> <U1EAC>      LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW 
<a/>-.> <U1EAD>      LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW 
<A('>   <U1EAE>      LATIN CAPITAL LETTER A WITH BREVE AND ACUTE 
<a('>   <U1EAF>      LATIN SMALL LETTER A WITH BREVE AND ACUTE 
<A(!>   <U1EB0>      LATIN CAPITAL LETTER A WITH BREVE AND GRAVE 
<a(!>   <U1EB1>      LATIN SMALL LETTER A WITH BREVE AND GRAVE 
<A(2>   <U1EB2>      LATIN CAPITAL LETTER A WITH BREVE AND HOOK ABOVE 
<a(2>   <U1EB3>      LATIN SMALL LETTER A WITH BREVE AND HOOK ABOVE 
<A(?>   <U1EB4>      LATIN CAPITAL LETTER A WITH BREVE AND TILDE 
<a(?>   <U1EB5>      LATIN SMALL LETTER A WITH BREVE AND TILDE 
<A(-.>  <U1EB6>      LATIN CAPITAL LETTER A WITH BREVE AND DOT BELOW 
<a(-.>  <U1EB7>      LATIN SMALL LETTER A WITH BREVE AND DOT BELOW 
<E-.>   <U1EB8>      LATIN CAPITAL LETTER E WITH DOT BELOW 
<e-.>   <U1EB9>      LATIN SMALL LETTER E WITH DOT BELOW 
<E2>    <U1EBA>      LATIN CAPITAL LETTER E WITH HOOK ABOVE 
<e2>    <U1EBB>      LATIN SMALL LETTER E WITH HOOK ABOVE 
<E?>    <U1EBC>      LATIN CAPITAL LETTER E WITH TILDE 
<e?>    <U1EBD>      LATIN SMALL LETTER E WITH TILDE 
<E/>'>  <U1EBE>      LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE 
<e/>'>  <U1EBF>      LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE 
<E/>!>  <U1EC0>      LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND GRAVE 
<e/>!>  <U1EC1>      LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE 
<E/>2>  <U1EC2>      LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE 
<e/>2>  <U1EC3>      LATIN SMALL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE 
<E/>?>  <U1EC4>      LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND TILDE 
<e/>?>  <U1EC5>      LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE 
<E/>-.> <U1EC6>      LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW 
<e/>-.> <U1EC7>      LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW 
<I2>    <U1EC8>      LATIN CAPITAL LETTER I WITH HOOK ABOVE 
<i2>    <U1EC9>      LATIN SMALL LETTER I WITH HOOK ABOVE 
<I-.>   <U1ECA>      LATIN CAPITAL LETTER I WITH DOT BELOW 
<i-.>   <U1ECB>      LATIN SMALL LETTER I WITH DOT BELOW 
<O-.>   <U1ECC>      LATIN CAPITAL LETTER O WITH DOT BELOW 
<o-.>   <U1ECD>      LATIN SMALL LETTER O WITH DOT BELOW 
<O2>    <U1ECE>      LATIN CAPITAL LETTER O WITH HOOK ABOVE 
<o2>    <U1ECF>      LATIN SMALL LETTER O WITH HOOK ABOVE 
<O/>'>  <U1ED0>      LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUTE 
<o/>'>  <U1ED1>      LATIN SMALL LETTER O WITH CIRCUMFLEX AND ACUTE 
<O/>!>  <U1ED2>      LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND GRAVE 
<o/>!>  <U1ED3>      LATIN SMALL LETTER O WITH CIRCUMFLEX AND GRAVE 
<O/>2>  <U1ED4>      LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE 
<o/>2>  <U1ED5>      LATIN SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE 
<O/>?>  <U1ED6>      LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND TILDE 
<o/>?>  <U1ED7>      LATIN SMALL LETTER O WITH CIRCUMFLEX AND TILDE 
<O/>-.> <U1ED8>      LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND DOT BELOW 
<o/>-.> <U1ED9>      LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW 
<O9'>   <U1EDA>      LATIN CAPITAL LETTER O WITH HORN AND ACUTE 
<o9'>   <U1EDB>      LATIN SMALL LETTER O WITH HORN AND ACUTE 
<O9!>   <U1EDC>      LATIN CAPITAL LETTER O WITH HORN AND GRAVE 
<o9!>   <U1EDD>      LATIN SMALL LETTER O WITH HORN AND GRAVE 
<O92>   <U1EDE>      LATIN CAPITAL LETTER O WITH HORN AND HOOK ABOVE 
<o92>   <U1EDF>      LATIN SMALL LETTER O WITH HORN AND HOOK ABOVE 
<O9?>   <U1EE0>      LATIN CAPITAL LETTER O WITH HORN AND TILDE 
<o9?>   <U1EE1>      LATIN SMALL LETTER O WITH HORN AND TILDE 
<O9-.>  <U1EE2>      LATIN CAPITAL LETTER O WITH HORN AND DOT BELOW 
<o9-.>  <U1EE3>      LATIN SMALL LETTER O WITH HORN AND DOT BELOW 
<U-.>   <U1EE4>      LATIN CAPITAL LETTER U WITH DOT BELOW 
<u-.>   <U1EE5>      LATIN SMALL LETTER U WITH DOT BELOW 
<U2>    <U1EE6>      LATIN CAPITAL LETTER U WITH HOOK ABOVE 
<u2>    <U1EE7>      LATIN SMALL LETTER U WITH HOOK ABOVE 
<U9'>   <U1EE8>      LATIN CAPITAL LETTER U WITH HORN AND ACUTE 
<u9'>   <U1EE9>      LATIN SMALL LETTER U WITH HORN AND ACUTE 
<U9!>   <U1EEA>      LATIN CAPITAL LETTER U WITH HORN AND GRAVE 
<u9!>   <U1EEB>      LATIN SMALL LETTER U WITH HORN AND GRAVE 
<U92>   <U1EEC>      LATIN CAPITAL LETTER U WITH HORN AND HOOK ABOVE 
<u92>   <U1EED>      LATIN SMALL LETTER U WITH HORN AND HOOK ABOVE 
<U9?>   <U1EEE>      LATIN CAPITAL LETTER U WITH HORN AND TILDE 
<u9?>   <U1EEF>      LATIN SMALL LETTER U WITH HORN AND TILDE 
<U9-.>  <U1EF0>      LATIN CAPITAL LETTER U WITH HORN AND DOT BELOW 
<u9-.>  <U1EF1>      LATIN SMALL LETTER U WITH HORN AND DOT BELOW 
<Y!>    <U1EF2>      LATIN CAPITAL LETTER Y WITH GRAVE 
<y!>    <U1EF3>      LATIN SMALL LETTER Y WITH GRAVE 
<Y-.>   <U1EF4>      LATIN CAPITAL LETTER Y WITH DOT BELOW 
<y-.>   <U1EF5>      LATIN SMALL LETTER Y WITH DOT BELOW 
<Y2>    <U1EF6>      LATIN CAPITAL LETTER Y WITH HOOK ABOVE 
<y2>    <U1EF7>      LATIN SMALL LETTER Y WITH HOOK ABOVE 
<Y?>    <U1EF8>      LATIN CAPITAL LETTER Y WITH TILDE 
<y?>    <U1EF9>      LATIN SMALL LETTER Y WITH TILDE 
<a*,>   <U1F00>      GREEK SMALL LETTER ALPHA WITH PSILI 
<a*;>   <U1F01>      GREEK SMALL LETTER ALPHA WITH DASIA 
<a*,!>  <U1F02>      GREEK SMALL LETTER ALPHA WITH PSILI AND VARIA 
<a*;!>  <U1F03>      GREEK SMALL LETTER ALPHA WITH DASIA AND VARIA 
<a*,'>  <U1F04>      GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA 
<a*;'>  <U1F05>      GREEK SMALL LETTER ALPHA WITH DASIA AND OXIA 
<a*,?>  <U1F06>      GREEK SMALL LETTER ALPHA WITH PSILI AND PERISPOMENI 
<a*;?>  <U1F07>      GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI 
<A*,>   <U1F08>      GREEK CAPITAL LETTER ALPHA WITH PSILI 
<A*;>   <U1F09>      GREEK CAPITAL LETTER ALPHA WITH DASIA 
<A*,!>  <U1F0A>      GREEK CAPITAL LETTER ALPHA WITH PSILI AND VARIA 
<A*;!>  <U1F0B>      GREEK CAPITAL LETTER ALPHA WITH DASIA AND VARIA 
<A*,'>  <U1F0C>      GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA 
<A*;'>  <U1F0D>      GREEK CAPITAL LETTER ALPHA WITH DASIA AND OXIA 
<A*,?>  <U1F0E>      GREEK CAPITAL LETTER ALPHA WITH PSILI AND PERISPOMENI 
<A*;?>  <U1F0F>      GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI 
<e*,>   <U1F10>      GREEK SMALL LETTER EPSILON WITH PSILI 
<e*;>   <U1F11>      GREEK SMALL LETTER EPSILON WITH DASIA 
<e*,!>  <U1F12>      GREEK SMALL LETTER EPSILON WITH PSILI AND VARIA 
<e*;!>  <U1F13>      GREEK SMALL LETTER EPSILON WITH DASIA AND VARIA 
<e*,'>  <U1F14>      GREEK SMALL LETTER EPSILON WITH PSILI AND OXIA 
<e*;'>  <U1F15>      GREEK SMALL LETTER EPSILON WITH DASIA AND OXIA 
<E*,>   <U1F18>      GREEK CAPITAL LETTER EPSILON WITH PSILI 
<E*;>   <U1F19>      GREEK CAPITAL LETTER EPSILON WITH DASIA 
<E*,!>  <U1F1A>      GREEK CAPITAL LETTER EPSILON WITH PSILI AND VARIA 
<E*;!>  <U1F1B>      GREEK CAPITAL LETTER EPSILON WITH DASIA AND VARIA 
<E*,'>  <U1F1C>      GREEK CAPITAL LETTER EPSILON WITH PSILI AND OXIA 
<E*;'>  <U1F1D>      GREEK CAPITAL LETTER EPSILON WITH DASIA AND OXIA 
<y*,>   <U1F20>      GREEK SMALL LETTER ETA WITH PSILI 
<y*;>   <U1F21>      GREEK SMALL LETTER ETA WITH DASIA 
<y*,!>  <U1F22>      GREEK SMALL LETTER ETA WITH PSILI AND VARIA 
<y*;!>  <U1F23>      GREEK SMALL LETTER ETA WITH DASIA AND VARIA 
<y*,'>  <U1F24>      GREEK SMALL LETTER ETA WITH PSILI AND OXIA 
<y*;'>  <U1F25>      GREEK SMALL LETTER ETA WITH DASIA AND OXIA 
<y*,?>  <U1F26>      GREEK SMALL LETTER ETA WITH PSILI AND PERISPOMENI 
<y*;?>  <U1F27>      GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI 
<Y*,>   <U1F28>      GREEK CAPITAL LETTER ETA WITH PSILI 
<Y*;>   <U1F29>      GREEK CAPITAL LETTER ETA WITH DASIA 
<Y*,!>  <U1F2A>      GREEK CAPITAL LETTER ETA WITH PSILI AND VARIA 
<Y*;!>  <U1F2B>      GREEK CAPITAL LETTER ETA WITH DASIA AND VARIA 
<Y*,'>  <U1F2C>      GREEK CAPITAL LETTER ETA WITH PSILI AND OXIA 
<Y*;'>  <U1F2D>      GREEK CAPITAL LETTER ETA WITH DASIA AND OXIA 
<Y*,?>  <U1F2E>      GREEK CAPITAL LETTER ETA WITH PSILI AND PERISPOMENI 
<Y*;?>  <U1F2F>      GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI 
<i*,>   <U1F30>      GREEK SMALL LETTER IOTA WITH PSILI 
<i*;>   <U1F31>      GREEK SMALL LETTER IOTA WITH DASIA 
<i*,!>  <U1F32>      GREEK SMALL LETTER IOTA WITH PSILI AND VARIA 
<i*;!>  <U1F33>      GREEK SMALL LETTER IOTA WITH DASIA AND VARIA 
<i*,'>  <U1F34>      GREEK SMALL LETTER IOTA WITH PSILI AND OXIA 
<i*;'>  <U1F35>      GREEK SMALL LETTER IOTA WITH DASIA AND OXIA 
<i*,?>  <U1F36>      GREEK SMALL LETTER IOTA WITH PSILI AND PERISPOMENI 
<i*;?>  <U1F37>      GREEK SMALL LETTER IOTA WITH DASIA AND PERISPOMENI 
<I*,>   <U1F38>      GREEK CAPITAL LETTER IOTA WITH PSILI 
<I*;>   <U1F39>      GREEK CAPITAL LETTER IOTA WITH DASIA 
<I*,!>  <U1F3A>      GREEK CAPITAL LETTER IOTA WITH PSILI AND VARIA 
<I*;!>  <U1F3B>      GREEK CAPITAL LETTER IOTA WITH DASIA AND VARIA 
<I*,'>  <U1F3C>      GREEK CAPITAL LETTER IOTA WITH PSILI AND OXIA 
<I*;'>  <U1F3D>      GREEK CAPITAL LETTER IOTA WITH DASIA AND OXIA 
<I*,?>  <U1F3E>      GREEK CAPITAL LETTER IOTA WITH PSILI AND PERISPOMENI 
<I*;?>  <U1F3F>      GREEK CAPITAL LETTER IOTA WITH DASIA AND PERISPOMENI 
<o*,>   <U1F40>      GREEK SMALL LETTER OMICRON WITH PSILI 
<o*;>   <U1F41>      GREEK SMALL LETTER OMICRON WITH DASIA 
<o*,!>  <U1F42>      GREEK SMALL LETTER OMICRON WITH PSILI AND VARIA 
<o*;!>  <U1F43>      GREEK SMALL LETTER OMICRON WITH DASIA AND VARIA 
<o*,'>  <U1F44>      GREEK SMALL LETTER OMICRON WITH PSILI AND OXIA 
<o*;'>  <U1F45>      GREEK SMALL LETTER OMICRON WITH DASIA AND OXIA 
<O*,>   <U1F48>      GREEK CAPITAL LETTER OMICRON WITH PSILI 
<O*;>   <U1F49>      GREEK CAPITAL LETTER OMICRON WITH DASIA 
<O*,!>  <U1F4A>      GREEK CAPITAL LETTER OMICRON WITH PSILI AND VARIA 
<O*;!>  <U1F4B>      GREEK CAPITAL LETTER OMICRON WITH DASIA AND VARIA 
<O*,'>  <U1F4C>      GREEK CAPITAL LETTER OMICRON WITH PSILI AND OXIA 
<O*;'>  <U1F4D>      GREEK CAPITAL LETTER OMICRON WITH DASIA AND OXIA 
<u*,>   <U1F50>      GREEK SMALL LETTER UPSILON WITH PSILI 
<u*;>   <U1F51>      GREEK SMALL LETTER UPSILON WITH DASIA 
<u*,!>  <U1F52>      GREEK SMALL LETTER UPSILON WITH PSILI AND VARIA 
<u*;!>  <U1F53>      GREEK SMALL LETTER UPSILON WITH DASIA AND VARIA 
<u*,'>  <U1F54>      GREEK SMALL LETTER UPSILON WITH PSILI AND OXIA 
<u*;'>  <U1F55>      GREEK SMALL LETTER UPSILON WITH DASIA AND OXIA 
<u*,?>  <U1F56>      GREEK SMALL LETTER UPSILON WITH PSILI AND PERISPOMENI 
<u*;?>  <U1F57>      GREEK SMALL LETTER UPSILON WITH DASIA AND PERISPOMENI 
<U*;>   <U1F59>      GREEK CAPITAL LETTER UPSILON WITH DASIA 
<U*;!>  <U1F5B>      GREEK CAPITAL LETTER UPSILON WITH DASIA AND VARIA 
<U*;'>  <U1F5D>      GREEK CAPITAL LETTER UPSILON WITH DASIA AND OXIA 
<U*;?>  <U1F5F>      GREEK CAPITAL LETTER UPSILON WITH DASIA AND 
PERISPOMENI 
<w*,>   <U1F60>      GREEK SMALL LETTER OMEGA WITH PSILI 
<w*;>   <U1F61>      GREEK SMALL LETTER OMEGA WITH DASIA 
<w*,!>  <U1F62>      GREEK SMALL LETTER OMEGA WITH PSILI AND VARIA 
<w*;!>  <U1F63>      GREEK SMALL LETTER OMEGA WITH DASIA AND VARIA 
<w*,'>  <U1F64>      GREEK SMALL LETTER OMEGA WITH PSILI AND OXIA 
<w*;'>  <U1F65>      GREEK SMALL LETTER OMEGA WITH DASIA AND OXIA 
<w*,?>  <U1F66>      GREEK SMALL LETTER OMEGA WITH PSILI AND PERISPOMENI 
<w*;?>  <U1F67>      GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI 
<W*,>   <U1F68>      GREEK CAPITAL LETTER OMEGA WITH PSILI 
<W*;>   <U1F69>      GREEK CAPITAL LETTER OMEGA WITH DASIA 
<W*,!>  <U1F6A>      GREEK CAPITAL LETTER OMEGA WITH PSILI AND VARIA 
<W*;!>  <U1F6B>      GREEK CAPITAL LETTER OMEGA WITH DASIA AND VARIA 
<W*,'>  <U1F6C>      GREEK CAPITAL LETTER OMEGA WITH PSILI AND OXIA 
<W*;'>  <U1F6D>      GREEK CAPITAL LETTER OMEGA WITH DASIA AND OXIA 
<W*,?>  <U1F6E>      GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI 
<W*;?>  <U1F6F>      GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI 
<a*!>   <U1F70>      GREEK SMALL LETTER ALPHA WITH VARIA 
<a*'>   <U1F71>      GREEK SMALL LETTER ALPHA WITH OXIA 
<e*!>   <U1F72>      GREEK SMALL LETTER EPSILON WITH VARIA 
<e*'>   <U1F73>      GREEK SMALL LETTER EPSILON WITH OXIA 
<y*!>   <U1F74>      GREEK SMALL LETTER ETA WITH VARIA 
<y*'>   <U1F75>      GREEK SMALL LETTER ETA WITH OXIA 
<i*!>   <U1F76>      GREEK SMALL LETTER IOTA WITH VARIA 
<i*'>   <U1F77>      GREEK SMALL LETTER IOTA WITH OXIA 
<o*!>   <U1F78>      GREEK SMALL LETTER OMICRON WITH VARIA 
<o*'>   <U1F79>      GREEK SMALL LETTER OMICRON WITH OXIA 
<u*!>   <U1F7A>      GREEK SMALL LETTER UPSILON WITH VARIA 
<u*'>   <U1F7B>      GREEK SMALL LETTER UPSILON WITH OXIA 
<w*!>   <U1F7C>      GREEK SMALL LETTER OMEGA WITH VARIA 
<w*'>   <U1F7D>      GREEK SMALL LETTER OMEGA WITH OXIA 
<a*,j>  <U1F80>      GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI 
<a*;j>  <U1F81>      GREEK SMALL LETTER ALPHA WITH DASIA AND YPOGEGRAMMENI 
<a*,!j> <U1F82>      GREEK SMALL LETTER ALPHA WITH PSILI AND VARIA AND 
YPOGEGRAMMENI 
<a*;!j> <U1F83>      GREEK SMALL LETTER ALPHA WITH DASIA AND VARIA AND 
YPOGEGRAMMENI 
<a*,'j> <U1F84>      GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA AND 
YPOGEGRAMMENI 
<a*;'j> <U1F85>      GREEK SMALL LETTER ALPHA WITH DASIA AND OXIA AND 
YPOGEGRAMMENI 
<a*,?j> <U1F86>      GREEK SMALL LETTER ALPHA WITH PSILI AND PERISPOMENI 
AND YPOGEGRAMMENI 
<a*;?j> <U1F87>      GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI 
AND YPOGEGRAMMENI 
<A*,J>  <U1F88>      GREEK CAPITAL LETTER ALPHA WITH PSILI AND 
PROSGEGRAMMENI 
<A*;J>  <U1F89>      GREEK CAPITAL LETTER ALPHA WITH DASIA AND 
PROSGEGRAMMENI 
<A*,!J> <U1F8A>      GREEK CAPITAL LETTER ALPHA WITH PSILI AND VARIA AND 
PROSGEGRAMMENI 
<A*;!J> <U1F8B>      GREEK CAPITAL LETTER ALPHA WITH DASIA AND VARIA AND 
PROSGEGRAMMENI 
<A*,'J> <U1F8C>      GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA AND 
PROSGEGRAMMENI 
<A*;'J> <U1F8D>      GREEK CAPITAL LETTER ALPHA WITH DASIA AND OXIA AND 
PROSGEGRAMMENI 
<A*,?J> <U1F8E>      GREEK CAPITAL LETTER ALPHA WITH PSILI AND PERISPOMENI 
AND PROSGEGRAMMENI 
<A*;?J> <U1F8F>      GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI 
AND PROSGEGRAMMENI 
<y*,j>  <U1F90>      GREEK SMALL LETTER ETA WITH PSILI AND YPOGEGRAMMENI 
<y*;j>  <U1F91>      GREEK SMALL LETTER ETA WITH DASIA AND YPOGEGRAMMENI 
<y*,!j> <U1F92>      GREEK SMALL LETTER ETA WITH PSILI AND VARIA AND 
YPOGEGRAMMENI 
<y*;!j> <U1F93>      GREEK SMALL LETTER ETA WITH DASIA AND VARIA AND 
YPOGEGRAMMENI 
<y*,'j> <U1F94>      GREEK SMALL LETTER ETA WITH PSILI AND OXIA AND 
YPOGEGRAMMENI 
<y*;'j> <U1F95>      GREEK SMALL LETTER ETA WITH DASIA AND OXIA AND 
YPOGEGRAMMENI 
<y*,?j> <U1F96>      GREEK SMALL LETTER ETA WITH PSILI AND PERISPOMENI AND 
YPOGEGRAMMENI 
<y*;?j> <U1F97>      GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI AND 
YPOGEGRAMMENI 
<Y*,J>  <U1F98>      GREEK CAPITAL LETTER ETA WITH PSILI AND 
PROSGEGRAMMENI 
<Y*;J>  <U1F99>      GREEK CAPITAL LETTER ETA WITH DASIA AND 
PROSGEGRAMMENI 
<Y*,!J> <U1F9A>      GREEK CAPITAL LETTER ETA WITH PSILI AND VARIA AND 
PROSGEGRAMMENI 
<Y*;!J> <U1F9B>      GREEK CAPITAL LETTER ETA WITH DASIA AND VARIA AND 
PROSGEGRAMMENI 
<Y*,'J> <U1F9C>      GREEK CAPITAL LETTER ETA WITH PSILI AND OXIA AND 
PROSGEGRAMMENI 
<Y*;'J> <U1F9D>      GREEK CAPITAL LETTER ETA WITH DASIA AND OXIA AND 
PROSGEGRAMMENI 
<Y*,?J> <U1F9E>      GREEK CAPITAL LETTER ETA WITH PSILI AND PERISPOMENI 
AND PROSGEGRAMMENI 
<Y*;?J> <U1F9F>      GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI 
AND PROSGEGRAMMENI 
<w*,j>  <U1FA0>      GREEK SMALL LETTER OMEGA WITH PSILI AND YPOGEGRAMMENI 
<w*;j>  <U1FA1>      GREEK SMALL LETTER OMEGA WITH DASIA AND YPOGEGRAMMENI 
<w*,!j> <U1FA2>      GREEK SMALL LETTER OMEGA WITH PSILI AND VARIA AND 
YPOGEGRAMMENI 
<w*;!j> <U1FA3>      GREEK SMALL LETTER OMEGA WITH DASIA AND VARIA AND 
YPOGEGRAMMENI 
<w*,'j> <U1FA4>      GREEK SMALL LETTER OMEGA WITH PSILI AND OXIA AND 
YPOGEGRAMMENI 
<w*;'j> <U1FA5>      GREEK SMALL LETTER OMEGA WITH DASIA AND OXIA AND 
YPOGEGRAMMENI 
<w*,?j> <U1FA6>      GREEK SMALL LETTER OMEGA WITH PSILI AND PERISPOMENI 
AND YPOGEGRAMMENI 
<w*;?j> <U1FA7>      GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI 
AND YPOGEGRAMMENI 
<W*,J>  <U1FA8>      GREEK CAPITAL LETTER OMEGA WITH PSILI AND 
PROSGEGRAMMENI 
<W*;J>  <U1FA9>      GREEK CAPITAL LETTER OMEGA WITH DASIA AND 
PROSGEGRAMMENI 
<W*,!J> <U1FAA>      GREEK CAPITAL LETTER OMEGA WITH PSILI AND VARIA AND 
PROSGEGRAMMENI 
<W*;!J> <U1FAB>      GREEK CAPITAL LETTER OMEGA WITH DASIA AND VARIA AND 
PROSGEGRAMMENI 
<W*,'J> <U1FAC>      GREEK CAPITAL LETTER OMEGA WITH PSILI AND OXIA AND 
PROSGEGRAMMENI 
<W*;'J> <U1FAD>      GREEK CAPITAL LETTER OMEGA WITH DASIA AND OXIA AND 
PROSGEGRAMMENI 
<W*,?J> <U1FAE>      GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI 
AND PROSGEGRAMMENI 
<W*;?J> <U1FAF>      GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI 
AND PROSGEGRAMMENI 
<a*(>   <U1FB0>      GREEK SMALL LETTER ALPHA WITH VRACHY 
<a*->   <U1FB1>      GREEK SMALL LETTER ALPHA WITH MACRON 
<a*!j>  <U1FB2>      GREEK SMALL LETTER ALPHA WITH VARIA AND YPOGEGRAMMENI 
<a*j>   <U1FB3>      GREEK SMALL LETTER ALPHA WITH YPOGEGRAMMENI 
<a*'j>  <U1FB4>      GREEK SMALL LETTER ALPHA WITH OXIA AND YPOGEGRAMMENI 
<a*?>   <U1FB6>      GREEK SMALL LETTER ALPHA WITH PERISPOMENI 
<a*?j>  <U1FB7>      GREEK SMALL LETTER ALPHA WITH PERISPOMENI AND 
YPOGEGRAMMENI 
<A*(>   <U1FB8>      GREEK CAPITAL LETTER ALPHA WITH VRACHY 
<A*->   <U1FB9>      GREEK CAPITAL LETTER ALPHA WITH MACRON 
<A*!>   <U1FBA>      GREEK CAPITAL LETTER ALPHA WITH VARIA 
<A*'>   <U1FBB>      GREEK CAPITAL LETTER ALPHA WITH OXIA 
<A*J>   <U1FBC>      GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI 
<)*>    <U1FBD>      GREEK KORONIS 
<J3>    <U1FBE>      GREEK PROSGEGRAMMENI 
<,,>    <U1FBF>      GREEK PSILI 
<?*>    <U1FC0>      GREEK PERISPOMENI 
<?:>    <U1FC1>      GREEK DIALYTIKA AND PERISPOMENI 
<y*!j>  <U1FC2>      GREEK SMALL LETTER ETA WITH VARIA AND YPOGEGRAMMENI 
<y*j>   <U1FC3>      GREEK SMALL LETTER ETA WITH YPOGEGRAMMENI 
<y*'j>  <U1FC4>      GREEK SMALL LETTER ETA WITH OXIA AND YPOGEGRAMMENI 
<y*?>   <U1FC6>      GREEK SMALL LETTER ETA WITH PERISPOMENI 
<y*?j>  <U1FC7>      GREEK SMALL LETTER ETA WITH PERISPOMENI AND 
YPOGEGRAMMENI 
<E*!!>  <U1FC8>      GREEK CAPITAL LETTER EPSILON WITH VARIA 
<E*'>   <U1FC9>      GREEK CAPITAL LETTER EPSILON WITH OXIA 
<Y*!>   <U1FCA>      GREEK CAPITAL LETTER ETA WITH VARIA 
<Y*'>   <U1FCB>      GREEK CAPITAL LETTER ETA WITH OXIA 
<Y*J>   <U1FCC>      GREEK CAPITAL LETTER ETA WITH PROSGEGRAMMENI 
<,!>    <U1FCD>      GREEK PSILI AND VARIA 
<,'>    <U1FCE>      GREEK PSILI AND OXIA 
<?,>    <U1FCF>      GREEK PSILI AND PERISPOMENI 
<i*(>   <U1FD0>      GREEK SMALL LETTER IOTA WITH VRACHY 
<i*->   <U1FD1>      GREEK SMALL LETTER IOTA WITH MACRON 
<i*:!>  <U1FD2>      GREEK SMALL LETTER IOTA WITH DIALYTIKA AND VARIA 
<i*:'>  <U1FD3>      GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA 
<i*?>   <U1FD6>      GREEK SMALL LETTER IOTA WITH PERISPOMENI 
<i*:?>  <U1FD7>      GREEK SMALL LETTER IOTA WITH DIALYTIKA AND PERISPOMENI 
<I*(>   <U1FD8>      GREEK CAPITAL LETTER IOTA WITH VRACHY 
<I*->   <U1FD9>      GREEK CAPITAL LETTER IOTA WITH MACRON 
<I*!>   <U1FDA>      GREEK CAPITAL LETTER IOTA WITH VARIA 
<I*'>   <U1FDB>      GREEK CAPITAL LETTER IOTA WITH OXIA 
<;!>    <U1FDD>      GREEK DASIA AND VARIA 
<;'>    <U1FDE>      GREEK DASIA AND OXIA 
<?;>    <U1FDF>      GREEK DASIA AND PERISPOMENI 
<u*(>   <U1FE0>      GREEK SMALL LETTER UPSILON WITH VRACHY 
<u*->   <U1FE1>      GREEK SMALL LETTER UPSILON WITH MACRON 
<u*:!>  <U1FE2>      GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND VARIA 
<u*:'>  <U1FE3>      GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA 
<r*,>   <U1FE4>      GREEK SMALL LETTER RHO WITH PSILI 
<r*;>   <U1FE5>      GREEK SMALL LETTER RHO WITH DASIA 
<u*?>   <U1FE6>      GREEK SMALL LETTER UPSILON WITH PERISPOMENI 
<u*:?>  <U1FE7>      GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND 
PERISPOMENI 
<U*(>   <U1FE8>      GREEK CAPITAL LETTER UPSILON WITH VRACHY 
<U*->   <U1FE9>      GREEK CAPITAL LETTER UPSILON WITH MACRON 
<U*!>   <U1FEA>      GREEK CAPITAL LETTER UPSILON WITH VARIA 
<U*'>   <U1FEB>      GREEK CAPITAL LETTER UPSILON WITH OXIA 
<R*;>   <U1FEC>      GREEK CAPITAL LETTER RHO WITH DASIA 
<!:>    <U1FED>      GREEK DIALYTIKA AND VARIA 
<:'>    <U1FEE>      GREEK DIALYTIKA AND OXIA 
<!*>    <U1FEF>      GREEK VARIA 
<w*!j>  <U1FF2>      GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGRAMMENI 
<w*j>   <U1FF3>      GREEK SMALL LETTER OMEGA WITH YPOGEGRAMMENI 
<w*'j>  <U1FF4>      GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI 
<w*?>   <U1FF6>      GREEK SMALL LETTER OMEGA WITH PERISPOMENI 
<w*?j>  <U1FF7>      GREEK SMALL LETTER OMEGA WITH PERISPOMENI AND 
YPOGEGRAMMENI 
<O*!>   <U1FF8>      GREEK CAPITAL LETTER OMICRON WITH VARIA 
<O*'>   <U1FF9>      GREEK CAPITAL LETTER OMICRON WITH OXIA 
<W*!>   <U1FFA>      GREEK CAPITAL LETTER OMEGA WITH VARIA 
<W*'>   <U1FFB>      GREEK CAPITAL LETTER OMEGA WITH OXIA 
<W*J>   <U1FFC>      GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI 
<//*>   <U1FFD>      GREEK OXIA 
<;;>    <U1FFE>      GREEK DASIA 
<1N>    <U2002>      EN SPACE 
<1M>    <U2003>      EM SPACE 
<3M>    <U2004>      THREE-PER-EM SPACE 
<4M>    <U2005>      FOUR-PER-EM SPACE 
<6M>    <U2006>      SIX-PER-EM SPACE 
<LR>    <U200E>      LEFT-TO-RIGHT MARK 
<RL>    <U200F>      RIGHT-TO-LEFT MARK 
<1T>    <U2009>      THIN SPACE 
<1H>    <U200A>      HAIR SPACE 
<-1>    <U2010>      HYPHEN 
<-N>    <U2013>      EN DASH 
<-M>    <U2014>      EM DASH 
<-3>    <U2015>      HORIZONTAL BAR 
<!2>    <U2016>      DOUBLE VERTICAL LINE 
<=2>    <U2017>      DOUBLE LOW LINE 
<'6>    <U2018>      LEFT SINGLE QUOTATION MARK 
<'9>    <U2019>      RIGHT SINGLE QUOTATION MARK 
<.9>    <U201A>      SINGLE LOW-9 QUOTATION MARK 
<9'>    <U201B>      SINGLE HIGH-REVERSED-9 QUOTATION MARK 
<"6>    <U201C>      LEFT DOUBLE QUOTATION MARK 
<"9>    <U201D>      RIGHT DOUBLE QUOTATION MARK 
<:9>    <U201E>      DOUBLE LOW-9 QUOTATION MARK 
<9">    <U201F>      DOUBLE HIGH-REVERSED-9 QUOTATION MARK 
<//->   <U2020>      DAGGER 
<//=>   <U2021>      DOUBLE DAGGER 
<sb>    <U2022>      BULLET 
<3b>    <U2023>      TRIANGULAR BULLET 
<..>    <U2025>      TWO DOT LEADER 
<.3>    <U2026>      HORIZONTAL ELLIPSIS 
<.->    <U2027>      HYPHENATION POINT 
<linesep>       <U2028>     LINE SEPARATOR 
<parsep>        <U2029>     PARAGRAPH SEPARATOR 
<%0>    <U2030>      PER MILLE SIGN 
<1'>    <U2032>      PRIME 
<2'>    <U2033>      DOUBLE PRIME 
<3'>    <U2034>      TRIPLE PRIME 
<1">    <U2035>      REVERSED PRIME 
<2">    <U2036>      REVERSED DOUBLE PRIME 
<3">    <U2037>      REVERSED TRIPLE PRIME 
<Ca>    <U2038>      CARET 
<<1>    <U2039>      SINGLE LEFT-POINTING ANGLE QUOTATION MARK 
</>1>   <U203A>      SINGLE RIGHT-POINTING ANGLE QUOTATION MARK 
<:X>    <U203B>      REFERENCE MARK 
<!*2>   <U203C>      DOUBLE EXCLAMATION MARK 
<'->    <U203E>      OVERLINE 
<-b>    <U2043>      HYPHEN BULLET 
<//f>   <U2044>      FRACTION SLASH 
<0S>    <U2070>      SUPERSCRIPT ZERO 
<4S>    <U2074>      SUPERSCRIPT FOUR 
<5S>    <U2075>      SUPERSCRIPT FIVE 
<6S>    <U2076>      SUPERSCRIPT SIX 
<7S>    <U2077>      SUPERSCRIPT SEVEN 
<8S>    <U2078>      SUPERSCRIPT EIGHT 
<9S>    <U2079>      SUPERSCRIPT NINE 
<+S>    <U207A>      SUPERSCRIPT PLUS SIGN 
<-S>    <U207B>      SUPERSCRIPT MINUS 
<=S>    <U207C>      SUPERSCRIPT EQUALS SIGN 
<(S>    <U207D>      SUPERSCRIPT LEFT PARENTHESIS 
<)S>    <U207E>      SUPERSCRIPT RIGHT PARENTHESIS 
<nS>    <U207F>      SUPERSCRIPT LATIN SMALL LETTER N 
<0s>    <U2080>      SUBSCRIPT ZERO 
<1s>    <U2081>      SUBSCRIPT ONE 
<2s>    <U2082>      SUBSCRIPT TWO 
<3s>    <U2083>      SUBSCRIPT THREE 
<4s>    <U2084>      SUBSCRIPT FOUR 
<5s>    <U2085>      SUBSCRIPT FIVE 
<6s>    <U2086>      SUBSCRIPT SIX 
<7s>    <U2087>      SUBSCRIPT SEVEN 
<8s>    <U2088>      SUBSCRIPT EIGHT 
<9s>    <U2089>      SUBSCRIPT NINE 
<+s>    <U208A>      SUBSCRIPT PLUS SIGN 
<-s>    <U208B>      SUBSCRIPT MINUS 
<=s>    <U208C>      SUBSCRIPT EQUALS SIGN 
<(s>    <U208D>      SUBSCRIPT LEFT PARENTHESIS 
<)s>    <U208E>      SUBSCRIPT RIGHT PARENTHESIS 
<Ff>    <U20A3>      FRENCH FRANC SIGN 
<Li>    <U20A4>      LIRA SIGN 
<Pt>    <U20A7>      PESETA SIGN 
<W=>    <U20A9>      WON SIGN 
<"7>    <U20D1>      COMBINING RIGHT HARPOON ABOVE 
<oC>    <U2103>      DEGREE CELSIUS 
<co>    <U2105>      CARE OF 
<oF>    <U2109>      DEGREE FAHRENHEIT 
<N0>    <U2116>      NUMERO SIGN 
<PO>    <U2117>      SOUND RECORDING COPYRIGHT 
<Rx>    <U211E>      PRESCRIPTION TAKE 
<SM>    <U2120>      SERVICE MARK 
<TM>    <U2122>      TRADE MARK SIGN 
<Om>    <U2126>      OHM SIGN 
<AO>    <U212B>      ANGSTROM SIGN 
<Est>   <U212E>      ESTIMATED SYMBOL 
<13>    <U2153>      VULGAR FRACTION ONE THIRD 
<23>    <U2154>      VULGAR FRACTION TWO THIRDS 
<15>    <U2155>      VULGAR FRACTION ONE FIFTH 
<25>    <U2156>      VULGAR FRACTION TWO FIFTHS 
<35>    <U2157>      VULGAR FRACTION THREE FIFTHS 
<45>    <U2158>      VULGAR FRACTION FOUR FIFTHS 
<16>    <U2159>      VULGAR FRACTION ONE SIXTH 
<56>    <U215A>      VULGAR FRACTION FIVE SIXTHS 
<18>    <U215B>      VULGAR FRACTION ONE EIGHTH 
<38>    <U215C>      VULGAR FRACTION THREE EIGHTHS 
<58>    <U215D>      VULGAR FRACTION FIVE EIGHTHS 
<78>    <U215E>      VULGAR FRACTION SEVEN EIGHTHS 
<1R>    <U2160>      ROMAN NUMERAL ONE 
<2R>    <U2161>      ROMAN NUMERAL TWO 
<3R>    <U2162>      ROMAN NUMERAL THREE 
<4R>    <U2163>      ROMAN NUMERAL FOUR 
<5R>    <U2164>      ROMAN NUMERAL FIVE 
<6R>    <U2165>      ROMAN NUMERAL SIX 
<7R>    <U2166>      ROMAN NUMERAL SEVEN 
<8R>    <U2167>      ROMAN NUMERAL EIGHT 
<9R>    <U2168>      ROMAN NUMERAL NINE 
<aR>    <U2169>      ROMAN NUMERAL TEN 
<bR>    <U216A>      ROMAN NUMERAL ELEVEN 
<cR>    <U216B>      ROMAN NUMERAL TWELVE 
<50R>   <U216C>      ROMAN NUMERAL FIFTY 
<100R>  <U216D>      ROMAN NUMERAL ONE HUNDRED 
<500R>  <U216E>      ROMAN NUMERAL FIVE HUNDRED 
<1000R> <U216F>      ROMAN NUMERAL ONE THOUSAND 
<1r>    <U2170>      SMALL ROMAN NUMERAL ONE 
<2r>    <U2171>      SMALL ROMAN NUMERAL TWO 
<3r>    <U2172>      SMALL ROMAN NUMERAL THREE 
<4r>    <U2173>      SMALL ROMAN NUMERAL FOUR 
<5r>    <U2174>      SMALL ROMAN NUMERAL FIVE 
<6r>    <U2175>      SMALL ROMAN NUMERAL SIX 
<7r>    <U2176>      SMALL ROMAN NUMERAL SEVEN 
<8r>    <U2177>      SMALL ROMAN NUMERAL EIGHT 
<9r>    <U2178>      SMALL ROMAN NUMERAL NINE 
<ar>    <U2179>      SMALL ROMAN NUMERAL TEN 
<br>    <U217A>      SMALL ROMAN NUMERAL ELEVEN 
<cr>    <U217B>      SMALL ROMAN NUMERAL TWELVE 
<50r>   <U217C>      SMALL ROMAN NUMERAL FIFTY 
<100r>  <U217D>      SMALL ROMAN NUMERAL ONE HUNDRED 
<500r>  <U217E>      SMALL ROMAN NUMERAL FIVE HUNDRED 
<1000r> <U217F>      SMALL ROMAN NUMERAL ONE THOUSAND 
<1000RCD>       <U2180>     ROMAN NUMERAL ONE THOUSAND C D 
<5000R> <U2181>      ROMAN NUMERAL FIVE THOUSAND 
<10000R>        <U2182>     ROMAN NUMERAL TEN THOUSAND 
<<->    <U2190>      LEFTWARDS ARROW 
<-!>    <U2191>      UPWARDS ARROW 
<-/>>   <U2192>      RIGHTWARDS ARROW 
<-v>    <U2193>      DOWNWARDS ARROW 
<</>>   <U2194>      LEFT RIGHT ARROW 
<UD>    <U2195>      UP DOWN ARROW 
<<!!>   <U2196>      NORTH WEST ARROW 
</////>>        <U2197>     NORTH EAST ARROW 
<!!/>>  <U2198>      SOUTH EAST ARROW 
<<////> <U2199>      SOUTH WEST ARROW 
<UD->   <U21A8>      UP DOWN ARROW WITH BASE 
</>V>   <U21C0>      RIGHTWARDS HARPOON WITH BARB UPWARDS 
<<=>    <U21D0>      LEFTWARDS DOUBLE ARROW 
<=/>>   <U21D2>      RIGHTWARDS DOUBLE ARROW 
<==>    <U21D4>      LEFT RIGHT DOUBLE ARROW 
<FA>    <U2200>      FOR ALL 
<dP>    <U2202>      PARTIAL DIFFERENTIAL 
<TE>    <U2203>      THERE EXISTS 
<//0>   <U2205>      EMPTY SET 
<DE>    <U2206>      INCREMENT 
<NB>    <U2207>      NABLA 
<(->    <U2208>      ELEMENT OF 
<-)>    <U220B>      CONTAINS AS MEMBER 
<FP>    <U220E>      END OF PROOF 
<*P>    <U220F>      N-ARY PRODUCT 
<+Z>    <U2211>      N-ARY SUMMATION 
<-2>    <U2212>      MINUS SIGN 
<-+>    <U2213>      MINUS-OR-PLUS SIGN 
<.+>    <U2214>      DOT PLUS 
<*->    <U2217>      ASTERISK OPERATOR 
<Ob>    <U2218>      RING OPERATOR 
<Sb>    <U2219>      BULLET OPERATOR 
<RT>    <U221A>      SQUARE ROOT 
<0(>    <U221D>      PROPORTIONAL TO 
<00>    <U221E>      INFINITY 
<-L>    <U221F>      RIGHT ANGLE 
<-V>    <U2220>      ANGLE 
<PP>    <U2225>      PARALLEL TO 
<AN>    <U2227>      LOGICAL AND 
<OR>    <U2228>      LOGICAL OR 
<(U>    <U2229>      INTERSECTION 
<)U>    <U222A>      UNION 
<In>    <U222B>      INTEGRAL 
<DI>    <U222C>      DOUBLE INTEGRAL 
<Io>    <U222E>      CONTOUR INTEGRAL 
<.:>    <U2234>      THEREFORE 
<:.>    <U2235>      BECAUSE 
<:R>    <U2236>      RATIO 
<::>    <U2237>      PROPORTION 
<?1>    <U223C>      TILDE OPERATOR 
<CG>    <U223E>      INVERTED LAZY S 
<?->    <U2243>      ASYMPTOTICALLY EQUAL TO 
<?=>    <U2245>      APPROXIMATELY EQUAL TO 
<?2>    <U2248>      ALMOST EQUAL TO 
<=?>    <U224C>      ALL EQUAL TO 
<HI>    <U2253>      IMAGE OF OR APPROXIMATELY EQUAL TO 
<!=>    <U2260>      NOT EQUAL TO 
<=3>    <U2261>      IDENTICAL TO 
<=<>    <U2264>      LESS-THAN OR EQUAL TO 
</>=>   <U2265>      GREATER-THAN OR EQUAL TO 
<<*>    <U226A>      MUCH LESS-THAN 
<*/>>   <U226B>      MUCH GREATER-THAN 
<!<>    <U226E>      NOT LESS-THAN 
<!/>>   <U226F>      NOT GREATER-THAN 
<(C>    <U2282>      SUBSET OF 
<)C>    <U2283>      SUPERSET OF 
<(_>    <U2286>      SUBSET OF OR EQUAL TO 
<)_>    <U2287>      SUPERSET OF OR EQUAL TO 
<0.>    <U2299>      CIRCLED DOT OPERATOR 
<02>    <U229A>      CIRCLED RING OPERATOR 
<-T>    <U22A5>      UP TACK 
<.P>    <U22C5>      DOT OPERATOR 
<:3>    <U22EE>      VERTICAL ELLIPSIS 
<Eh>    <U2302>      HOUSE 
<<7>    <U2308>      LEFT CEILING 
</>7>   <U2309>      RIGHT CEILING 
<7<>    <U230A>      LEFT FLOOR 
<7/>>   <U230B>      RIGHT FLOOR 
<NI>    <U2310>      REVERSED NOT SIGN 
<(A>    <U2312>      ARC 
<TR>    <U2315>      TELEPHONE RECORDER 
<88>    <U2318>      PLACE OF INTEREST SIGN 
<Iu>    <U2320>      TOP HALF INTEGRAL 
<Il>    <U2321>      BOTTOM HALF INTEGRAL 
<<//>   <U2329>      LEFT-POINTING ANGLE BRACKET 
<///>>  <U232A>      RIGHT-POINTING ANGLE BRACKET 
<Vs>    <U2423>      OPEN BOX 
<1h>    <U2440>      OCR HOOK 
<3h>    <U2441>      OCR CHAIR 
<2h>    <U2442>      OCR FORK 
<4h>    <U2443>      OCR INVERTED FORK 
<1j>    <U2446>      OCR BRANCH BANK IDENTIFICATION 
<2j>    <U2447>      OCR AMOUNT OF CHECK 
<3j>    <U2448>      OCR DASH 
<4j>    <U2449>      OCR CUSTOMER ACCOUNT NUMBER 
<1-o>   <U2460>      CIRCLED DIGIT ONE 
<2-o>   <U2461>      CIRCLED DIGIT TWO 
<3-o>   <U2462>      CIRCLED DIGIT THREE 
<4-o>   <U2463>      CIRCLED DIGIT FOUR 
<5-o>   <U2464>      CIRCLED DIGIT FIVE 
<6-o>   <U2465>      CIRCLED DIGIT SIX 
<7-o>   <U2466>      CIRCLED DIGIT SEVEN 
<8-o>   <U2467>      CIRCLED DIGIT EIGHT 
<9-o>   <U2468>      CIRCLED DIGIT NINE 
<10-o>  <U2469>      CIRCLED NUMBER TEN 
<11-o>  <U246A>      CIRCLED NUMBER ELEVEN 
<12-o>  <U246B>      CIRCLED NUMBER TWELVE 
<13-o>  <U246C>      CIRCLED NUMBER THIRTEEN 
<14-o>  <U246D>      CIRCLED NUMBER FOURTEEN 
<15-o>  <U246E>      CIRCLED NUMBER FIFTEEN 
<16-o>  <U246F>      CIRCLED NUMBER SIXTEEN 
<17-o>  <U2470>      CIRCLED NUMBER SEVENTEEN 
<18-o>  <U2471>      CIRCLED NUMBER EIGHTEEN 
<19-o>  <U2472>      CIRCLED NUMBER NINETEEN 
<20-o>  <U2473>      CIRCLED NUMBER TWENTY 
<(1)>   <U2474>      PARENTHESIZED DIGIT ONE 
<(2)>   <U2475>      PARENTHESIZED DIGIT TWO 
<(3)>   <U2476>      PARENTHESIZED DIGIT THREE 
<(4)>   <U2477>      PARENTHESIZED DIGIT FOUR 
<(5)>   <U2478>      PARENTHESIZED DIGIT FIVE 
<(6)>   <U2479>      PARENTHESIZED DIGIT SIX 
<(7)>   <U247A>      PARENTHESIZED DIGIT SEVEN 
<(8)>   <U247B>      PARENTHESIZED DIGIT EIGHT 
<(9)>   <U247C>      PARENTHESIZED DIGIT NINE 
<(10)>  <U247D>      PARENTHESIZED NUMBER TEN 
<(11)>  <U247E>      PARENTHESIZED NUMBER ELEVEN 
<(12)>  <U247F>      PARENTHESIZED NUMBER TWELVE 
<(13)>  <U2480>      PARENTHESIZED NUMBER THIRTEEN 
<(14)>  <U2481>      PARENTHESIZED NUMBER FOURTEEN 
<(15)>  <U2482>      PARENTHESIZED NUMBER FIFTEEN 
<(16)>  <U2483>      PARENTHESIZED NUMBER SIXTEEN 
<(17)>  <U2484>      PARENTHESIZED NUMBER SEVENTEEN 
<(18)>  <U2485>      PARENTHESIZED NUMBER EIGHTEEN 
<(19)>  <U2486>      PARENTHESIZED NUMBER NINETEEN 
<(20)>  <U2487>      PARENTHESIZED NUMBER TWENTY 
<1.>    <U2488>      DIGIT ONE FULL STOP 
<2.>    <U2489>      DIGIT TWO FULL STOP 
<3.>    <U248A>      DIGIT THREE FULL STOP 
<4.>    <U248B>      DIGIT FOUR FULL STOP 
<5.>    <U248C>      DIGIT FIVE FULL STOP 
<6.>    <U248D>      DIGIT SIX FULL STOP 
<7.>    <U248E>      DIGIT SEVEN FULL STOP 
<8.>    <U248F>      DIGIT EIGHT FULL STOP 
<9.>    <U2490>      DIGIT NINE FULL STOP 
<10.>   <U2491>      NUMBER TEN FULL STOP 
<11.>   <U2492>      NUMBER ELEVEN FULL STOP 
<12.>   <U2493>      NUMBER TWELVE FULL STOP 
<13.>   <U2494>      NUMBER THIRTEEN FULL STOP 
<14.>   <U2495>      NUMBER FOURTEEN FULL STOP 
<15.>   <U2496>      NUMBER FIFTEEN FULL STOP 
<16.>   <U2497>      NUMBER SIXTEEN FULL STOP 
<17.>   <U2498>      NUMBER SEVENTEEN FULL STOP 
<18.>   <U2499>      NUMBER EIGHTEEN FULL STOP 
<19.>   <U249A>      NUMBER NINETEEN FULL STOP 
<20.>   <U249B>      NUMBER TWENTY FULL STOP 
<(a)>   <U249C>      PARENTHESIZED LATIN SMALL LETTER A 
<(b)>   <U249D>      PARENTHESIZED LATIN SMALL LETTER B 
<(c)>   <U249E>      PARENTHESIZED LATIN SMALL LETTER C 
<(d)>   <U249F>      PARENTHESIZED LATIN SMALL LETTER D 
<(e)>   <U24A0>      PARENTHESIZED LATIN SMALL LETTER E 
<(f)>   <U24A1>      PARENTHESIZED LATIN SMALL LETTER F 
<(g)>   <U24A2>      PARENTHESIZED LATIN SMALL LETTER G 
<(h)>   <U24A3>      PARENTHESIZED LATIN SMALL LETTER H 
<(i)>   <U24A4>      PARENTHESIZED LATIN SMALL LETTER I 
<(j)>   <U24A5>      PARENTHESIZED LATIN SMALL LETTER J 
<(k)>   <U24A6>      PARENTHESIZED LATIN SMALL LETTER K 
<(l)>   <U24A7>      PARENTHESIZED LATIN SMALL LETTER L 
<(m)>   <U24A8>      PARENTHESIZED LATIN SMALL LETTER M 
<(n)>   <U24A9>      PARENTHESIZED LATIN SMALL LETTER N 
<(o)>   <U24AA>      PARENTHESIZED LATIN SMALL LETTER O 
<(p)>   <U24AB>      PARENTHESIZED LATIN SMALL LETTER P 
<(q)>   <U24AC>      PARENTHESIZED LATIN SMALL LETTER Q 
<(r)>   <U24AD>      PARENTHESIZED LATIN SMALL LETTER R 
<(s)>   <U24AE>      PARENTHESIZED LATIN SMALL LETTER S 
<(t)>   <U24AF>      PARENTHESIZED LATIN SMALL LETTER T 
<(u)>   <U24B0>      PARENTHESIZED LATIN SMALL LETTER U 
<(v)>   <U24B1>      PARENTHESIZED LATIN SMALL LETTER V 
<(w)>   <U24B2>      PARENTHESIZED LATIN SMALL LETTER W 
<(x)>   <U24B3>      PARENTHESIZED LATIN SMALL LETTER X 
<(y)>   <U24B4>      PARENTHESIZED LATIN SMALL LETTER Y 
<(z)>   <U24B5>      PARENTHESIZED LATIN SMALL LETTER Z 
<A-o>   <U24B6>      CIRCLED LATIN CAPITAL LETTER A 
<B-o>   <U24B7>      CIRCLED LATIN CAPITAL LETTER B 
<C-o>   <U24B8>      CIRCLED LATIN CAPITAL LETTER C 
<D-o>   <U24B9>      CIRCLED LATIN CAPITAL LETTER D 
<E-o>   <U24BA>      CIRCLED LATIN CAPITAL LETTER E 
<F-o>   <U24BB>      CIRCLED LATIN CAPITAL LETTER F 
<G-o>   <U24BC>      CIRCLED LATIN CAPITAL LETTER G 
<H-o>   <U24BD>      CIRCLED LATIN CAPITAL LETTER H 
<I-o>   <U24BE>      CIRCLED LATIN CAPITAL LETTER I 
<J-o>   <U24BF>      CIRCLED LATIN CAPITAL LETTER J 
<K-o>   <U24C0>      CIRCLED LATIN CAPITAL LETTER K 
<L-o>   <U24C1>      CIRCLED LATIN CAPITAL LETTER L 
<M-o>   <U24C2>      CIRCLED LATIN CAPITAL LETTER M 
<N-o>   <U24C3>      CIRCLED LATIN CAPITAL LETTER N 
<O-o>   <U24C4>      CIRCLED LATIN CAPITAL LETTER O 
<P-o>   <U24C5>      CIRCLED LATIN CAPITAL LETTER P 
<Q-o>   <U24C6>      CIRCLED LATIN CAPITAL LETTER Q 
<R-o>   <U24C7>      CIRCLED LATIN CAPITAL LETTER R 
<S-o>   <U24C8>      CIRCLED LATIN CAPITAL LETTER S 
<T-o>   <U24C9>      CIRCLED LATIN CAPITAL LETTER T 
<U-o>   <U24CA>      CIRCLED LATIN CAPITAL LETTER U 
<V-o>   <U24CB>      CIRCLED LATIN CAPITAL LETTER V 
<W-o>   <U24CC>      CIRCLED LATIN CAPITAL LETTER W 
<X-o>   <U24CD>      CIRCLED LATIN CAPITAL LETTER X 
<Y-o>   <U24CE>      CIRCLED LATIN CAPITAL LETTER Y 
<Z-o>   <U24CF>      CIRCLED LATIN CAPITAL LETTER Z 
<a-o>   <U24D0>      CIRCLED LATIN SMALL LETTER A 
<b-o>   <U24D1>      CIRCLED LATIN SMALL LETTER B 
<c-o>   <U24D2>      CIRCLED LATIN SMALL LETTER C 
<d-o>   <U24D3>      CIRCLED LATIN SMALL LETTER D 
<e-o>   <U24D4>      CIRCLED LATIN SMALL LETTER E 
<f-o>   <U24D5>      CIRCLED LATIN SMALL LETTER F 
<g-o>   <U24D6>      CIRCLED LATIN SMALL LETTER G 
<h-o>   <U24D7>      CIRCLED LATIN SMALL LETTER H 
<i-o>   <U24D8>      CIRCLED LATIN SMALL LETTER I 
<j-o>   <U24D9>      CIRCLED LATIN SMALL LETTER J 
<k-o>   <U24DA>      CIRCLED LATIN SMALL LETTER K 
<l-o>   <U24DB>      CIRCLED LATIN SMALL LETTER L 
<m-o>   <U24DC>      CIRCLED LATIN SMALL LETTER M 
<n-o>   <U24DD>      CIRCLED LATIN SMALL LETTER N 
<o-o>   <U24DE>      CIRCLED LATIN SMALL LETTER O 
<p-o>   <U24DF>      CIRCLED LATIN SMALL LETTER P 
<q-o>   <U24E0>      CIRCLED LATIN SMALL LETTER Q 
<r-o>   <U24E1>      CIRCLED LATIN SMALL LETTER R 
<s-o>   <U24E2>      CIRCLED LATIN SMALL LETTER S 
<t-o>   <U24E3>      CIRCLED LATIN SMALL LETTER T 
<u-o>   <U24E4>      CIRCLED LATIN SMALL LETTER U 
<v-o>   <U24E5>      CIRCLED LATIN SMALL LETTER V 
<w-o>   <U24E6>      CIRCLED LATIN SMALL LETTER W 
<x-o>   <U24E7>      CIRCLED LATIN SMALL LETTER X 
<y-o>   <U24E8>      CIRCLED LATIN SMALL LETTER Y 
<z-o>   <U24E9>      CIRCLED LATIN SMALL LETTER Z 
<0-o>   <U24EA>      CIRCLED DIGIT ZERO 
<hh>    <U2500>      BOX DRAWINGS LIGHT HORIZONTAL 
<HH->   <U2501>      BOX DRAWINGS HEAVY HORIZONTAL 
<vv>    <U2502>      BOX DRAWINGS LIGHT VERTICAL 
<VV->   <U2503>      BOX DRAWINGS HEAVY VERTICAL 
<3->    <U2504>      BOX DRAWINGS LIGHT TRIPLE DASH HORIZONTAL 
<3_>    <U2505>      BOX DRAWINGS HEAVY TRIPLE DASH HORIZONTAL 
<3!>    <U2506>      BOX DRAWINGS LIGHT TRIPLE DASH VERTICAL 
<3//>   <U2507>      BOX DRAWINGS HEAVY TRIPLE DASH VERTICAL 
<4->    <U2508>      BOX DRAWINGS LIGHT QUADRUPLE DASH HORIZONTAL 
<4_>    <U2509>      BOX DRAWINGS HEAVY QUADRUPLE DASH HORIZONTAL 
<4!>    <U250A>      BOX DRAWINGS LIGHT QUADRUPLE DASH VERTICAL 
<4//>   <U250B>      BOX DRAWINGS HEAVY QUADRUPLE DASH VERTICAL 
<dr>    <U250C>      BOX DRAWINGS LIGHT DOWN AND RIGHT 
<dR->   <U250D>      BOX DRAWINGS DOWN LIGHT AND RIGHT HEAVY 
<Dr->   <U250E>      BOX DRAWINGS DOWN HEAVY AND RIGHT LIGHT 
<DR->   <U250F>      BOX DRAWINGS HEAVY DOWN AND RIGHT 
<dl>    <U2510>      BOX DRAWINGS LIGHT DOWN AND LEFT 
<dL->   <U2511>      BOX DRAWINGS DOWN LIGHT AND LEFT HEAVY 
<Dl->   <U2512>      BOX DRAWINGS DOWN HEAVY AND LEFT LIGHT 
<LD->   <U2513>      BOX DRAWINGS HEAVY DOWN AND LEFT 
<ur>    <U2514>      BOX DRAWINGS LIGHT UP AND RIGHT 
<uR->   <U2515>      BOX DRAWINGS UP LIGHT AND RIGHT HEAVY 
<Ur->   <U2516>      BOX DRAWINGS UP HEAVY AND RIGHT LIGHT 
<UR->   <U2517>      BOX DRAWINGS HEAVY UP AND RIGHT 
<ul>    <U2518>      BOX DRAWINGS LIGHT UP AND LEFT 
<uL->   <U2519>      BOX DRAWINGS UP LIGHT AND LEFT HEAVY 
<Ul->   <U251A>      BOX DRAWINGS UP HEAVY AND LEFT LIGHT 
<UL->   <U251B>      BOX DRAWINGS HEAVY UP AND LEFT 
<vr>    <U251C>      BOX DRAWINGS LIGHT VERTICAL AND RIGHT 
<vR->   <U251D>      BOX DRAWINGS VERTICAL LIGHT AND RIGHT HEAVY 
<Udr>   <U251E>      BOX DRAWINGS UP HEAVY AND RIGHT DOWN LIGHT 
<uDr>   <U251F>      BOX DRAWINGS DOWN HEAVY AND RIGHT UP LIGHT 
<Vr->   <U2520>      BOX DRAWINGS VERTICAL HEAVY AND RIGHT LIGHT 
<UdR>   <U2521>      BOX DRAWINGS DOWN LIGHT AND RIGHT UP HEAVY 
<uDR>   <U2522>      BOX DRAWINGS UP LIGHT AND RIGHT DOWN HEAVY 
<VR->   <U2523>      BOX DRAWINGS HEAVY VERTICAL AND RIGHT 
<vl>    <U2524>      BOX DRAWINGS LIGHT VERTICAL AND LEFT 
<vL->   <U2525>      BOX DRAWINGS VERTICAL LIGHT AND LEFT HEAVY 
<Udl>   <U2526>      BOX DRAWINGS UP HEAVY AND LEFT DOWN LIGHT 
<uDl>   <U2527>      BOX DRAWINGS DOWN HEAVY AND LEFT UP LIGHT 
<Vl->   <U2528>      BOX DRAWINGS VERTICAL HEAVY AND LEFT LIGHT 
<UdL>   <U2529>      BOX DRAWINGS DOWN LIGHT AND LEFT UP HEAVY 
<uDL>   <U252A>      BOX DRAWINGS UP LIGHT AND LEFT DOWN HEAVY 
<VL->   <U252B>      BOX DRAWINGS HEAVY VERTICAL AND LEFT 
<dh>    <U252C>      BOX DRAWINGS LIGHT DOWN AND HORIZONTAL 
<dLr>   <U252D>      BOX DRAWINGS LEFT HEAVY AND RIGHT DOWN LIGHT 
<dlR>   <U252E>      BOX DRAWINGS RIGHT HEAVY AND LEFT DOWN LIGHT 
<dH->   <U252F>      BOX DRAWINGS DOWN LIGHT AND HORIZONTAL HEAVY 
<Dh->   <U2530>      BOX DRAWINGS DOWN HEAVY AND HORIZONTAL LIGHT 
<DLr>   <U2531>      BOX DRAWINGS RIGHT LIGHT AND LEFT DOWN HEAVY 
<DlR>   <U2532>      BOX DRAWINGS LEFT LIGHT AND RIGHT DOWN HEAVY 
<DH->   <U2533>      BOX DRAWINGS HEAVY DOWN AND HORIZONTAL 
<uh>    <U2534>      BOX DRAWINGS LIGHT UP AND HORIZONTAL 
<uLr>   <U2535>      BOX DRAWINGS LEFT HEAVY AND RIGHT UP LIGHT 
<ulR>   <U2536>      BOX DRAWINGS RIGHT HEAVY AND LEFT UP LIGHT 
<uH->   <U2537>      BOX DRAWINGS UP LIGHT AND HORIZONTAL HEAVY 
<Uh->   <U2538>      BOX DRAWINGS UP HEAVY AND HORIZONTAL LIGHT 
<ULr>   <U2539>      BOX DRAWINGS RIGHT LIGHT AND LEFT UP HEAVY 
<UlR>   <U253A>      BOX DRAWINGS LEFT LIGHT AND RIGHT UP HEAVY 
<UH->   <U253B>      BOX DRAWINGS HEAVY UP AND HORIZONTAL 
<vh>    <U253C>      BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL 
<vLr>   <U253D>      BOX DRAWINGS LEFT HEAVY AND RIGHT VERTICAL LIGHT 
<vlR>   <U253E>      BOX DRAWINGS RIGHT HEAVY AND LEFT VERTICAL LIGHT 
<vH->   <U253F>      BOX DRAWINGS VERTICAL LIGHT AND HORIZONTAL HEAVY 
<Udh>   <U2540>      BOX DRAWINGS UP HEAVY AND DOWN HORIZONTAL LIGHT 
<uDh>   <U2541>      BOX DRAWINGS DOWN HEAVY AND UP HORIZONTAL LIGHT 
<Vh->   <U2542>      BOX DRAWINGS VERTICAL HEAVY AND HORIZONTAL LIGHT 
<UdLr>  <U2543>      BOX DRAWINGS LEFT UP HEAVY AND RIGHT DOWN LIGHT 
<UdlR>  <U2544>      BOX DRAWINGS RIGHT UP HEAVY AND LEFT DOWN LIGHT 
<uDLr>  <U2545>      BOX DRAWINGS LEFT DOWN HEAVY AND RIGHT UP LIGHT 
<uDlR>  <U2546>      BOX DRAWINGS RIGHT DOWN HEAVY AND LEFT UP LIGHT 
<UdH>   <U2547>      BOX DRAWINGS DOWN LIGHT AND UP HORIZONTAL HEAVY 
<uDH>   <U2548>      BOX DRAWINGS UP LIGHT AND DOWN HORIZONTAL HEAVY 
<VLr>   <U2549>      BOX DRAWINGS RIGHT LIGHT AND LEFT VERTICAL HEAVY 
<VlR>   <U254A>      BOX DRAWINGS LEFT LIGHT AND RIGHT VERTICAL HEAVY 
<VH->   <U254B>      BOX DRAWINGS HEAVY VERTICAL AND HORIZONTAL 
<HH>    <U2550>      BOX DRAWINGS DOUBLE HORIZONTAL 
<VV>    <U2551>      BOX DRAWINGS DOUBLE VERTICAL 
<dR>    <U2552>      BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE 
<Dr>    <U2553>      BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE 
<DR>    <U2554>      BOX DRAWINGS DOUBLE DOWN AND RIGHT 
<dL>    <U2555>      BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE 
<Dl>    <U2556>      BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE 
<LD>    <U2557>      BOX DRAWINGS DOUBLE DOWN AND LEFT 
<uR>    <U2558>      BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE 
<Ur>    <U2559>      BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE 
<UR>    <U255A>      BOX DRAWINGS DOUBLE UP AND RIGHT 
<uL>    <U255B>      BOX DRAWINGS UP SINGLE AND LEFT DOUBLE 
<Ul>    <U255C>      BOX DRAWINGS UP DOUBLE AND LEFT SINGLE 
<UL>    <U255D>      BOX DRAWINGS DOUBLE UP AND LEFT 
<vR>    <U255E>      BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE 
<Vr>    <U255F>      BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE 
<VR>    <U2560>      BOX DRAWINGS DOUBLE VERTICAL AND RIGHT 
<vL>    <U2561>      BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE 
<Vl>    <U2562>      BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE 
<VL>    <U2563>      BOX DRAWINGS DOUBLE VERTICAL AND LEFT 
<dH>    <U2564>      BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE 
<Dh>    <U2565>      BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE 
<DH>    <U2566>      BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL 
<uH>    <U2567>      BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE 
<Uh>    <U2568>      BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE 
<UH>    <U2569>      BOX DRAWINGS DOUBLE UP AND HORIZONTAL 
<vH>    <U256A>      BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE 
<Vh>    <U256B>      BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE 
<VH>    <U256C>      BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL 
<FD>    <U2571>      BOX DRAWINGS LIGHT DIAGONAL UPPER RIGHT TO LOWER LEFT 
<BD>    <U2572>      BOX DRAWINGS LIGHT DIAGONAL UPPER LEFT TO LOWER RIGHT 
<TB>    <U2580>      UPPER HALF BLOCK 
<LB>    <U2584>      LOWER HALF BLOCK 
<FB>    <U2588>      FULL BLOCK 
<lB>    <U258C>      LEFT HALF BLOCK 
<RB>    <U2590>      RIGHT HALF BLOCK 
<.S>    <U2591>      LIGHT SHADE 
<:S>    <U2592>      MEDIUM SHADE 
<?S>    <U2593>      DARK SHADE 
<fS>    <U25A0>      BLACK SQUARE 
<OS>    <U25A1>      WHITE SQUARE 
<RO>    <U25A2>      WHITE SQUARE WITH ROUNDED CORNERS 
<Rr>    <U25A3>      WHITE SQUARE CONTAINING BLACK SMALL SQUARE 
<RF>    <U25A4>      SQUARE WITH HORIZONTAL FILL 
<RY>    <U25A5>      SQUARE WITH VERTICAL FILL 
<RH>    <U25A6>      SQUARE WITH ORTHOGONAL CROSSHATCH FILL 
<RZ>    <U25A7>      SQUARE WITH UPPER LEFT TO LOWER RIGHT FILL 
<RK>    <U25A8>      SQUARE WITH UPPER RIGHT TO LOWER LEFT FILL 
<RX>    <U25A9>      SQUARE WITH DIAGONAL CROSSHATCH FILL 
<sB>    <U25AA>      BLACK SMALL SQUARE 
<SR>    <U25AC>      BLACK RECTANGLE 
<Or>    <U25AD>      WHITE RECTANGLE 
<UT>    <U25B2>      BLACK UP-POINTING TRIANGLE 
<uT>    <U25B3>      WHITE UP-POINTING TRIANGLE 
<Tr>    <U25B7>      WHITE RIGHT-POINTING TRIANGLE 
<PR>    <U25BA>      BLACK RIGHT-POINTING POINTER 
<Dt>    <U25BC>      BLACK DOWN-POINTING TRIANGLE 
<dT>    <U25BD>      WHITE DOWN-POINTING TRIANGLE 
<Tl>    <U25C1>      WHITE LEFT-POINTING TRIANGLE 
<PL>    <U25C4>      BLACK LEFT-POINTING POINTER 
<Db>    <U25C6>      BLACK DIAMOND 
<Dw>    <U25C7>      WHITE DIAMOND 
<LZ>    <U25CA>      LOZENGE 
<0m>    <U25CB>      WHITE CIRCLE 
<0o>    <U25CE>      BULLSEYE 
<0M>    <U25CF>      BLACK CIRCLE 
<0L>    <U25D0>      CIRCLE WITH LEFT HALF BLACK 
<0R>    <U25D1>      CIRCLE WITH RIGHT HALF BLACK 
<Sn>    <U25D8>      INVERSE BULLET 
<Ic>    <U25D9>      INVERSE WHITE CIRCLE 
<Fd>    <U25E2>      BLACK LOWER RIGHT TRIANGLE 
<Bd>    <U25E3>      BLACK LOWER LEFT TRIANGLE 
<Ci>    <U25EF>      LARGE CIRCLE 
<*2>    <U2605>      BLACK STAR 
<*1>    <U2606>      WHITE STAR 
<TEL>   <U260E>      BLACK TELEPHONE 
<tel>   <U260F>      WHITE TELEPHONE 
<<H>    <U261C>      WHITE LEFT POINTING INDEX 
</>H>   <U261E>      WHITE RIGHT POINTING INDEX 
<0u>    <U263A>      WHITE SMILING FACE 
<0U>    <U263B>      BLACK SMILING FACE 
<SU>    <U263C>      WHITE SUN WITH RAYS 
<Fm>    <U2640>      FEMALE SIGN 
<Ml>    <U2642>      MALE SIGN 
<cS>    <U2660>      BLACK SPADE SUIT 
<cH>    <U2661>      WHITE HEART SUIT 
<cD>    <U2662>      WHITE DIAMOND SUIT 
<cC>    <U2663>      BLACK CLUB SUIT 
<cS->   <U2664>      WHITE SPADE SUIT 
<cH->   <U2665>      BLACK HEART SUIT 
<cD->   <U2666>      BLACK DIAMOND SUIT 
<cC->   <U2667>      WHITE CLUB SUIT 
<Md>    <U2669>      QUARTER NOTE 
<M8>    <U266A>      EIGHTH NOTE 
<M2>    <U266B>      BEAMED EIGHTH NOTES 
<M16>   <U266C>      BEAMED SIXTEENTH NOTES 
<Mb>    <U266D>      MUSIC FLAT SIGN 
<Mx>    <U266E>      MUSIC NATURAL SIGN 
<MX>    <U266F>      MUSIC SHARP SIGN 
<OK>    <U2713>      CHECK MARK 
<XX>    <U2717>      BALLOT X 
<-X>    <U2720>      MALTESE CROSS 
<IS>    <U3000>      IDEOGRAPHIC SPACE 
<,_>    <U3001>      IDEOGRAPHIC COMMA 
<._>    <U3002>      IDEOGRAPHIC FULL STOP 
<+">    <U3003>      DITTO MARK 
<JIS>   <U3004>      JAPANESE INDUSTRIAL STANDARD SYMBOL 
<*_>    <U3005>      IDEOGRAPHIC ITERATION MARK 
<;_>    <U3006>      IDEOGRAPHIC CLOSING MARK 
<0_>    <U3007>      IDEOGRAPHIC NUMBER ZERO 
<<+>    <U300A>      LEFT DOUBLE ANGLE BRACKET 
</>+>   <U300B>      RIGHT DOUBLE ANGLE BRACKET 
<<'>    <U300C>      LEFT CORNER BRACKET 
</>'>   <U300D>      RIGHT CORNER BRACKET 
<<">    <U300E>      LEFT WHITE CORNER BRACKET 
</>">   <U300F>      RIGHT WHITE CORNER BRACKET 
<(">    <U3010>      LEFT BLACK LENTICULAR BRACKET 
<)">    <U3011>      RIGHT BLACK LENTICULAR BRACKET 
<=T>    <U3012>      POSTAL MARK 
<=_>    <U3013>      GETA MARK 
<('>    <U3014>      LEFT TORTOISE SHELL BRACKET 
<)'>    <U3015>      RIGHT TORTOISE SHELL BRACKET 
<(I>    <U3016>      LEFT WHITE LENTICULAR BRACKET 
<)I>    <U3017>      RIGHT WHITE LENTICULAR BRACKET 
<-?>    <U301C>      WAVE DASH 
<=T:)>  <U3020>      POSTAL MARK FACE 
<A5>    <U3041>      HIRAGANA LETTER SMALL A 
<a5>    <U3042>      HIRAGANA LETTER A 
<I5>    <U3043>      HIRAGANA LETTER SMALL I 
<i5>    <U3044>      HIRAGANA LETTER I 
<U5>    <U3045>      HIRAGANA LETTER SMALL U 
<u5>    <U3046>      HIRAGANA LETTER U 
<E5>    <U3047>      HIRAGANA LETTER SMALL E 
<e5>    <U3048>      HIRAGANA LETTER E 
<O5>    <U3049>      HIRAGANA LETTER SMALL O 
<o5>    <U304A>      HIRAGANA LETTER O 
<ka>    <U304B>      HIRAGANA LETTER KA 
<ga>    <U304C>      HIRAGANA LETTER GA 
<ki>    <U304D>      HIRAGANA LETTER KI 
<gi>    <U304E>      HIRAGANA LETTER GI 
<ku>    <U304F>      HIRAGANA LETTER KU 
<gu>    <U3050>      HIRAGANA LETTER GU 
<ke>    <U3051>      HIRAGANA LETTER KE 
<ge>    <U3052>      HIRAGANA LETTER GE 
<ko>    <U3053>      HIRAGANA LETTER KO 
<go>    <U3054>      HIRAGANA LETTER GO 
<sa>    <U3055>      HIRAGANA LETTER SA 
<za>    <U3056>      HIRAGANA LETTER ZA 
<si>    <U3057>      HIRAGANA LETTER SI 
<zi>    <U3058>      HIRAGANA LETTER ZI 
<su>    <U3059>      HIRAGANA LETTER SU 
<zu>    <U305A>      HIRAGANA LETTER ZU 
<se>    <U305B>      HIRAGANA LETTER SE 
<ze>    <U305C>      HIRAGANA LETTER ZE 
<so>    <U305D>      HIRAGANA LETTER SO 
<zo>    <U305E>      HIRAGANA LETTER ZO 
<ta>    <U305F>      HIRAGANA LETTER TA 
<da>    <U3060>      HIRAGANA LETTER DA 
<ti>    <U3061>      HIRAGANA LETTER TI 
<di>    <U3062>      HIRAGANA LETTER DI 
<tU>    <U3063>      HIRAGANA LETTER SMALL TU 
<tu>    <U3064>      HIRAGANA LETTER TU 
<du>    <U3065>      HIRAGANA LETTER DU 
<te>    <U3066>      HIRAGANA LETTER TE 
<de>    <U3067>      HIRAGANA LETTER DE 
<to>    <U3068>      HIRAGANA LETTER TO 
<do>    <U3069>      HIRAGANA LETTER DO 
<na>    <U306A>      HIRAGANA LETTER NA 
<ni>    <U306B>      HIRAGANA LETTER NI 
<nu>    <U306C>      HIRAGANA LETTER NU 
<ne>    <U306D>      HIRAGANA LETTER NE 
<no>    <U306E>      HIRAGANA LETTER NO 
<ha>    <U306F>      HIRAGANA LETTER HA 
<ba>    <U3070>      HIRAGANA LETTER BA 
<pa>    <U3071>      HIRAGANA LETTER PA 
<hi>    <U3072>      HIRAGANA LETTER HI 
<bi>    <U3073>      HIRAGANA LETTER BI 
<pi>    <U3074>      HIRAGANA LETTER PI 
<hu>    <U3075>      HIRAGANA LETTER HU 
<bu>    <U3076>      HIRAGANA LETTER BU 
<pu>    <U3077>      HIRAGANA LETTER PU 
<he>    <U3078>      HIRAGANA LETTER HE 
<be>    <U3079>      HIRAGANA LETTER BE 
<pe>    <U307A>      HIRAGANA LETTER PE 
<ho>    <U307B>      HIRAGANA LETTER HO 
<bo>    <U307C>      HIRAGANA LETTER BO 
<po>    <U307D>      HIRAGANA LETTER PO 
<ma>    <U307E>      HIRAGANA LETTER MA 
<mi>    <U307F>      HIRAGANA LETTER MI 
<mu>    <U3080>      HIRAGANA LETTER MU 
<me>    <U3081>      HIRAGANA LETTER ME 
<mo>    <U3082>      HIRAGANA LETTER MO 
<yA>    <U3083>      HIRAGANA LETTER SMALL YA 
<ya>    <U3084>      HIRAGANA LETTER YA 
<yU>    <U3085>      HIRAGANA LETTER SMALL YU 
<yu>    <U3086>      HIRAGANA LETTER YU 
<yO>    <U3087>      HIRAGANA LETTER SMALL YO 
<yo>    <U3088>      HIRAGANA LETTER YO 
<ra>    <U3089>      HIRAGANA LETTER RA 
<ri>    <U308A>      HIRAGANA LETTER RI 
<ru>    <U308B>      HIRAGANA LETTER RU 
<re>    <U308C>      HIRAGANA LETTER RE 
<ro>    <U308D>      HIRAGANA LETTER RO 
<wA>    <U308E>      HIRAGANA LETTER SMALL WA 
<wa>    <U308F>      HIRAGANA LETTER WA 
<wi>    <U3090>      HIRAGANA LETTER WI 
<we>    <U3091>      HIRAGANA LETTER WE 
<wo>    <U3092>      HIRAGANA LETTER WO 
<n5>    <U3093>      HIRAGANA LETTER N 
<vu>    <U3094>      HIRAGANA LETTER VU 
<"5>    <U309B>      KATAKANA-HIRAGANA VOICED SOUND MARK 
<05>    <U309C>      KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK 
<*5>    <U309D>      HIRAGANA ITERATION MARK 
<+5>    <U309E>      HIRAGANA VOICED ITERATION MARK 
<a6>    <U30A1>      KATAKANA LETTER SMALL A 
<A6>    <U30A2>      KATAKANA LETTER A 
<i6>    <U30A3>      KATAKANA LETTER SMALL I 
<I6>    <U30A4>      KATAKANA LETTER I 
<u6>    <U30A5>      KATAKANA LETTER SMALL U 
<U6>    <U30A6>      KATAKANA LETTER U 
<e6>    <U30A7>      KATAKANA LETTER SMALL E 
<E6>    <U30A8>      KATAKANA LETTER E 
<o6>    <U30A9>      KATAKANA LETTER SMALL O 
<O6>    <U30AA>      KATAKANA LETTER O 
<Ka>    <U30AB>      KATAKANA LETTER KA 
<Ga>    <U30AC>      KATAKANA LETTER GA 
<Ki>    <U30AD>      KATAKANA LETTER KI 
<Gi>    <U30AE>      KATAKANA LETTER GI 
<Ku>    <U30AF>      KATAKANA LETTER KU 
<Gu>    <U30B0>      KATAKANA LETTER GU 
<Ke>    <U30B1>      KATAKANA LETTER KE 
<Ge>    <U30B2>      KATAKANA LETTER GE 
<Ko>    <U30B3>      KATAKANA LETTER KO 
<Go>    <U30B4>      KATAKANA LETTER GO 
<Sa>    <U30B5>      KATAKANA LETTER SA 
<Za>    <U30B6>      KATAKANA LETTER ZA 
<Si>    <U30B7>      KATAKANA LETTER SI 
<Zi>    <U30B8>      KATAKANA LETTER ZI 
<Su>    <U30B9>      KATAKANA LETTER SU 
<Zu>    <U30BA>      KATAKANA LETTER ZU 
<Se>    <U30BB>      KATAKANA LETTER SE 
<Ze>    <U30BC>      KATAKANA LETTER ZE 
<So>    <U30BD>      KATAKANA LETTER SO 
<Zo>    <U30BE>      KATAKANA LETTER ZO 
<Ta>    <U30BF>      KATAKANA LETTER TA 
<Da>    <U30C0>      KATAKANA LETTER DA 
<Ti>    <U30C1>      KATAKANA LETTER TI 
<Di>    <U30C2>      KATAKANA LETTER DI 
<TU>    <U30C3>      KATAKANA LETTER SMALL TU 
<Tu>    <U30C4>      KATAKANA LETTER TU 
<Du>    <U30C5>      KATAKANA LETTER DU 
<Te>    <U30C6>      KATAKANA LETTER TE 
<De>    <U30C7>      KATAKANA LETTER DE 
<To>    <U30C8>      KATAKANA LETTER TO 
<Do>    <U30C9>      KATAKANA LETTER DO 
<Na>    <U30CA>      KATAKANA LETTER NA 
<Ni>    <U30CB>      KATAKANA LETTER NI 
<Nu>    <U30CC>      KATAKANA LETTER NU 
<Ne>    <U30CD>      KATAKANA LETTER NE 
<No>    <U30CE>      KATAKANA LETTER NO 
<Ha>    <U30CF>      KATAKANA LETTER HA 
<Ba>    <U30D0>      KATAKANA LETTER BA 
<Pa>    <U30D1>      KATAKANA LETTER PA 
<Hi>    <U30D2>      KATAKANA LETTER HI 
<Bi>    <U30D3>      KATAKANA LETTER BI 
<Pi>    <U30D4>      KATAKANA LETTER PI 
<Hu>    <U30D5>      KATAKANA LETTER HU 
<Bu>    <U30D6>      KATAKANA LETTER BU 
<Pu>    <U30D7>      KATAKANA LETTER PU 
<He>    <U30D8>      KATAKANA LETTER HE 
<Be>    <U30D9>      KATAKANA LETTER BE 
<Pe>    <U30DA>      KATAKANA LETTER PE 
<Ho>    <U30DB>      KATAKANA LETTER HO 
<Bo>    <U30DC>      KATAKANA LETTER BO 
<Po>    <U30DD>      KATAKANA LETTER PO 
<Ma>    <U30DE>      KATAKANA LETTER MA 
<Mi>    <U30DF>      KATAKANA LETTER MI 
<Mu>    <U30E0>      KATAKANA LETTER MU 
<Me>    <U30E1>      KATAKANA LETTER ME 
<Mo>    <U30E2>      KATAKANA LETTER MO 
<YA>    <U30E3>      KATAKANA LETTER SMALL YA 
<Ya>    <U30E4>      KATAKANA LETTER YA 
<YU>    <U30E5>      KATAKANA LETTER SMALL YU 
<Yu>    <U30E6>      KATAKANA LETTER YU 
<YO>    <U30E7>      KATAKANA LETTER SMALL YO 
<Yo>    <U30E8>      KATAKANA LETTER YO 
<Ra>    <U30E9>      KATAKANA LETTER RA 
<Ri>    <U30EA>      KATAKANA LETTER RI 
<Ru>    <U30EB>      KATAKANA LETTER RU 
<Re>    <U30EC>      KATAKANA LETTER RE 
<Ro>    <U30ED>      KATAKANA LETTER RO 
<WA>    <U30EE>      KATAKANA LETTER SMALL WA 
<Wa>    <U30EF>      KATAKANA LETTER WA 
<Wi>    <U30F0>      KATAKANA LETTER WI 
<We>    <U30F1>      KATAKANA LETTER WE 
<Wo>    <U30F2>      KATAKANA LETTER WO 
<N6>    <U30F3>      KATAKANA LETTER N 
<Vu>    <U30F4>      KATAKANA LETTER VU 
<KA>    <U30F5>      KATAKANA LETTER SMALL KA 
<KE>    <U30F6>      KATAKANA LETTER SMALL KE 
<Va>    <U30F7>      KATAKANA LETTER VA 
<Vi>    <U30F8>      KATAKANA LETTER VI 
<Ve>    <U30F9>      KATAKANA LETTER VE 
<Vo>    <U30FA>      KATAKANA LETTER VO 
<.6>    <U30FB>      KATAKANA MIDDLE DOT 
<-6>    <U30FC>      KATAKANA-HIRAGANA PROLONGED SOUND MARK 
<*6>    <U30FD>      KATAKANA ITERATION MARK 
<+6>    <U30FE>      KATAKANA VOICED ITERATION MARK 
<b4>    <U3105>      BOPOMOFO LETTER B 
<p4>    <U3106>      BOPOMOFO LETTER P 
<m4>    <U3107>      BOPOMOFO LETTER M 
<f4>    <U3108>      BOPOMOFO LETTER F 
<d4>    <U3109>      BOPOMOFO LETTER D 
<t4>    <U310A>      BOPOMOFO LETTER T 
<n4>    <U310B>      BOPOMOFO LETTER N 
<l4>    <U310C>      BOPOMOFO LETTER L 
<g4>    <U310D>      BOPOMOFO LETTER G 
<k4>    <U310E>      BOPOMOFO LETTER K 
<h4>    <U310F>      BOPOMOFO LETTER H 
<j4>    <U3110>      BOPOMOFO LETTER J 
<q4>    <U3111>      BOPOMOFO LETTER Q 
<x4>    <U3112>      BOPOMOFO LETTER X 
<zh>    <U3113>      BOPOMOFO LETTER ZH 
<ch>    <U3114>      BOPOMOFO LETTER CH 
<sh>    <U3115>      BOPOMOFO LETTER SH 
<r4>    <U3116>      BOPOMOFO LETTER R 
<z4>    <U3117>      BOPOMOFO LETTER Z 
<c4>    <U3118>      BOPOMOFO LETTER C 
<s4>    <U3119>      BOPOMOFO LETTER S 
<a4>    <U311A>      BOPOMOFO LETTER A 
<o4>    <U311B>      BOPOMOFO LETTER O 
<e4>    <U311C>      BOPOMOFO LETTER E 
<eh4>   <U311D>      BOPOMOFO LETTER EH 
<ai>    <U311E>      BOPOMOFO LETTER AI 
<ei>    <U311F>      BOPOMOFO LETTER EI 
<au>    <U3120>      BOPOMOFO LETTER AU 
<ou>    <U3121>      BOPOMOFO LETTER OU 
<an>    <U3122>      BOPOMOFO LETTER AN 
<en>    <U3123>      BOPOMOFO LETTER EN 
<aN>    <U3124>      BOPOMOFO LETTER ANG 
<eN>    <U3125>      BOPOMOFO LETTER ENG 
<er>    <U3126>      BOPOMOFO LETTER ER 
<i4>    <U3127>      BOPOMOFO LETTER I 
<u4>    <U3128>      BOPOMOFO LETTER U 
<iu>    <U3129>      BOPOMOFO LETTER IU 
<v4>    <U312A>      BOPOMOFO LETTER V 
<nG>    <U312B>      BOPOMOFO LETTER NG 
<gn>    <U312C>      BOPOMOFO LETTER GN 
<(JU)>  <U321C>      PARENTHESIZED HANGUL CIEUC U 
<1c>    <U3220>      PARENTHESIZED IDEOGRAPH ONE 
<2c>    <U3221>      PARENTHESIZED IDEOGRAPH TWO 
<3c>    <U3222>      PARENTHESIZED IDEOGRAPH THREE 
<4c>    <U3223>      PARENTHESIZED IDEOGRAPH FOUR 
<5c>    <U3224>      PARENTHESIZED IDEOGRAPH FIVE 
<6c>    <U3225>      PARENTHESIZED IDEOGRAPH SIX 
<7c>    <U3226>      PARENTHESIZED IDEOGRAPH SEVEN 
<8c>    <U3227>      PARENTHESIZED IDEOGRAPH EIGHT 
<9c>    <U3228>      PARENTHESIZED IDEOGRAPH NINE 
<10c>   <U3229>      PARENTHESIZED IDEOGRAPH TEN 
<KSC>   <U327F>      KOREAN STANDARD SYMBOL 
<am>    <U33C2>      SQUARE AM 
<pm>    <U33D8>      SQUARE PM 
<ff>    <UFB00>      LATIN SMALL LIGATURE FF 
<fi>    <UFB01>      LATIN SMALL LIGATURE FI 
<fl>    <UFB02>      LATIN SMALL LIGATURE FL 
<ffi>   <UFB03>      LATIN SMALL LIGATURE FFI 
<ffl>   <UFB04>      LATIN SMALL LIGATURE FFL 
<St>    <UFB05>      LATIN SMALL LIGATURE LONG S T 
<st>    <UFB06>      LATIN SMALL LIGATURE ST 
<3+;>   <UFE7D>      ARABIC SHADDA MEDIAL FORM 
<aM.>   <UFE82>      ARABIC LETTER ALEF WITH MADDA ABOVE FINAL FORM 
<aH.>   <UFE84>      ARABIC LETTER ALEF WITH HAMZA ABOVE FINAL FORM 
<ah.>   <UFE88>      ARABIC LETTER ALEF WITH HAMZA BELOW FINAL FORM 
<a+->   <UFE8D>      ARABIC LETTER ALEF ISOLATED FORM 
<a+.>   <UFE8E>      ARABIC LETTER ALEF FINAL FORM 
<b+->   <UFE8F>      ARABIC LETTER BEH ISOLATED FORM 
<b+.>   <UFE90>      ARABIC LETTER BEH FINAL FORM 
<b+,>   <UFE91>      ARABIC LETTER BEH INITIAL FORM 
<b+;>   <UFE92>      ARABIC LETTER BEH MEDIAL FORM 
<tm->   <UFE93>      ARABIC LETTER TEH MARBUTA ISOLATED FORM 
<tm.>   <UFE94>      ARABIC LETTER TEH MARBUTA FINAL FORM 
<t+->   <UFE95>      ARABIC LETTER TEH ISOLATED FORM 
<t+.>   <UFE96>      ARABIC LETTER TEH FINAL FORM 
<t+,>   <UFE97>      ARABIC LETTER TEH INITIAL FORM 
<t+;>   <UFE98>      ARABIC LETTER TEH MEDIAL FORM 
<tk->   <UFE99>      ARABIC LETTER THEH ISOLATED FORM 
<tk.>   <UFE9A>      ARABIC LETTER THEH FINAL FORM 
<tk,>   <UFE9B>      ARABIC LETTER THEH INITIAL FORM 
<tk;>   <UFE9C>      ARABIC LETTER THEH MEDIAL FORM 
<g+->   <UFE9D>      ARABIC LETTER JEEM ISOLATED FORM 
<g+.>   <UFE9E>      ARABIC LETTER JEEM FINAL FORM 
<g+,>   <UFE9F>      ARABIC LETTER JEEM INITIAL FORM 
<g+;>   <UFEA0>      ARABIC LETTER JEEM MEDIAL FORM 
<hk->   <UFEA1>      ARABIC LETTER HAH ISOLATED FORM 
<hk.>   <UFEA2>      ARABIC LETTER HAH FINAL FORM 
<hk,>   <UFEA3>      ARABIC LETTER HAH INITIAL FORM 
<hk;>   <UFEA4>      ARABIC LETTER HAH MEDIAL FORM 
<x+->   <UFEA5>      ARABIC LETTER KHAH ISOLATED FORM 
<x+.>   <UFEA6>      ARABIC LETTER KHAH FINAL FORM 
<x+,>   <UFEA7>      ARABIC LETTER KHAH INITIAL FORM 
<x+;>   <UFEA8>      ARABIC LETTER KHAH MEDIAL FORM 
<d+->   <UFEA9>      ARABIC LETTER DAL ISOLATED FORM 
<d+.>   <UFEAA>      ARABIC LETTER DAL FINAL FORM 
<dk->   <UFEAB>      ARABIC LETTER THAL ISOLATED FORM 
<dk.>   <UFEAC>      ARABIC LETTER THAL FINAL FORM 
<r+->   <UFEAD>      ARABIC LETTER REH ISOLATED FORM 
<r+.>   <UFEAE>      ARABIC LETTER REH FINAL FORM 
<z+->   <UFEAF>      ARABIC LETTER ZAIN ISOLATED FORM 
<z+.>   <UFEB0>      ARABIC LETTER ZAIN FINAL FORM 
<s+->   <UFEB1>      ARABIC LETTER SEEN ISOLATED FORM 
<s+.>   <UFEB2>      ARABIC LETTER SEEN FINAL FORM 
<s+,>   <UFEB3>      ARABIC LETTER SEEN INITIAL FORM 
<s+;>   <UFEB4>      ARABIC LETTER SEEN MEDIAL FORM 
<sn->   <UFEB5>      ARABIC LETTER SHEEN ISOLATED FORM 
<sn.>   <UFEB6>      ARABIC LETTER SHEEN FINAL FORM 
<sn,>   <UFEB7>      ARABIC LETTER SHEEN INITIAL FORM 
<sn;>   <UFEB8>      ARABIC LETTER SHEEN MEDIAL FORM 
<c+->   <UFEB9>      ARABIC LETTER SAD ISOLATED FORM 
<c+.>   <UFEBA>      ARABIC LETTER SAD FINAL FORM 
<c+,>   <UFEBB>      ARABIC LETTER SAD INITIAL FORM 
<c+;>   <UFEBC>      ARABIC LETTER SAD MEDIAL FORM 
<dd->   <UFEBD>      ARABIC LETTER DAD ISOLATED FORM 
<dd.>   <UFEBE>      ARABIC LETTER DAD FINAL FORM 
<dd,>   <UFEBF>      ARABIC LETTER DAD INITIAL FORM 
<dd;>   <UFEC0>      ARABIC LETTER DAD MEDIAL FORM 
<tj->   <UFEC1>      ARABIC LETTER TAH ISOLATED FORM 
<tj.>   <UFEC2>      ARABIC LETTER TAH FINAL FORM 
<tj,>   <UFEC3>      ARABIC LETTER TAH INITIAL FORM 
<tj;>   <UFEC4>      ARABIC LETTER TAH MEDIAL FORM 
<zH->   <UFEC5>      ARABIC LETTER ZAH ISOLATED FORM 
<zH.>   <UFEC6>      ARABIC LETTER ZAH FINAL FORM 
<zH,>   <UFEC7>      ARABIC LETTER ZAH INITIAL FORM 
<zH;>   <UFEC8>      ARABIC LETTER ZAH MEDIAL FORM 
<e+->   <UFEC9>      ARABIC LETTER AIN ISOLATED FORM 
<e+.>   <UFECA>      ARABIC LETTER AIN FINAL FORM 
<e+,>   <UFECB>      ARABIC LETTER AIN INITIAL FORM 
<e+;>   <UFECC>      ARABIC LETTER AIN MEDIAL FORM 
<i+->   <UFECD>      ARABIC LETTER GHAIN ISOLATED FORM 
<i+.>   <UFECE>      ARABIC LETTER GHAIN FINAL FORM 
<i+,>   <UFECF>      ARABIC LETTER GHAIN INITIAL FORM 
<i+;>   <UFED0>      ARABIC LETTER GHAIN MEDIAL FORM 
<f+->   <UFED1>      ARABIC LETTER FEH ISOLATED FORM 
<f+.>   <UFED2>      ARABIC LETTER FEH FINAL FORM 
<f+,>   <UFED3>      ARABIC LETTER FEH INITIAL FORM 
<f+;>   <UFED4>      ARABIC LETTER FEH MEDIAL FORM 
<q+->   <UFED5>      ARABIC LETTER QAF ISOLATED FORM 
<q+.>   <UFED6>      ARABIC LETTER QAF FINAL FORM 
<q+,>   <UFED7>      ARABIC LETTER QAF INITIAL FORM 
<q+;>   <UFED8>      ARABIC LETTER QAF MEDIAL FORM 
<k+->   <UFED9>      ARABIC LETTER KAF ISOLATED FORM 
<k+.>   <UFEDA>      ARABIC LETTER KAF FINAL FORM 
<k+,>   <UFEDB>      ARABIC LETTER KAF INITIAL FORM 
<k+;>   <UFEDC>      ARABIC LETTER KAF MEDIAL FORM 
<l+->   <UFEDD>      ARABIC LETTER LAM ISOLATED FORM 
<l+.>   <UFEDE>      ARABIC LETTER LAM FINAL FORM 
<l+,>   <UFEDF>      ARABIC LETTER LAM INITIAL FORM 
<l+;>   <UFEE0>      ARABIC LETTER LAM MEDIAL FORM 
<m+->   <UFEE1>      ARABIC LETTER MEEM ISOLATED FORM 
<m+.>   <UFEE2>      ARABIC LETTER MEEM FINAL FORM 
<m+,>   <UFEE3>      ARABIC LETTER MEEM INITIAL FORM 
<m+;>   <UFEE4>      ARABIC LETTER MEEM MEDIAL FORM 
<n+->   <UFEE5>      ARABIC LETTER NOON ISOLATED FORM 
<n+.>   <UFEE6>      ARABIC LETTER NOON FINAL FORM 
<n+,>   <UFEE7>      ARABIC LETTER NOON INITIAL FORM 
<n+;>   <UFEE8>      ARABIC LETTER NOON MEDIAL FORM 
<h+->   <UFEE9>      ARABIC LETTER HEH ISOLATED FORM 
<h+.>   <UFEEA>      ARABIC LETTER HEH FINAL FORM 
<h+,>   <UFEEB>      ARABIC LETTER HEH INITIAL FORM 
<h+;>   <UFEEC>      ARABIC LETTER HEH MEDIAL FORM 
<w+->   <UFEED>      ARABIC LETTER WAW ISOLATED FORM 
<w+.>   <UFEEE>      ARABIC LETTER WAW FINAL FORM 
<j+->   <UFEEF>      ARABIC LETTER ALEF MAKSURA ISOLATED FORM 
<j+.>   <UFEF0>      ARABIC LETTER ALEF MAKSURA FINAL FORM 
<y+->   <UFEF1>      ARABIC LETTER YEH ISOLATED FORM 
<y+.>   <UFEF2>      ARABIC LETTER YEH FINAL FORM 
<y+,>   <UFEF3>      ARABIC LETTER YEH INITIAL FORM 
<y+;>   <UFEF4>      ARABIC LETTER YEH MEDIAL FORM 
<lM->   <UFEF5>      ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE 
ISOLATED FORM 
<lM.>   <UFEF6>      ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE FINAL 
FORM 
<lH->   <UFEF7>      ARABIC LIGATURE LAM WITH ALEF WITH HAMZA ABOVE 
ISOLATED FORM 
<lH.>   <UFEF8>      ARABIC LIGATURE LAM WITH ALEF WITH HAMZA ABOVE FINAL 
FORM 
<lh->   <UFEF9>      ARABIC LIGATURE LAM WITH ALEF WITH HAMZA BELOW 
ISOLATED FORM 
<lh.>   <UFEFA>      ARABIC LIGATURE LAM WITH ALEF WITH HAMZA BELOW FINAL 
FORM 
<la->   <UFEFB>      ARABIC LIGATURE LAM WITH ALEF ISOLATED FORM 
<la.>   <UFEFC>      ARABIC LIGATURE LAM WITH ALEF FINAL FORM 
<"3>    <UE000>      NON-SPACING UMLAUT <ISO-IR-53_C9/> (not a real 
character) 
<"1>    <UE001>      NON-SPACING DIAERESIS WITH ACCENT <ISO-IR-70_C0/> 
(not a real character) 
<"!>    <UE002>      NON-SPACING GRAVE ACCENT <ISO-IR-103_C1/> (not a real 
character) 
<"'>    <UE003>      NON-SPACING ACUTE ACCENT <ISO-IR-103_C2/> (not a real 
character) 
<"/>>   <UE004>      NON-SPACING CIRCUMFLEX ACCENT <ISO-IR-103_C3/> (not a 
real character) 
<"?>    <UE005>      NON-SPACING TILDE <ISO-IR-103_C4/> (not a real 
character) 
<"->    <UE006>      NON-SPACING MACRON <ISO-IR-103_C5/> (not a real 
character) 
<"(>    <UE007>      NON-SPACING BREVE <ISO-IR-103_C6/> (not a real 
character) 
<".>    <UE008>      NON-SPACING DOT ABOVE <ISO-IR-103_C7/> (not a real 
character) 
<":>    <UE009>      NON-SPACING DIAERESIS <ISO-IR-103_C8/> (not a real 
character) 
<"0>    <UE00A>      NON-SPACING RING ABOVE <ISO-IR-103_CA/> (not a real 
character) 
<",>    <UE00B>      NON-SPACING CEDILLA <ISO-IR-103_CB/> (not a real 
character) 
<"_>    <UE00C>      NON-SPACING LOW LINE <ISO-IR-103_CC/> (not a real 
character) 
<"">    <UE00D>      NON-SPACING DOUBLE ACUTE ACCENT <ISO-IR-103_CD/> (not 
a real character) 
<";>    <UE00E>      NON-SPACING OGONEK <ISO-IR-103_CE/> (not a real 
character) 
<"<>    <UE00F>      NON-SPACING CARON <ISO-IR-103_CF/> (not a real 
character) 
<"=>    <UE010>      NON-SPACING DOUBLE LOW LINE <ISO-IR-38_D9/> (not a 
real character) 
<"//>   <UE011>      NON-SPACING LONG SOLIDUS OVERLAY <ISO-IR-128_C9/> 
(not a real character) 
<"p>    <UE012>      GREEK NON-SPACING PSILI PNEUMATA <ISO-IR-55_25/> (not 
a real character) 
<"d>    <UE013>      GREEK NON-SPACING DASIA PNEUMATA <ISO-IR-55_26/> (not 
a real character) 
<"i>    <UE014>      GREEK NON-SPACING IOTA BELOW <ISO-IR-55_27/> (not a 
real character) 
<+_>    <UE015>      IDEOGRAPHIC DITTO MARK <ISO-IR-87_2138/> 
<a+:>   <UE016>      ARABIC LETTER ALEF FINAL FORM COMPATIBILITY 
<IBM868_90/> 
<Tel>   <UE017>      TEL COMPATIBILITY SIGN <ISO-IR-149_2265/> 
<UA>    <UE018>      Unit space A <ISO-IR-8-1_40/> 
<UB>    <UE019>      Unit space B <ISO-IR-8-1_60/> 
<t3>    <UE01A>      GREEK SMALL LETTER STIGMA <ISO-IR-55_47/> 
<m3>    <UE01B>      GREEK SMALL LETTER DIGAMMA <ISO-IR-55_48/> 
<k3>    <UE01C>      GREEK SMALL LETTER KOPPA <ISO-IR-55_54/> 
<p3>    <UE01D>      GREEK SMALL LETTER SAMPI <ISO-IR-55_5E/> 
<Mc>    <UE01E>      APPLE LOGO (Macintosh_F0) 
<Fl>    <UE01F>      HUNGARIAN FLORINTH (CWI_9F) 
<Ss>    <UE020>      LATIN CAPITAL LIGATURE SS (German) (CORK_FF) 
<Ch>    <UE021>      LATIN SMALL LIGATURE CH (Slovak) (KOI-8_CS2_C7) 
<CH>    <UE022>      LATIN CAPITAL LIGATURE CH (Slovak) (KOI-8_CS2_E7) 
<//c>   <UE024>      JOIN THIS LINE WITH NEXT LINE (Mnemonic) 
<H->    <U0023>      NUMBER SIGN 
<!S>    <U0024>      DOLLAR SIGN 
<@>     <U0040>      COMMERCIAL AT 
<Oa>    <U0040>      COMMERCIAL AT 
<!C>    <U00A2>      CENT SIGN 
<L->    <U00A3>      POUND SIGN 
<Xo>    <U00A4>      CURRENCY SIGN 
<Y->    <U00A5>      YEN SIGN 
<!B>    <U00A6>      BROKEN BAR 
<So>    <U00A7>      SECTION SIGN 
<OC>    <U00A9>      COPYRIGHT SIGN 
<7!>    <U00AC>      NOT SIGN 
<OR>    <U00AE>      REGISTERED SIGN 
<9I>    <U00B6>      PILCROW SIGN 
<_->    <U2500>      BOX DRAWINGS LIGHT HORIZONTAL 
<_=>    <U2501>      BOX DRAWINGS HEAVY HORIZONTAL 
<_!>    <U2502>      BOX DRAWINGS LIGHT VERTICAL 
<_V/>>  <U250C>      BOX DRAWINGS LIGHT DOWN AND RIGHT 
<_V<w>  <U2510>      BOX DRAWINGS LIGHT DOWN AND LEFT 
<_A/>>  <U2514>      BOX DRAWINGS LIGHT UP AND RIGHT 
<_A<>   <U2518>      BOX DRAWINGS LIGHT UP AND LEFT 
<_!/>>  <U251C>      BOX DRAWINGS LIGHT VERTICAL AND RIGHT 
<_!<>   <U2524>      BOX DRAWINGS LIGHT VERTICAL AND LEFT 
<_V->   <U252C>      BOX DRAWINGS LIGHT DOWN AND HORIZONTAL 
<_-A>   <U2534>      BOX DRAWINGS LIGHT UP AND HORIZONTAL 
<_!->   <U253C>      BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL 
<_/>//> <U2571>      BOX DRAWINGS LIGHT DIAGONAL UPPER RIGHT TO LOWER LEFT 
<_<\>   <U2572>      BOX DRAWINGS LIGHT DIAGONAL UPPER LEFT TO LOWER RIGHT 
<_./>//>        <U25E2>     BLACK LOWER RIGHT TRIANGLE 
<_.<\>  <U25E3>      BLACK LOWER LEFT TRIANGLE 
<_d!>   <U266A>      EIGHTH NOTE 
 
 
7   CONFORMANCE 
 
7.1 FDCC-set 
 
A FDCC-set is conforming to this standard if it meets the 
requirements in clause 4. 
 
7.2 FDCC-set category 
 
Conformance can be claimed for a category against each of the 
clauses 4.2 thru 4.12, and then the requirements of clause 4.1 
shall also be met, and a LC_VERSIONS category as described in 
clause 4.13 shall be specified. 
 
7.3 Charmap 
 
A charmap is conforming to this standard if it meets the 
requirements in clause 5. 
 
7.4 Repertoiremap 
 
A repertoiremap is conforming to this standard if it meets the 
requirements in clause 6. 
 Annex A 
(informative) 
 
Differences from the ISO/IEC 9945-2 standard 
 
This standard is based on the locale and charmap 
specifications in the ISO/IEC 9945-2 standard, and it intends 
to be backwards compatible, so that what is comformant to that 
standard is also conformant to this standard. 
 
A number of enhancements have been done and a number of 
restrictions have been lifted in comparison to the POSIX 
standard: 
 
A.1   Restrictions removed 
 
1. Dependence on specific meaning of the character NUL as 
termination of a string (from the C standard) has been 
removed, to cater for other programming languages than C. 
 
A.2   Enhancements 
 
1. A description of a "repertoiremap" definition was added to 
facilitate descriptions of FDCC-sets without charmaps, and 
also to provide binding from a FDCC-set using one set of 
character names to charmaps using another naming set. 
 
2. The specific POSIX locale has been replaced with the "i18n" 
FDCC-set, defined on the repertoire on ISO/IEC 10646. 
 
3. Transliteration support has been added in the LC_CTYPE 
category. 
 
4. Terminology has been aligned with ISO/IEC TR 11017, 
especially the POSIX term "locale" has been changed to "FDCC- 
set". 
 
5. A date escape format "%F" has been added for ISO 8601 
dates, and another date escape format "%f" has been added for 
weekday number with Monday being the first day of the week. 
 
6. Added to LC_MONETARY to accommodate differences between 
local and international formats: 
 int_p_cs_precedes 
 int_p-sep_by_space 
 int_n_cs_precedes 
 int_n_sep_by_space 
 
7. Script symbols have been added via the "script" keyword in 
the LC_COLLATE category. 
 
8. The "order_start" keyword has got an optional script-symbol 
identifier 
 
9. The keywords "reorder-scripts-after" and "reorder- 
scripts_end" have been introduce to reorder scripts. 
 
10. Symbolic elipsises (both decimal and hexadecimal) has been 
introduced generally as a notation. 
 
11. The "print" CTYPE class includes automatically all "graph" 
characters. 
 
12. The <Uxxxx> and <Uxxxxxxxx> has been introduced as 
predefined symbolic character names, together with a number 
symbolic character names derived from POSIX. 
 
13. Toggling commands define, undef, ifdef, ifndef, elif, 
else, and endif have been introduced for the FDCC-set category 
LC_COLLATE, in the style of the C-precompiler. 
 
14. New categories LC_VERSION, LC_PAPER, LC_NAME, LC_ADDRESS, 
LC_TELEPHONE, and LC_MEASUREMENT has been introduced. 
 
15. The LC_CTYPE has got support for bidirectionality, via the 
new keywords class and map, which corresponds to the C 
standard library functions iswctype() and towctrans() 
respectively. 
 
16. The digits keyword now support digits for multiple 
scripts. 
 
17. The LC_MONETARY category provides support for dual 
currencies, such as the Euro in some European countries. 
 
18. The LC_TIME has got a number of enhancements to cater for 
alternate calenders, and timezone information may be given. 
 
19. The charmap specification has been enhanced to support ISO 
2022. Annex B 
(informative) 
 
Rationale 
 
 
B.1   FDCC-set Rationale 
 
The description of FDCC-sets is based on work performed in the 
UniForum Technical Committee Subcommittee on 
Internationalisation and on POSIX. Wherever appropriate, 
keywords were taken from the C Standard or the POSIX-2 
standard. The C and POSIX term "locale" has been changed into 
the term "FDCC-set" from ISO/IEC TR 11017 to align with that 
specification. 
 
The POSIX utility "localedef" compiles locale sources into 
object files. The "object" definitions need not be portable, 
as long as "source" definitions are.  Strictly speaking, 
"source" definitions are portable only between applications 
using the same character set(s). Such "source" definitions 
can, if they use symbolic names only, easily be ported between 
systems using different code sets as long as the characters in 
the portable character set (ISO 646) have common values 
between the code sets; this is frequently the case in 
historical applications. Of course, this requires that the 
symbolic names used for characters outside the portable 
character set are identical between character sets. 
 
To avoid confusion between an octal constant and a 
backreference, the octal, hexadecimal, and decimal constants 
must contain at least two digits. As single-digit constants 
are relatively rare, this should not impose any significant 
hardship. Each of the constants includes "two or more" digits 
to account for systems in which the byte size is larger than 
eight bits. For example, an ISO/IEC 10646 system that has 
defined 16-bit bytes may require six octal, four hexadecimal, 
and five decimal digits, for some coded characters. 
 
As an international (ISO/IEC) standard this standard should 
follow the ISO/IEC guidelines, including the ISO/IEC TR 10176. 
This TR has a rule that characters outside the invariant part 
of ISO/IEC 646 should not be used in portable specifications. 
The backslash and the number-sign character are not in the 
invariant part. As far as general usage of these symbols, they 
are covered by the "grandfather clause", but for newly defined 
interfaces, ISO has requested that specifications provide 
alternate representations, and this standard then follows 
POSIX for backward compatibility. Consequently, while the 
default escape character remains the backslash, and the 
default comment character is the number-sign, applications are 
required to recognize alternative representations, identified 
in the applicable source text via the "escape_char" and "com- 
ment_char" keywords. 
 
 
B.1.1   LC_CTYPE Rationale 
 
The LC_CTYPE category primarily is used to define the 
encoding-independent aspects of a character set, such as 
character classification. In addition, certain encoding-depen- 
dent characteristics are also defined for an application via 
the LC_CTYPE category. This standard does not mandate that the 
encoding used in the FDCC-set is the same as the one used by 
the application, because an application may decide that it is 
advantageous to define FDCC-set in a system-wide encoding 
rather than having multiple, logically identical FDCC-sets in 
different encodings, and to convert from the application 
encoding to the system-wide encoding on usage. Other 
applications could require encoding-dependent FDCC-sets. In 
either case, the LC_CTYPE attributes that are directly 
dependent on the encoding, such as mb_cur_max and the display 
width of characters, are not user-specifiable in a locale 
source, and are consequently not defined as keywords. 
 
As the LC_CTYPE character classes are based on the C Standard 
character-class definition, the category does not support 
multicharacter elements. For instance, the German character 
<sharp-s> is traditionally classified as a lowercase letter. 
There is no corresponding uppercase letter; in proper 
capitalization of German text the <sharp-s> will be replaced 
by SS; i.e., by two characters. This kind of conversion is 
outside the scope of the toupper and tolower keywords. Where 
this standard specifies that only certain characters can be 
specified, as for the keywords digit and xdigit, the specified 
characters must be from the portable character set, as shown. 
As an example, only the Arabic digits 0 through 9 are 
acceptable as digits. 
 
The character classes digit, xdigit, lower, upper, and space 
have a set of automatically included characters. These only 
need to be specified if the character values (i.e. encoding) 
differs from the application default values. The definition of 
character class digit allows that alternate digits (e.g., 
Hindi or Ideographic) can be specified here. The definition of 
character class xdigit requires that the characters included 
in character class digit are included here also, and allows 
for different symbols for the hexadecimal digits 10 through 
15. 
 
B.1.2   LC_COLLATE Rationale. 
 
The LC_COLLATE category governs the collation order in the 
FDCC-set, and may thus be useful for the processing of the 
APIs in the ISO/IEC 14651 string ordering and comparison 
standard, the C Standard strxfrm() and strcoll() functions, as 
well as a number of POSIX-2 utilities. 
 
The rules governing collation depends to some extent on the 
use. At least five different levels of increasingly complex 
collation rules can be distinguished: 
 
(1)     Byte/machine code order. This is the historical 
 collation order in the UNIX system and many proprietary 
 operating systems. Collation is here done character by 
 character, without any regard to context. The primary 
 virtue is that it usually is quite fast, and also 
 completely deterministic; it works well when the native 
 machine collation sequence matches the user 
 expectations. 
(2)     Character order. On this level, collation is also done 
 character by character, without regard to context. The 
 order between characters is, however, not determined by 
 the code values, but on the user's expectations of the 
 correct order between characters. In addition, such a 
 (simple) collation order can specify that certain 
 characters collate equal (e.g., upper and lowercase 
 letters). 
(3)     String ordering. On this level, entire strings are 
 compared based on relatively straightforward rules. At 
 this level, several "passes" may be required to deter- 
 mine the order between two strings. Characters may be 
 ignored in some passes, but not in others; the strings 
 may be compared in different directions; and simple 
 string substitutions may be made before strings are 
 compared. This level is best described as "dictionary" 
 ordering; it is based on the spelling, not the pronun- 
 ciation, or meaning, of the words. 
(4)     Text search ordering. This is a further refinement of 
 the previous level, best described as "telephone book 
 ordering"; some common homonyms (words spelled 
 differently but with same pronunciation) are collated 
 together; numbers are collated as if spelled with 
 words, and so on. 
(5)     Semantic level ordering. Words and strings are collated 
 based on their meaning; entire words (such as "the") 
 are eliminated, the ordering is not deterministic. This 
 may requires special software, and is highly dependent 
 on the intended use. 
 
While the historical collation order formally is at level 1, 
for the English language it corresponds roughly to elements at 
level 2. The user expects to see the output from the "ls" 
utility sorted very much as it would be in a dictionary. While 
telephone book ordering would be an optimal goal for standard 
collation, this was ruled out as the order would be language 
dependent. Furthermore, a requirement was that the order must 
be determined solely from the text string and the collation 
rules; no external information (e.g., "pronunciation 
dictionaries") could be required. 
 
As a result, the goal for the collation support is at level 3. 
This also matches the requirements for the Canadian collation 
order standard, as well as other, known collation requirements 
for alphabetic scripts. It specifically rules out collation 
based on pronunciation rules, or based on semantic analysis of 
the text. The syntax for the LC_COLLATE category source is the 
result of a cooperative effort between representatives for 
many countries and organizations working with international 
issues, such as UniForum, X/Open, and ISO, and it meets the 
requirements for level 3, and has been verified to produce the 
correct result with examples based on Canadian and Danish 
collation order. 
 
 The directives that can be specified in an operand to the 
order_start keyword are based on the requirements specified in 
several proposed standards and in customary use. The following 
is a rephrasing of rules defined for "lexical ordering in 
English and French" by the Canadian Standards Association 
(text is brackets is rephrased): 
 
(1)     Once special characters (punctuation) have been removed 
 from original strings, the ordering is determined by 
 scanning forward (left to right) [disregarding case and 
 diacriticals]. 
(2)     In case of equivalence, special characters are once 
 again removed from original strings and the ordering is 
 determined scanning backward (starting from the 
 rightmost character of the string and back), character 
 by character, (disregarding case but considering 
 diacriticals). 
(3)     In case of repeated equivalence, special characters are 
 removed again from original strings and the ordering is 
 determined scanning forward, character by character, 
 (considering both case and diacriticals). 
(4)     If there is still an ordering equivalence after rules 
 (1) through (3) have been applied, then only special 
 characters and the position they occupy in the string 
 are considered to determine ordering. The string that 
 has a special character in the lowest position comes 
 first. If two strings have a special character in the 
 same position, the character [with the lowest collation 
 value] comes first. In case of equality, the other 
 special characters are considered until there is a 
 difference or all special characters have been 
 exhausted. 
 
It is estimated that the standard covers the requirements for 
all European languages, and no particular problems are 
anticipated for Cyrillic or Middle Eastern scripts. 
 
The Far East (particularly Japanese/Chinese) collations are 
often based on contextual information and pronunciation rules 
(the same ideograph can have different meanings and different 
pronunciations). Such collation, in general, falls outside the 
desired goal of the standard. There are, however, several 
other collation rules (stroke/radical, or "most common 
pronunciation") which can be supported with the mechanism 
described here.  Previous drafts contained a substitute 
statement, which performed a regular expression style 
replacement before string compares. It has been withdrawn 
based on balloter objections that it was not required for the 
types of ordering this standard is aimed at. 
 
The character (and collating element) order is defined by the 
order in which characters and elements are specified between 
the order_start and order_end keywords. This character order 
is used in range expressions in regular expressions. Weights 
assigned to the characters and elements defines the collation 
sequence; in the absence of weights, the character order is 
also the collation sequence. 
 
The position keyword was introduced to provide the capability 
to consider, in a compare, the relative position of non- 
IGNOREd characters. As an example, consider the two strings 
"o-ring" and "or-ing". Assuming the hyphen is IGNOREd on the 
first pass, the two strings will compare equal, and the 
position of the hyphen is immaterial. On second pass, all 
characters except the hyphen are IGNOREd, and in the normal 
case the two strings would again compare equal. By taking 
position into account, the first collates before the second. 
 
B.1.2.1   "reorder-after" rationale 
 
Much work has been done on FDCC-sets, making them quite 
general. The POSIX-2 standard introduced a "copy" command for 
all categories of the POSIX locale. This is useful for many 
purposes and it ensures that two FDCC-sets are equivalent for 
this category. A further step in building on previous FDCC-set 
work is defined in this standard. 
 
Collating sequences often vary a bit from country to country, 
and from language to language, but generally much of the 
collating sequence is the same. For example the Danish 
sequence is for the most part the same as the German or 
English collation, but for about a dozen letters it differs. 
The same can be said for Swedish or Hungarian: generally the 
Latin collating sequence is the same, but a few characters are 
different. 
 
This standard defines a FDCC-set defined on the character 
repertoire of the ISO/IEC 10646 standard, in a character set 
independent way. The intention is that some of the information 
from this FDCC-set will be acceptable in many cultures, and 
that it can serve as the basis for modifications in other 
cultures, to obtain a culturally acceptable specification. 
Using the "reorder-after" construct will also help improve the 
overview of what the changes really are for implementers and 
other users. 
 
An example of the use of the "reorder-after" construct is the 
following. A default international ordering for the Latin 
alphabet may be adequate for Danish, with the exception of the 
collation rules for the letters , , , , , , , , , , 
 and . By applying the "reorder-after" construct, the Danish 
specification can be made more easily by copying and 
reordering the existing international specification, rather 
than specifying collation parameters for all Latin letters 
(with or without diacritics). There is no obligation for 
Denmark to take this approach, but the "reorder-after" 
construct provides the mechanism for doing so if it is deemed 
desirable. 
 
B.1.2.2   awk script for "reorder-after" construct 
 
A script has been written in the "awk" language defined in the 
POSIX standard ISO/IEC 9945-2 to implement the "reorder-after" 
construct: 
 
BEGIN { comment = "%"; back[0]= follow[0] = 0; } 
/LC_COLLATE/ { coll=1 } 
/END LC_COLLATE/ { coll=0; for (lnr= 1; lnr; lnr= follow[lnr]) print c- 
ont[lnr] } 
 
{ if (coll == 0) print $0  ; 
 else { if ($1 == "copy")   { 
 file = $2 
 while (getline < file ) 
 if ( $1 == "LC_COLLATE" ) copy_lc = 1 
 else if ( $1 == "END" && $2 == "LC_COLLATE" ) copy_lc =0 
 else if (copy_lc) { 
 lnr++ 
 follow[lnr-1] = lnr; back [ lnr ] = lnr-1 
 cont[lnr] = $0; symb[ $1 ] = lnr 
 } 
 close (file ) 
 } 
 else if ($1 == "reorder-after") { ra=1 ; after = symb [ $2 ] } 
 else if ($1 == "reorder-end") ra = 0 
 else { 
 lnr++ 
 if (ra) follow [ lnr ] = follow [ after ] 
 if (ra) back [ follow [ after ] ] = lnr 
 follow[after] = lnr; back [ lnr ] = after 
 cont[lnr] = $0 
 if ( ra && $1 != comment && $1 != "" )  { 
 old = symb [ $1 ]; 
 follow [ back [ old ] ] = follow [ old ]; 
 back [ follow [ old ] ] = back [ old ]; 
 symb[ $1 ] = lnr; 
 } 
 after = lnr 
 } 
 } 
} 
 
B.1.2.3   Sample FDCC-set specification for Danish 
 
escape_char / 
comment_char % 
repertoiremap "i18nrep" 
charset "ISO_8859-1:1987" 
% Distribution and use is free, also 
% for commercial purposes. 
 
LC_VERSION 
title         "Danish language FDCC-set for Denmark" 
source        "Danish Standards Association" 
address       "Kollegievej 6, DK-2920 Charlottenlund, Danmark" 
contact       "Keld Simonsen" 
email         "Keld.Simonsen@dkuug.dk" 
tel           "+45 - 3996-6101" 
fax           "+45 - 3996-6202" 
language      "da" 
territory     "DK" 
revision      "4.2" 
date          "1997-12-22" 
 
category      i18n:1998;LC_VERSIONS 
category      i18n:1998;LC_CTYPE 
category      i18n:1998;LC_COLLATE 
category      i18n:1998;LC_TIME 
category      posix:1993;LC_NUMERIC 
category      i18n:1998;LC_MONETARY 
category      posix:1993;LC_MESSAGES 
category      i18n:1998;LC_PAPER 
category      i18n:1998;LC_NAME 
category      i18n:1998;LC_ADDRESS 
category      i18n:1998;LC_TELEPHONE 
category      i18n:1998;LC_MEASUREMENT 
 
END LC_VERSION 
 
LC_CTYPE 
copy "i18n" 
END LC_CTYPE 
 
LC_COLLATE 
% The ordering algorithm is in accordance 
% with Danish Standard DS 377 (1980) 
% and the Danish Orthography Dictionary 
% (Retskrivningsordbogen, 2. udgave, 1996). 
% It is also in accordance with 
% Greenlandic orthography. 
 
collating-element <A-A> from "<A><A>" 
collating-element <A-a> from "<A><a>" 
collating-element <a-A> from "<a><A>" 
collating-element <a-a> from "<a><a>" 
copy i18n 
reorder-after <CAPITAL> 
<CAPITAL> 
<CAPITAL-SMALL> 
<SMALL-CAPITAL> 
<SMALL> 
reorder-after <q8> 
<kk>    <Q>;<SPECIAL>;<SMALL>;IGNORE 
reorder-after <t8> 
<TH>    "<T><H>";"<TH><TH>";"<CAPITAL><CAPITAL>";IGNORE 
<th>    "<T><H>";"<TH><TH>";"<SMALL><SMALL>";IGNORE 
reorder-after <y8> 
% <U:> and <U"> are treated as <Y> in Danish 
<U:>    <Y>;<U:>;<CAPITAL>;IGNORE 
<u:>    <Y>;<U:>;<SMALL>;IGNORE 
<U">    <Y>;<U">;<CAPITAL>;IGNORE 
<u">    <Y>;<U">;<SMALL>;IGNORE 
reorder-after <z8> 
% <AE> is a separate letter in Danish 
<AE>    <AE>;<NONE>;<CAPITAL>;IGNORE 
<ae>    <AE>;<NONE>;<SMALL>;IGNORE 
<AE'>   <AE>;<ACUTE>;<CAPITAL>;IGNORE 
<ae'>   <AE>;<ACUTE>;<SMALL>;IGNORE 
<A3>    <AE>;<MACRON>;<CAPITAL>;IGNORE 
<a3>    <AE>;<MACRON>;<SMALL>;IGNORE 
<A:>    <AE>;<SPECIAL>;<CAPITAL>;IGNORE 
<a:>    <AE>;<SPECIAL>;<SMALL>;IGNORE 
% <O//> is a separate letter in Danish 
<O//>   <O//>;<NONE>;<CAPITAL>;IGNORE 
<o//>   <O//>;<NONE>;<SMALL>;IGNORE 
<O//'>  <O//>;<ACUTE>;<CAPITAL>;IGNORE 
<o//'>  <O//>;<ACUTE>;<SMALL>;IGNORE 
<O:>    <O//>;<DIAERESIS>;<CAPITAL>;IGNORE 
<o:>    <O//>;<DIAERESIS>;<SMALL>;IGNORE 
<O">    <O//>;<DOUBLE-ACUTE>;<CAPITAL>;IGNORE 
<o">    <O//>;<DOUBLE-ACUTE>;<SMALL>;IGNORE 
% <AA> is a separate letter in Danish 
<AA>    <AA>;<NONE>;<CAPITAL>;IGNORE 
<aa>    <AA>;<NONE>;<SMALL>;IGNORE 
<A-A>   <AA>;<A-A>;<CAPITAL>;IGNORE 
<A-a>   <AA>;<A-A>;<CAPITAL-SMALL>;IGNORE 
<a-A>   <AA>;<A-A>;<SMALL-CAPITAL>;IGNORE 
<a-a>   <AA>;<A-A>;<SMALL>;IGNORE 
<AA'>   <AA>;<AA'>;<CAPITAL>;IGNORE 
<aa'>   <AA>;<AA'>;<SMALL>;IGNORE 
reorder-end 
END LC_COLLATE 
 
LC_MONETARY 
int_curr_symbol         "<D><K><K><SP>" 
currency_symbol         "<k><r>" 
mon_decimal_point       "<,>" 
mon_thousands_sep       "<.>" 
mon_grouping            3;3 
positive_sign           "" 
negative_sign           "<->" 
int_frac_digits         2 
frac_digits             2 
p_cs_precedes           1 
p_sep_by_space          2 
n_cs_precedes           1 
n_sep_by_space          2 
p_sign_posn             4 
n_sign_posn             4 
END LC_MONETARY 
 
LC_NUMERIC 
decimal_point           "<,>" 
thousands_sep           "<.>" 
grouping                3;3 
END LC_NUMERIC 
 
LC_TIME 
abday       "<m><a><n>";/ 
 "<t><i><r>";"<o><n><s>";/ 
 "<t><o><r>";"<f><r><e>";/ 
 "<l><o//><r>";"<s><o/><n> 
day         "<m><a><n><d><a><g>";/ 
 "<t><i><r><s><d><a><g>";/ 
 "<o><n><s><d><a><g>";/ 
 "<t><o><r><s><d><a><g>";/ 
 "<f><r><e><d><a><g>";/ 
 "<l><o//><r><d><a><g>"/ 
 "<s><o//><n><d><a><g>"; 
week        7;19971201;4 
abmon       "<j><a><n>";"<f><e><b>";/ 
 "<m><a><r>";"<a><p><r>";/ 
 "<m><a><j>";"<j><u><n>";/ 
 "<j><u><l>";"<a><u><g>";/ 
 "<s><e><p>";"<o><k><t>";/ 
 "<n><o><v>";"<d><e><c>" 
mon         "<j><a><n><u><a><r>";/ 
 "<f><e><b><r><u><a><r>";/ 
 "<m><a><r><t><s>";/ 
 "<a><p><r><i><l>";/ 
 "<m><a><j>";/ 
 "<j><u><n><i>";/ 
 "<j><u><l><i>";/ 
 "<a><u><g><u><s><t>";/ 
 "<s><e><p><t><e><m><b><e><r>";/ 
 "<o><k><t><o><b><e><r>";/ 
 "<n><o><v><e><m><b><e><r>";/ 
 "<d><e><c><e><m><b><e><r>" 
d_t_fmt     "<%><a><SP><%><F><SP><%><T><SP><%><Z>" 
d_fmt       "<%><O><d><.><SP><%><B><SP><%><Y>" 
atl_digits  "<0><.>;<1><.>;<2><.>;<3><.>;<4><.>;/ 
 <5><.>;<6><.>;<7><.>;<8><.>;<9><.>;/ 
 <1><0><.>;<1><1><.>;<1><2><.>;<1><3><.>;<1><4><.>;/ 
 <1><5><.>;<1><6><.>;<1><7><.>;<1><8><.>;<1><9><.>;/ 
 <2><0><.>;<2><1><.>;<2><2><.>;<2><3><.>;<2><4><.>;/ 
 <2><5><.>;<2><6><.>;<2><7><.>;<2><8><.>;<2><9><.>;/ 
 <3><0><.>;<3><1><.>" 
t_fmt       "<%><T>" 
am_pm       "";"" 
t_fmt_ampm  "" 
timezone    "<C><E><T><-><1><C><E><T><SP><D><S><T><,><M><3><.><5><.><0>/ 
 <,><M><1><0><.><5><.><0>" 
END LC_TIME 
 
LC_MESSAGES 
yesexpr     "<<(><1><J><j><Y><y><)/>><.><*>" 
noexpr      "<<(><0><N><n><)/>><.><*>" 
END LC_MESSAGES 
 
LC_PAPER 
copy "i18n" 
END LC_PAPER 
 
LC_NAME 
name_fmt    "<%><p><%><t><%><g><%><t><%><m><%><t><%><f>" 
name_gen    "" 
name_mr     <h><r> 
name_mrs    <f><r><u> 
name_miss   <f><r><o/><k><e><n> 
name_ms     <f><r> 
END LC_NAME 
 
LC_ADDRESS 
country_name       "<D><a><n><m><a><r><k>" 
country_post       "<D><K>" 
country_ab2        "<D><K>" 
country_ab3        "<D><N><K>" 
country_num        208 
country_car        "<D><K>" 
country_isbn       "<8><7>" 
lang_ab            "<d><a>" 
lang_term          "<d><a><n>" 
postal_fmt   "<%><a><%><N><%><f><%><N><%><d><%><N><%><b><%><N><%>/ 
 <%><s><SP><%><h><SP><%><e><SP><%><r><%><N>/ 
 <%><C><-><%><z><SP><%><T><%><N><%><c><%><N>" 
END LC_ADDRESS 
 
LC_TELEPHONE 
tel_int_fmt    "<+><%><c><SP><%><a><SP><%><l>" 
tel_dom_fmt    "<%><l>" 
int_select     "<0><0>" 
int_prefix     "<4><5>" 
END LC_TELEPHONE 
 
LC_MEASUREMENT 
copy "i18n" 
END LC_MEASUREMENT 
 
B.1.3   LC_MONETARY Rationale. 
 
The currency symbol does not appear in LC_MONETARY because it 
is not defined in the C Standard's C locale.  The C Standard 
limits the size of decimal points and thousands delimiters to 
single-byte values. In FDCC-sets based on multibyte coded 
character sets this cannot be enforced, obviously; this 
standard does not prohibit such characters, but makes the 
behaviour unspecified (in the text "In contexts where other 
standards . . . "). 
 
The grouping specification is based on, but not identical to, 
the C Standard . The "-1" signals that no further grouping 
shall be performed, the equivalent of (CHAR_MAX) in the C 
Standard ). 
 
The FDCC-set definition is an extension of the C Standard 
localeconv() specification. In particular, rules on how 
currency_symbol is treated are extended to also cover int_- 
curr_symbol, and p_set_by_space and n_sep_by_space have been 
augmented with the value 2, which places a space between the 
sign and the symbol (if they are adjacent; otherwise it should 
be treated as a 0). The following table shows the result of 
various combinations: 
 
 p_sep_by_space 
 2          1           0 
 
p_cs_precedes = 1                p_sign_posn = 0        ($ 1.25)    ($ 
1.25)             ($1.25) 
 p_sign_posn = 1           + $1.25     +$ 1.25 
+$1.25 
 p_sign_posn = 2           $1.25 +     $ 1.25+ 
$1.25+ 
 p_sign_posn = 3           + $1.25     +$ 1.25 
+$1.25 
 p_sign_posn = 4           $ +1.25     $+ 1.25 
$+1.25 
 
p_cs_precedes = 0                p_sign_posn = 0        (1.25 $)    (1.25 
$)                (1.25$) 
 p_sign_posn = 1           +1.25 $     +1.25 $ 
+1.25$ 
 p_sign_posn = 2           1.25$ +     1.25 $+ 
1.25$+ 
 p_sign_posn = 3           1.25+ $     1.25 +$ 
1.25+$ 
 p_sign_posn = 4           1.25$ +     1.25 $+ 
1.25$+ 
 
 
 
The following is an example of the interpretation of the 
mon_grouping keyword. Assuming that the value to be formatted 
is 123456789 and the mon_thousands_sep is "'", then the 
following table shows the result. The third column shows the 
equivalent C Standard  string that would be used to 
accommodate this grouping. It is the responsibility of the 
utility to perform mappings of the formats in this clause to 
those used by language bindings such as the C Standard . 
 
 
 Mon_grouping                  Formatted Value        C String 
 3;-1          123456'789      "\3\177" 
 3             123'456'789     "\3" 
 3;2;-1        1234'56'789     "\3\2\177" 
 3;2           12'34'56'789    "\3\2" 
 -1            123456789       "177" 
 
In these examples, the octal value of (CHAR_MAX) is 177. 
 
The dual currency support is specified such that a FDCC-set 
can be used without change during the transition period in a 
static environment. For example in the case of the Euro 
currency as being employed in a number of European countries, 
there is no need to change the FDCC-set when shifting from one 
currency to two concurrent currencies; and there is no need to 
change FDCC-set, when changing to the Euro as the only 
currency. Also the same application call can be made to be 
valid for countries with a single currency and countries with 
dual currencies. The specifications can also be used without 
change of the FDCC-set on an installation, when converting 
from one national currency to another, for example when 
removing some zeroes to form a new currency. 
 
The following example illustrates the support for dual 
currencies; the example is for the Euro in Germany. 
 
LC_MONETARY 
int_curr_symbol         "<D><E><M><SP>" 
currency_symbol         "<D><M>" 
mon_decimal_point       "<,>" 
mon_thousands_sep       "<.>" 
mon_grouping            3;3 
positive_sign           "" 
negative_sign           "<->" 
int_frac_digits         2 
frac_digits             2 
p_cs_precedes           1 
p_sep_by_space          2 
n_cs_precedes           1 
n_sep_by_space          2 
p_sign_posn             4 
n_sign_posn             4 
duo_int_curr_symbol         "<E><U><R><SP>" 
duo_currency_symbol         "<E><U><R>" 
duo_mon_decimal_point       "<,>" 
duo_mon_thousands_sep       "<.>" 
duo_mon_grouping            3;3 
duo_positive_sign           "" 
duo_negative_sign           "<->" 
duo_int_frac_digits         2 
duo_frac_digits             2 
duo_p_cs_precedes           1 
duo_p_sep_by_space          2 
duo_n_cs_precedes           1 
duo_n_sep_by_space          2 
duo_p_sign_posn             4 
duo_n_sign_posn             4 
uno_valid_to                20020630 
duo_valid_from              19990101 
conversion_rate             195;100 
END LC_MONETARY 
 
B.1.4   LC_NUMERIC Rationale. 
 
See the rationale for LC_MONETARY (B1.3) for a description of 
the behaviour of grouping. 
 
B.1.5   LC_TIME Rationale. 
 
The LC_TIME descriptions of abday, day, and abmon imply a 
Gregorian style calendar (7-day weeks, 12-month years, leap 
years, etc.). Other calendars can be supported, for example 
calendars with a fixed week length. 
 
In some FDCC-sets the field descriptors for weekday and month 
names will be given with an initial small letter. Programs 
using these fields may need to adjust the capitalization if 
the output is going to be used at the beginning of a sentence. 
 
The field descriptors corresponding to the optional keywords 
consist of a modifier followed by a traditional field 
descriptor (for instance %Ex). If the optional keywords are 
not supported by the application or are unspecified for the 
current FDCC-set, these field descriptors shall be treated as 
the traditional 
field descriptor. For instance, assume the following keywords: 
 
 alt_digits 
"0th";"1st";"2nd";"3rd";"4th";"5th";"6th";"7th";"8th";"9t- 
h";"10th" 
 d_fmt "The %Od day of %B in %Y" 
 
 
On 7/4/1776, the %x field descriptor would result in "The 4th 
day of July in 1776," while 7/14/1789 would come out as "The 
14 day of July in 1789." It can be noted that the above 
example is for illustrative purposes only; the %o modifier is 
primarily intended to provide for Kanji or Hindi digits in 
date formats. While it is clear that an alternate year format 
is required, there is no consensus on the format or the 
requirements. As a result, while these keywords are reserved, 
the details are left unspecified. It is expected that National 
Standards Bodies will provide specifications. 
 
 
B.1.6   LC_MESSAGES Rationale. 
 
The LC_MESSAGES category is described in clause 4 as affecting 
the language used by utilities for their output. The mechanism 
used by the application to accomplish this, other than the 
responses shown here in the FDCC-set definition, is not 
specified by this version of this standard. The 
internationalization working group is developing an interface 
that would allow applications (and, presumably some of the 
standard utilities) to access messages from various message 
catalogs, tailored to a user's LC_MESSAGES value. 
 
 
B.1.7   LC_PAPER Rationale. 
 
The LC_PAPER category gives information to prepare output on a 
printer. Only the physical measurement s of the height and 
width is available, as this is the information most often 
available in various document handling applications. 
 
 
B.1.8   LC_NAME Rationale. 
 
The LC_NAME category gives information to prepare a text for 
addressing a person, for example as a part of a postal address 
on an envelope, or as a salutationing line in a letter. The 
information is intended to be given to an API that has the 
various naming information as parameters and yields a 
formatted string as the return value. 
 
 
B.1.8   LC_ADDRESS Rationale. 
 
The LC_ADDRESS category gives information to prepare a text 
for writing an address, for example as a part of a postal 
address on an envelope. The information is intended to be 
given to an API that has the various address information as 
parameters and yields a formatted string as the return value. 
 
 
B.1.9   LC_TELEPHONE Rationale. 
 
The LC_TELEPHONE category gives information to prepare a text 
for writing a telephone number. The information is intended to 
be given to an API that has the various information on a 
telephone number as parameters and yields a formatted string 
as the return value. Both an international and a domestic 
formatting possibility is available. 
 
 
B.1.10   LC_MEASUREMENT Rationale. 
 
The LC_MEASUREMENT category gives a simple indication whether 
the ISO measurement system is used, or another systems is the 
one applied. It may be enhanced in future editions of this 
standard. 
 
 
B.1.11   LC_VERSIONS Rationale. 
 
The LC_VERSIONS category gives meta-information on the FDCC- 
set, such as who created it, and what is the level of 
conformance for each of the FDCC sets. 
 
 
B.2   Character Set Rationale. 
 
This standard poses no requirement that multiple character 
sets or code sets be supported, leaving this as a marketing 
differentiation for implementors.  Although multiple charmaps 
are supported, it is the responsibility of the application to 
provide the file(s); if only one is provided, only that one 
will be accessible. 
 
The character set description text provides the capability to 
describe character set attributes (such as collation order or 
character classes) independent of character set encoding, and 
using only the characters in the portable character set.  This 
makes it possible to create "generic" FDCC-set source texts 
for all code sets that share the portable character set (such 
as the ISO/IEC 8859 family or IBM Extended ASCII). 
 
Applications are free to describe more than one code set in a 
character set description text.  For example, if an 
application defines ISO/IEC 8859-1 as the primary code set, 
and ISO/IEC 8859-2 as an alternate set, with each character 
from the alternate code set preceded in data by a shift code, 
a character set description text could contain a complete 
description of the primary set and those characters from the 
secondary that are not identical, the encoding of the latter 
including the shift code. 
 
Applications are free to choose their own symbolic names, as 
long as the names identified by this standard are also 
defined; this provides support for already existing "character 
names". 
 
The charmap was introduced to resolve problems with the 
portability of, especially, FDCC-set sources.  While the 
portable character set (in Table 3) is a constant across all 
FDCC-sets for a particular application, this is not true for 
the extended character set. However, the particular coded 
character set used for an application or an application does 
not necessarily imply different characteristics or collation: 
on the contrary, these attributes should in many cases be 
identical, regardless of codeset.  The charmap provides the 
capability to define a common FDCC-set definition for multiple 
codesets (the same FDCC-set source can be used for codesets 
with different extended characters; the ability in the charmap 
to define ``empty'' names allows for characters missing in 
certain codesets). 
 
In addition, some implementors have expressed an interest in 
using the charmap to define certain other characteristics of 
codesets, such as the <mb_cur_max> value for the particular 
codeset.  (Note that <mb_cur_max> has to be equal to or lower 
than the C Standard {MB_LEN_MAX}, which is the application 
limit).  Such extensions are not described here; but may be 
added in a later revision of this standard. 
 
The <escape_char> declaration was added at the request of the 
international community to ease the creation of portable 
charmaps on terminals not implementing the default backslash 
escape.  (This approach was adopted because this is a new 
interface invented by POSIX-2. Historical interfaces, such as 
the shell command language and awk, have not been modified to 
accommodate this type of terminal.) 
 
The octal number notation was selected to match those of POSIX 
"awk" and "tr" utilities and is consistent with that used by 
the POSIX localedef utility. 
 
The charmap capability implements a facility available at some 
X/Open compatible applications.  Its prime virtue is to 
support "generic" collation sequence source definitions.  An 
implementor or an applications developer can produce a 
template definition that can be used to produce several 
codeset-dependent "compiled" FDCC-set definitions.  The 
facility also removes any dependency in many source 
definitions on characters outside the character set defined in 
this clause. 
 
The charmap allows specification of more than one encoding of 
a character. This allows for encodings that can encode items 
in more than one way; for example as a fully composed 
character and as a base character plus a combining character 
can be recognized, but only the first occurrence of the 
character may be output. In this way a character stream may be 
normalized. 
 
The ISO 2022 support introduced gives the possibility to refer 
other definitions via charmaps, so the full encoding does not 
have to be replicated. It supports shifting with G0, G1, G2 
and G3 sets, and also general shifting of coded character sets 
via escape sequences. 
 
B.3   Repertoiremap Rationale. 
 
The repertoiremap was introduced to make FDCC-sets independent 
of the availability of charmaps. With the repertoiremap it is 
possible to use a FDCC-set encoded with one set of symbolic 
character names, together with charmaps with other symbolic 
character naming schemes, provided there are repertoiremaps 
available for both naming schemes. 
 
Repertoiremaps are also useful to describe repertoires of 
characters, to be used for example for transliteration.  Annex C 
(informative) 
 
Index 
 
abbreviation                  4.13 
abday                          4.6 
abmon                          4.6 
absolute ellipses            3.2.3 
address                       4.13 
addresses                     4.10 
addset                         5.1 
affirmative response        3.1.17 
alpha                        4.2.1 
alt_digits                     4.6 
am_pm                          4.6 
application                   4.13 
audience                      4.13 
blank                        4.2.1 
block_separator              4.2.1 
byte                         3.1.1 
cal_direction                  4.6 
category                      4.13 
category names                 4.1 
category trailer               4.1 
category header                4.1 
category body                  4.1 
char_shape_selector          4.2.1 
character                    3.1.2 
character, graphic           4.2.1 
character, special           4.2.1 
character representation     4.1.1 
character, native digit      4.2.1 
character, hexadecimal digit 4.2.1 
character, multibyte         4.1.1 
character, decimal constant  4.1.1 
character, hexadecimal 
constant                     4.1.1 
character, space             4.2.1 
character, octal constant    4.1.1 
character, control           4.2.1 
character, blank             4.2.1 
character, digit             4.2.1 
character, punctuation       4.2.1 
character, printable        3.1.10 
character class              3.1.9 
character, coded             3.1.3 
Character set rationale        B.2 
charmap text                   5.1 
charmap          5, 4.1.2.4, 3.1.7 
charmap rationale              B.2 
class                        4.2.1 
cntrl                        4.2.1 
code_set_name                  5.1 
coded character              3.1.3 
col_weight_max          4.3, 4.3.3 
collating-element              4.3 
collating statements         4.3.1 
collating-symbol             4.3.6 
collating element           3.1.13 
collating sequence                                                   3.1.15 
collating-element                                                     4.3.5 
collating-symbol                                                        4.3 
collation                                                            3.1.12 
comment_char                                                   4.1.2.1, 5.1 
conformance                                                               7 
contact                                                                4.13 
control characters                                                    4.2.1 
conversion_rate                                                         4.4 
copy                                           4.2.1, 4.3.2, 4.4, 4.5, 4.6, 
4.7 
 4.8, 4.9, 4.10, 4.11, 4.12 
country_ab2                                                            4.10 
country_ab3                                                            4.10 
country_car                                                            4.10 
country_isbn                                                           4.10 
country_name                                                           4.10 
country_num                                                            4.10 
country_post                                                           4.10 
cultural convention                                                   3.1.5 
currency_symbol                                                         4.4 
d_fmt                                                                   4.6 
d_t_fmt                                                                 4.6 
date field descriptors                                                4.6.1 
date                                                                   4.13 
day                                                                     4.6 
decimal_point                                                           4.5 
default_missing                                                       4.2.2 
define                                                        4.3.14.1, 4.3 
definitions                                                             3.1 
digit                                                                 4.2.1 
direction_control                                                     4.2.1 
duo_currency_symbol                                                     4.4 
duo_frac_digits                                                         4.4 
duo_int_curr_symbol                                                     4.4 
duo_int_frac_digits                                                     4.4 
duo_int_n_cs_precedes                                                   4.4 
duo_int_n_sep_by_space                                                  4.4 
duo_int_n_sign_posn                                                     4.4 
duo_int_p_cs_precedes                                                   4.4 
duo_int_p_sep_by_space                                                  4.4 
duo_int_p_sign_posn                                                     4.4 
duo_n_cs_precedes                                                       4.4 
duo_n_sep_by_space                                                      4.4 
duo_n_sign_posn                                                         4.4 
duo_p_cs_precedes                                                       4.4 
duo_p_sep_by_space                                                      4.4 
duo_p_sign_posn                                                         4.4 
duo_valid_from                                                          4.4 
duo_valid_to                                                            4.4 
elif                                                          4.3.14.6, 4.3 
ellipses                                                              3.2.3 
ellipses, absolute                                                      5.1 
ellipses, symbolic                                                      5.1 
else                                                          4.3, 4.3.14.5 
email                         4.13 
endif                          4.3 
endif                     4.3.14.7 
equivalence class           3.1.16 
era                            4.6 
era_d_fmt                      4.6 
era_year                       4.6 
escape_char        4.1.2.2, 5.1, 6 
esqseq                         5.1 
euro                         B.1.3 
fax                           4.13 
FDCC-set, definition           4.1 
FDCC-set                        4f 
FDCC-set                     3.1.6 
FDCC-set rationale             B.1 
first_weekday                  4.6 
first_workday                  4.6 
frac_digits                    4.4 
graph                        4.2.1 
graphic chracters            4.2.1 
grouping                       4.5 
height                         4.8 
ifdef                     4.3.14.3 
ifdef                          4.3 
ifndef                         4.3 
ifndef                    4.3.14.4 
include                      4.2.2 
include                        5.1 
include                    4.2.2.2 
int_curr_symbol                4.4 
int_frac_digits                4.4 
int_n_cs_precedes              4.4 
int_n_sep_by_space             4.4 
int_n_sign_posn                4.4 
int_p_cs_precedes              4.4 
int_p_sep_by_space             4.4 
int_p_sign_posn                4.4 
int_prefix                    4.11 
int_select                    4.11 
keywords                       4.1 
lang_ab                       4.10 
lang_lib                      4.10 
lang_name                     4.10 
lang_term                     4.10 
language                      4.13 
LC_ADDRESS                    4.10 
LC_ADDRESS rationale         B.1.9 
LC_COLLATE                     4.3 
LC_COLLATE rationale         B.1.2 
LC_CTYPE                       4.2 
LC_CTYPE rationale           B.1.1 
LC_MEASUREMENT                4.12 
LC_MEASUREMENT rationale    B.1.11 
LC_MESSAGES                    4.7 
LC_MESSAGES rationale        B.1.6 
LC_MONETARY                    4.4 
LC_MONETARY rationale        B.1.3 
LC_NAME                        4.9 
LC_NAME rationale            B.1.8 
LC_NUMERIC                                                              4.5 
LC_NUMERIC rationale                                                  B.1.4 
LC_PAPER                                                                4.8 
LC_PAPER rationale                                                    B.1.7 
LC_TELEPHONE                                                           4.11 
LC_TELEPHONE rationale                                               B.1.10 
LC_TIME                                                                 4.6 
LC_TIME rationale                                                     B.1.5 
LC_VERSIONS                                                            4.13 
LC_VERSIONS rationale                                                B.1.12 
LC_X                                                                      4 
left_to_right                                                         4.2.1 
line continuation                                                     3.2.2 
lower                                                                 4.2.1 
map                                                                   4.2.1 
mb_cur_max                                                              5.1 
mb_cur_min                                                              5.1 
measurement                                                            4.12 
messages                                                                4.7 
modified date fiels 
descriptors                                                           4.6.2 
mon                                                                     4.6 
mon_decimal_point                                                       4.4 
mon_grouping                                                            4.4 
mon_thousands_sep                                                       4.4 
monetary                                                                4.4 
multicharacter collating 
element                                                              3.1.14 
n_cs_precedes                                                           4.4 
n_sep_by_space                                                          4.4 
n_sign_posn                                                             4.4 
name formatting                                                         4.9 
name_fmt                                                                4.9 
name_gen                                                                4.9 
name_miss                                                               4.9 
name_mr                                                                 4.9 
name_mrs                                                                4.9 
name_ms                                                                 4.9 
negative response                                                    3.1.18 
negative_sign                                                           4.4 
no_connect-space                                                      4.2.1 
no_connect                                                            4.2.1 
noexpr                                                                  4.7 
non_spacing                                                           4.2.1 
non_spacing_level3                                                    4.2.1 
normal_connect                                                        4.2.1 
notations                                                               3.2 
num_separator                                                         4.2.1 
num_shape_selector                                                    4.2.1 
num_terminator                                                        4.2.1 
numeric                                                                 4.5 
operands                                                                4.1 
order_end                                                        4.3.9, 4.3 
order_start                                                      4.3, 4.3.8 
outdigit                                                              4.2.1 
p_cs_precedes                                                           4.4 
p_sep_by_space                                                          4.4 
p_sign_posn                                                             4.4 
paper format                                                            4.8 
portable character set           5 
positive_sign                  4.4 
POSIX                            1 
POSIX differences                A 
POSIX conformance             4.13 
postal addresses              4.10 
postal_fmt                    4.10 
pre-category statements      4.1.2 
print                        4.2.1 
printable character         3.1.10 
punct                        4.2.1 
punctuation characters       4.2.1 
r_connect                    4.2.1 
references                       2 
reorder-script-end          4.3.13 
reorder-script-after        4.3.12 
reorder-script-after           4.3 
reorder-after                  4.3 
reorder-end                    4.3 
reorder-script-end             4.3 
reorder-after               4.3.10 
reorder-end                 4.3.11 
reorder-after rationale    B.1.2.1 
repertoire rationale           B.3 
repertoire                       6 
repertoiremap6, 3.1.8, 5.1, 4.1.2.3 
revision                      4.13 
right_to_left                4.2.1 
scope                            1 
script                  4.3, 4.3.4 
segment_separator            4.2.1 
source                        4.13 
space                        4.2.1 
special characters           4.2.1 
special1                     4.2.1 
special2                     4.2.1 
special3                     4.2.1 
sym_swap_layout              4.2.1 
symbol-equivalence             4.3 
symbol-equivalence           4.3.7 
symbolic ellipses            3.2.3 
symbolic name                4.1.1 
syntax format                3.2.1 
t_fmt                          4.6 
t_fmt_ampm                     4.6 
tel                           4.13 
tel_dom_fmt                   4.11 
tel_int_fmt                   4.11 
telephone numbers             4.11 
territory                     4.13 
text file                    3.1.4 
thousands_sep                  4.5 
timezone                       4.6 
title                         4.13 
toggling keywords           4.3.14 
tolower                      4.2.1 
tosymmetric                  4.2.1 
toupper                      4.2.1 
translit_end                 4.2.2 
translit_start               4.2.2 
transliteration                                                       4.2.2 
transliteration statements                                          4.2.2.1 
undef                                                         4.3, 4.3.14.2 
uno_valid_from                                                          4.4 
uno_valid_to                                                            4.4 
upper                                                                 4.2.1 
visible glyph portable 
characters                                                                5 
vowel_connect                                                         4.2.1 
week                                                                    4.6 
white space                                                          3.1.11 
width                                                                   4.8 
xdigit                                                                4.2.1 
yesexpr                                                                 4.7 
 BIBLIOGRAPHY 
 
The following specifications are considered relevant to this 
standard, in addition to the normative references. 
 
ISO 639, "Code for the representation of names of languages" 
 
ISO 646, "Information technology - ISO 7-bit coded character 
set for information interchange" 
 
ISO 3166, "Code for the representation of names of countries" 
 
ISO/IEC 8824, "Information technology - Open Systems Intercon- 
nection - Specification of Abstract Syntax Notation One 
(ASN.1)" 
 
ISO/IEC 8825, "Information technology - Open System 
Interconnection - Specification of Basic Encoding Rules for 
Abstract Syntax Notation One (ASN.1)" 
 
ISO/IEC 9899, "Information technology - Programming Language 
C". 
 
The Unicode Consortium: "The Unicode Standard, Version 2.0", 
Addison Wesley Developers Press, July 1996. ISBN 0-201-48345- 
9. 
 
IBM: "National Language Design Guide Volume 2 - National 
Language Support Reference Manual", IBM SE09-8002-03, August 
1994. 
 
STR: "Nordic Cultural Requirements on Information Technology 
(Summary report)", STR TS3, Libris, Reykjavk, Iceland 1992. 
ISBN 9979-9004-3-1. 
