Unicode Free Operator
Revision as of 17:47, 26 May 2010 by Ted (Talk | contribs) (New page: {{UnderConstruction}} Category:Unicode Category:Compiler This page describe Unicode (5.0) free operator to be implement in ISE compiler. We hightlight the relationship between '''...)
This page describe Unicode (5.0) free operator to be implement in ISE compiler. We hightlight the relationship between special characters mentioned in Unicode standard and printable characters specified in the ECMA standard.
Legend
<accepted> marks characters that are allowed as part of a free operator. <denied> marks characters that are not allowed as part of a free operator. <undecided> marks characters that are not yet decided.
Space Characters
<denied>
Code Position Name 0020 SPACE 00A0 NO-BREAK SPACE 2000 EN QUAD 2001 EM QUAD 2002 EN SPACE 2003 EM SPACE 2004 THREE-PER-EM SPACE 2005 FOUR-PER-EM SPACE 2006 SIX-PER-EM SPACE 2007 FIGURE SPACE 2008 PUNCTUATION SPACE 2009 THIN SPACE 200A HAIR SPACE 3000 IDEOGRAPHIC SPACE
Currency symbols
<accepted> Unicode standard emphasizes that currency symbols in ISO/IEC 10646 do not necessarily identify the currency of a country.
Alternate format characters
General format characters
Zero-width boundary indicators
COMBINING GRAPHEME JOINER (034F) -- Used to indicate that adjacent characters belong to the same grapheme cluster. SOFT HYPHEN (00AD) -- a format character that indicates a preferred intra-word linebreak opportunity ZERO WIDTH SPACE (200B) -- This character behaves like a SPACE in that it indicates a word boundary, but unlike SPACE it has no presentational width. WORD JOINER (2060) and ZERO WIDTH NO-BREAK SPACE (FEFF) -- These characters behave like a NOBREAK SPACE in that they indicate the absence of word boundaries, but unlike NO-BREAK SPACE they have no presentational width. ZERO WIDTH NON-JOINER (200C) -- This character indicates that the adjacent characters are not joined together in cursive connection even when they would normally join together as cursive letter forms. ZERO WIDTH JOINER (200D) -- This character indicates that the adjacent characters are represented with joining forms in cursive connection even when they would not normally join together as cursive letter forms.
Format separators
LINE SEPARATOR (2028) PARAGRAPH SEPARATOR (2029)
Bidirectional text formatting
LEFT-TO-RIGHT MARK (200E) -- In bidirectional formatting, this character acts like a left-to-right character (such as LATIN SMALL LETTER A). RIGHT-TO-LEFT MARK (200F) -- In bidirectional formatting, this character acts like a right-to-left character (such as ARABIC LETTER NOON). LEFT-TO-RIGHT EMBEDDING (202A) -- This character is used to indicate the start of a left-to-right implicit embedding. RIGHT-TO-LEFT EMBEDDING (202B) -- This character is used to indicate the start of a right-to-left implicit embedding. LEFT-TO-RIGHT OVERRIDE (202D) -- This character is used to indicate the start of a left-to-right explicit embedding. RIGHT-TO-LEFT OVERRIDE (202E) -- This character is used to indicate the start of a right-to-left explicit embedding. POP DIRECTIONAL FORMATTING (202C) -- This character is used to indicate the termination of an implicit or explicit directional embedding initiated by the above characters.
Other boundary indicators
NARROW NO-BREAK SPACE (202F) -- This character is a non-breaking space. It is similar to 00A0 NO-BREAK SPACE, except that it is rendered with a narrower width.