Difference between revisions of "Talk:Vision2 and Unicode"
 (added howto)  | 
				Colin-adams  (Talk | contribs)   | 
				||
| Line 3: | Line 3: | ||
*Do the Unicode stuff first as an outside library that takes STRING_GENERAL instances as argument  | *Do the Unicode stuff first as an outside library that takes STRING_GENERAL instances as argument  | ||
**The implementation should be as compact as possible for the tables, and of course very efficient. This is actually a quite difficult work since it requires some reading of the Unicode standard.  | **The implementation should be as compact as possible for the tables, and of course very efficient. This is actually a quite difficult work since it requires some reading of the Unicode standard.  | ||
| + | ***I would say it requires reading the text of the entire book once, plus the standard annexes, and then re-reading (maybe several times) the sections relevant to the particular task you are engaged on. At least, I found this approach necessary when I was implementing geuc, and UTF-16 and UTF-32 serialization, in order to get a good grasp on all the issues involved. (Colin Adams)  | ||
*Implement helper classes using RAW_FILE as argument to read text file in a special unicode encoding  | *Implement helper classes using RAW_FILE as argument to read text file in a special unicode encoding  | ||
**Looks like there are some kind of standard out there using BOM  | **Looks like there are some kind of standard out there using BOM  | ||
*update editor to accepts STRING_GENERAL instances as arguments and thus enabling user to enter unicode characters  | *update editor to accepts STRING_GENERAL instances as arguments and thus enabling user to enter unicode characters  | ||
Revision as of 22:09, 28 April 2006
Howto
The basic ideas are the following:
- Do the Unicode stuff first as an outside library that takes STRING_GENERAL instances as argument
- The implementation should be as compact as possible for the tables, and of course very efficient. This is actually a quite difficult work since it requires some reading of the Unicode standard.
- I would say it requires reading the text of the entire book once, plus the standard annexes, and then re-reading (maybe several times) the sections relevant to the particular task you are engaged on. At least, I found this approach necessary when I was implementing geuc, and UTF-16 and UTF-32 serialization, in order to get a good grasp on all the issues involved. (Colin Adams)
 
 
 - The implementation should be as compact as possible for the tables, and of course very efficient. This is actually a quite difficult work since it requires some reading of the Unicode standard.
 - Implement helper classes using RAW_FILE as argument to read text file in a special unicode encoding
- Looks like there are some kind of standard out there using BOM
 
 - update editor to accepts STRING_GENERAL instances as arguments and thus enabling user to enter unicode characters
 

