Internationalization/code parser

Revision as of 10:31, 4 May 2006 by Trosim (Talk | contribs) (Reading and parsing: - MO file structure)

Summary

That's what this part of the project should achieve:

  • reading and parsing of MO files containing the strings and their translations
  • organize the object collection in an incremental way: don't load the whole file if it's not needed
  • give a simple interface to the localization class, so that the strings can be printed out without too much efforts

Reading and parsing

MO file structure

As reported from the gettext manual.

          byte
               +------------------------------------------+
            0  | magic number = 0x950412de                |
               |                                          |
            4  | file format revision = 0                 |
               |                                          |
            8  | number of strings                        |  == N
               |                                          |
           12  | offset of table with original strings    |  == O
               |                                          |
           16  | offset of table with translation strings |  == T
               |                                          |
           20  | size of hashing table                    |  == S
               |                                          |
           24  | offset of hashing table                  |  == H
               |                                          |
               .                                          .
               .    (possibly more entries later)         .
               .                                          .
               |                                          |
            O  | length & offset 0th string  ----------------.
        O + 8  | length & offset 1st string  ------------------.
                ...                                    ...   | |
O + ((N-1)*8)  | length & offset (N-1)th string           |  | |
               |                                          |  | |
            T  | length & offset 0th translation  ---------------.
        T + 8  | length & offset 1st translation  -----------------.
                ...                                    ...   | | | |
T + ((N-1)*8)  | length & offset (N-1)th translation      |  | | | |
               |                                          |  | | | |
            H  | start hash table                         |  | | | |
                ...                                    ...   | | | |
    H + S * 4  | end hash table                           |  | | | |
               |                                          |  | | | |
               | NUL terminated 0th string  <----------------' | | |
               |                                          |    | | |
               | NUL terminated 1st string  <------------------' | |
               |                                          |      | |
                ...                                    ...       | |
               |                                          |      | |
               | NUL terminated 0th translation  <---------------' |
               |                                          |        |
               | NUL terminated 1st translation  <-----------------'
               |                                          |
                ...                                    ...
               |                                          |
               +------------------------------------------+