Internationalization/mo parser

Revision as of 07:18, 7 May 2006 by Carlo (Talk | contribs) (moved from code parser)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Summary

That's what this part of the project should achieve:

  • reading and parsing of MO files containing the strings and their translations
  • organize the object collection in an incremental way: don't load the whole file if it's not needed
  • give a simple interface to the localization class, so that the strings can be printed out without too much efforts

Reading and parsing

Parser structure

I'll propose the class structure of the parser, later.

MO file structure

As reported from the gettext manual.

          byte
               +------------------------------------------+
            0  | magic number = 0x950412de                |
               |                                          |
            4  | file format revision = 0                 |
               |                                          |
            8  | number of strings                        |  == N
               |                                          |
           12  | offset of table with original strings    |  == O
               |                                          |
           16  | offset of table with translation strings |  == T
               |                                          |
           20  | size of hashing table                    |  == S
               |                                          |
           24  | offset of hashing table                  |  == H
               |                                          |
               .                                          .
               .    (possibly more entries later)         .
               .                                          .
               |                                          |
            O  | length & offset 0th string  ----------------.
        O + 8  | length & offset 1st string  ------------------.
                ...                                    ...   | |
O + ((N-1)*8)  | length & offset (N-1)th string           |  | |
               |                                          |  | |
            T  | length & offset 0th translation  ---------------.
        T + 8  | length & offset 1st translation  -----------------.
                ...                                    ...   | | | |
T + ((N-1)*8)  | length & offset (N-1)th translation      |  | | | |
               |                                          |  | | | |
            H  | start hash table                         |  | | | |
                ...                                    ...   | | | |
    H + S * 4  | end hash table                           |  | | | |
               |                                          |  | | | |
               | NUL terminated 0th string  <----------------' | | |
               |                                          |    | | |
               | NUL terminated 1st string  <------------------' | |
               |                                          |      | |
                ...                                    ...       | |
               |                                          |      | |
               | NUL terminated 0th translation  <---------------' |
               |                                          |        |
               | NUL terminated 1st translation  <-----------------'
               |                                          |
                ...                                    ...
               |                                          |
               +------------------------------------------+