I just sent this to a mailing list that I am on, but I figured I’d post it publicly too since there is some useful information here. However, it is probably very dry if you are not a programmer.
Internationalization is a requirement for XBLA (and probably any other console deployment)…. Well, I have finally started on that process.
Someone before mentioned gnu gettext() and after reading up on the file format it seemed simple and like the right thing. There is a text file format (suffix “.po”) that you can type stuff into, and then if you want there are tools that convert that into a binary format (suffix “.mo”) that you can just load as one chunk of memory and use (it even has a precomputed hash table inside the file). I didn’t want to link to a big library, and it seemed simple enough to implement, so I just wrote a reader for the binary file format. This did not take long; it was 166 lines of code, 20 of which are just the hashpjw function which I pasted in after looking up the license terms.
Unfortunately it didn’t work and it wasn’t obvious why not, it seemed like the hash function was wrong or something. So then I spent a few hours trying to compile gettext() so I could run through the hello_world program and see how it goes. Big mistake. The problem with gettext() is there is a massive amount of code there to do, as nearly as I can tell, nearly nothing. After spending a couple of hours I was not even able to get it to compile under cygwin (using the native cygwin packages and the new auto-installer-and-updater that cygwin has… I did get it to partially compile, and that took 15 minutes (!!) of constant compiling, for a program that basically is just for string lookups in a hash table. What the fuck, people). So I said fuck that, and just looked at the source some more and within 5 minutes saw the bug (which had to do with the documentation not being very clear; though I am disappointed I didn’t figure it out as soon as it started happening).
So all in all, the programming part of this was a very quick task, if you don’t count the hours I spent trying to deal with crappy open source. (The RAD Game Tools model of packing everything into one file is definitely, definitely the way to go if you ever want anyone to use your code).
My 166 lines of code assumes that the hash table is present in the .mo file, but doesn’t care about the endianness of it. If anyone wants it, just ask.
I am currently using a gui tool called poedit to compile the .po files into .mo, but I think it just calls the command-line tool ‘msgfmt‘.
So my strings are now all externalized and happy in a standard format that any contracted translation company can deal with.
gettext(), and programs that people have built around it, tend to provide some tools for scanning through your program and extracting all the strings to build the initial .po file, but this did not seem very useful to me… if I even had the confidence I could figure out how to make it run in a reasonable amount of time. The majority of strings in the Braid source are not text displayed to the end-user (they’re stuff like asset IDs), so such a generated file would have a lot of junk in it. But also, I think it’s important to pick a separate label for the string lookup than the actual English string, because that helps emphasize / document that there is an external data file that you need to change… otherwise you go and fix a typo or change some punctuation in the English text in the program, and then all your translations mysteriously break. Which would be lame.
I haven’t done the rendering part of this yet, though. That’s next. Right now my plan is to use freetype2 to generate font glyphs into a texture on-demand after startup. Does anyone have experience using freetype at all?