main mozilla tech

Silme 0.5 released

silme logoProjects need releases. It’s important. It’s like a birthday for a project – they get a milestone to mark the progress.
On the other hand we have developers. They need unlimited time and no deadlines. When one meet another, we have an interesting arm-wrestling battle between those two, but ultimately one has to obey to the Oath of the Bazaar, if you know what I mean.


So, here we are, Silme was asking for a release for long enough and I postponed it over and over so it’s time to make the cut. Today, I’m proud to announce the very first official release of Silme – python l10n library. Silme has been announced to long time ago, and since then it has been continuously developed in a small, but quite interesting project structure with support from Adrian Kalla, Stefan Plewako, Ricardo Palomares, Staś Małolepszy and management guidance from Seth Bindernagel.

It’s very, very hard to explain Silme concept to those who never tried to work on localization development.

Let me try: It’s like a DOM API for localization.

Works? Probably not… Well. Let me try the descriptive way. Silme is a toolset for a developer who wants to work on localization tools. It can read localization files, it can write them, it can modify them, it can search through them, it can process them, merge, split, localize and help you get some statistics out of the localization files. It probably can juggle them, although support for this is rather experimental.

Standing on the shoulders…

Silme is definitely not the first library of that kind and has been created basing on lessons from TranslationToolkit masters, who for years lead the development of most widely used l10n dev library in the world. TT is not the only one. In fact, anyone who created a tool related to localization had to create a library for it but it was probably limited to on-demand needs, and was treated without enough love and respect from its masters.

The reason is that number of people working on localization tools is… let’s put it that way. The number could not qualify for the big number competition. Even for a descend number in fact. And those who do work on the tools, are really extremely focused on getting the output here and now, and everything that has to happen on the way from PO or DTD file to the moment when you can see a window with a textbox asking you to translate the string is less important than the actual UI and workflow. Thanks to this approach we at least have some tools that we can offer to localizers and make their life happier.

The one particular example where the situation is different is the world of gettext, with its great tools like KBabel (now Lokalize), Pootle, PoEdit and swarm of little helpful scripts around Translation Toolkit project.

Format equality

And now, Silme tries to replicate this success on the cross-format level, giving some sense of self-respect to the non-gettext formats like DTD and Properties. In fact, Silme is intended to be format-neutral with support for as many formats as possible. Gettext, DTD, XLIFF, TS/QM (Nokia QT) and others. It also can read entities directly from SQLite or MySQL and we’re experimenting with grabbing entities directly from an HTML file. We hope that this will open an easier way for developers around the world to craft their own tools, or new projects to focus on the actual tool, not having to reinvent the way to parse entities or compare them.

More that that. We hope that new localization formats and languages will have an easier start being just able to reuse everything that Silme has to offer and tie it with their vision and potential. Yes, L20n may be that language.

Some numbers


Silme 0.5 is a result of countless hours of testing, coding, debugging, making mistakes and fixing them, analyzing performance, and introducing bottle necks only to spend another couple days trying to remove them. Adrian, in particular, spent significant amount of time sending me testcases and letting me know that I broke everything with my shiny, new API, but all in all what we have here is a pretty neat piece of code, if you ask me.

What you see on the right is a cute result of hg activity script  run on all 385 commits to our HG repo. Not sure if its self-explaining so left me give you a hint. 35 commits per day is a big number. (and our module owners are not in the race, they have some godlike powers for sure).

In terms of LOC, the stats are as follow:

Language          files     blank   comment      code    scale   3rd gen. equiv
Silme library        49       580       413      3784 x   4.20 =       15892.80
With scripts etc.    78       982       529      5574 x   4.20 =       23410.80
(data from CLOC)


Ha! Of course not! Silme is just at the beginning of its journey and although we already can see first projects investigating potential use of it, we have a long way ahead till we can mark stable 1.0 release, and claim success.

In particular, on the Wiki page you can read about our plans for the next release – Silme 0.7. You can also find basic tutorial (work in progress) and a few example code pieces that you can use to figure out some bolts and nits used in here.

As this post gets to its closing, and your excitement about the potential uncovered by the tales of python libraries, localization developers and mysterious land of happy i18n, l10n, l12y and other abbreviations where all strings fit the space, all words have perfect translations, declensions and countless plural forms are well handled and…

Umm… where was I… ah! yea. So… it’s time for me to close this gathering and let you surf ahead on your Internet waves. And if you happen to find this project interesting, don’t hesitate to step by our newborn IRC channel – #silme on and cheer the brave men and women who work there.

From now on, you can expect short incremental updates on the way to Silme 0.7.

One reply on “Silme 0.5 released”

Comments are closed.