For several years now, the Localization team at Mozilla has been working on a modern localization framework based on the following set of principles and architectural choices that we consider fundamental for the next generation multilingual UI’s.
- Principle 1: Localizers should be in control of translations. The localization framework should be grammar-agnostic, whether it’s about grammatical cases, genders, or tenses. Localizers should be able to use the entire expressive power of their language to author translations which create the best experience for the users.
- Principle 2: Language fallback should be robust and graceful. When a translation is missing or broken the user should be presented with a translation into the next best language, given their preferences. There might be more than one fallback language.
- Principle 3: Translations should be isolated and asymmetric if needed. The source language of the application should not define structure of the translations (e.g. the lack of pluralization in English should not make it impossible for other languages to use plurals in a given message).
- Principle 4: The framework should embrace the Web. Localization should react to changes of the runtime environment (e.g. resizing of the app’s window, change of orientation, incrementation of a number of unread messages) and should add as little overhead for developers as possible.
Exactly one year ago, on April 8th 2014, Stas landed the initial rewrite of l10n.js – the localization framework used in Firefox OS. This set us on the path to enable the vision of a modern localization framework driven by the design principles outlined above.
Since then, we have a dedicated, two person team, working full time on advancing this vision and learning how to improve upon it in the process.
The full year of work has resulted in many important features being developed for the platform, including:
- Language packs: Small packages that decouple language resources from the application allowing us to extend language coverage dynamically when users request it, even after the device has been already released on the market.
- Pseudo-locales: Programmatically built language resources that emulate different languages allowing developers to test their applications for any multilingual problems before localizers have time to provide translations
- Security: While not traditionally big thing in localization, having an open runtime ecosystem of localizations requires us to make sure that translations cannot accidentally or maliciously impact our code and break it.
- Error reporting: We’ve made major advancements to help developers and localizers find potential errors early. We reject malformed strings, report missing strings and duplicates, recover from exceptions in our code.
A couple weeks ago we have finalized the work scheduled for Firefox OS 2.2 and begun development for the next major release. The clean and reliable API has given us a good base to start implementing the remaining components of the vision behind L20n in this cycle.
For the current cycle we have scheduled:
- DOM Overlays: Ability for a localizer to use HTML syntax in their translations and also provide whole localized DOM Fragments to be merged with developer provided skeletons via a secure algorithm. This increases the system’s security and empowers localizers to provide better translations.
- L20n format: One of the last remaining pieces of the puzzle is the new file format that is designed to store localization data like multivariant strings, entities with values and attributes and variant selectors. This will allow us to start introducing new features to the system that are impossible using the current data storing formats.
- Lightweight l10n contexts: Together with the whole platform, we want to make a heavier use of the concept of multiple small localization contexts to replace the single-context-per-app approach. It will improve performance and isolation resulting in easier maintenance.
- API 3.0: Our current API still contains remaining pieces from the old, synchronous API that we’d like to remove. Together with lightweight contexts and on the path to WebAPI we will want to make sure to organize our events, methods and objects to fit the design of other W3C APIs.
With ever-growing understanding of the environment and how the web stack matures, we are also getting close to start extracting the core of our framework to offer for standardization, and that’s an exciting opportunity to fulfill the vision of both Firefox OS and L20n and bring the modern localization framework to the whole web, making it more multilingual and global.
3 replies on “One year with the Firefox OS L10n framework”
You should really take a look at FormatJS ( http://formatjs.io/ ) by Yahoo presented during React Conf ( https://www.youtube.com/watch?v=Sla-DkvmIHY ) and in particular the standard ICU Message syntax ( http://userguide.icu-project.org/formatparse/messages and http://formatjs.io/guides/message-syntax/ ) which is quite cleaner than the L20n format (which is really cargo culting the old DTD format in a bad way).
We did look at FormatJS, and we spent significant amount of time learning and discussing ICU Message syntax… with ICU. Unfortunately, the syntax is not capable of storing the data that we are using in L20n. It also doesn’t match the file format principles that we defined for L20n project.
The notion of “cleanness” is, of course, subjective matter of taste and habits, so it may be hard to hold a conversation on your personal preference, but I can assure you that we did not use DTD as an inspiration, and that we are open to get your feedback on improvements to the syntax, as long as they do not regress the feature set that we need to have.
Please, write to mozilla.tools.l10n with your ideas and we can continue the discussion about the format there 🙂
Thanks for your reply. Great if you already considered it. You probably have way more experience in the subject than I. But it do not seem to me that the ICU format conflicts with the file format principles that you link to. Perhaps you meant some feature requirements. I found only one page mentioning ICU and i20n, but the example mentioned there seem to be implementable using the ICU syntax. “Cleanness” sure is subjective, but also look out for NIH-syndrome.
My last parenthetical was not helpful, I’m sorry.