Categories
main mozilla tech

JavaScript Internationalization in 2020

2020 is shaping up to be an amazing year for JavaScript Internationalization API.

After many years of careful design we’re seeing a lot of the work now coming close to completion with a number of high profile APIs on track for inclusion in ECMAScript 2020 standard!

What’s coming up?

Let’s cut to the chase. Here’s the list of APIs coming to the JavaScript environment of your choice:

Categories
main mozilla tech

The New Localization System for Firefox is in!

After nearly 3 years of work, 13 Firefox releases, 6 milestones and a lot of bits flipped, I’m happy to announce that the project of integrating the Fluent Localization System into Firefox is now completed!

It means that we consider Fluent to be well integrated into Gecko and ready to be used as the primary localization system for Firefox!

Below is a story of how that happened.

3 years of history

At Mozilla All-Hands in December 2016 my team at the time (L10n Drivers) presented a proposal for a new localization system for Firefox and Gecko – Fluent (code name at the time – “L20n“).

The proposal was sound, but at the time the organization was crystallizing vision for what later became known as Firefox Quantum and couldn’t afford pulling additional people in to make the required transition or risk the stability of Firefox during the push for Quantum.

Instead, we developed a plan to spend the Quantum release cycle bringing Fluent to 1.0, modernizing the Internationalization stack in Gecko, getting everything ready in place, and then, once the Quantum release completes, we’ll be ready to just land Fluent into Firefox!

Original schema of the proposed system integration into Gecko

We divided the work between two main engineers on the project – Staś Małolepszy took the lead of Fluent itself, while I became responsible for integrating it into Firefox.

My initial task was to refactor all of the locale management and higher-level internationalization integration (date/time formatting, number formatting, plural rules etc.) to unify around a common Unicode-backed model, all while avoiding any disruptions for the Quantum project, and by all means avoid any regressions.

I documented the first half of 2017 progress in a blog post “Multilingual Gecko in 2017” which became a series of reports on the progress of in our internationalization module, and ended up with a summary about the whole rearchitecture which ended up with a rewrite of 90% of code in intl::locale component.

Around May 2017, we had ICU enabled in all builds, all the required APIs including unified mozilla::intl::LocaleService, and the time has come to plan how we’re going to integrate Fluent into Gecko.

Planning

Measuring

Before we began, we wanted to understand what the success means, and how we’re going to measure the progress.

Stating that we aim at making Fluent a full replacement for the previous localization systems in Firefox (DTD and .properties) may be overwhelming. The path from landing the new API in Gecko, to having all of our UI migrated would likely take years and many engineers, and without a good way to measure our progress, we’d be unable to evaluate it.

Original draft of a per-component dashboard

Together with Axel, Staś and Francesco, we spent a couple days in Berlin going back and forth on what should we measure. After brainstorming through ideas such as fluent-per-component, fluent-per-XUL-widget and so on, we eventually settled on the simplest one – percentage of localization messages that use Fluent.

Original draft of a global percentage view

We knew we could answer more questions with more detailed breakdowns, but each additional metric required additional work to receive it and keep it up to date. With limited resources, we slowly gave up on aiming for detail, and focused on the big picture.

Getting the raw percentage of strings in Fluent to start with, and then adding more details, allowed us to get the measurements up quickly and have them available independently of further additions. Big picture first.

Staś took ownership over the measuring dashboard, wrote the code and the UI and soon after we had https://www.arewefluentyet.com running!

AreWeFluentYet.com as of January 12th 2020

Later, with the help from Eric Pang, we were able to improve the design and I added two more specific milestones: Main UI, and Startup Path.

The dashboard is immensely useful, both for monitoring the progress, and evangelizing the effort, and today if you visit any Mozilla office around the World, you’ll see it cycle through on the screens in common areas!

Target Component

To begin, we needed to get agreement with the Firefox Product Team on the intended change to their codebase, and select a target for the initial migration to validate the new technology.

We had a call with the Firefox Product Lead who advised that we start with migrating Preferences UI as a non-startup, self-contained, but sufficiently large piece of UI.

It felt like the right scale. Not starting with the startup path limited the risk of breaking peoples Nightly builds, and the UI itself is complex enough to test Fluent against large chunks of text, giving our team and the Firefox engineers time to verify that the API works as expected.

We knew the main target will be Preferences now, but we couldn’t yet just start migrating all of it. We needed smaller steps to validate the whole ecosystem is ready for Fluent, and we needed to plan separate steps to enable Fluent everywhere.

I split the whole project into 6 phases, each one gradually building on top of the previous ones.

Outline of the phases used by the stakeholders to track progress
Categories
main mozilla tech

Multilingual Gecko – 2017-2018 – Rearchitecture

Between 2017 and 2018 we refactored a major component of the Gecko Platform – the intl/locale module. The main motivator was the vision of Multilingual Gecko which I documented in a blog post.

Firefox 65 brought the first major user-facing change that results from that refactor in form of Locale Selection. It’s a good time to look back at the scale of changes. This post is about the refactor of the whole module which enabled many of the changes that we were able to land in 2019 to Firefox.

All of that work led to the following architecture:

Intl module in Gecko as of Firefox 72
Yellow – C++, Red – JavaScript, Green – Rust

Evolution vs Revolution

It’s very rare in software engineering that projects of scale (a Gecko module with close to 500 000 lines of code) go through such a major transition. We’ve done it a couple times, with Stylo, WebRender etc., but each case is profound, unique and rare.

There are good reasons not to touch an old code, and there are good reasons against rewriting such a large pile of code.

There’s not a lot of accrued organizational knowledge about how to handle such transitions, what common pitfalls await, and how to handle them.

That makes it even more unique to realize how smooth this change was. We started 2017 with a vision of putting a modern localization system into Firefox, and a platform that required a substantial refactor to get there.

We spent 2017 moving thousands of lines of 20 years old code around, and replacing them with a modern external library – ICU. We designed a new unified locale management and regional preferences modules, together with new internationalization wrappers finishing the year by landing Fluent into Gecko.

To give you a rough idea on the scope of changes – only 10% of ./intl/locale directory remained the same between Firefox 51 and 65!

In 2018, we started building on top of it, witnessing a reduced amount of complex work on the lower level, and much higher velocity of higher level changes with three overarching themes:

  1. Migrate Firefox to Fluent
  2. Improve Firefox locale management and multilingual use cases
  3. Upstream all of what we can to be part of the Web Platform

We then ended up Q1 2019 with a Fluent 1.0 release, close to 50% of DTD strings migrated to Fluent, with all major new JavaScript ECMA402 APIs based on our Gecko work and what’s even more important, with proven runtime ability to select locales from the Preferences UI.

What’s amazing to me, is that we avoided any major architectural regression in this transition. We didn’t have to revisit, remodel, revert or otherwise rethink in any substantial way any of the new APIs! All of the work, as you can see above, was put into incremental updates, deprecation of old code, standardization and integration of the new. Everything we designed to handle Firefox UI that has been proposed for standardization has been accepted, in roughly that form, by our partners from ECMA TC39 committee, and no major assumption ended up being wrong.

I believe that the reason for that success is the slow pace we took (a year of time to refactor, a year to stabilize), support from the whole organization, good test coverage and some luck.

On December 26th 2016 I filed a bug titled “Unify locale selection across components and languages“. Inside it, you can find a long list of user scenarios which we dared to tackle and which, at the time, felt completely impossible to provide.

Two years later, we had the technical means to address the majority of the scenarios listed in that bug.

Three new centralized components played a crucial role in enabling that:

LocaleService

LocaleService was the first new API. At the time, Firefox locale management was decentralized. Multiple components would try to negotiate languages for their own needs – UI chrome API, character detection, layout etc.

They would sometimes consult nsILocaleService / nsILocale which were shaped after the POSIX model and allowed to retrieve the locale based on environmental, per-OS, variables.

There was also a set of user preferences such as general.useragent.locale and intl.locale.matchOS which some of the components took into account, and others ignored.

Finally, when a locale was selected, the UI would use OS specific calls to perform internationalization operations which depended on what locale data and what intl APIs were available in the platform.

The result was that our UI could easily become inconsistent, jumping between POSIX variables, user settings, and OS settings, leading to platform-specific bugs and users stuck in a weird mid-air state with their UI being half in one locale, and half in the other.

The role of the new LocaleService was to unify that selection, provide a singleton service which will manage locale selection, negotiation and interaction with platform-specific settings.

LocaleService was written in C++ (I didn’t know Rust at the time), and quickly started replacing all hooks around the platform. It brought four major concepts:

  • All APIs accepted and returned fallback lists, rather than a single locale
  • All APIs manage their state through runtime negotiation
  • Unify all language tags around Unicode Locale Identifier standard
  • Events were fired to notify users on locale changes

The result, combined with the introduction of IPC for LocaleService, led us in the direction of cleaner system that maintains its state and can be reasoned about.

In the process, we kept extending LocaleService to provide the lists of locales that we should have in Gecko, both getters and setters, while maintaining the single source of truth and event driven model for reacting to runtime locale changes.

That allowed us to make our codebase ready for, first locale selection, and then runtime locale switching, that Fluent was about to make possible.

Today LocaleService is very stable, with just 8 bugs open, no feature changes since September 2018, up-to-date documentation and a ~93% test coverage.

OSPreferences

With the centralization of internationalization API around our custom ICU/CLDR instance, we needed a new layer to handle interactions with the host environment. This layer has been carved out of the old nsILocaleService to facilitate learning user preferences set in Windows, MacOS, Gnome and Android.

It has been a fun ride with incremental improvements to handle OS-specific customizations and feed them into LocaleService and mozIntl, but we were able to accomplish the vast majority of the goals with minimum friction and now have a sane model to reason about and extend as we need.

Today OSPreferences is very stable, with just 3 bugs open, no feature changes since September 2018, up-to-date documentation and a ~93% test coverage.

mozIntl

With LocaleService and OSPreferences in place we had all the foundation we needed to negotiate a different set of locales and customize many of the intl formatters, but we didn’t have a way to separate what we expose to the Web from what we use internally.

We needed a wrapper that would allow us to use the JS Intl API, but with some customizations and extensions that are either not yet part of the web standard, or, due to fingerprinting, cannot be exposed to the Web.

We developed `mozIntl` to close that gap. It exposes some bits that are internal only, or too early for external adoption, but ready for the internal one.

mozIntl is pretty stable now, with a few open bugs, 96% test coverage and a lot of its functionality is now in the pipeline to become part of ECMA402. In the future we hope this will make mozIntl thinner.

2019

In 2019, we were able to focus on the migration of Firefox to Fluent. That required us to resolve a couple more roadblocks, like startup performance and caching, and allowed us to start looking into the flexibility that Fluent gives to revisit our distribution channels, build system models and enable Gecko based applications to handle multilingual input.

Gecko is, as of today, a powerful software platform with a great internationalization component leading the way in advancing Web Standards and fulfilling the Mozilla mission by making the Web more accessible and multilingual. It’s a great moment to celebrate this achievement and commit to maintain that status.

Categories
main mozilla tech

Pseudolocalization in Firefox

One of the core projects we did over 2017 was a major overhaul of the Localization and Internationalization layers in Gecko, and all throughout the first half of 2018 we were introducing Fluent into Firefox.

All of that work was “behind the scenes” and laid the foundation to enable us to bring higher level improvements in the future.

Today, I’m happy to announce that the first of those high-level features has just landed in Firefox Nightly!

Pseudolocalization?

Pseudolocalization is a technology allowing for testing the localizability of software UI. It allows developers to check how the UI they are working on will look like when translated, without having to wait for translations to become available.

It shortens the Test-Driven Development cycle and lowers the burden of creating localizable UI.

Here’s a demo of how it works:

How to turn it on?

At the moment, we don’t have any UI for this feature. You need to create a new preference called intl.l10n.pseudo and set its value to accented for a left-to-right, ~30% longer strategy, or bidi for a right-to-left strategy. (more documentation).

If you test the bidi strategy you also will likely want to switch another preference – intl.uidirection – to 1. This is because right now the directionality of text and layout are not connected. We will improve that in the future.

We’ll be looking into ways to expose this functionality in the UI, and if you have any ideas or suggestions for what you’d like to see, let’s talk!

Nitty-gritty details

Although the feature may seem simple to add, and the actual patch that adds it was less than 100 lines long, it took many years of prototyping and years of development to build the foundation layers to allow for it.

Many of the design principles of Project Fluent combined with the vision shaped by the L10n Drivers Team at Mozilla allowed for dynamic runtime locale switching and declarative UI localization bindings.

Thanks to all of that work, we don’t have to require special builds or increase the bundle size for this feature to work. It comes practically for free and we can extend and fine tune pseudolocalization strategies on fly.

Kudos

If that feature looks cool, in the esoteric way localization and internationalization can, please, make sure to high-five the people who put a lot of work to get this done: Staś Małolepszy, Axel Hecht, Francesco Lodolo, Jeff Beatty and Dave Townsend.

More features are coming! Stay tuned.