Monday, August 22, 2011

Data: Internationalization standard - Does 11/10/11 mean November 10th or October 11th?

As the world market for software expands, the ability to provide software users with a localized experience becomes more and more crucial. To address that issue, the Unicode Consortium has an open source project to create a standard that can guide software developers in their internationalization efforts. This project is called the Common Language Data Repository or CLDR.

The data maintained by CLDR contains information for approximately 500 locales on how to display dates, including month/day of week spellings, as well as numeric formatting, measurement units, collation, names, characters, etc. It attempts to be as comprehensive as possible, while offering the data in a standard way that can be used for software implementation. The scope and ambition of the project is impressive, as is the depth of information that it contains.

All of the data can be downloaded in a zip file from here. The zip file contains individual xml files for all the available languages and regions. The data can also be viewed online here.