Application ICU Localization


c++ localization

I have been working on a C++ cross-platform application and decided to provide localization via Boost locale together with ICU backend.

If you are new to Boost, it may seem rather large, but like any other library it can be configured and optimized for your needs. The same is true for ICU which can also be configured and optimized for your needs.

Prepare libraries:

Download Boost
Download ICU

Build ICU:

1
cd [source path of ICU] && make && sudo make install

Build Boost without ICU:

1
cd [source path of Boost] && ./bootstrap.sh && ./b2

Build Boost explicitly enable ICU:

1
cd [source path of Boost] && ./bootstrap.sh && ./b2 --with-locale boost.locale.std=off boost.locale.posix=off boost.locale.iconv=off boost.locale.icu=on -sICU_PATH=[path to local ICU installation]

Check that b2 reports that ICU is available. If not verify that [path to local ICU installation] is setup correctly.

At this point you should have multiple ICU libraries and one boost locale library which needs to be linked into your application. You may not need all of the ICU libraries depending on your needs.

Translation using po and mo:

Boost locale conform to the GNU Gettext localization model and provides the boost::locale::translate function for string translation. The boost::locale::translate function is used to mark all localized strings within the application, extracted by the xgettext command and generated into binary form by msgfmt command.

1
std::cout << boost::locale::translate( "Hello World" ) << std::endl;

The above code show a localized string which will be translated at runtime. When all strings in the source file has been changed, the strings can be extract for localization. This is done by use of the xgettext command:

1
xgettext --keyword=translate:1,1t --output='en_US.po' source_file.cpp

This will extract all basic messages and produce a ‘po’ file from which translations can be managed. Read the xgettext man page and boost translation page for full overview of extraction customization.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2021-04-09 18:00+0200\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: source_file.cpp:22
msgid "Hello World"
msgstr "Hϵllo Word, Ͳranslatϵd UTF-8"

The above ‘po’ file contains the ‘Hello World’ msgid with its corresponding translation string which will be used at runtime whenever the msgid is seen by boost::locale::translate.

When you are happy with all the translation strings, the ‘po’ file needs to converted into the binary ‘mo’ format which is used for runtime string lookup. Generation of the ‘mo’ file is done by use of the msgfmt command:

1
msgfmt --verbose --check-format --output-file=en_US.mo en_US.po

Now you have a binary ‘po’ localization file for you application. The po file must be placed in a file structured such as ‘en_US/LC_MESSAGES/hello.mo’ where ‘en_US’ is the language/country and ‘hello’ is the message domain.

Application localization:

The following application utilizes ‘po’ files located in message search path ‘.’ and a message domain called ‘hello’.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <boost/locale.hpp>
#include <iostream>
#include <time.h>

using namespace std;
using namespace boost::locale;

int main()
{
  generator gen;

  // Specify location of dictionaries
  gen.add_messages_path( "." );
  gen.add_messages_domain( "hello/utf-8" ); // explicit utf-8 domain encoding

  {
    std::locale loc( gen( "en_US.UTF-8" ) );
    
    locale::global( loc );
    cout.imbue( locale() );
    
    cout << translate( "Hello World" ) << endl;
    
    cout << std::use_facet< boost::locale::info >( loc ).name() << endl;
    cout << std::use_facet< boost::locale::info >( loc ).language() << endl;
    cout << std::use_facet< boost::locale::info >( loc ).country() << endl;
    cout << std::use_facet< boost::locale::info >( loc ).variant() << endl;
    cout << std::use_facet< boost::locale::info >( loc ).encoding() << endl;
    
    cout << boost::locale::as::date << time(0) << std::ends;
  }
    
  return 0;
}

Output from application:

1
2
3
4
5
6
7
Hϵllo Word, Ͳranslatϵd UTF-8
en_US.UTF-8
en
US

utf-8
Apr 9, 2021