HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...
, gettext is an
internationalization and localization In computing, internationalization and localization (American English, American) or internationalisation and localisation (British English, British), often abbreviated i18n and l10n respectively, are means of adapting to different languages, regi ...
(i18n and l10n) system commonly used for writing multilingual programs on
Unix-like A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
computer
operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
s. One of the main benefits of gettext is that it separates programming from translating. The most commonly used implementation of gettext is GNU gettext, released by the
GNU Project The GNU Project ( ) is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and Computer hardware, computing dev ...
in 1995. The runtime library is libintl. gettext provides an option to use different strings for any number of plural forms of nouns, but this feature has no support for
grammatical gender In linguistics, a grammatical gender system is a specific form of a noun class system, where nouns are assigned to gender categories that are often not related to the real-world qualities of the entities denoted by those nouns. In languages wit ...
. The main
filename extension A filename extension, file name extension or file extension is a suffix to the name of a computer file (for example, .txt, .mp3, .exe) that indicates a characteristic of the file contents or its intended use. A filename extension is typically d ...
s used by this system are .POT (Portable Object Template), .PO (Portable Object) and .MO (Machine Object).


History

Initially, POSIX provided no means of localizing messages. Two proposals were raised in the late 1980s, the 1988 Uniforum gettext and the 1989 X/Open catgets (XPG-3 § 5).
Sun Microsystems Sun Microsystems, Inc., often known as Sun for short, was an American technology company that existed from 1982 to 2010 which developed and sold computers, computer components, software, and information technology services. Sun contributed sig ...
implemented the first gettext in 1993. The Unix and POSIX developers never really agreed on what kind of interface to use (the other option is the X/Open catgets), so many C libraries, including
glibc The GNU C Library, commonly known as glibc, is the GNU Project implementation of the C standard library. It provides a wrapper around the system calls of the Linux kernel and other kernels for application use. Despite its name, it now also dir ...
, implemented both. , whether gettext should be part of POSIX was still a point of debate in the
Austin Group The Austin Group or the Austin Common Standards Revision Group is a joint technical working group formed to develop and maintain a common revision of POSIX.1 and parts of the Single UNIX Specification. It is named after the location of the first ...
, despite the fact that its old foe has already fallen out of use. Concerns cited included its dependence on the system-set locale (a
global variable In computer programming, a global variable is a variable with global scope, meaning that it is visible (hence accessible) throughout the program, unless shadowed. The set of all global variables is known as the ''global environment'' or ''global ...
subject to multithreading problems) and its support for newer C-language extensions involving wide strings. The
GNU Project The GNU Project ( ) is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and Computer hardware, computing dev ...
decided that the message-as-key approach of gettext is simpler and more friendly. (Most other systems, including catgets, requires the developer to come up with "key" names for every string.) They released GNU gettext, a
free software Free software, libre software, libreware sometimes known as freedom-respecting software is computer software distributed open-source license, under terms that allow users to run the software for any purpose as well as to study, change, distribut ...
implementation of the system in 1995. Gettext, GNU or not, has since been ported to many programming languages. The simplicity of po and widespread editor support even lead to its adoption in non-program contexts for text documents or as an intermediate between other localization formats, with converters like po4a (po for anything) and Translate Toolkit emerging to provide such a bridge.


Operation


Programming

The basic interface of gettext is the function, which accepts a
string String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...
that the user will see in the original language, usually English. To save typing time and reduce code clutter, this function is commonly aliased to _: printf(gettext("My name is %s.\n"), my_name); printf(_("My name is %s.\n"), my_name); // same, but shorter gettext() then uses the supplied strings as keys for looking up translations, and will return the original string when no translation is available. This is in contrast to
POSIX The Portable Operating System Interface (POSIX; ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines application programming interfaces (APIs), along with comm ...
catgets(),
AmigaOS AmigaOS is a family of proprietary native operating systems of the Amiga and AmigaOne personal computers. It was developed first by Commodore International and introduced with the launch of the first Amiga, the Amiga 1000, in 1985. Early versions ...
GetString(), or
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
LoadString() where a programmatic ID (often an integer) is used. To handle the case where the same original-language text can have different meanings, gettext has functions like cgettext() that accept an additional "context" string. xgettext is run on the sources to produce a .pot (Portable Object Template) file, which contains a list of all the translatable strings extracted from the sources. Comments starting with /// are used to give translators hints, although other prefixes are also configurable to further limit the scope. One such common prefix is TRANSLATORS:. For example, an input file with a comment might look like: /// TRANSLATORS: %s contains the user's name as specified in Preferences printf(_("My name is %s.\n"), my_name); xgettext is run using the command: xgettext -c / The resultant .pot file looks like this with the comment (note that xgettext recognizes the string as a C-language
printf printf is a C standard library function that formats text and writes it to standard output. The function accepts a format c-string argument and a variable number of value arguments that the function serializes per the format string. Mism ...
format string): #. TRANSLATORS: %s contains the user's name as specified in Preferences #, c-format #: src/name.c:36 msgid "My name is %s.\n" msgstr "" In POSIX
shell script A shell script is a computer program designed to be run by a Unix shell, a command-line interpreter. The various dialects of shell scripts are considered to be command languages. Typical operations performed by shell scripts include file manipu ...
, gettext provides a gettext.sh library one can include that provides the many same functions gettext provides in similar languages.
GNU bash In computing, Bash (short for "''Bourne Again SHell''") is an interactive command interpreter and command programming language developed for UNIX-like operating systems. Created in 1989 by Brian Fox for the GNU Project, it is supported by the Fre ...
also has a simplified construct $"msgid" for the simple gettext function, although it depends on the C library to provide a gettext() function.


Translating

The translator derives a .po (Portable Object) file from the template using the msginit program, then fills out the translations. msginit initializes the translations so, for instance, for a French language translation, the command to run would be: msginit --locale=fr --input=name.pot This will create fr.po. The translator then edits the resultant file, either by hand or with a translation tool like Poedit, or
Emacs Emacs (), originally named EMACS (an acronym for "Editor Macros"), is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, s ...
with its editing mode for .po files. An edited entry will look like: #: src/name.c:36 msgid "My name is %s.\n" msgstr "Je m'appelle %s.\n" Finally, the .po files are compiled with msgfmt into binary .mo (Machine Object) files. GNU gettext may use its own file name extension .gmo on systems with another gettext implementation. These are now ready for distribution with the software package. GNU msgfmt can also perform some checks relevant to the format string used by the programming language. It also allows for outputting to language-specific formats other than MO; the
X/Open X/Open group (also known as the Open Group for Unix Systems and incorporated in 1987 as X/Open Company, Ltd.) was a consortium founded by several European UNIX systems manufacturers in 1984 to identify and promote open standards in the field of info ...
equivalent is gencat. In later phases of the developmental workflow, msgmerge can be used to "update" an old translation to a newer template. There is also msgunfmt for reverse-compiling .mo files, and many other utilities for batch processing.


Running

The user, on
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
-type systems, sets the
environment variable An environment variable is a user-definable value that can affect the way running processes will behave on a computer. Environment variables are part of the environment in which a process runs. For example, a running process can query the va ...
LC_MESSAGES, and the program will display strings in the selected language, if there is an .mo file for it. Users on GNU variants can also use the environment variable LANGUAGE instead. Its main difference from the Unix variable is that it supports multiple languages, separated with a colon, for fallback.


Plural form

The ngettext() interface accounts for the count of a noun in the string. As with the convention of gettext(), it is often aliased to N_ in practical use. Consider the code sample: // parameters: english singular, english plural, integer count printf(ngettext("%d translated message", "%d translated messages", n), n); A header in the "" (empty string) entry of the PO file stores some metadata, one of which is the plural form that the language uses, usually specified using a C-style
ternary operator In mathematics, a ternary operation is an ''n''- ary operation with ''n'' = 3. A ternary operation on a set ''A'' takes any given three elements of ''A'' and combines them to form a single element of ''A''. In computer science, a ternary operator ...
. Suppose we want to translate for the
Slovene language Slovene ( or ) or Slovenian ( ; ) is a South Slavic languages, South Slavic language of the Balto-Slavic languages, Balto-Slavic branch of the Indo-European languages, Indo-European language family. Most of its 2.5 million speakers are the ...
: msgid "" msgstr "" "..." "Language: sl\n" "Plural-Forms: nplurals=4; plural=(n%100

1 ? 1 : n%100

2 ? 2 : n%100

3 , , n%100

4 ? 3 : 0);\n"
Since now there are four plural forms, the final po would look like: #: src/msgfmt.c:876 #, c-format msgid "%d translated message" msgid_plural "%d translated messages" msgstr "%d prevedenih sporočil" msgstr "%d prevedeno sporočilo" msgstr "%d prevedeni sporočili" msgstr "%d prevedena sporočila" Reference plural rules for languages are provided by the
Unicode consortium The Unicode Consortium (legally Unicode, Inc.) is a 501(c)(3) non-profit organization incorporated and based in Mountain View, California, U.S. Its primary purpose is to maintain and publish the Unicode Standard which was developed with the in ...
. msginit also prefills the appropriate rule when creating a file for one specific language.


Implementations

In addition to C, gettext has the following implementations: C# for both
ASP.NET ASP.NET is a server-side web-application framework designed for web development to produce dynamic web pages. It was developed by Microsoft to allow programmers to build dynamic web sites, applications and services. The name stands for Ac ...
and for WPF,
Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language". Perl was developed ...
,
PHP PHP is a general-purpose scripting language geared towards web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by the PHP Group. ...
, Python, R, Scala, and Node.js. GNU gettext has native support for Objective-C, but there is no support for the Swift programming language yet. A commonly used gettext implementation on these Cocoa platforms is POLocalizedString. The Microsoft Outlook for iOS team also provides a LocalizedStringsKit library with a gettext-like API.


See also

* gtranslator * Poedit * Translate Toolkit * Virtaal * Weblate


References


External links

* {{official website, https://www.gnu.org/software/gettext/gettext.html, Official GNU gettext site GNU Project software Internationalization and localization Software-localization tools