Sunteți pe pagina 1din 8

LaTeX/Internationalization

LaTeX has to be congured and used appropriately when 2 Babel


it is used to write documents in languages other than En-
glish. This has to address three main areas: The babel package by Johannes Braams and Javier Bezos
will take care of everything (with XeTeX and LuaTeX
you should consider polyglossia). You can load it in your
1. LaTeX needs to know how to hyphenate the lan- preamble, providing as an argument name of the language
guage(s) to be used. you want to use (usually its English name, but not always):
You should place it soon after the \documentclass com-
2. The user needs to use language-specic typographic mand, so that all the other packages you load afterwards
rules. In French for example, there is a mandatory will know the language you are using. Babel will auto-
space before each colon character (:). matically activate the appropriate hyphenation rules for
the language you choose. If your LaTeX format does not
support hyphenation in the language of your choice, ba-
3. The input of special characters, especially for lan- bel will still work but will disable hyphenation, which has
guages using an input system (Arab, Chinese, quite a negative eect on the appearance of the typeset
Japanese, Korean). document. Babel also species new commands for some
languages, which simplify the input of special characters.
See the sections about languages below for more infor-
It is convenient to be able to insert language-specic spe- mation.
cial characters directly from the keyboard instead of us-
ing cumbersome coding (for example, by typing instead If you call babel with multiple languages:
of \"{a}). This can be done by conguring input encod- then the last language in the option list will be active (i.e.
ing properly. We will not tackle this issue here: see the languageB), and you can use the command
Special Characters chapter.
to change the active language. You can also add short
Some languages require special fonts with the proper font pieces of text in another language using the command
encoding set. See Font encoding.
Babel also oers various environments for entering larger
Some of the methods described in this chapter may be pieces of text in another language:
useful when dealing with non-English author names in
The starred version of this environment typesets the main
bibliographies.
text according to the rules of the other language, but
Here is a collection of suggestions about writing a La- keeps the language specic string for ancillary things like
TeX document in a language other than English. If you gures, in the main language of the document. The envi-
have experience in a language not listed below, please add ronment hyphenrules switches only the hyphenation pat-
some notes about it. terns used; it can also be used to disallow hyphenation by
using the language name 'nohyphenation' (but note select-
language* is preferred).
The babel manual provides much more information on
1 Prerequisites these and many other options.

Most non-english language will need to input special


characters very often. For a convenient writing you will
need to set the input encoding and the font encoding prop-
3 Multilingual versions
erly.
It is possible in LaTeX to typeset the content of one docu-
The following conguration is optimal for many lan- ment in several languages and to choose upon compilation
guages (most latin languages). Make sure your document which language to output. This might be convenient to
is saved using the UTF-8 encoding. keep a consistent sectioning and formatting across the dif-
For more details check Font encoding and Special Char- ferent languages. It is also useful if you make use of mul-
acters. tiple proper nouns and other untranslated content. Using

1
2 4 SPECIFIC LANGUAGES

the commands above in multilingual documents can be to compile with XeLaTeX.


cumbersome, and therefore babel provides a way to de-
ne shorter names. With
4.3 Cyrillic script
You can write:
Version 3.7h of babel includes support for the T2* encod-
3.1 Alternative choice using iang ings and for typesetting Bulgarian, Russian and Ukrainian
texts using Cyrillic letters[1] . Support for Cyrillic is based
The current language can also be tested by using the iang on standard LaTeX mechanisms plus the fontenc and
package by Heiko Oberdiek (the built-in feature from the inputenc packages. AMS-LaTeX packages should be
(Why?)
babel package is not reliable). Here comes a simple ex- loaded before fontenc and babel . If you are going to
ample: use Cyrillics in mathmode, you also need to load mathtext
package before fontenc:
\IfLanguageName{ngerman}{Hallo}{Hello}
Generally, babel will automatically choose the default
This allows to easily distinguish between two languages font encoding, for the above three languages this is T2A.
without the need of dening own commands. The babel However, documents are not restricted to a single font
language is changed by setting encoding. For multilingual documents using Cyrillic and
\selectlanguage{english} Latin-based languages it makes sense to include Latin
font encoding explicitly. Babel will take care of switch-
ing to the appropriate font encoding when a dierent lan-
4 Specic languages guage is selected within the document.
On modern operating systems it is benecial to use Uni-
4.1 Arabic script code (utf8 or utf8x) instead of KOI8-RU (koi8-ru) as an
input encoding for Cyrillic text.
For languages which use the Arabic script, including Ara- In addition to enabling hyphenations, translating automat-
bic, Persian, Urdu, Pashto, Kurdish, Uyghur, etc., add the ically generated text strings, and activating some language
following code to your preamble: specic typographic rules (like \frenchspacing), babel
You can input text in either romanized characters or na- provides some commands allowing typesetting according
tive Arabic script encodings. Use any of the following to the standards of Bulgarian, Russian, or Ukrainian lan-
commands and environments to enter in text: guages.

See the ArabTeX Wikipedia article for further details. For all three languages, language specic punctuation is
provided: the Cyrillic dash for the text (it is little nar-
You may also use the Arabi package within Babel to type- rower than Latin dash and surrounded by tiny spaces), a
set Arabic and Persian dash for direct speech, quotes, and commands to facilitate
You may also copy and paste from PDF les produced hyphenation:
with Arabi thanks to the support of the cmap package. The Russian and Ukrainian options of babel dene the
You may use Arabi with LyX, or with tex4ht to produce commands
HTML.
which act like \Alph and \alph (commands for turning
See Arabi page on CTAN counters into letters, e.g. a, b, c...), but produce capi-
tal and small letters of Russian or Ukrainian alphabets
(whichever is the active language of the document).
4.2 Armenian
The Bulgarian option of babel provides the commands
The Armenian script uses its own characters, which will which make \Alph and \alph produce letters of either
require you to install a text editor that supports Unicode Bulgarian or Latin (English) alphabets. The default be-
and will allow you to enter UTF-8 text, such as Texmaker haviour of \Alph and \alph for the Bulgarian language op-
or WinEdt. These text editors should then be congured tion is to produce letters from the Bulgarian alphabet.
to compile using XeLaTeX.
See the Bulgarian translation of The Not So Short Intro-
Once the text editor is set up to compile with XeLaTeX, duction to LaTeX [2] for a method to type Cyrillic letters
the fontspec package can be used to write in Armenian: directly from the keyboard using a dierent distribution.
or
The Sylfaen font lacks italic and bold, but DejaVu Serif
4.4 Chinese
supports them.
See Armenian Wikibooks for further details, especially One possible Chinese support is made available thanks to
on how to congure the Unicode supporting text editors the CJK package collection. If you are using a package
4.8 German 3

manager or a portage tree, the CJK collection is usually Xorg (*BSD and GNU/Linux), you may want to use the
in a separate package because of its size (mainly due to oss variant which features some nice shortcuts, like
fonts). You will need the T1 font encoding for guillemets to print
Make sure your document is saved using the UTF-8 char- properly.
acter encoding. See Special Characters for more details.
For the degree character you will get an error like
Put the parts where you want to write chinese characters
in a CJK environment. ! Package inputenc Error: Unicode char \u8: not set up
for use with LaTeX.
The last argument species the font. It must t the desired
language, since fonts are dierent for Chinese, Japanese The textcomp package will x it for you.
and Korean. Possible choices for Chinese include: The great advantage of Babel for French is that it will
handle some elements of French typography for you, es-
gbsn ( , simplied Chinese) pecially non-breaking spaces before all two-parts punctu-
ation marks. So now you can write:
gkai ( , simplied Chinese)
The non-breaking space before the euro symbol is still
bsmi ( , traditional Chinese) necessary because currency symbols and other units or
bkai ( , traditional Chinese) not supported in general (thats not specic to French).
You can use the numprint package along Babel. It will let
you print numbers the French way.
4.5 Czech
You will also notice that the layout of lists changes when
Czech is ne using switching to the French language. This is customizable
using the \frenchbsetup command. For more information
UTF-8 allows you to have czech quotation marks di- on what the frenchb option of babel does and how you can
rectly in your text. Otherwise, there are macros \clqq customize its behavior, run LaTeX on le frenchb.dtx and
and \crqq to produce left and right quote. You can place read the produced le frenchb.pdf or frenchb.dvi. You
quotated text inside \uv. can get the PDF version on CTAN.

4.6 Finnish
4.8 German
Finnish language hyphenation is enabled with:
This will also automatically change document language You can load German language support using either one
(section names, etc.) to Finnish. of the two following commands.
For traditional (old) German orthography use
4.7 French or for reform (new) German orthography use
This enables German hyphenation, if you have cong-
You can load French language support with the following ured your LaTeX system accordingly. It also changes
command: all automatic text into German, e.g. Chapter becomes
There are multiple options for typesetting French doc- Kapitel. A set of new commands also becomes avail-
uments, depending on the avor of French: french, able, which allows you to write German input les more
frenchb, and francais for Parisian French, and acadian and quickly even when you don't use the inputenc package.
canadien for new-world French. If you do not know or do Check out the table below for inspiration. With inpu-
not really care, we would recommend using frenchb. tenc, all this becomes moot, but your text also is locked
in a particular encoding world.
However, as of version 3.0 of babel-french, it is advised to
choose the language as a global option with the following In German books you sometimes nd French quotation
command[3] : marks (guillemets). German typesetters, however, use
them dierently. A quote in a German book would look
All enable French hyphenation, if you have congured like this. In the German speaking part of Switzerland,
your LaTeX system accordingly. All of these also change typesetters use guillemets the same way the French do.
all automatic text into French: \chapter prints Chapitre, A major problem arises from the use of commands like
\today prints the current date in French and so on. A set \q: If you use the OT1 font encoding (which is the de-
of new commands also becomes available, which allows fault) the guillemets will look like the math symbol "
you to write French input les more easily. Check out the ", which turns a typesetters stomach. T1 encoded fonts,
following table for inspiration: on the other hand, do contain the required symbols. So
You may want to typeset guillemets and other French if you are using this type of quote, make sure you use the
characters directly if your keyboard have them. Running T1 encoding.
4 4 SPECIFIC LANGUAGES

Decimal numbers usually have to be written like 0{,}5 You can also use capabilities provided by the Fontspec
(not just 0,5). Packages like zier enable input like 0,5. package and those provided by Luatexja-fontspec to de-
Alternatively, one can use the \num command from the clare the font you want to use in your paper. Let us take
babel and (globally) set the decimal marker using an example :
Use UTF-8 as your encoding. In case you don't know
how to do this, take a look at Texmaker, a LaTeX editor
4.9 Greek which use UTF-8 by default.

This is the preamble you need to write in the Greek lan- Another (but old) possible Japanese support is made
guage. Note the particular input encoding. available thanks to the CJK package collection. If you
are using a package manager or a portage tree, the CJK
This preamble enables hyphenation and changes all au- collection is usually in a separate package because of its
tomatic text to Greek. A set of new commands also be- size (mainly due to fonts).
comes available, which allows you to write Greek input
les more easily. In order to temporarily switch to En- Make sure your document is saved using the UTF-8 char-
glish and vice versa, one can use the commands \text- acter encoding. See Special Characters for more details.
latin{english text} and \textgreek{greek text} that both Put the parts where you want to write japanese characters
take one argument which is then typeset using the re- in a CJK environment.
quested font encoding. Otherwise you can use the com- The last argument species the font. It must t the desired
mand \selectlanguage{...} described in a previous sec- language, since fonts are dierent for Chinese, Japanese
tion. Use \euro for the Euro symbol. and Korean. min is an example for Japanese.

4.10 Hungarian 4.14 Korean

Use the following lines: The two most widely used encodings for Korean text les
are EUC-KR and its upward compatible extension used
More information in hungarian. in Korean MS-Windows, CP949/Windows-949/UHC. In
these encodings each US-ASCII character represents its
normal ASCII character similar to other ASCII compat-
4.11 Icelandic and Faroese ible encodings such as ISO-8859-x, EUC-JP, Big5, or
Shift_JIS. On the other hand, Hangul syllables, Hanjas
The following lines can be added to write Icelandic text: (Chinese characters as used in Korea), Hangul Jamos,
This changes text like Part into Hluti. It makes additional Hiraganas, Katakanas, Greek and Cyrillic characters and
commands available: other symbols and letters drawn from KS X 1001 are
represented by two consecutive octets. The rst has its
To make special characters such as and become MSB set. Until the mid-1990s, it took a considerable
available just add: amount of time and eort to set up a Korean-capable en-
The default LATEX font encoding is OT1, but it contains vironment under a non-localized (non-Korean) operating
only the 128 characters. The T1 encoding contains letters system. You can skim through the now much-outdated
and punctuation characters for most of the European lan- http://jshin.net/faq to get a glimpse of what it was like to
guages using Latin script. use Korean under non-Korean OS in mid-1990s.
TeX and LaTeX were originally written for scripts with
no more than 256 characters in their alphabet. To make
4.12 Italian them work for languages with considerably more charac-
ters such as Korean or Chinese, a subfont mechanism was
Italian is well supported by LaTeX. Just add developed. It divides a single CJK font with thousands or
at the beginning of your document and the output of all tens of thousands of glyphs into a set of subfonts with 256
the commands will be translated properly. glyphs each.
For Korean, there are three widely used packages.

4.13 Japanese HLATEX by UN Koaunghi

There is a variant of TeX intended for Japanese named hLATEXp by CHA Jaechoon
pTeX, which supports vertical typesetting. the CJK package by Werner Lemberg
Another possible way to write in japanese is to use Lu-
alatex and the luatex-ja package. Adapted example from HLATEX and hLATEXp are specic to Korean and pro-
the Luatexja documentation : vide Korean localization on top of the font support. They
4.15 Persian script 5

both can process Korean input text les encoded in EUC- 4.15 Persian script
KR. HLATEX can even process input les encoded in
CP949/Windows-949/UHC and UTF-8 when used along For Persian language, there is a dedicated package called
with , . XePersian which uses XeLaTeX as the typesetting en-
The CJK package is not specic to Korean. It can process gine. Just add the following code to your preamble:
input les in UTF-8 as well as in various CJK encodings See XePersian page on CTAN
including EUC-KR and CP949/Windows-949/UHC, it
Moreover, Arabic script can be used to type Persian as
can be used to typeset documents with multilingual con-
illustrated in the corresponding section.
tent (especially Chinese, Japanese and Korean). The CJK
package has no Korean localization such as the one of-
fered by HLATEX and it does not come with as many
special Korean fonts as HLATEX. 4.16 Polish
The ultimate purpose of using typesetting programs like If you plan to use Polish in your UTF-8 encoded docu-
TeX and LaTeX is to get documents typeset in an aes- ment, use the following code
thetically satisfying way. Arguably the most important
element in typesetting is a set of welldesigned fonts. The The above code merely allows to use Polish letters and
HLATEX distribution includes UHC PostScript fonts of translates the automatic text to Polish, so that chapter
10 dierent families and Munhwabu fonts (TrueType) of becomes rozdzia". There are a few additional things one
5 dierent families. The CJK package works with a set must remember about.
of fonts used by earlier versions of HLATEX and it can
use Bitstreams cyberbit True-Type font.
4.16.1 Connectives
To use the HLATEX package for typesetting your Korean
text, put the following declaration into the preamble of
Polish has many single letter connectives: a, o, w,
your document:
i, u, z, etc., grammar and typography rules don't al-
This command turns the Korean localization on. The low for them to end a printed line. To ensure that LaTeX
headings of chapters, sections, subsections, table of con- won't set them as last letter in the line, you have to use
tent and table of gures are all translated into Korean and non breakable space:
the formatting of the document is changed to follow Ko-
rean conventions. The package also provides automatic
particle selection. In Korean, there are pairs of post-x 4.16.2 Numerals
particles grammatically equivalent but dierent in form.
Which of any given pair is correct depends on whether According to Polish grammar rules, you have to put dots
the preceding syllable ends with a vowel or a consonant. after numerals in chapter, section, subsection, etc. head-
(It is a bit more complex than this, but this should give ers.
you a good picture.) Native Korean speakers have no
problem picking the right particle, but it cannot be de- This is achieved by redening few LaTeX macros.
termined which particle to use for references and other For books:
automatic text that will change while you edit the docu-
For articles:
ment. It takes a painstaking eort to place appropriate
particles manually every time you add/remove references Alternatively you can use dedicated document classes:
or simply shue parts of your document around. HLA-
TEX relieves its users from this boring and error-prone the mwart class instead of article,
process.
In case you don't need Korean localization features but mwbk instead of book
just want to typeset Korean text, you can put the following
line in the preamble, instead. and mwrep instead of report.
For more details on typesetting Korean with HLATEX,
refer to the HLATEX Guide. Check out the web site of Those classes have much more European typography set-
the Korean TeX User Group (KTUG). tings but do not require the use of Polish babel settings or
In the FAQ section of KTUG it is recommended to use character encoding.
the kotex package Simple usage:
Full documentation for those classes is available at
http://web.archive.org/web/20040609034031/http://
www.ci.pwr.wroc.pl/~{}pmazur/LaTeX/mwclsdoc.pdf
(Polish).
6 4 SPECIFIC LANGUAGES

4.16.3 Indentation 4.19 Spanish


It may be customary (depending on publisher) to indent Include the appropriate Babel option:
the rst paragraph in sections and chapters:
The trick is that Spanish has several options and com-
mands to control the layout. The options may be loaded
4.16.4 Hyphenation and typography either at the call to Babel, or before, by dening the com-
mand \spanishoptions. Therefore, the following com-
Its much more frowned upon to set pages with hyphen- mands are roughly equivalent:
ation between pages than it is customary in American
On average, the former syntax should be preferred, as the
typesetting.
latter is a deviation from standard Babel behavior, and
To adjust penalties for hyphenation spanning pages, use thus may break other programs (LyX, latex2rtf) interact-
this command: ing with LaTeX.
To adjust penalties for leaving widows and orphans (clubs Spanish also denes shorthands for the dot and << >> so
in TeX nomenclature) use those commands: that they are used as logical markup: the former is used as
decimal marker in math mode, and the output is typically
either a comma or a dot; the latter is used for quoted text,
4.16.5 Commas in math
and the output is typically either or . This allows
dierent typographical conventions with the same input,
According to some typography rules, fractional parts of
as preferences may be quite dierent from, say, Spain and
numbers should be delimited by a comma, not a dot. To
Mexico.
make LaTeX not insert additional space in math mode
after a comma (unless there is a space after the comma), Two particularly useful options are es-noquoting,es-
use the icomma package. nolists: some packages and classes are known to collide
with Spanish in the way they handle active characters, and
Unfortunately, it is partially incompatible with the dcol-
these options disable the internal workings of Spanish to
umn package. One needs to either use dots in columns
allow you to overcome these common pitfalls. Moreover,
with numerical data in the source le and make dcolumn
these options may simplify the way LyX customizes some
switch them to commas for display or dene the column
features of the Spanish layout from inside the GUI.
as follows:
The options mexico,mexico-com provide support for lo-
The alternative is to use the numprint package, but it is
cal custom in Mexico: the former using decimal dot, as
much less convenient.
customary, and the latter allowing decimal comma, as re-
Another alternative is using package siunitx that lets you quired by the Mexican Ocial Norm (NOM) of the De-
typeset numbers and their according units consistently. partment of Economy for labels in foods and goods. More
Number alignment in tables and dierent output modes localizations are in the making.
re supported.
The other commands modify the spanish layout after
loading Babel. Two particularly useful commands are
4.16.6 Further information \spanishoperators and \spanishdeactivate.
The macro \spanishoperators{<list of operators>}{ con-
Refer the Sownik Ortograczny (in Polish) for additional tains a list of spanish mathematical operators, and may
information on Polish grammar and typography rules. be redened at will. For instance, the command
Good extract is available at Zasady Typograczne only denes sen, overriding all other denitions; the com-
Skadania Tekstu (in Polish). mand \let\spanishoperators\relax disables them all. This
command supports accented or spaced operators: the
\acute{<letter>} command puts an accent, and the \,
4.17 Portuguese command adds a small space. For instance, the follow-
ing operators are dened by default.
Add the following code to your preamble:
Finally, the macro \spanishdeactivate{<list of charac-
You can substitute the language for brazilian portuguese ters>} disables some active characters, to keep you out
by choosing brazilian or brazil. of trouble if they are redened by other packages. The
candidates for deactivation are the set {<>."'}. Please,
beware that some option preempt the availability of some
4.18 Slovak
active characters. In particular, you should not combine
Basic settings are ne when left the same as Czech, but the es-noquoting option with \spanishdeactivate{<>}, or
Slovak needs special signs for '', '', ''. To be able to the es-noshorthands with \spanishdeactivate{<>."}.
type them from keyboard use the following settings: Please check the documentation for Babel or spanish.dtx
7

for further details.

4.20 Tibetan
One option to use Tibetan script in LaTeX is to add
to your preamble and use a slightly modied Wylie
transliteration for input. Refer to the excellent pack-
age documentation for details. More information can be
found on

5 References
[1] The Not So Short Introduction to LaTeX, 2.5.6 Support
for Cyrillic, Maksym Polyakov

[2] The Not So Short Introduction to LaTeX, Bulgarian trans-


lation

[3] babel-french documentation: the French language should


now be loaded as french, not as frenchb or francais and
preferably as a global option of \documentclass. Some tol-
erance still exists in v3.0, but do not rely on it.
8 6 TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

6 Text and image sources, contributors, and licenses


6.1 Text
LaTeX/Internationalization Source: https://en.wikibooks.org/wiki/LaTeX/Internationalization?oldid=3179699 Contributors: Derbeth,
Jomegat, Alejo2083, Mwtoews, BiT, Hroobjartr, Kovianyo, Skarakoleva, Pi zero, Eudoxos~enwikibooks, Louabill, Louisix, ChrisHodge-
sUK, Drevicko, Piksi, Adrignola, Chaojoker, Mijikenda, Ambrevar, Tomato86, Jlrn, Harrikoo, Alzahrawi, Gelbukh, ILubeMyCucum-
bers20, RealSebix, Saippuakauppias, SamuelLB, Lobaluna, Waylesange, Rotlink, Abalenkm, Hdankowski, Johannes Bo, RTPK, Panora-
media and Anonymous: 67

6.2 Images
File:LaTeX_logo.svg Source: https://upload.wikimedia.org/wikipedia/commons/9/92/LaTeX_logo.svg License: Public domain Contrib-
utors: ? Original artist: ?
File:Nuvola_apps_important_yellow.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/dc/Nuvola_apps_
important_yellow.svg License: LGPL Contributors: An icon from gnome-themes-extras-0.9.0.tar.bz2 (specically
Nuvola/icons/scalable/emblems/emblem-important.svg) by David Vignoni. Original artist: Modied to look more like the PNG
le by Bastique. Recolored by amurai.

6.3 Content license


Creative Commons Attribution-Share Alike 3.0

S-ar putea să vă placă și