Sunteți pe pagina 1din 14


"Ion Creangă" Pedagogical State University Chișinău

Faculty of Foreign Languages and Literatures
Specialty: Applied Modern Languages (English)

History of the Indo-European group of
languages, the criteria of their classification
into branches (subgroups)

Made by: Arhip Cristina

301 A group

Chișinău 2018

1. Introduction
2. Purpose and Objectives
3. Abbreviations
4. History of Indo-European Linguistics
5. Proto-Indo-European Language
6. Expansion
7. Criteria of Classification
8. Branches of Indo-European Languages
9. Present Distribution
10. Conclusion
11. Bibliography


Languages evolve over time. Initially, a language diverges into varying dialects, which
are mutually intelligible (e.g. American English and British English). Eventually, dialects become
distinct languages, which are not mutually intelligible (e.g. French and Spanish).
Languages can therefore be organized into family trees. French and Spanish, for instance,
both evolved from Latin; in this instance, Latin is the parent language, while French and Spanish
are both child languages of Latin. The oldest ancestor of a language family (i.e. the language at
the very top of the family tree) is known as the family's proto-language
Because the chief reason for grouping the Indo-European languages together is that they
share a number of items of basic vocabulary, the main purpose of this report is to investigate the
expansion and historical evolution of Indo-European languages and their criteria of classification
into different branches.
In order to achieve the main purposes of this work the following objectives have been set:
1. to analyze the historical evolution of Indo-European linguistics;
2. to select examples of similar words from different Indo-European languages;
3. to compare the difference between branches of Indo-European family.
4. to describe the establishment of the family;
5. to determine the main characteristics of Proto-Indo-European language;
PIE – Proto-Indo-European language;
BC – Before Christ;
CE – Common/Current Era;
B.C.E – Before Common/Christian Era;
Most European languages belong to the Indo-European language family. The proto-
language of this family (known as "Proto-Indo-European" or simply "Indo-European") emerged
in far eastern Europe, from where it spread westward across Europe and eastward into Asia. This
great Indo-European expansion occurred primarily during the period ca. 2000-1000 BC
The Indo-European languages are a language family of several hundred related languages
and dialects. There are about 445 living Indo-European languages, according to the estimate by
Ethnologue (an annual reference publication in print and online that provides statistics and other
information on the living languages of the world), with over two thirds (313) of them belonging
to the Indo-Iranian branch. The most widely spoken Indo-European languages by native speakers
are Spanish, English, Hindustani (Hindi-Urdu), Portuguese, Bengali, Punjabi, Russian, each with
over 100 million speakers, with German, French, Italian, and Persian also having significant
numbers. Today, about 46% of the human population speaks an Indo-European language as a first
language, by far the highest of any language family.
The Indo-European family includes most of the modern languages of Europe; exceptions
include Hungarian, Finnish, Estonian, several minor Uralic languages, Turkish (a Turkic
language), Basque (a language isolate), and Maltese (a Semitic language). The Indo-European
family is also represented in Asia with the exception of East and Southeast Asia. It was
predominant in ancient Anatolia (present-day Turkey), the ancient Tarim Basin (present-day
Northwest China) and most of Central Asia until the medieval Turkic and Mongol invasions. With
written evidence appearing since the Bronze Age in the form of the Anatolian languages and
Mycenaean Greek, the Indo-European family is significant to the field of historical linguistics as
possessing the second-longest recorded history, after the Afroasiatic family, although certain
language isolates, such as Sumerian, Elamite, Hurrian, Hattian, and Kassite are recorded earlier.
All Indo-European languages are descendants of a single prehistoric language,
reconstructed as Proto-Indo-European, spoken sometime in the Neolithic era. Although no written
records remain, aspects of the culture and religion of the Proto-Indo-European people can also be
reconstructed from the related cultures of ancient and modern Indo-European speakers who
continue to live in areas to where the Proto-Indo-Europeans migrated from their original homeland.
Several disputed proposals link Indo-European to other major language families.
In ancient times it was noticed that some languages presented striking similarities: Greek
and Latin are a well-known example. During classical antiquity it was noted, for example, that
Greek héks “six” and heptá “seven” were similar to the Latin sex and septem. Furthermore, the
regular correspondence of the initial h- in Greek to the initial s- in Latin was pointed out.
The explanation that the ancients came up with was that the Latin language was a
descendant of Greek language. Centuries later, during and after the Renaissance, the close
similarities between more languages were also noted, and it was understood that certain groups of
languages were related, such as Icelandic and English, and also the Romance languages. Despite
all of these observations, the science of linguistics did not develop much further until the 18th
century CE.
Although ancient Greek and Roman grammarians noticed similarities between their
languages, as well as with surrounding Celtic and Germanic speakers, the sheer ubiquity of Indo-
European languages around them led them to the assumption that all human languages were
related. This assumption would continue among many grammarians into the early 19th century,
the grammatical similarities among Indo-European languages sometimes being seen as evidence
of the Tower of Babel until the establishment of the study of Indo-European linguistics proper and
the study of non-Indo-European language families.
In the 16th century, European visitors to the Indian subcontinent began to notice
similarities among Indo-Aryan, Iranian, and European languages. In 1583, English Jesuit
missionary and Konkani scholar Thomas Stephens wrote a letter from Goa to his brother (not
published until the 20th century) in which he noted similarities between Indian languages and
Greek and Latin. Another account was made by Filippo Sassetti, a merchant born in Florence in
1540, who travelled to the Indian subcontinent. Writing in 1585, he noted some word similarities
between Sanskrit and Italian (these included devaḥ/dio "God", sarpaḥ/serpe "serpent", sapta/sette
"seven", aṣṭa/otto "eight", and nava/nove "nine"). However, neither Stephens' nor Sassetti's
observations led to further scholarly inquiry.
In 1647, Dutch linguist and scholar Marcus Zuerius van Boxhorn noted the similarity
among certain Asian and European languages and theorized that they were derived from a
primitive common language which he called Scythian. He included in his hypothesis Dutch,
Albanian, Greek, Latin, Persian, and German, later adding Slavic, Celtic, and Baltic languages.
However, Van Boxhorn's suggestions did not become widely known and did not stimulate further
Ottoman Turkish traveler Evliya Çelebi visited Vienna in 1665–1666 as part of a
diplomatic mission and noted a few similarities between words in German and in Persian. Gaston
Coeurdoux and others made observations of the same type. Coeurdoux made a thorough
comparison of Sanskrit, Latin and Greek conjugations in the late 1760s to suggest a relationship
among them. Meanwhile, Mikhail Lomonosov compared different language groups, including
Slavic, Baltic ("Kurlandic"), Iranian ("Medic"), Finnish, Chinese, "Hottentot", and others, noting
that related languages (including Latin, Greek, German and Russian) must have separated in
antiquity from common ancestors.
The hypothesis reappeared in 1786 when Sir William Jones first lectured on the striking
similarities among three of the oldest languages known in his time: Latin, Greek, and Sanskrit, to
which he tentatively added Gothic, Celtic, and Persian, though his classification contained some
inaccuracies and omissions. During the British colonial expansion into India, a British orientalist
and jurist named Sir William Jones became familiar with the Sanskrit language. Jones was also
knowledgeable in Greek and Latin and was surprised by the similarities between these three
languages. During a lecture on February 2, 1786 CE, Sir William Jones expressed his new ideas:
The Sanskrit language, whatever be its antiquity, is of a wonderful structure; more perfect
than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing
to both of them a stronger affinity, both in the roots of verbs and the forms of grammar, than could
possibly have been produced by accident; so strong indeed, that no philologer could examine them
all three, without believing them to have sprung from some common source, which, perhaps, no
longer exists; there is a similar reason, though not quite so forcible, for supposing that both the
Gothic and the Celtic, though blended with a very different idiom, had the same origin with the
Sanskrit; and the old Persian might be added to the same family, if this were the place for
discussing any question concerning the antiquity of Persia.
The idea that Greek, Latin, Sanskrit, and Persian were derived from a common source
was revolutionary at that time. This was a turning point in the history of linguistics. Rather than
the “daughter” of Greek, Latin was for the first time understood as the “sister” of Greek. By
becoming familiar with Sanskrit, a language geographically far removed from Greek and Latin,
and realizing that chance was an insufficient explanation for the similarities between these
languages, Sir William Jones presented a new insight which triggered the development of modern
Nineteenth-century linguists firmly established the connections that Jones had elucidated
and broadened the family to include Slavic, Baltic, and other language groups. Thomas Young
first used the term Indo-European in 1813, deriving from the geographical extremes of the
language family: from Western Europe to North India. A synonym is Indo-Germanic (Idg. or
IdG.), specifying the family's southeasternmost and northwesternmost branches. This first
appeared in French (indo-germanique) in 1810 in the work of Conrad Malte-Brun; in most
languages this term is now dated or less common than Indo-European, although in German
indogermanisch remains the standard scientific term. A number of other synonymous terms have
also been used.
Franz Bopp wrote in 1816 On the conjugational system of the Sanskrit language
compared with that of Greek, Latin, Persian and Germanic (in which the relation of these five
languages was demonstrated on the basis of a detailed comparison of verb morphology (structure)),
and between 1833 and 1852 he wrote Comparative Grammar (the first full comparative grammar
of the major Indo-European languages). This marks the beginning of Indo-European studies as an
academic discipline. The classical phase of Indo-European comparative linguistics leads from this
work to August Schleicher's 1861 Compendium and up to Karl Brugmann's Grundriss,
published in the 1880s. Brugmann's neogrammarian reevaluation of the field and Ferdinand de
Saussure's development of the laryngeal theory may be considered the beginning of "modern"
Indo-European studies.
The generation of Indo-Europeanists active in the last third of the 20th century (such as
Calvert Watkins, Jochem Schindler, and Helmut Rix) developed a better understanding of
morphology and of ablaut in the wake of Kuryłowicz's 1956 Apophony in Indo-European, who in
1927 pointed out the existence of the Hittite consonant ḫ. Kuryłowicz's discovery supported
Ferdinand de Saussure's 1879 proposal of the existence of coefficients sonantiques, elements de
Saussure reconstructed to account for vowel length alternations in Indo-European languages. This
led to the so-called laryngeal theory, a major step forward in Indo-European linguistics and a
confirmation of de Saussure's theory.
Proto-Indo-European (PIE) is the linguistic reconstruction of the hypothetical common
ancestor of the Indo-European languages, the most widely spoken language family in the world.
Far more work has gone into reconstructing PIE than any other proto-language, and it is by far the
best understood of all proto-languages of its age. The vast majority of linguistic work during the
19th century was devoted to the reconstruction of PIE or its daughter proto-languages (e.g. Proto-
Germanic), and most of the modern techniques of linguistic reconstruction such as the comparative
method were developed as a result. These methods supply all of the knowledge concerning PIE
since there is no written record of the language.
PIE is estimated to have been spoken as a single language from 4,500 B.C.E. to 2,500
B.C.E. during the Neolithic Age, though estimates vary by more than a thousand years. According
to the prevailing Kurgan hypothesis, the original homeland of the Proto-Indo-Europeans may have
been in the Pontic–Caspian steppe of Eastern Europe. The linguistic reconstruction of PIE has also
provided insight into the culture and religion of its speakers. As Proto-Indo-Europeans became
isolated from each other through the Indo-European migrations, the dialects of PIE spoken by the
various groups diverged by undergoing certain sound laws and shifts in morphology to transform
into the known ancient and modern Indo-European languages.
PIE had an elaborate system of morphology that included inflectional suffixes as well as
ablaut (vowel alterations, for example, as preserved in English sing, sang, sung) and accent. PIE
nominals and pronouns had a complex system of declension, and verbs similarly had a complex
system of conjugation. The PIE phonology, particles, numerals, and copula are also well-
reconstructed. Today, the most widely-spoken daughter languages of PIE are Spanish, English,
Hindustani (Hindi and Urdu), Portuguese, Bengali, Russian, Punjabi, German, Persian, French,
Italian and Marathi.
The cradle of the Indo-Europeans may never be known but an ongoing scholarly debate
about the original homeland of Proto-Indo-European (PIE), may some day shed light on the
ancestors of all Indo-European languages as well as the people who spoken it. There are two
schools of thought:
1. Some scholars (e.g., Marija Gimbutas) propose that PIE originated in the steppes north of
the Blackand Caspian Seas (the Kurgan hypothesis). Kurgan is the Russian word of Turkic
origin for a type of burial mound over a burial chamber. The Kurgan hypothesis combines
archaeology with linguistics to trace the diffusion of kurgans from the steppes into
southeastern Europe, providing support for the existence ot a Kurgan culture that reflected
an early presence of Indo-European people in the steppes and southeastern Europe from
the 5th to the 3rd millenium BC.
2. Other scholars (e.g., Gamkrelidze and Ivanov) suggest that PIE originated around 7,000
BC in Anatolia, a stretch of land that lies between the Blackand Mediterranean seas. It lies
across the Aegean Sea to the east of Greece and is thus usually known by its Greek name
Anatolia (Asia Minor). Today, Anatolia is the Asian part of modern Turkey.
When there is evidence for the languages spoken in Europe at the end of the prehistoric
period, it is clear that with few exceptions, such as Basque or Etruscan, they belonged to the Indo-
European language group, which also extended to India and Central Asia. This raises the question
of when these languages, or their ancestral prototype, were first spoken in Europe. One theory
links these languages with a particular population of Indo-Europeans and explains the expansion
of the languages as the result of invasion or immigration; their origin is sought in the east, perhaps
in the area north of the Black and Caspian seas. The invasion is associated with the new patterns
of settlement, economy, material culture, burial, and social organization seen about 3000 BCE.
These innovations, however, may be better attributed to internal developments.
Around 4000 B.C.E, a new population, the Indo-Europeans, began to spread from their
homeland in the Caucases. Over the next few thousand years, they spread and eventually came to
dominate all of Europe and much of Asia. This is a contraversial theory, and some anthropologists
disute its accuracy.
An alternative explanation for the origin of Indo-European languages associates it with
the immigration of the first farmers from Anatolia at the beginning of the Neolithic Period, but the
spread of farming does not seem to have been a uniform process or to have been achieved
everywhere by population migration. There is, however, no single archaeological pattern that
might correspond to a migration on an appropriate geographic scale throughout Europe, and all
these explanations raise fundamental questions about the development, spread, and adoption of
languages, the relationship of language to ethnic groups, and the correspondence of
archaeologically recognizable patterns of material culture to either language or ethnicity.
Sometime between 3500 BC and 2500 BC, the Indo-Europeans began to fan out across
Europe and Asia, in search of new pastures and hunting grounds, and their languages developed -
and diverged - in isolation. By around 1000 BC, the original Indo-European language had split
into a dozen or more major language groups or families, the main groups being: Hellenic, Italic,
Indo-Iranian, Celtic, Germanic, Armenian, Balto-Slavic, Albanian. In addition, several more
groups (including Anatolian, Tocharian, Phrygian, Thracian, Illyrian, etc) have since died out
completely, and yet others may have existed which have not even left a trace.
These broad language groups in turn divided over time into scores of new languages, from
Swedish to Portuguese to Hindi to Latin to Frisian. So, it is astounding but true that languages as
diverse as Gaelic, Greek, Farsi and Sinhalese all ultimately derive from the same origin. The
common ancestry of these diverse languages can sometimes be seen quite clearly in the existence
of cognates (similar words in different languages), and the recognition of this common ancestry of
Indo-European languages is usually attributed to the amateur linguist Sir William Jones in 1786.
Examples are:
1. Father in English, Vater in German, Pater in Latin and Greek, Fadir in Old Norse and Pitr
in ancient Vedic Sanskrit.
2. brother in English, broeer in Dutch, Brüder in German, braithair in Gaelic, bróðr in Old
Norse and bhratar in Sanskrit.
3. three in English, tres in Latin, tris in Greek, drei in German, drie in Dutch, trí in Sanskrit.
4. is in English, is in Dutch, est in Latin, esti in Greek, ist in Gothic, asti in Sanskrit.
5. me in English, mich or mir in German, mij in Dutch, mik or mis in Gothic, me in Latin, eme
in Greek, mam in Sanskrit.
6. mouse in English, Maus in German, muis in Dutch, mus in Latin, mus in Sanskrit.
Membership of languages in the Indo-European language family is determined by
genealogical relationships, meaning that all members are presumed descendants of a common
ancestor, Proto-Indo-European. Membership in the various branches, groups and subgroups of
Indo-European is also genealogical, but here the defining factors are shared innovations among
various languages, suggesting a common ancestor that split off from other Indo-European groups.
For example, what makes the Germanic languages a branch of Indo-European is that much of their
structure and phonology can be stated in rules that apply to all of them. Many of their common
features are presumed innovations that took place in Proto-Germanic, the source of all the
Germanic languages.
Shared characteristics (fig 7)
The chief reason for grouping the Indo-European languages together is that they share a
number of items of basic vocabulary, including grammatical affixes, whose shapes in the different
languages can be related to one another by statable phonetic rules. Especially important are the
shared patterns of alternation of sounds. Thus, the agreement of Sanskrit ás-ti, Latin es-t, and
Gothic is-t, all meaning ‘is’ is greatly strengthened by the identical reduction of the root to s- in
the plural in all three languages: Sanskrit s-ánti, Latin s-unt, Gothic s-ind ‘they are.’ Agreements
in pure structure, totally divorced from phonetic substance, are, at best, of dubious value in proving
membership in the Indo-European family.
The Indo-European languages fall into two general branches. At some time in the distant
past, the original Indo-European speakers migrated westward and eastward from a location north
of the Middle East. We can trace those migrations by looking at vocabulary in each language, and
gradually seeing the sound changes that took place over time as the tribes drifted further apart.
The Indo-European tribes that migrated westward tended to pronounce words with hard
/k/ sounds--a velar stop. On the other hand, those that migrated eastward pronounced similar words
with /s/ or /sh/ sounds--a fricative sound. Likewise, the westward travelers tended to have certain
vowel sounds transform into /e/ sounds while the eastward travelers tended to switch to /a/ sounds
over time, and the labio-velar stops in westward traveling tribes tended to turn into velar sounds.
Philologists have named the two branches Centum and Satem. Centum is the ancient
word for "one hundred" in Latin, a language in the western branch of Indo-European. Satem is the
ancient word for "one hundred" in Avestan, a language in the eastern branch of Indo-European.
The two words illustrate the major changes in a single word as the Indo-European tribes drifted in
two different general directions.
There are more than 100 languages in the family and they show different degrees of
relatedness that allows them to be classified in a dozen European and Asiatic branches:
•Anatolian: all of its members are extinct now but in former times were spoken in the
Anatolian peninsula (modern Turkey). They included Hittite, Luvian and Palaic, and their
descendants Lycian, Lydian and Carian.
•Armenian: is represented only by Armenian.
•Iranian: includes Modern Persian, Kurdish, Pashto, Baluchi and several extinct
languages of Iran and Central Asia.
•Indo-Aryan: the many languages of South Asia including Sanskrit and its modern
descendants like Hindi, Bengali, Punjabi, Gujarati, Marathi and others.
•Tocharian: comprises only two languages, Tocharian A and Tocharian B, both extinct,
recorded in Buddhist documents unearthed in some city-oases of the Silk Road in Xinjiang,
b) European
•Germanic: German, Yiddish, English, Dutch, Frisian and the Scandinavian languages
Swedish, Norwegian, Danish, Icelandic, and Faroese.
•Italic: Latin and its descendants, including Spanish, Portuguese, Catalan, Provençal,
French, Italian and Romanian.
•Baltic: Latvian and Lithuanian, besides the extinct Prussian.
•Slavic: Russian, Ukrainian, Polish, Czech, Bulgarian, and Serbo-Croat among others.
•Celtic: Irish, Scottish Gaelic, Welsh and Breton as well as several extinct continental
•Hellenic: Greek, ancient and modern.
•Albanian: represented only by the Albanian language.
The well-attested languages of the Indo-European family fall fairly neatly into the 10
main branches listed below; these are arranged according to the age of their oldest sizable texts.
1. Anatolian
This branch of languages was predominant in the Asian portion of Turkey and some areas
in northern Syria. The most famous of these languages is Hittite. In 1906 CE, a large amount of
Hittite finds were made on the site of Hattusas, the capital of the Hittite Kingdom, where about
10,000 cuneiform tablets and various other fragments were found in the remains of a royal archive.
These texts date back to the mid to late second millennium BCE. Luvian, Palaic, Lycian, and
Lydian are other examples of families belonging to this group. All languages of this branch are
currently extinct. This branch has the oldest surviving evidence of an Indo-European language,
dated about 1800 BCE.
2. Indo-Iranian
This branch includes two sub-branches: Indic and Iranian. Today these languages are
predominant in India, Pakistan, Iran, and its vicinity and also in areas from the Black Sea to
western China.
Sanskrit, which belongs to the Indic sub-branch, is the best known among the early
languages of this branch; its oldest variety, Vedic Sanskrit, is preserved in the Vedas, a collection
of hymns and other religious texts of ancient India. Indic speakers entered into the Indian
subcontinent, coming from central Asia around 1500 BCE: In the Rig-Veda, the hymn 1.131
speaks about a legendary journey that may be considered a distant memory of this migration.
Avestan is a language that forms part of the Iranian group. Old Avestan (sometimes called
Gathic Avestan) is the oldest preserved language of the Iranian sub-branch, the “sister” of Sanskrit,
which is the language used in the early Zoroastrian religious texts. Another important language of
the Iranian sub-branch is Old Persian, which is the language found in the royal inscriptions of the
Achaemenid dynasty, starting in the late 6th century BCE. The earliest datable evidence of this
branch dates back to about 1300 BCE.
Today, many Indic languages are spoken in India and Pakistan, such as Hindi-Urdu,
Punjabi, and Bengali. Iranian languages such as Farsi (modern Persian), Pashto, and Kurdish are
spoken in Iraq, Iran, Afghanistan, and Tajikistan.
3. Greek
Rather than a branch of languages, Greek is a group of dialects: During more than 3000
years of written history, Greek dialects never evolved into mutually incomprehensible languages.
Greek was predominant in the southern end of the Balkans, the Peloponnese peninsula, and the
Aegean Sea and its vicinity. The earliest surviving written evidence of a Greek language is
Mycenaean, the dialect of the Mycenaean civilization, mainly found on clay tablets and ceramic
vessels on the isle of Crete. Mycenaean did not have an alphabetic written system, rather it had a
syllabic script known as the Linear B script.
The first alphabetic inscriptions have been dated back to the early 8th century BCE, which
is probably the time when the Homeric epics, the Iliad and the Odyssey, reached their present
form. There were many Greek dialects in ancient times, but because of Athens cultural supremacy
in the 5th century BCE, it was the Athens dialect, called Attic, the one that became the standard
literary language during the Classical period (480-323 BCE). Therefore, the most famous Greek
poetry and prose written in Classical times were written in Attic: Aristophanes, Aristotle,
Euripides, and Plato are just a few examples of authors who wrote in Attic.
4. Italic
This branch was predominant in the Italian peninsula. The Italic people were not natives
of Italy; they entered Italy crossing the Alps around 1000 BCE and gradually moved southward.
Latin, the most famous language in this group, was originally a relatively small local language
spoken by pastoral tribes living in small agricultural settlements in the centre of the Italian
peninsula. The first inscriptions in Latin appeared in the 7th century BCE and by the 6th century
BCE it had spread significantly.
Rome was responsible for the growth of Latin in ancient times. Classical Latin is the form
of Latin used by the most famous works of Roman authors like Ovid, Cicero, Seneca, Pliny, and
Marcus Aurelius. Other languages of this branch are: Faliscan, Sabellic, Umbrian, South Picene,
and Oscan, all of them extinct. Today Romance languages are the only surviving descendants of
the Italic branch.
5. Celtic
This branch contains two sub-branches: Continental Celtic and Insular Celtic. By about
600 BCE, Celtic-speaking tribes had spread from what today are southern Germany, Austria, and
Western Czech Republic in almost all directions, to France, Belgium, Spain, and the British Isles,
then by 400 BCE, they also moved southward into northern Italy and southeast into the Balkans
and even beyond. During the early 1st century BCE, Celtic-speaking tribes dominated a very
significant portion of Europe. On 50 BCE, Julius Caesar conquered Gaul (ancient France) and
Britain was also conquered about a century later by the emperor Claudius. As a result, this large
Celtic-speaking area was absorbed by Rome, Latin became the dominant language, and the
Continental Celtic languages eventually died out. The chief Continental language was Gaulish.
Insular Celtic developed in the British Isles after Celtic-speaking tribes entered around
the 6th century BCE. In Ireland, Insular Celtic flourished, aided by the geographical isolation
which kept Ireland relatively safe from the Roman and Anglo-Saxon invasion. The only Celtic
languages still spoken today (Irish Gaelic, Scottish Gaelic, Welsh and Breton) all come from
Insular Celtic.
6. Germanic
The Germanic branch is divided in three sub-branches: East Germanic, currently extinct;
North Germanic, containing Old Norse, the ancestor of all modern Scandinavian languages; and
West Germanic, containing Old English, Old Saxon, and Old High German.
The earliest evidence of Germanic-speaking people dates back to first half of the 1st
millennium BCE, and they lived in an area stretching from southern Scandinavia to the coast of
the North Baltic Sea. During prehistoric times, the Germanic speaking tribes came into contact
with Finnic speakers in the north and also with Balto-Slavic tribes in the east. As a result of this
interaction, the Germanic language borrowed several terms from Finnish and Balto-Slavic.
Several varieties of Old Norse were spoken by most Vikings. Native Nordic pre-Christian
Germanic mythology and folklore has been also preserved in Old Norse, in a dialect named Old
Icelandic. Dutch, English, Frisian, and Yiddish are some examples of modern survivors of the
West Germanic sub-branch, while Danish, Faroese, Icelandic, Norwegian, and Swedish are
survivors of the North Germanic branch.
7. Armenian
The origins of the Armenian-speaking people is a topic still unresolved. It is probable that
the Armenians and the Phrygians belonged to the same migratory wave that entered Anatolia,
coming from the Balkans around the late 2nd millennium BCE. The Armenians settled in an area
around Lake Van, currently Turkey; this region belonged to the state of Urartu during the early 1st
millennium BCE. In the 8th century BCE, Urartu came under Assyrian control and in the 7th
century BCE, the Armenians took over the region. The Medes absorbed the region soon after and
Armenia became a vassal state. During the time of the Achaemenid Empire, the region turned into
a Persian satrap. The Persian domination had a strong linguistic impact on Armenian, which
mislead many scholars in the past to believe that Armenian actually belonged to the Iranian group.
8. Tocharian
The history of the Tocharian-speaking people is still surrounded by mystery. We know
that they lived in the Taklamakan Desert, located in western China. Most of the Tocharian texts
left are translations from well-known Buddhist works, and all of these texts have been dated
between the 6th and the 8th centuries CE. None of these texts speak about the Tocharians
themselves. Two different languages belong to this branch: Tocharian A and Tocharian B.
Remains of the Tocharian A language have only been found in places where Tocharian B
documents have also been found, which would suggest that Tocharian A was already extinct, kept
alive only as a religious or poetic language, while Tocharian B was the living language used for
administrative purposes.
Many well-preserved mummies with Caucasoid features such as tall stature, red, blonde,
and brown hair, have been discovered in the Taklamakan Desert, dating between 1800 BCE to 200
CE. The weaving style and patterns of their clothes is similar to the Hallstatt culture in central
Europe. Physical analysis and genetic evidence have revealed resemblances with the inhabitants
of western Eurasia. This branch is completely extinct. Among all ancient Indo-European
languages, Tocharian was spoken farthest to the east.
9. Balto-Slavic
This branch contains two sub-branches: Baltic and Slavic. During the late Bronze Age,
the Balts' territory may have stretched from around western Poland all the way across to the Ural
Mountains. Afterwards, the Balts occupied a small region along the Baltic Sea. Those in the
northern part of the territory occupied by the Balts were in close contact with Finnic tribes, whose
language was not part of the Indo-European language family: Finnic speakers borrowed a
considerable amount of Baltic words, which suggests that the Balts had an important cultural
prestige in that area. Under the pressure of Gothic and Slavic migrations, the territory of the Balts
was reduced towards the 5th century CE.
Archaeological evidence shows that from 1500 BCE, either the Slavs or their ancestors
occupied an area stretching from near the western Polish borders towards the Dnieper River in
Belarus. During the 6th century CE, the Slav-speaking tribes expanded their territory, migrating
into Greece and the Balkans: this is when they are mentioned for the first time, in Byzantine
records referring to this large migration. Either some or all of the Slavs were once located further
to the east, in or around Iranian territory, since many Iranian words were borrowed into pre-Slavic
at an early stage. Later on, as they moved westward, they came into contact with German tribes
and again borrowed several additional terms.
Only two Baltic languages survive today: Latvian and Lithuanian. A large number of
Slavic languages survive today, such as Bulgarian, Czech, Croatian, Polish, Serbian, Slovak,
Russian, and many others.
10. Albanian
Albanian is the last branch of Indo-European languages to appear in written form. There
are two hypotheses on the origin of Albanian. The first one says that Albanian is a modern
descendant of Illyrian, a language which was widely spoken in the region during classical times.
Since we know very little about Illyrian, this assertion can be neither denied nor confirmed from
a linguistic standpoint. From a historical and geographical perspective, however, this assertion
makes sense. Another hypotheses says that Albanian is a descendant of Thracian, another lost
language that was spoken farther east than Illyrian. Today Albanian is spoken in Albania as the
official language, in several other areas in of the former Yugoslavia and also in small enclaves in
southern Italy, Greece and the Republic of Macedonia.
In addition to the classical ten branches listed above, several extinct and little-known
languages and language-groups have existed:
1. Illyrian: possibly related to Albanian, Messapian, or both
2. Venetic: shares several similarities with Latin and the Italic languages, but also has some
affinities with other IE languages, especially Germanic and Celtic.
3. Liburnian: doubtful affiliation, features shared with Venetic, Illyrian, and Indo-Hittite,
significant transition of the Pre-Indo-European elements
4. Messapian: not conclusively deciphered
5. Phrygian: language of the ancient Phrygians
6. Paionian: extinct language once spoken north of Macedon
7. Thracian: possibly including Dacian
8. Dacian: possibly very close to Thracian
9. Ancient Macedonian: proposed relationship to Greek.
10. Ligurian – possibly close to or part of Celtic.
11. Sicel: an ancient language spoken by the Sicels (Greek Sikeloi, Latin Siculi), one of the
three indigenous (i.e. pre-Greek and pre-Punic) tribes of Sicily. Proposed relationship to
Latin or proto-Illyrian (Pre-Indo-European) at an earlier stage.
12. Lusitanian: possibly related to (or part of) Celtic, Ligurian, or Italic
13. Cimmerian: possibly Iranic, Thracian, or Celtic
Today, Indo-European languages are spoken by almost 3 billion native speakers across
all inhabited continents, the largest number by far for any recognised language family. Of the 20
languages with the largest numbers of native speakers according to Ethnologue, 11 are Indo-
European: Spanish, English, Hindustani, Portuguese, Bengali, Russian, Punjabi, German, French,
Marathi, accounting for over 1.7 billion native speakers. Additionally, hundreds of millions of
persons worldwide study Indo-European languages as secondary or tertiary languages, including
in cultures which have completely different language families and historical backgrounds—there
between 600,000,000 and one billion L2 learners of English alone.
The success of the language family, including the large number of speakers and the vast
portions of the Earth that they inhabit, is due to several factors. The ancient Indo-European
migrations and widespread dissemination of Indo-European culture throughout Eurasia, including
that of the Proto-Indo-Europeans themselves, and that of their daughter cultures including the
Indo-Aryans, Iranian peoples, Celts, Greeks, Romans, Germanic peoples, and Slavs, led to these
peoples' branches of the language family already taking a dominant foothold in virtually all of
Eurasia except for North and East Asia by the end of the prehistoric era, replacing the previously-
spoken pre-Indo-European languages of this extensive area.
Despite being unaware of their common linguistic origin, diverse groups of Indo-
European speakers continued to culturally dominate and replace the indigenous languages of the
western two-thirds of Eurasia. By the beginning of the Common Era, Indo-European peoples
controlled almost the entirety of this area: the Celts western and central Europe, the Romans
southern Europe, the Germanic peoples northern Europe, the Slavs eastern Europe, the Iranian
peoples the entirety of western and central Asia and parts of eastern Europe, and the Indo-Aryan
peoples south Asia, with the Tocharians inhabiting the Indo-European frontier in western China.
By the medieval period, only the Vasconic, Semitic, Dravidian, Caucasian and Uralic languages
remained of the (relatively) indigenous languages of Europe and the western half of Asia.
Despite medieval invasions by Eurasian nomads, a group to which the Proto-Indo-
Europeans had once belonged, Indo-European expansion reached another peak in the early modern
period with the dramatic increase in the population of the Indian subcontinent and European
expansionism throughout the globe during the Age of Discovery, as well as the continued
replacement and assimilation of surrounding non-Indo-European languages and peoples due to
increased state centralization and nationalism. These trends compounded throughout the modern
period due to the general global population growth and the results of European colonization of the
Western Hemisphere and Oceania, leading to an explosion in the number of Indo-European
speakers as well as the territories inhabited by them.
Due to colonization and the modern dominance of Indo-European languages in the fields
of global science, technology, education, finance, and sports, even many modern countries whose
populations largely speak non-Indo-European languages have Indo-European languages as official
languages, and the majority of the global population speaks at least one Indo-European language.
The overwhelming majority of languages used on the Internet are Indo-European, with English
continuing to lead the group; English in general has in many respects become the lingua franca of
global communication.
Regarding the number of its speakers, Indo-European is the largest linguistic family
today. Its name suggests that some of its members are Asiatic and some European. In fact, the
Indo-European family includes the vast majority of European languages and some Asiatic ones.
In the last centuries, they have spread to the five continents and are now spoken by half of the
population of the planet. They all derive from a hypothetical precursor called Proto-Indo-
European, or simply Indo-European, spoken some six or seven thousand years ago somewhere in
Asia. The Indo-European homeland is a matter of controversy because of the difficulty to correlate
archeological with ethnic and linguistic data, though several scholars think it was located close to
the Caspian Sea. Others believe it was in Anatolia (today's Turkey).
This family of languages t first spread throughout Europe and many parts of South Asia,
and later to every corner of the globe as a result of colonization. The term Indo-European is
essentially geographical since it refers to the easternmost extension of the family from the Indian
subcontinent to its westernmost reach in Europe. The family includes most of the languages of
Europe, as well as many languages of Southwest, Central and South Asia. With over 2.6 billion
speakers (or 45% of the world’s population), the Indo-European language family has the largest
number of speakers of all language families as well as the widest dispersion around the world.
It would not have been possible to establish the existence of the Indo-European language
family if scholars had not compared the systematically recurring resemblances among European
languages and Sanskrit, the oldest language of the Indian subcontinent that left many written
documents. The common origin of European languages and Sanskrit was first proposed by Sir
William Jones (1746-1794). Systematic comparisons between these languages by Franz Bopp
supported this theory and laid the foundation for postulating that all Indo-European languages
descended from a common ancestor, Proto-Indo-European (PIE), thought to have been spoken
before 3,000 B.C. It then split into different branches which, in turn, split into different languages
in the subsequent millennia.
The majority view in historical linguistics is that the homeland of the Indo-European
language family was located in the Pontic steppes (present day Ukraine) around 6000 years ago.
The evidence for this comes from linguistic paleontology: in particular, certain words to do with
the technology of wheeled vehicles are arguably present across all the branches of the Indo-
European family; and archaeology tells us that wheeled vehicles arose no earlier than this date.
The minority view links the origins of Indo-European with the spread of farming from Anatolia
8000-9500 years ago