1. Introduction

1.1. The Indo-European Language Family

Text Box: Countries with a majority (dark colour) and minority or official status (light) of Indo-European language speakers. (2011, modified from Brianski 2007) 1.1.1. The Indo-European languages are a family of several hundred modern languages and dialects, including most of the major languages of Europe, as well as many in Asia. Contemporary languages in this family include English, German, French, Spanish, Portuguese, Hindustani (i.e., Hindi and Urdu among other modern dialects), Persian and Russian. It is the largest family of languages in the world today, being spoken by approximately half the world’s population as mother tongue. Furthermore, the majority of the other half speaks at least one of them as second language.

1.1.2. Romans didn’t perceive similarities between Latin and Celtic dialects, but they found obvious correspondences with Greek. After grammarian Sextus Pompeius Festus:

Suppum antiqui dicebant, quem nunc supinum dicimus ex Graeco, videlicet pro adspiratione ponentes <s> litteram, ut idem ὕλας dicunt, et nos silvas; item ἕξ sex, et ἑπτά septem

Such findings are not striking, though, as Rome was believed to have been originally funded by Trojan hero Aeneas and, consequently, Latin was derived from Old Greek.

1.1.3. Florentine merchant Filippo Sassetti travelled to the Indian subcontinent, and was among the first European observers to study the ancient Indian language, Sanskrit. Writing in 1585, he noted some word similarities between Sanskrit and Italian, e.g. deva/dio ‘God’, sarpa/serpe ‘snake’, sapta/sette ‘seven’, ashta/otto ‘eight’, nava/nove ‘nine’. This observation is today credited to have foreshadowed the later discovery of the Indo-European language family.

1.1.4. The first proposal of the possibility of a common origin for some of these languages came from Dutch linguist and scholar Marcus Zuerius van Boxhorn in 1647. He discovered the similarities among Indo-European languages, and supposed the existence of a primitive common language which he called ‘Scythian’. He included in his hypothesis Dutch, Greek, Latin, Persian, and German, adding later Slavic, Celtic and Baltic languages. He excluded languages such as Hebrew from his hypothesis. However, the suggestions of van Boxhorn did not become widely known and did not stimulate further research.

1.1.5. On 1686, German linguist Andreas Jäger published De Lingua Vetustissima Europae, where he identified an remote language, possibly spreading from the Caucasus, from which Latin, Greek, Slavic, ‘Scythian’ (i.e. Persian) and Celtic (or ‘Celto-Germanic’) were derived, namely Scytho-Celtic.

1.1.6. The hypothesis re-appeared in 1786 when Sir William Jones first lectured on similarities between four of the oldest languages known in his time: Latin, Greek, Sanskrit and Persian:

“The Sanskrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists: there is a similar reason, though not quite so forcible, for supposing that both the Gothic and the Celtic, though blended with a very different idiom, had the same origin with the Sanskrit; and the old Persian might be added to the same family”

1.1.7. Danish Scholar Rasmus Rask was the first to point out the connection between Old Norwegian and Gothic on the one hand, and Lithuanian, Slavonic, Greek and Latin on the other. Systematic comparison of these and other old languages conducted by the young German linguist Franz Bopp supported the theory, and his Comparative Grammar, appearing between 1833 and 1852, counts as the starting-point of Indo-European studies as an academic discipline.

NOTE. The term Indo-European itself now current in English literature, was coined in 1813 by the British scholar Sir Thomas Young, although at that time there was no consensus as to the naming of the recently discovered language family. Among the names suggested were indo-germanique (C. Malte-Brun, 1810), Indoeuropean (Th. Young, 1813), japetisk (Rasmus C. Rask, 1815), indisch-teutsch (F. Schmitthenner, 1826), sanskritisch (Wilhelm von Humboldt, 1827), indokeltisch (A. F. Pott, 1840), arioeuropeo (G. I. Ascoli, 1854), Aryan (F. M. Müller, 1861), aryaque (H. Chavée, 1867), etc.

In English, Indo-German was used by J. C. Prichard in 1826 although he preferred Indo-European. In French, use of indo-européen was established by A. Pictet (1836). In German literature, Indo-Europäisch was used by Franz Bopp since 1835. The term Indo-Germanisch had already been introduced by Julius von Klapproth in 1823, intending to include the northernmost and the southernmost of the family’s branches, as it was as an abbreviation of the full listing of involved languages that had been common in earlier literature; that opened the doors to ensuing fruitless discussions whether it should not be Indo-Celtic, or even Tocharo-Celtic.

1.2. Traditional Views

1.2.1. In the beginnings of the Indo-European studies using the comparative method, Indo-European was reconstructed as a unitary proto-language. For Rask, Bopp and other linguists, it was a search for the Indo-European. Such a language was supposedly spoken in a certain region between Europe and Asia and at one point in time.

1.2.2. The Stammbaumtheorie or Genealogical Tree theory states that languages split up in other languages, each of them in turn split up in others, and so on, like the branches of a tree. For example, a well-known out-dated theory about Indo-European is that, within the PIE language, two main groups of dialects known as centum and satem were formed, a model represented by a clean break-up from the parent language.

NOTE. The centum and satem isogloss is one of the oldest known phonological differences of IE languages, and is still used by many to classify PIE in two main dialectal groups – postulating the existence of proto-Centum and proto-Satem languages –, according to their pronunciation of PIE *(d)km̥tóm, hundred, disregarding their relevant morphological and syntactical differences, and usually implicitly accepting a common PIE series of palatovelars.

Tree diagrams remain the most used model for understanding the Indo-European language reconstruction, since it was proposed by A. Schleicher (Compendium, 1866). The problem with its simplicity is that “the branching of the different groups is portrayed as a series of clean breaks with no connection between branches after they have split, as if each dialectal group marched away from the rest. Such sharp splits are possible, but assuming that all splits within Proto-Indo-European were like this is not very plausible, and any linguist surveying the current Indo-European languages would note dialectal variations running through some but not all areas, often linking adjacent groups who may belong to different languages” (Mallory–Adams, 2007).

1.2.3. The Wellentheorie or Waves Theory, of J. Schmidt, states that one language is created from another by the spread of innovations, the way water waves spread when a stone hits the water surface. The lines that define the extension of the innovations are called isoglosses. The convergence of different isoglosses over a common territory signals the existence of a new language or dialect. Where isoglosses from different languages coincide, transition zones are formed.

NOTE. After Mallory and Adams (2007), “their criteria of inclusion, why we are looking at any particular one, and not another one, are no more solid than those that define family trees. The key element here is what linguistic features actually help determine for us whether two languages are more related or less related to one another.”

1.2.4. Because of the difficulties found in the modelling of Proto-Indo-European branches and daughter languages into the traditional, unitary ‘Diverging Tree’ framework, i.e. a uniform Proto-Indo-European language with its branches, a new model called ‘Converging Association of Languages’ was proposed, in which languages that are in contact (not necessarily related to each other) exchange linguistic elements and rules, thus developing and acquiring from each other. Most linguists have rejected it as an implausible explanation of the irregularities found in the old, static concept of PIE.

NOTE. Among the prominent advocates is N.S. Trubetzkoy (Urheimat, 1939): “The term ‘language family’  does not presuppose the common descent of a quantity of languages from a single original language. We consider a ‘language family’ a group of languages, in which a considerable quantity of lexical and morphological elements exhibit regular equivalences (…) it is not necessary for one to suppose common descent, since such regularity may also originate through borrowings between neighboring unrelated languages (…) It is just as conceivable that the ancestors of the Indo-European language branches were originally different from each other, but though constant contact, mutual influence, and borrowings, approached each other, without however ever becoming identical to one another”  (Meier-Brügger, 2003).

Agreeing with Neumann (1996), Meier-Brügger (2003) rejects that association of languages in the Proto-Indo-European case by stating: “that the various Indo-European languages have developed from a prior unified language is certain. Questionable is, however, the concrete ‘how’ of this process of differentiation”, and that this “thesis of a ‘converging association of languages’ may immediately be dismissed, given that all Indo-European languages are based upon the same Proto-Indo-European flexion morphology. As H. Rix makes clear, it is precisely this morphological congruence that speaks against the language association model, and for the diverging tree model.”

1.3. The Theory of the Three Stages

1.3.1. Even the first Indo-Europeanists had noted in their works the possibility of reconstructing older stages of the ‘Brugmannian’ Proto-Indo-European.

NOTE. The development of this theory of three linguistic stages can be traced back to the very origins of Indo-European studies, firstly as a diffused idea of a non-static PIE language, and later widely accepted as a dynamic dialectal evolution, already in the twentieth century, after the decipherment of the Anatolian scripts. Most linguists accept that Proto-Indo-European must be the product of a long historical development, as any ‘common language’ is being formed gradually, and proto-languages (like languages) have stages, as described by Lehmann (Introducción a la lingüística histórica, Spa. transl. 1961). On this question, H. Rix (Modussystem, 1986) asserts “[w]hereby comparative reconstruction is based upon a group of similar forms in a number of languages, internal reconstruction takes its point of departure from irregularities or inhomogeneities of the system of a single language (…) The fundamental supposition of language-internal reconstruction is that such an irregularity or inhomogeneity in the grammar of a language is the result of a diachronic process, in which an older pattern, or homogeneity is eclipsed, but not fully suppressed”. According to Meier-Brügger (2003), “Rix works back from Late Proto-Indo-European Phase B (reconstructible Proto-Indo-European) using deducible information about an Early Proto-Indo-European Phase A, and gathers in his work related evidence on the Proto-Indo-European verbal system”. On that question, see also the “Late Indo-European” differentiation in Gamkrelidze–Ivanov (1994-1995), Adrados–Bernabé–Mendoza (1995-1998); a nomenclature also widespread today stems from G.E. Dunkel’s Early, Middle, Late Indo-European: Doing it My Way (1997); etc.

1.3.2. Today, a widespread Three-Stage theory divides PIE internal language evolution into three main historic layers or stages, including a description of branches and languages either as clean breaks from a common source (e.g. PAn and LIE from Indo-Hittite) or from intermediate dialect continua (e.g. Germanic and Balto-Slavic from North-West IE), or classifying similarities into continued linguistic contact (e.g. between Balto-Slavic and Indo-Iranian):

1)  Pre-Proto-Indo-European (Pre-PIE), more properly following the current nomenclature Pre-Indo-Hittite (Pre-PIH), also Early PIE, is the hypothetical ancestor of Indo-Hittite, and probably the oldest stage of the language that comparative grammar could help reconstruct using internal reconstruction. There is, however, no common position as to how it was like or when and where it was spoken.

2) The second stage corresponds to a time before the separation of Proto-Anatolian from the common linguistic community where it should have coexisted (as a Pre-Anatolian dialect) with Pre-LIE. That stage of the language is today commonly called Indo-Hittite (PIH), and also Middle PIE, but often simply Proto-Indo-European; it is identified with early kurgan cultures in the Kurgan Hypothesis.

NOTE. On the place of Anatolian among IE languages, the question is whether it separated first as a language branch from PIE, and to what extent was it thus spared developments common to the remaining Proto-Indo-European language group. There is growing consensus in favour of its early split from Indo-European (Heading, among others, ‘Indo-Hittite’); see N. Oettinger (‘Indo-Hittite’ – Hypothesen und Wortbildung 1986), A. Lehrman (Indo-Hittite Revisited, 1996), H. Craig Melchert (The Dialectal Position of Anatolian within IE in IE Subgrouping, 1998), etc.

For Kortlandt (The Spread of The Indo-Europeans, JIES 18, 1990): “Since the beginnings of the Yamnaya, Globular Amphora, Corded Ware, and Afanasievo cultures can all be dated between 3600 and 3000 BC, I am inclined to date Proto-Indo-European to the middle of the fourth millennium, and to recognize Proto-Indo-Hittite as a language which may have been spoken a millennium earlier.”

For Ringe (2006), “[i]nterestingly, there is by now a general consensus among Indo-Europeanists that the Anatolian subfamily is, in effect, one half of the IE family, all the other subgroups together forming the other half.”

On the Anatolian question and its implications on nomenclature, West (2007) states that “[t]here is growing consensus that the Anatolian branch, represented by Hittite and related languages of Asia Minor, was the first to diverge from common Indo-European, which continued to evolve for some time after the split before breaking up further. This raises a problem of nomenclature. It means that with the decipherment of Hittite the ‘Indo-European’ previously reconstructed acquired a brother in the shape of proto-Anatolian, and the archetype of the family had to be put back a stage. E. H. Sturtevant coined a new term ‘Indo-Hittite’ (…) The great majority of linguists, however, use ‘Indo-European’ to include Anatolian, and have done, naturally enough, ever since Hittite was recognized to be ‘an Indo-European language’. They will no doubt continue to do so.”

3)  The common immediate ancestor of most of the reconstructed IE proto-languages is approximately the same static ‘Brugmannian’ PIE searched for since the start of Indo-European studies, before Hittite was deciphered. It is usually called Late Indo-European (LIE) or Late PIE, generally dated some time ca. 3500-2500 BC using linguistic or archaeological models, or both.

NOTE. According to Mallory–Adams (2007): “Generally, we find some form of triangulation based on the earliest attested Indo-European languages, i.e. Hittite, Mycenaean Greek, and Indo-Aryan, each of these positioned somewhere between c. 2000 and 1500 BC. Given the kind of changes linguists know to have occurred in the attested histories of Greek or Indo-Aryan, etc., the linguist compares the difference wrought by such changes with the degree of difference between the earliest attested Hittite, Mycenaean Greek, and Sanskrit and reconstructed Proto-Indo-European. The order of magnitude for these estimates (or guesstimates) tends to be something on the order of 1,500-2,000 years. In other words, employing some form of gut intuition (based on experience which is often grounded on the known separation of the Romance or Germanic languages), linguists tend to put Proto-Indo-European sometime around 3000 BC plus or minus a millennium (…) the earliest we are going to be able to set Proto-Indo-European is about the fifth millennium BC if we want it to reflect the archaeological reality of Eurasia. We have already seen that individual Indo-European groups are attested by c. 2000 BC. One might then place a notional date of c. 4500-2500 BC on Proto-Indo-European. The linguist will note that the presumed dates for the existence of Proto-Indo-European arrived at by this method are congruent with those established by linguists’ ‘informed estimation’. The two dating techniques, linguistic and archeological, are at least independent and congruent with one another.”

Likewise, in Meier-Brügger (2003), about a common Proto-Indo-European: “No precise statement concerning the exact time period of the Proto-Indo-European linguistic community is possible. One may only state that the ancient Indo-European languages that we know, which date from the 2nd millennium BC, already exhibit characteristics of their respective linguistic groups in their earliest occurrences, thus allowing one to presume the existence of a separate and long pre-history (…) The period of 5000-3000 BC is suggested as a possible timeframe of a Proto-Indo-European language.”

However, on the early historic and prehistoric finds, and the assumption of linguistic communities linked with archaeological cultures, Hänsel (Die Indogermanen und das Pferd, B. Hänsel, S. Zimmer (eds.), 1994) states that “[l]inguistic development may be described in steps that, although logically comprehensible, are not precisely analyzable without a timescale. The archaeologist pursues certain areas of cultural development, the logic of which (if one exists) remains a mystery to him, or is only accessible in a few aspects of its complex causality. On the other hand, he is provided with concrete ideas with regard to time, as vague as these may be, and works with a concept of culture that the Indo-European linguist cannot attain. For the archaeologist, culture is understood in the sense of a sociological definition (…) The archaeological concept of culture is composed of so many components, that by its very nature its contours must remain blurred. But languages are quite different. Of course there are connections; no one can imagine cultural connections without any possibility of verbal communication. But it is too much to ask that archaeologists equate their concept of culture, which is open and incorporates references on various levels, to the single dimension of linguistic community. Archaeology and linguistics are so fundamentally different that, while points of agreement may be expected, parallels and congruency may not. The advantage of linguistic research is its ability to precisely distinguish between individual languages and the regularity of developments. The strength of archaeology is its precision in developing timelines. What one can do, the other cannot. They could complement each other beautifully, if only there were enough commonality.”

1.3.3. Another division has to be made, so that the dialectal evolution is properly understood. Late Indo-European had at least two main inner dialectal branches, the Southern or Graeco-Aryan (S.LIE) and the Northern (N.LIE) ones.

It seems that speakers of Southern or Graeco-Aryan dialects spread in different directions with the first LIE migrations (ca. 3000-2500 BC in the Kurgan framework), forming at least a South-East (including Pre-Indo-Iranian) and a South-West (including Pre-Greek) group. Meanwhile, speakers of Northern dialects migrated to the North-West (see below), but for speakers of a North-East IE branch (from which Pre-Tocharian developed), who migrated to Asia.

NOTE. Beekes (1995), from an archaeological point of view, on the Yamnaya culture: “This is one of the largest pre-historic complexes in Europe, and scholars have been able to distinguish between different regions within it. It is dated from 3600-2200 B.C. In this culture, the use of copper for the making of various implements is more common. From about 3000 B.C. we begin to find evidence for the presence in this culture of two- and four-wheeled wagons (…) There seems to be no doubt that the Yamnaya culture represents the last phase of an Indo-European linguistic unity, although there were probably already significant dialectal differences within it.”

Fortson (2004) similarly suggests: in the period 3100-2900 BC came a clear and dramatic infusion of Yamna cultural practice, including burials, into eastern Hungary and along the lower Danube. With this we seem able to witness the beginnings of the Indo-Europeanization of Europe. By this point, the members of the Yamna culture had spread out over a very large area and their speech had surely become dialecticaly strongly differentiated.”

Meier-Brügger (2003): “Within the group of IE languages, some individual languages are more closely associated with one another owing to morphological or lexical similarities. The cause for this, as a rule, is a prehistoric geographic proximity (perhaps even constituting single linguistic community) or a common preliminary linguistic phase, a middle mother-language phase, which would however then be posterior to the period of the mother language.”

About Tocharian, Adrados–Bernabé–Mendoza (1995-1998): “even if archaic in some respects (its centum character, subjunctive, etc.) it shares common features with Balto-Slavic, among other languages: they must be old isoglosses, shared before it separated and migrated to the East. It is, therefore, [a N.LIE] language. It shows great innovations, too, something normal in a language that evolved isolated.”

On the Southern (Graeco-Aryan or Indo-Greek) LIE dialect, see Tovar (Krahes alteuropäische Hydronymie und die west-indogermanischen Sprachen, 1977; Actas del II Coloquio sobre lenguas y culturas prerromanas de la Península Ibérica, Salamanca, 1979), Gamkrelidze–Ivanov (1993-1994), Clackson (The Linguistic Relationship Between Armenian and Greek, 1994), Adrados–Bernabé–Mendoza (1995-1998), etc. In Mallory–Adams (2007): “Many have argued that Greek, Armenian, and Indo-Iranian share a number of innovations that suggest that there should have been some form of linguistic continuum between their predecessors.”

On the Graeco-Aryan community, West (2007) proposes the latest terminus ante quem for its split: “We shall see shortly that Graeco-Aryan must already have been differentiated from [LIE] by 2500. We have to allow several centuries for the development of [LIE] after its split from proto-Anatolian and before its further division. (…) The first speakers of Greek – or rather of the language that was to develop into Greek; I will call them mello-Greeks – arrived in Greece, on the most widely accepted view, at the beginning of Early Helladic III, that is, around 2300. They came by way of Epirus, probably from somewhere north of the Danube. Recent writers have derived them from Romania or eastern Hungary. (…) we must clearly go back at least to the middle of the millennium for the postulated Graeco-Aryan linguistic unity or community.”

1.3.4. The so-called North-West Indo-European is considered by some to have formed an early linguistic community already separated from other Northern dialects (which included Pre-Tocharian) before or during the LIE dialectal split, and is generally assumed to have been a later IE dialect continuum between different communities in Northern Europe during the centuries on either side of 2500 BC, with a development usually linked to the expansion of the Corded Ware culture.

NOTE. A dialect continuum, or dialect area, was defined by Leonard Bloomfield as a range of dialects spoken across some geographical area that differ only slightly between neighbouring areas, but as one travels in any direction, these differences accumulate such that speakers from opposite ends of the continuum are no longer mutually intelligible. Examples of dialect continua included (now blurred with national languages and administrative borders) the North-Germanic, German, East Slavic, South Slavic, Northern Italian, South French, or West Iberian languages, among others.

A Sprachbund, also known as a linguistic area, convergence area, diffusion area or language crossroads – is a group of languages that have become similar in some way because of geographical proximity and language contact. They may be genetically unrelated, or only distantly related. That was probably the case with Balto-Slavic and Indo-Iranian, v.i. §1.7.

North-West IE was therefore a language or group of closely related dialects that emerged from a parent (N.LIE) dialect, in close contact for centuries, which allowed them to share linguistic developments.

NOTE. On the so-called “Nort-West Indo-Europeandialect continuum, see Tovar (1977, 1979), Eric Hamp (“The Indo-European Horse” in T. Markey and J.Greppin (eds.) When Worlds Collide: Indo-Europeans and Pre-Indo-Europeans, 1990), N. Oettinger Grundsätzliche überlegungen zum Nordwest-Indogermanischen (1997), and Zum nordwestindogermanischen Lexikon (1999); M. E. Huld Indo-Europeanization of Northern Europe (1996); Adrados–Bernabé–Mendoza (1995-1998); etc.  

Regarding the dating of European proto-languages (of ca. 1500-500 BC) to the same time as Proto-Greek or Proto-Indo-Iranian (of ca. 2500-2000), obviating the time span between them, we might remember Kortlandt’s (1990) description of what “seems to be a general tendency to date proto-languages farther back in time than is warranted by the linguistic evidence. When we reconstruct Proto-Romance, we arrive at a linguistic stage which is approximately two centuries later than the language of Caesar and Cicero (cf. Agard 1984: 47-60 for the phonological differences). When we start from the extralinguistic evidence and identify the origins of Romance with the beginnings of Rome, we arrive at the eighth century BC, which is almost a millennium too early. The point is that we must identify the formation of Romance with the imperfect learning of Latin by a large number of people during the expansion of the Roman empire.”

1.3.4. Apart from the shared phonology and vocabulary, North-Western dialects show other common features, as a trend to reduce the noun inflection system, shared innovations in the verbal system (merge of imperfect, aorist and perfect in a single preterite, although some preterite-presents are found) the -r endings of the middle or middle-passive voice, a common evolution of laryngeals, etc.

The southern IEDs, which spread in different directions and evolved without forming a continuum, show therefore a differentiated phonology and vocabulary, but common older developments like the augment in é-, middle desinences in -i, athematic verbal inflection, pluperfect and perfect forms, and aspectual differentiation between the types *bhére/o- and *tudé/o-.

1.4. The Proto-Indo-European Urheimat

The search for the Urheimat or ‘Homeland’ of the prehistoric Proto-Indo-Europeans has developed as an archaeological quest along with the linguistic research looking for the reconstruction of the proto-language.

NOTE. Mallory (Journal of Indo-European Studies 1, 1973): “While many have maintained that the search for the PIE homeland is a waste of intellectual effort, or beyond the competence of the methodologies involved, the many scholars who have tackled the problem have ably evinced why they considered it important. The location of the homeland and the description of how the Indo-European languages spread is central to any explanation of how Europe became European. In a larger sense it is a search for the origins of western civilization.”

According to A. Scherer’s Die Urheimat der Indogermanen (1968), summing up the views of various authors from the years 1892-1963, still followed by mainstream Indo-European studies today, “[b]ased upon the localization of later languages such as Greek, Anatolian, and Indo-Iranian, a swathe of land in southern Russia north of the Black Sea is often proposed as the native area of the speakers of Proto-Indo-European”.

1.4.1. Historical Linguistics

In Adrados–Bernabé–Mendoza (1995-1998), a summary of main linguistic facts is made, supported by archaeological finds:

 “It is communis opinio today that the languages of Europe have developed in situ in our continent; although indeed, because of the migrations, they have remained sometimes dislocated, and also extended and fragmented (…) Remember the recent date of the ‘crystallisation’ of European languages. ‘Old European’ [=North-West IE], from which they derive, is an already evolved language, with opposition masculine/feminine, and must be located in time ca. 2000 BC or before. Also, one must take into account the following data: the existence of Tocharian, related to [Northern LIE], but far away to the East, in the Chinese Turkestan; the presence of [Southern LIE] languages to the South of the Carpathian Mountains, no doubt already in the third millennium (the ancestors of Thracian, Iranian, Greek speakers); differentiation of Hittite and Luwian, within the Anatolian group, already ca. 2000 BC, in the documents of Kültepe, what means that Common Anatolian must be much older.

NOTE. Without taking on account archaeological theories, linguistic data reveals that:

a) [Northern LIE], located in Europe and in the Chinese Turkestan, must come from an intermediate zone, with expansion into both directions.

b) [Southern LIE], which occupied the space between Greece and the north-west of India, communicating both peninsulas through the languages of the Balkans, Ukraine and Northern Caucasus, the Turkestan and Iran, must also come from some intermediate location. Being a different linguistic group, it cannot come from Europe or the Russian Steppe, where Ural-Altaic languages existed.

c)  Both groups have been in contact secondarily, taking on account the different ‘recent’ isoglosses in the contact zone.

d) The more archaic Anatolian must have been isolated from the more evolved IE; and that in some region with easy communication with Anatolia.

(…) Only the Steppe North of the Caucasus, the Volga river and beyond can combine all possibilities mentioned: there are pathways that go down into Anatolia and Iran through the Caucasus, through the East of the Caspian Sea, the Gorgan plains, and they can migrate from there to the Chinese Turkestan, or to Europe, where two ways exist: to the North and to the South of the Carpathian mountains.

These linguistic data, presented in a diagram, are supported by strong archaeological arguments: they have been defended by Gimbutas 1985 against Gamkrelidze–Ivanov (1994-1995) (…). This diagram proposes three stages. In the first one, [PIH] became isolated, and from it Anatolian emerged, being first relegated to the North of the Caucasus, and then crossing into the South: Common Anatolian must be located there. Note that there is no significant temporal difference with the other groups; it happens also that the first IE wave into Europe was older. It is somewhere to the North of the people that later went to Anatolia that happened the great revolution that developed [LIE], the ‘common language’.


Stage 1




Stage 2





Stage 3




Down Arrow Callout: Jkfghjfghjdghjdfhdfhdfjhfghkfk rjtyjdghj  





West IE   Bal.-Sla.

                        Northern horde                                                         Tocharian




                                                                             Southern Horde


Germanic          Bal.-Sla.

                            Northern Horde





Diagram of the expansion and relationships of IE languages. Adapted from Adrados (1979).

                                                                                       Southern Horde

The following stages refer to that common language. The first is the one that saw both [N.LIE] (to the North) and [S.LIE] (to the South), the former being fragmented in two groups, one that headed West and one that migrated to the East. That is a proof that somewhere in the European Russia a common language [N.LIE] emerged; to the South, in Ukraine or in the Turkestan [S.LIE].

The second stage continues the movements of both branches, that launched waves to the South, but that were in contact in some moments, arising isoglosses that unite certain languages of the [Southern IE] group (first Greek, later Iranian, etc.) with those of the rearguard of [Northern IE] (especially Baltic and Slavic, also Italic and Germanic)”.

NOTE. The assumption of three independent series of velars (v.s. Considerations of Method), has logical consequences when trying to arrange a consistent chronological and dialectal evolution from the point of view of historical linguistics. That is necessarily so because phonological change is generally assumed to be easier than morphological evolution for any given language. As a consequence, while morphological change is an agreed way to pinpoint different ancient groups, and lexical equivalences to derive late close contacts and culture (using them we could find agreement in grouping e.g. Balto-Slavic, Italo-Celtic, and Germanic between both groups, as well as an older Graeco-Aryan dialects), phonetics is often used – whether explicitly or not – as key to the groupings and chronology of the final split up of Late Indo-European, which is at the core of the actual archaeological quest today.

 If we assume that the satem languages were show the most natural trend of leniting palatals from an ‘original’ system of three series of velars; if we assume that the other, centum languages, had undergone a trend of (unlikely and unparallelled) depalatalisation of the palatovelars; then the picture of the dialectal split must be different, because centum languages must be more closely related to each other in ancient times (due to the improbable happening of depalatalisation in more than one branch independently). That is the scheme followed in some manuals on IE linguistics or archaeology if three series are reconstructed or accepted, as it is commonly the case.

From that point of view, Italic, Celtic and Tocharian must be grouped together, while the satem core can be found in Balto-Slavic and Indo-Iranian. This contradicts the finds on different Northern and Graeco-Aryan dialects, though. As already stated, the Glottalic theory might support that dialectal scheme, by assuming a neater explanation of the natural evolution of glottalic, voiced and voiceless stops, different from the depalatalisation proposal. However, the glottalic theory is today mostly rejected (see below §1.5). Huld’s (1997) explanation of the three series could also support this scheme (see above).

1.4.2. Archaeology

The Kurgan hypothesis was introduced by Marija Gimbutas (The Prehistory of Eastern Europe, Part 1, 1956) in order to combine archaeology with linguistics in locating the origins of the Proto-Indo-Europeans. She named the set of cultures in question “Kurgan” after their distinctive burial mounds and traced their diffusion into Eastern and Northern Europe.

NOTE. People were buried with their legs flexed, a position which remained typical for peoples identified with Indo-European speakers for a long time. The burials were covered with a mound, a kurgan (Turkish loanword in Russian for ‘tumulus’).

According to her hypothesis, PIE speakers were probably a nomadic tribe of the Pontic-Caspian steppe that expanded in successive stages of the Kurgan culture and three successive “waves” of expansion during the third millennium BC:

·   Kurgan I, Dnieper/Volga region, earlier half of the fourth millennium BC. Apparently evolving from cultures of the Volga basin, subgroups include the Samara and Seroglazovo cultures.

·   Kurgan II–III, latter half of the fourth millennium BC. Includes the Sredny Stog culture and the Maykop culture of the northern Caucasus. Stone circles, early two-wheeled chariots, anthropomorphic stone stelae of deities.

·   Kurgan IV or Pit Grave culture, first half of the third millennium BC, encompassing the entire steppe region from the Ural to Romania.

Text Box: Hypothetical Urheimat (Homeland) of the first PIE speakers, from 4500 BC onwards. The Yamna (Pit Grave) culture lasted from ca. 3600 till 2200 BC. In this time the first wagons appeared. (PD)


There were proposed to be three successive “waves” of expansion:

o             Wave 1, predating Kurgan I, expansion from the lower Volga to the Dnieper, leading to coexistence of Kurgan I and the Cucuteni culture. Repercussions of the migrations extend as far as the Balkans and along the Danube to the Vinča and Lengyel cultures in Hungary.

o             Wave 2, mid fourth millennium BC, originating in the Maykop culture and resulting in advances of kurganised hybrid cultures into northern Europe around 3000 BC – Globular Amphora culture, Baden culture, and ultimately Corded Ware culture.

o Wave 3, 3000-2800 BC, expansion of the Pit Grave culture beyond the steppes; appearance of characteristic pit graves as far as the areas of modern Romania, Bulgaria and eastern Hungary.

The ‘kurganised’ Globular Amphora culture in Europe is proposed as a ‘secondary Urheimat’ of PIE, the culture separating into the Bell-Beaker culture and Corded Ware culture around 2300 BC. This ultimately resulted in the European IE families of Italic, Celtic and Germanic languages, and other, partly extinct, language groups of the Balkans and central Europe, possibly including the proto-Mycenaean invasion of Greece.

1.4.3. Quantitative Analysis

Glottochronology tries to compare lexical, morphological or phonological traits in order to develop more trustable timelines and dialectal groupings. It hasn’t attracted much reliability among linguists, though, in relation with the comparative method, on which the whole IE reconstruction is still based.

NOTE. Most of these glottochronological works are highly controversial, partly owing to issues of accuracy, partly to the question of whether its very basis is sound. Two serious arguments that make this method mostly invalid today are the proof that Swadesh formulae would not work on all available material, and that language change arises from socio-historical events which are of course unforeseeable and, therefore, incomputable.

A variation of traditional glottochronology is phylogenic reconstruction; in biological systematics, phylogeny is a graph intended to represent genetic relationships between  biological taxa. Linguists try to transfer these biological models to obtain “subgroupings of one or the other branch of a language family.

NOTE. Clackson (2007) describes a recent phylogenetic study, by Atkinson et al. (“From Words to Dates: Water into Wine, Mathemagic or Phylogenetic Inference?”, Transactions of the Philological Society 103, 2005): “The New Zealand team use models which were originally designed to build phylogenies based on DNA and other genetic information, which do not assume a constant rate of change. Instead, their model accepts that the rate of change varies, but it constrains the variation within limits that coincide with attested linguistic sub-groups. For example, it is known that the Romance languages all derive from Latin, and we know that Latin was spoken 2,000 years ago. The rates of lexical change in the Romance family can therefore be calculated in absolute terms. These different possible rates of change are then projected back into prehistory, and the age of the parent can be ascertained within a range of dates depending on the highest and lowest rates of change attested in the daughter languages. More recently (Atkinson et al. 2005), they have used data based not just on lexical characters, but on morphological and phonological information as well.”

Their results show a late separation of the Northwestern IE languages, with a last core of Romance-Germanic, earlier Celto-Romano-Germanic, and earlier Celto-Romano-Germano-Balto-Slavic. Previous to that date, Graeco-Armenian would have separated earlier than Indo-Iranian, while Tocharian would have been the earliest to split up from LIE, still within the Kurgan framework, although quite early (ca. 4000-3000 BC). Before that, the Anatolian branch is found to have split quite earlier than the dates usually assumed in linguistics and archaeology (ca. 7000-6000 BC).

Holm proposed to apply a Separation-Level Recovery system to PIE. This is made (Holm, 2008) by using the data on the new Lexikon der indogermanischen Verben, 2nd ed. (Rix et al. 2001), considered a “more modern and linguistic reliable database” than the data traditionally used from Pokorny IEW. The results show a similar grouping to those of Atkinson et al. (2005), differentiating between North-West IE (Italo-Celtic, Germanic, Balto-Slavic), and Graeco-Aryan (Graeco-Armenian, Indo-Iranian) groups. However, Anatolian is deemed to have separated quite late compared to linguistic dates, being considered then just another LIE dialect, therefore rejecting the concept of Indo-Hittite altogether. Some of Holm’s studies are available at <http://hjholm.de/>.

The most recent quantitative studies then apparently show similar results in the phylogenetic groupings of recent languages, i.e. Late Indo-European dialects, excluding Tocharian. Their dates remain, at best, just approximations for the separation of late and well attested languages, though, while the dating (and even grouping) of ancient languages like Anatolian or Tocharian with modern evolution patterns remains at best questionable.


1.4.4. Archaeogenetics

Text Box: Distribution of haplotypes R1b (light colour) for Eurasiatic Paleolithic and R1a (dark colour) for Yamna expansion; black represents other haplogroups. (2009, modified from Dbachmann 2007)Cavalli-Sforza and Alberto Piazza argue that Renfrew (v.i. §1.5) and Gimbutas reinforce rather than contradict each other, stating that “genetically speaking, peoples of the Kurgan steppe descended at least in part from people of the Middle Eastern Neolithic who immigrated there from Turkey”.

Text Box: Distribution of haplogroup R1a (2011, modified from Crates 2009)NOTE. The genetic record cannot yield any direct information as to the language spoken by these groups. The current interpretation of genetic data suggests a strong genetic continuity in Europe; specifically, studies of mtDNA by Bryan Sykes show that about 80% of the genetic stock of Europeans originated in the Paleolithic.

Spencer Wells suggests that the origin, distribution and age of the R1a1 haplotype points to an ancient migration, possibly corresponding to the spread by the Kurgan people in their expansion across the Eurasian steppe around 3000 BC, stating that “there is nothing to contradict this model, although the genetic patterns do not provide clear support either”.

NOTE. R1a1 is most prevalent in Poland, Russia, and Ukraine, and is also observed in Pakistan, India and central Asia. R1a1 is largely confined east of the Vistula gene barrier and drops considerably to the west. The spread of Y-chromosome DNA haplogroup R1a1 has been associated with the spread of the Indo-European languages too. The mutations that characterise haplogroup R1a occurred ~10,000 years bp. Haplogroup R1a1, whose lineage is thought to have originated in the Eurasian Steppes north of the Black and Caspian Seas, is therefore associated with the Kurgan culture, as well as with the postglacial Ahrensburg culture which has been suggested to have spread the gene originally.

Text Box: (2011, modified from Cadenas 2008)The present-day population of R1b haplotype, with extremely high peaks in Western Europe and measured up to the eastern confines of Central Asia, are believed to be the descendants of a refugium in the Iberian peninsula at the Last Glacial Maximum, where the haplogroup may have achieved genetic homogeneity. As conditions eased with the Allerød Oscillation in about 12000 BC, descendants of this group migrated and eventually recolonised all of Western Europe, leading to the dominant position of R1b in variant degrees from Iberia to Scandinavia, so evident in haplogroup maps.

NOTE. High concentrations of Mesolithic or late Paleolithic YDNA haplogroups of types R1b (typically well above 35%) and I (up to 25%), are thought to derive ultimately of the robust Eurasiatic Cro Magnoid homo sapiens of the Aurignacian culture, and the subsequent gracile leptodolichomorphous people of the Gravettian culture that entered Europe from the Middle East 20,000 to 25,000 years ago, respectively.

1.4.5. The Kurgan Hypothesis and the Three-Stage Theory

ARCHAEOLOGY (Kurgan Hypothesis)

LINGUISTICS (Three-Stage Theory)

ca. 4500-4000 BC. Sredny Stog, Dnieper-Donets and Sarama cultures, domestication of the horse.


ca. 4000-3500 BC. The Yamna culture, the kurgan builders, emerges in the steppe, and the Maykop culture in northern Caucasus.

Pre-LIE and Pre-PAn dialects evolve in different communities but presumably still in contact within the same territory.

ca. 3500-3000 BC. Yamna culture at its peak: stone idols, two-wheeled proto-chariots, animal husbandry, permanent settlements and hillforts, subsisting on agriculture and fishing, along rivers. Contact of the Yamna culture with late Neolithic Europe cultures results in kurganised Globular Amphora and Baden cultures. Maykop culture shows earliest evidence of the beginning Bronze Age; bronze weapons and artifacts introduced.

Proto-Anatolian becomes isolated (either to the south of the Caucasus or in the Balkans), and has no more contacts with the linguistic innovations of the common Late Indo-European language.

Late Indo-European evolves in turn into dialects, at least a Southern or Graeco-Aryan and a Northern one.

ca. 3000-2500 BC. The Yamna culture extends over the entire Pontic steppe. The Corded Ware culture extends from the Rhine to the Volga, corresponding to the latest stage of IE unity. Different cultures disintegrate, still in loose contact, enabling the spread of technology.

Dialectal communities begin to migrate, remaining still in loose contact, enabling the spread of the last common phonetic and morphological innovations, and loan words. PAn, spoken in Asia Minor, evolves into Common Anatolian.

ca. 2500-2000 BC. The Bronze Age reaches Central Europe with the Beaker culture of Northern Indo-Europeans. Indo-Iranians settle north of the Caspian in the Sintashta-Petrovka and later the Andronovo culture.

The breakup of the southern IE dialects is complete. Proto-Greek spoken in the Balkans; Proto-Indo-Iranian in Central Asia; North-West Indo-European in Northern Europe; Common Anatolian dialects in Anatolia.

ca. 2000-1500 BC. The chariot is invented, leading to the split and rapid spread of Iranians and other peoples from the Andronovo culture and the Bactria-Margiana Complex over much of Central Asia, Northern India, Iran and Eastern Anatolia. Greek Darg Ages and flourishing of the Hittite Empire. Pre-Celtic Unetice culture.

Indo-Iranian splits up in two main dialects, Indo-Aryan and Iranian. European proto-dialects like Pre-Germanic, Pre-Celtic, Pre-Italic, and Pre-Balto-Slavic differentiate from each other. Anatolian languages like Hittite and Luwian are written down; Indo-Iranian attested through Mitanni; a Greek dialect, Mycenaean, is already spoken.

ca. 1500-1000 BC. The Nordic Bronze Age sees the rise of the Germanic Urnfield and the Celtic Hallstatt cultures in Central Europe, introducing the Iron Age. Italic peoples move to the Italian Peninsula. Rigveda composed. Decline of Hittite Kingdoms and the Mycenaean civilisation.

Celtic, Italic, Germanic, Baltic and Slavic are already different proto-languages, developing in turn different dialects. Iranian and other related southern dialects expand through military conquest, and Indo-Aryan spreads in the form of its sacred language, Sanskrit.

ca. 1000-500 BC. Northern Europe enters the Pre-Roman Iron Age. Early Indo-European Kingdoms and Empires in Eurasia. In Europe, Classical Antiquity begins with the flourishing of the Greek peoples. Foundation of Rome.

Celtic dialects spread over western Europe, German dialects to the south of Jutland. Italic languages in the Italian Peninsula. Greek and Old Italic alphabets appear. Late Anatolian dialects. Cimmerian, Scythian and Sarmatian in Asia, Palaeo-Balkan languages in the Balkans.

1.5. Other Archaeolinguistic Theories

1.5.1. The most known new alternative theory concerning PIE is the Glottalic theory. It assumes that Proto-Indo-European was pronounced more or less like Armenian, i.e. instead of PIE *p, *b, *bh, the pronunciation would have been *p’, *p, *b, and the same with the other two voiceless-voiced-voiced aspirated series of consonants usually reconstructed. The IE Urheimat would have been then located in the surroundings of Anatolia, especially near Lake Urmia, in northern Iran, hence the archaism of Anatolian dialects and the glottalics found in Armenian.

NOTE. Those linguistic and archaeological findings are supported by Gamkredlize-Ivanov (“The early history of Indo-European languages”, Scientific American, 1990) where early Indo-European vocabulary deemed “of southern regions” is examined, and similarities with Semitic and Kartvelian languages are also brought to light.

This theory is generally rejected; Beekes (1995) for all: “But this theory is in fact very improbable. The presumed loan-words are difficult to evaluate, because in order to do so the Semitic words and those of other languages would also have to be evaluated. The names of trees are notoriously unreliable as evidence. The words for panther, lion and elephant are probably incorrectly reconstructed as PIE words.”

1.5.2. Alternative theories include:

I. The European Homeland thesis maintains that the common origin of the IE languages lies in Europe. These hypotheses are often driven by archeological theories. A. Häusler (Die Indoeuropäisierung Griechenlands, Slovenska Archeológia 29, 1981; etc.) continues to defend the hypothesis that places Indo-European origins in Europe, stating that all the known differentiation emerged in the continuum from the Rhin to the Urals.

NOTE. It has been traditionally located in 1) Lithuania and the surrounding areas, by R.G. Latham (1851) and Th. Poesche (Die Arier. Ein Beitrag zur historischen Anthropologie, 1878); 2) Scandinavia, by K.Penka (Origines ariacae, 1883); 3) Central Europe, by G. Kossinna (“Die Indogermanische Frage archäologisch beantwortet”, Zeitschrift für Ethnologie, 34, 1902), P.Giles (The Aryans, 1922), and by linguist/archaeologist G. Childe (The Aryans. A Study of Indo-European Origins, 1926).

a. The Paleolithic Continuity theory posits that the advent of IE languages should be linked to the arrival of Homo sapiens in Europe and Asia from Africa in the Upper Paleolithic. The PCT proposes a continued presence of Pre-IE and non-IE peoples and languages in Europe from Paleolithic times, allowing for minor invasions and infiltrations of local scope, mainly during the last three millennia.

NOTE. There are some research papers concerning the PCT available at <http://www.continuitas.com/>. Also, the PCT could in turn be connected with Frederik Kortlandt’s Indo-Uralic and Altaic studies <http://kortlandt.nl/publications/>.

On the temporal relationship question, Mallory–Adams (2007): “Although there are still those who propose solutions dating back to the Palaeolithic, these cannot be reconciled with the cultural vocabulary of the Indo-European languages. The later vocabulary of Proto-Indo- European hinges on such items as wheeled vehicles, the plough, wool, which are attested in Proto-Indo-European, including Anatolian. It is unlikely then that words for these items entered the Proto-Indo-European lexicon prior to about 4000 BC.”

b. A new theory put forward by Colin Renfrew relates IE expansion to the Neolithic revolution, causing the peacefully spreading of an older pre-IE language into Europe from Asia Minor from around 7000 BC, with the advance of farming. It proposes that the dispersal (discontinuity) of Proto-Indo-Europeans originated in Neolithic Anatolia.

NOTE. Reacting to criticism, Renfrew by 1999 revised his proposal to the effect of taking a pronounced Indo-Hittite position. Renfrew’s revised views place only Pre-Proto-Indo-European in seventh millennium Anatolia, proposing as the homeland of Proto-Indo-European proper the Balkans around 5000 BC, explicitly identified as the “Old European culture” proposed by Gimbutas.

Mallory–Adams (2007): “(…) in both the nineteenth century and then again in the later twentieth century, it was proposed that Indo-European expansions were associated with the spread of agriculture. The underlying assumption here is that only the expansion of a new more productive economy and attendant population expansion can explain the widespread expansion of a language family the size of the Indo-European. This theory is most closely associated with a model that derives the Indo-Europeans from Anatolia about the seventh millennium BC from whence they spread into south-eastern Europe and then across Europe in a Neolithic ‘wave of advance’.

(…) Although the difference between the Wave of Advance and Kurgan theories is quite marked, they both share the same explanation for the expansion of the Indo-Iranians in Asia (and there are no fundamental differences in either of their difficulties in explaining the Tocharians), i.e. the expansion of mobile pastoralists eastwards and then southwards into Iran and India. Moreover, there is recognition by supporters of the Neolithic theory that the ‘wave of advance’ did not reach the peripheries of Europe (central and western Mediterranean, Atlantic and northern Europe) but that these regions adopted agriculture from their neighbours rather than being replaced by them”.

Talking about these new hypotheses, Adrados–Bernabé–Mendoza (1995-1998) discuss the relevance that is given to each new personal archaeological ‘revolutionary’ theory: “[The hypothesis of Colin Renfrew (1987)] is based on ideas about the diffusion of agriculture from Asia to Europe in [the fifth millennium Neolithic Asia Minor], diffusion that would be united to that of Indo-Europeans; it doesn’t pay attention at all to linguistic data. The [hypothesis of Gamkrelidze–Ivanov (1980, etc.)], which places the Homeland in the contact zone between Caucasian and Semitic peoples, south of the Caucasus, is based on real or supposed lexical loans; it disregards morphological data altogether, too. Criticism of these ideas – to which people have paid too much attention – are found, among others, in Meid (1989), Villar (1991), etc.”

II. Another hypothesis, contrary to the European ones, also mainly driven today by nationalistic or religious views, traces back the origin of PIE to Vedic Sanskrit, postulating that this is very pure, and that the origin of common Proto-Indo-European can thus be traced back to the Indus Valley Civilisation of ca. 3000 BC.

NOTE. Pan-Sanskritism was common among early Indo-Europeanists, as Schlegel, Young, A. Pictet (Les origines indoeuropéens, 1877) or Schmidt (who preferred Babylonia), but are now mainly supported by those who consider Sanskrit almost equal to Late Proto-Indo-European. For more on this, see S. Misra (The Aryan Problem: A Linguistic Approach, 1992), Elst (Update on the Aryan Invasion Debate, 1999), followed up by S.G. Talageri (The Rigveda: A Historical Analysis, 2000), both part of “Indigenous Indo-Aryan” viewpoint by N. Kazanas, the “Out of India” theory, with a framework dating back to the times of the Indus Valley Civilisation.

1.6. Relationship to Other Languages

1.6.1. Many higher-level relationships between PIE and other language families have been proposed, but these speculative connections are highly controversial. Perhaps the most widely accepted proposal is of an Indo-Uralic family, encompassing PIE and Proto-Uralic, a language from which Hungarian, Finnish, Estonian, Saami and a number of other languages belong. The evidence usually cited in favour of this is the proximity of the proposed Urheimaten for both of them, the typological similarity between the two languages, and a number of apparent shared morphemes.

NOTE. Other proposals, further back in time (and correspondingly less accepted), model PIE as a branch of Indo-Uralic with a Caucasian substratum; link PIE and Uralic with Altaic and certain other families in Asia, such as Korean, Japanese, Chukotko-Kamchatkan and Eskimo-Aleut (representative proposals are Greenberg’s Eurasiatic and its proposed parent-language Nostratic); etc.

1.6.2.Indo-Uralic or Uralo-Indo-European is therefore a hypothetical language family consisting of Indo-European and Uralic (i.e. Finno-Ugric and Samoyedic). Most linguists still consider this theory speculative and its evidence insufficient to conclusively prove genetic affiliation.

NOTE. The problem with lexical evidence is to weed out words due to borrowing, because Uralic languages have been in contact with Indo-European languages for millennia, and consequently borrowed many words from them.

Björn Collinder, author of the path-breaking Comparative Grammar of the Uralic Languages (1960), a standard work in the field of Uralic studies, argued for the kinship of Uralic and Indo-European (1934, 1954, 1965).

The most extensive attempt to establish sound correspondences between Indo-European and Uralic to date is that of the late Slovenian linguist Bojan Čop. It was published as a series of articles in various academic journals from 1970 to 1989 under the collective title Indouralica. The topics to be covered by each article were sketched out at the beginning of “Indouralica II”. Of the projected 18 articles only 11 appeared. These articles have not been collected into a single volume and thereby remain difficult to access.

Dutch linguist Frederik Kortlandt supports a model of Indo-Uralic in which its speakers lived north of the Caspian Sea, and Proto-Indo-Europeans began as a group that branched off westward from there to come into geographic proximity with the Northwest Caucasian languages, absorbing a Northwest Caucasian lexical blending before moving farther westward to a region north of the Black Sea where their language settled into canonical Proto-Indo-European.

1.6.3. The most common arguments in favour of a relationship between PIH and Uralic are based on seemingly common elements of morphology, such as:




‘I, me’

*me ‘me’ (Acc.), *mene ‘my’ (Gen.)

*mun, *mina ‘I’

‘you’ (sg)

*tu (Nom.), *twe (Acc.), *tewe ‘your’ (Gen.)

*tun, *tina

1st P. singular



1st P. plural



2nd P. singular

*-s (active), *-tHa (perfect)


2nd P. plural




*so ‘this, he/she’ (animate nom)

*ša (3rd person singular)

Interr. pron. (An.)

*kwi-  ‘who?, what?’; *kwo- ‘who?, what?’

*ken ‘who?’, *ku-, ‘who?’

Relative pronoun


*-ja (nomen agentis)







Nom./Acc. plural

*-es (Nom. pl.), *-m̥-s (Acc. pl.)


Oblique plural

*-i (pronomin. pl., cf. *we-i- ‘we’,  *to-i- ‘those’)






*-s- (aorist); *-es-, *-t (stative substantive)


Negative particle

*nei, *ne

*ei- [negative verb] , *ne

‘to give’



‘to wet’,’water’

*wed- ‘to wet’, *wodr̥- ‘water’

*weti ‘water’


*mesg- ‘dip under water, dive’

*muśke- ‘wash’

‘to assign’,

*nem- ‘to assign, to allot’, *h1nomn̥- ‘name’

*nimi ‘name’


*h2weseh2- ‘gold’

*waśke ‘some metal’


*mei- ‘exchange’

*miHe- ‘give, sell’


*(s)kwalo- ‘large fish’

*kala ‘fish’


*galou- ‘husband's sister’

*kälɜ ‘sister-in-law’


*polu- ‘much’

*paljɜ ‘thick, much’


1.7. Indo-European Dialects

Schleicher’s Fable: From PIE to Modern English

The so-called Schleicher's fable is a poem composed in PIE, published by August Schleicher in 1868, originally named “The Sheep and the Horses”. It is written here in the different reconstructible IE dialects for comparison.

NOTE. Only the versions inLate Indo-European early dialects are supposed to use correct dialectal forms and vocabulary. The other examples – in PIH and late European proto-languages – are mainly phonetic examples following the Late Indo-European morphological and syntactical model.


A hypothetical PIH version (ca. 3500 BC?): h3owis h1ekwōskwe. • H3owis, kwesjo wl̥h1neh2 ne h1est, h1ekwoms spekét, h1oinom gwr̥h3úm woghom wéghontm̥, h1oinomkwe megeh2m bhorom, h1oinomkwe dhh1ghmonm̥h1oh1ku bhérontm̥. • H3owis nu h1ékwobhos weukwét: • “Kr̥d h2éghnutoi h1moí, h1ekwoms h2égontm̥wih1róm wídn̥tei”. H1ekwōs tu weukwónt: “Klu, h3owi! • kr̥d h2éghnutoi n̥sméi wídn̥tbhos: h2ner, potis, h3owjom-r̥wl̥h1neh2m̥ • swebhei gwhermom westrom kwr̥neuti”. • H3owjom-kwe wl̥hneh2 ne h1esti. • Tod kékluwos h3owis h2egrom bhugét.

CA (PAn), 2500 BC

NWIE (N.LIE), ca. 2500 BC

Howis ekwōskwu.

Owis ekwōskwe.

Howis, kwosjo ulhneh ne est,

Owis, kwosjo wl̥nā ne est,

ekwons spekét,

ekwons spekét,

oikom gwurrúm wogom wégontm̥,

oinom gwrawúm woghom wéghontm̥,

oikomkwu megehm borom,

oinomkwe megām bhorom,

oikomkwu dgomonm̥ oku bérontm̥.

oinomkwe dhghomonm̥ ōkú bhérontm̥.

Howis nu ékwobos wūkwét:

Owis nu ékwobhos weukwét:

Kr̥di xégnutor moi,

“Kr̥di ághnutor moi,

ekwons xégontm̥ wiróm wídn̥tę”.

ekwons ágontm̥ wīróm widn̥tei”.

Ekwōs tu wūkwónt: “Klu, howi!

Ekwōs tu weukwónt: “Kl̥u, owi!

kr̥di hegnutor n̥smę wídn̥tbos:

kr̥di ághnutor n̥sméi widn̥tbhos:

hnr, potis, howjom-r̥ ulhnehm̥

neros, potis, owjom ar wl̥nām

swebę gwermom wéstrom kwr̥nūdi”.

sebhei gwhormom westrom kwr̥neuti”.

Howjomkwu ulhneh ne esti.

Owjomkwe wl̥nā ne esti.

Tod kékluwos howis hegrom bugét.

Tod kékluwos owis agrom bhugét.


Proto-Aryan (S.LIE), ca. 2500 BC

Proto-Greek (S.LIE), ca. 2500 BC

Awis aķwāsa.

Owis ekwoikwe.

Awis, kasja wr̥nā na āst,

Ówis, kwohjo wlānā ne ēst,

aķwans spaķát,

ekwons spekét,

aikam gurúm waģham wáģhantm̥,

oiwom kwarúm wokhom wekhontm̥,

aikama maģham bharam,

oiwomkwe megām phorom,

aikama dhģhámanm̥ āķu bharantm̥.

oiwomkwe khthómonm̥ ōku phérontm̥.

Awis nu áķwabhjas áwaukat:

Ówis nu ékwophos éweukwet:

Ķr̥di ághnutai mai,

Kr̥di ákhnutoi moi,

aķwans aģantam wīrám wídn̥tai”.

ekwons ágontm̥ wīróm wídn̥tei”.

Áķwās tu áwawkant: “Ķr̥nudhí avi!

Ékwoi tu éweukwont: “Kl̥nuthi, owi!

ķr̥d ághnutai n̥smái wídn̥tbhjas:

kr̥di ágnutoi n̥sméi wídn̥tphos:

nr, patis, awjam ar wr̥nām

anr, potis, owjom ar wlānām

swabhi gharmam wastram kr̥nauti”.

sephei kwhermom westrom kwr̥neuti”.

Awjama wr̥nā na asti.

Owjom-kwe wlānā ne esti.

Tat ķáķruwas awis aģram ábhugat.

Tot kékluwos owis agrom éphuget.


Proto-Celtic (ca. 1000 BC)

Proto-Italic (ca. 1000 BC)

Owis ekwoikwe.

Owis ekwoikwe.

Owis, kwosjo wlanā ne est,

Owis, kwosjo wlānā ne est,

ekwōs spekét,

ekwōs spekét,

oinom barúm woxom wéxontam,

oinom grāwúm woxom wéxontem,

oinomkwe megam borom,

oinomkwe megam φorom,

oinomkwe dxonjom āku berontam.

oinomkwe xomonem ōku φerontem.

Owis nu ékwobos weukwét:

Owis nu ékwoφos weukwét:

“Kridi áxnutor mai,

Kordi áxnutor mei,

ekwōs ágontom wīróm wídanti”.

ekwōs ágontom wīróm wídentei.

Ekwoi tu wewkwónt: “Kalnéu, owi!

Ekwoi tu wewkwónt: “Kalnéu, owi!

kridi áxnutor ansméi wídantbos:

kordi axnutor ensméi wídentφos:

neros, φotis, owjom ar wlanām

neros, potis, owjom ar wlānām

sebi gwormom westrom kwarneuti”.

seφei ghormom westrom kworneuti”.

Owjomkwe wlanā ne esti.

Owjomkwe wlānā ne esti.

Tod kéklowos owis agrom bugét.

Tud kékluwos owis agrom φugít.


Pre-Proto-Germanic (ca. 1000 BC)

Proto-Balto-Slavic (ca. 1000 BC)

Awiz exwazxwe.

Awis ewōskje.

Awiz, hwas wulnō ne est,

Awis, kasja wilnā ne est,

exwanz spexét,

ewas speét,

ainan karún wagan wéganðun,

ainan grun waġan wéġantun,

ainanxwe mekon baran,

ainanke megan baran,

ainanxwe gúmanan āxu béranðun.

ainanke ġmanan ōku bérantun

Awiz nu éxwamaz weuxwéð:

Awis nu ewamas wjaukjét:

“Hurti ágnuðai mei,

irdi ágnutei mei,

exwanz ákanðun werán wítanðī”.

ekwans ágantun wirán wíduntei”.

Exwaz tu wewxwant: “Hulnéu, awi!

Ewōs tu wjaukunt: “Kludí, awi!

hurti áknuðai unsmí wítunðmaz:

irdi ágnutei insméi wídūntmas:

neraz, faþiz, awjan ar wulnōn

neras, patis, awjam ar wilnān

sibī warman wesþran hwurneuþi”.

sebi gormom westran kjirnjautĭ”.

Awjanxwe wulnō ne isti.

Áwjamkje wilnā ne esti.

Þat héxluwaz awiz akran bukéþ.

Ta kjekluwas awis agram bugít.

Translation: « The Sheep and the Horses. A sheep that had no wool saw horses, one pulling a heavy wagon, one carrying a big load, and one carrying a man quickly. The sheep said to the horses: “My heart pains me, seeing a man driving horses”. The horses said: “Listen, sheep, our hearts pain us when we see this: a man, the master, makes the wool of the sheep into a warm garment for himself. And the sheep has no wool”. Having heard this, the sheep fled into the plain. »


1.7.1. Northern Indo-European dialects

I. North-West Indo-European

1. North-West Indo-European was probably spoken in Europe in the centuries on either side of ca. 2500 BC, including Pre-Celtic, Pre-Italic, Pre-Germanic, Pre-Baltic, and Pre-Slavic, among other ancestors of IE languages attested in Europe. Its original common location is usually traced back to “some place to the East of the Rhine, to the North of the Alps and the Carpathian Mountains, to the South of Scandinavia and to the East of the Eastern European Lowlands or Russian Plain, not beyond Moscow” (Adrados–Bernabé–Mendoza 1995-1998).

Text Box: Generalized distribution of all Corded Ware variants (ca. 3200-2300), with adjacent third millennium cultures. Mallory–Adams (1997). The Globular Amphora culture (ca. 3400-2800) overlaps with the early territory of the Corded Ware culture (ca. 3200-2800 BC), which later expanded to east and west. (2011, modified from Dbachmann 2005).2. The Corded Ware (also Battle Axe or Single Grave) complex of cultures, traditionally represents for many scholars the arrival of the first speakers of Northern LIE in central Europe, coming from the Yamna culture. The complex dates from about 3200-2300 BC. The Globular Amphorae culture may be slightly earlier, but the relationship between these cultures remains unclear.

NOTE. From a linguistic-archaeological point of view, Beekes (1995): “The combined use of the horse and the ox-drawn wagon made the Indo-Europeans exceptionally mobile. It is therefore not surprising that they were able to migrate over such a very large area after having first taken possession of the steppes (…). It has long been assumed that the Corded Ware culture (from 3300 to 2300 B.C., in German the ‘Schnurkeramiker’ of which the Battle Axe culture, the Single Grave Folk, the East Baltic and the Fatyanovo culture on the upper reaches of the Volga are all variants) from the middle Dniepr region and the upper Volga as far as Scandinavia and Holland, was developed by an Indo-European people. They would seem to have been nomads, their society was warlike, and they introduced both the horse and wagon. We find them in Holland as early as 3000 B.C., where they are clearly immigrants, and it is here that the earliest wheels of western Europe have been found. There is a problem in the fact that this culture is very early indeed when compared to the Yamnaya culture (3600-2200 B.C., although the Yamnaya may be still older), but the central problem is the origin of Corded Ware. The Globular Amphorae culture (‘Kugelamphoren’ in German) preceded that of the Corded Ware (as of 3500 B.C.) in roughly the same area, though it extended in a more southerly direction and reached as far as the middle Dniepr and the Dniestr. The relation between this culture and the Corded Ware culture is not clear, but it does seem probable that there was a relationship of some kind.”

Mallory–Adams (2007): “Many of the language groups of Europe, i.e. Celtic, Germanic, Baltic, and Slavic, may possibly be traced back to the Corded Ware horizon of northern, central, and eastern Europe that flourished c. 3200-2300 BC. Some would say that the Iron Age cultures of Italy might also be derived from this cultural tradition. For this reason the Corded Ware culture is frequently discussed as a prime candidate for early Indo-European.”

Anthony (2007) gives a detailed account of archaeological events: “The Corded Ware horizon spread across most of northern Europe, from Ukraine to Belgium, after 3000 BCE, with the initial rapid spread happening mainly between 2900 and 2700 BCE. The defining traits of the Corded Ware horizon were a pastoral, mobile economy that resulted in the near disappearance of settlement sites (much like Yamnaya in the steppes), the almost universal adoption of funeral rituals involving single graves under mounds (like Yamnaya), the diffusion of stone hammer-axes probably derived from Polish TRB [=Funnelbeaker] styles, and the spread of a drinking culture linked to particular kinds of cord-decorated cups and beakers, many of which had local stylistic prototypes in variants of TRB ceramics. The material culture of the Corded Ware horizon was mostly native to northern Europe, but the underlying behaviors were very similar to those of the Yamnaya horizon, the broad adoption of a herding economy based on mobility (using oxdrawn wagons and horses), and a corresponding rise in the ritual prestige and value oflivestock. The economy and political structure of the Corded Ware horizon certainly was influenced by what had emerged earlier in the steppes(…).

The Yamnaya and Corded Ware horizons bordered each other in the hills between Lvov and Ivano-Frankovsk, Ukraine, in the upper Dniester piedmont around 2800-2600 BCE (see figure). At that time early Corded Ware cemeteries were confined to the uppermost headwaters of the Dniester west of Lvov, the same territory that had earlier been occupied by the late TRB communities infiltrated by late Tripolye groups. If Corded Ware societies in this region evolved from local late TRB origins, as many believe, they might already have spoken an Indo-European language. Between 2700 and 2600 BCE Corded Ware and late Yamnaya herders met each other on the upper Dniester over cups of mead or beer. This meeting was another opportunity for language shift (…). The wide-ranging pattern of interaction that the Corded Ware horizon inaugurated across northern Europe provided an optimal medium for language spread. Late Proto-Indo-European languages penetrated the eastern end of this medium, either through the incorporation of Indo-European dialects in the TRB base population before the Corded Ware horizon evolved, or through Corded Ware-Yamnaya contacts later, or both. Indo-European speech probably was emulated because the chiefs who spoke it had larger herds of cattle and sheep and more horses than could be raised in northern Europe, and they had a politico-religious culture already adapted to territorial expansion.”

3. The Corded Ware horizon spans over centuries. Most linguists agree that Northern LIE dialects shared a common origin within the original Yamnaya territory (ca. 3500-2500 BC), and that North-West Indo-European was a close linguistic community, already separated from Pre-Tocharian, during the time of the first Corded Ware migrations (ca. 2900-2500 BC, in the Kurgan framework). After that shared linguistic community, their speakers migrated to the east and west, spreading over a huge territory, which turned into a European continuum of different IE dialects in close contact.

4. The general internal linguistic division proposed for North-West Indo-European includes a West European group, with Pre-Italic and Pre-Celtic, and an East European group, comprising Pre-Baltic and Pre-Slavic. Pre-Germanic is usually assumed to have belonged to the West European core, and to have had contacts with East European later in time, into a loose Balto-Slavo-Germanic community.

NOTE 1. Those who divide between Italo-Celto-Germanic and Balto-Slavic include e.g.:

Burrow (1955): “The Western group of Indo-European languages consisting of Italic, Celtic and Germanic, is distinguished by certain common features in grammar and vocabulary, which indicate a fairly close mutual connection in prehistoric times. These ties are particularly close in the case of Italic and Celtic, even though they are not sufficient to justify the theory of common Italo-Celtic.”

Kortlandt (1990): “If the speakers of the other satem languages can be assigned to the Yamnaya horizon and the western Indo-Europeans to the Corded Ware horizon, it is attractive to assign the ancestors of the Balts and the Slavs to the Middle Dnieper culture [an eastern extension of the Corded Ware culture, of northern Ukraine and Belarus, see below Indo-Iranian].”

Beekes (1995): “Probably the Corded Ware people were the predecessors of the Germanic, Celtic and Italic peoples, and, perhaps, of the Balto-Slavic peoples as well.”

Adrados–Bernabé–Mendoza (1995-1998): “We think, to sum up, that a language more or less common, between Celtic and Germanic, is plausible. And that in equally gradual terms, but with a unity, if not complete, at least approximate, we should think the same for Baltic and Slavic. Even though it is a theory that has awoken polemic discussions, with Meillet and Senn as main representatives of the separation idea, Stang and Scherer of the unity; cf. Untermann 1957, Birnbaum 1975 (…) still more dubious is in relation with Illyrian, Venetic, etc. And models of more unitary ‘common languages’, like Indo-Iranian (…).”

Those who divide between Italo-Celtic and Balto-Slavo-Germanic:

Gamkrelidze–Ivanov (1993-1994), departing from an Anatolian homeland: “Especially intense contacts at level 5 can be found between the Balto-Slavic-Germanic and Italic-Celtic dialect areas. A long list of cognates can be adduced with lexical isoglosses reflecting close historical interaction between these areas (see Meillet 1922) (…) New arrivals joined earlier settlers to form an intermediate homeland shared by the tribes which later moved on to the more western zones of Europe. This intermediate settlement area thus became a zone of contacts and secondary rapprochements of dialects which had partially differentiated before this. This is where the common lexical and semantic innovations were able to arise. (…) The out-migration of the dialects from this secondary area - a secondary, or intermediate, proto-homeland - to central and western Europe laid the foundation for the gradual rise of the individual Italic, Celtic, Illyrian, Germanic, Baltic, and Slavic languages.”

Mallory–Adams (2007), who suppose an early separation of all European dialects independently from the parent language: “A major group presumably created or maintained by contact is labelled the North-West group and comprises Germanic, Baltic, and Slavic (as one chain whose elements may have been in closer contact with one another), and additionally Italic and Celtic. (…) The evidence suggests that this spread occurred at some time before there were marked divisions between these languages so that these words appear to have been ‘inherited’ from an early period”; also, “[t]here are so many of these words that are confined within these five language groups (Celtic, Italic, Germanic, Baltic, and Slavic) that most linguists would regard cognates found exclusively between any two or among all of these groups as specifically North-West Indo-European and not demonstrably Proto-Indo-European. To accept a series of cognates as reflections of a PIE word requires that the evidence come from further afield than a series of contiguous language groups in Europe”; and, “[t]he North-West European languages (Germanic, Baltic, Slavic, Celtic, Italic) shared a series of common loanwords (probably created among themselves as well as derived from some non-Indo-European source) at some period.”

This late continuum of closely related Northwestern IE languages has been linked to the Old European (Alteuropäisch) of Krahe (Unsere ältesten Flußnamen, 1964; Die Struktur der alteuropäischen Hydronymie, 1964), the language of the oldest reconstructed stratum of European hydronymy in Central and Western Europe.

NOTE. This “Old European” is not to be confused with the term as used by Marija Gimbutas, who applies it to Neolithic Europe. The character of these river names is Pre-Germanic and Pre-Celtic, and dated by Krahe to ca. 2000 BC, although according to the recent archaeological and linguistic studies, it should probably be deemed slightly earlier. Old European river names are found in the Baltic and southern Scandinavia, in Central Europe, France, the British Isles, and the Iberian and Italian peninsulas. This area is associated with the spread of the later Western Indo-European dialects, the Celtic, Italic, Germanic, Baltic, Slavic, and Illyrian branches. Notably exempt are the Balkans and Greece. Krahe locates the geographical nucleus of this area as stretching from the Baltic across Western Poland and Germany to the Swiss plateau and the upper Danube north of the Alps, while he considers the Old European river names of southern France, Italy and Spain to be later imports, replacing “Aegean-Pelasgian” and Iberian substrates, corresponding to Italic, Celtic and Illyrian invasions from about 1300 BC.

Tovar (1977, 1979) combines the split of the Graeco-Aryan group with the development of an ‘Old European’ language in Europe, which evolved into the historical languages attested. Adrados (Arquelogía y diferenciación del indoeuropeo, Em. 47, 1979) assume, as we have seen, a North-West Indo-European or Old European language (of ca. 2000 BC or earlier, according to Krahe’s account). In his view, the western core (Italo-Celto-Germanic) is still a unitary dialect in the late dialect continuum, while the eastern core (Pre-Balto-Slavic) is another, closely related dialect. This grouping has been supported by the latest phylogenetic studies (Atkinson et al. 2005, Holm 2008, v.s.). According to that view, the late North-West Indo-European community would have been similar e.g. to the German or to the North-Germanic dialect continua: a West European core (equivalent to the German and Scandinavian cores), plus a more different East European or Pre-Balto-Slavic territory (equivalent to Dutch, and to Icelandic, respectively).

About the identification of the North-West European dialect continuum with the “Old European” concept, Adrados–Bernabé–Mendoza (1995-1998): “The IE languages of Europe are all derived from [Late Indo-European]; most of them are [Northern dialects], Greek (and Thracian, we think) are [Southern] dialects. The first ones “crystallised” late, ca. 1000 BC or even later. But there are marks of earlier IE languages in Europe. Then a hypothesis results, whereby an ancient IE language could have existed in Europe, previous to Baltic, Slavic, Germanic, Latin, etc., a [Late Indo-European], or maybe an [Indo-Hittite] dialect.

This was put forward by the theory defended by Krahe (1964a, 1964b, among many writings), in which the European hydronymy, because of its roots and suffixes, bears witness to the existence of a European language previous to the differentiated languages (Germanic, Celtic, etc.), which would have been born from it in a later date. This is the so-called “Old European” (Alteuropäisch). We would have here a new intermediate language. For a defence of its presence in [the Iberian] Peninsula, cf. de Hoz 1963.

We lack otherwise data to decide the dialectal classification of this hypothetical language (the existence of a distinct feminine speaks in favour of a [LIE dialect]). Some names have been proposed: Drava, Dravos; Druna, Dravina, Dravonus; Dravan-, Dravantia, Druantia; Druta, Drutus. Or, to put other example, Sava, Savos; Savina; Savara, Savira; Savintia; Savistas. In cases like these, the roots are clearly IE, the suffixes too. The thesis that it is an IE language previous to the known ones seems correct, if we take into account the huge time span between the arrival of Indo-Europeans to Europe (in the fourth millennium BC) until the “crystallisation” of European languages, much more recent (…)

Therefore, the proposal of Schmid [(Alteuropäisch und Indogermanisch, 1968)], that the “Old European” of Krahe is simply IE, cannot be accepted. Apart from the arguments of Tovar in different publications, especially in Tovar 1979 and 1977, we have to add that in our view this IE knew the opposition masc./fem. -os/-a (-yə), i.e. it [derived from LIE]. We have to add Tovar’s corrections: we shouldn’t think about a unitary language, impossible without political and administrative unity, but about a series of dialects more or less evolved which clearly shared certain isoglosses. (…)

Indeed, all these discoveries, that took place in the 1950s and later, remain valid today, if we place them within the history of [Late Indo-European]. We still have to broaden its base by setting “Old European” (or more exactly its dialects) to the side of some IE languages whose existence we trace back to Europe in a previous date to the formation of the big linguistic groups that we know. They have left their marks not only in hydronymy (and toponymy and onomastics in general), but also in the vocabulary of the later languages, and even in languages that arrived to the historical age but are too badly attested; and, in any case, they aren’t Celtic, nor Illyran, nor Venetic, nor any other historical dialect, but independent and – we believe – older languages.

The investigation of “Old European” began precisely with the study of some toponymies and personal names spread all over Europe, previously considered “Ligurian” (by H. d’Arbois de Jubainville and C. Jullian) or “Illyrian” (by J. Pokorny), with which those linguistic groups – in turn badly known – were given an excessive extension, based only on some lexical coincidences. Today those hypotheses are abandoned, but the concept of “Old European” is not always enough. It is commonly spoken about “Pre-Celtic” languages, because in territories occupied by Celts toponyms and ethnic names have non-Celtic phonetics: especially with initial p (Parisii, Pictones, Pelendones, Palantia); there are also, in the Latin of those regions, loans of the same kind (so Paramus in Hispania).

In [the Iberian] Peninsula, more specifically, it has been proposed that peoples like the Cantabri, Astures, Pellendones, Carpetani and Vettones were possibly of Pre-Celtic language (cf. Tovar 1949:12). More closed is the discussion around Lusitanian (…)” (see below).

5. Linguists have pointed out ancient language contacts of Italic with Celtic; Celtic with Germanic; Germanic with Balto-Slavic. Southern dialectal isoglosses affect Balto-Slavic and Tocharian, and only partially Germanic and Latin.

NOTE. According to Adrados–Bernabé–Mendoza (1995-1998): “One has to distinguish, in this huge geographical space, different locations. We have already talked about the situation of Germans to the West, and by their side, Celtic, Latin and Italic speakers; Balts and Slavs to the East, the former to the North of the later. See, among others, works by Bonfante (1983, 1984), about the old location of Baltic and Slavic-speaking communities. Isoglosses of different chronology let us partially reconstruct the language history. Note that the output obtained with phonetics and morphology match up essentially those of Porzig, who worked with lexica.”

Celtic too shares isoglosses with Southern dialects, according to Meier-Brügger (2003): “Celtic contacts with eastern Indo-Europe are ancient. Compare the case, among others, of relative pronouns, which in Celtic, contrarily to the Italic *kwo-/*kwi-, is represented by *Hi̯o-, a characteristic that it shares with Greek, Phrygian, Indo-Iranian and Slavic.”

Against the inclusion of Pre-Latin IE within West Indo-European, there are some archaeological and linguistic theories (Szemerényi, Colin Renfrew; v.s. for J.P. Mallory); Polomé (“The Dialectal Position of Germanic within West-Indo-European”, Proc. of the 13th Int. Congress of Linguists, Tokyo, 1983) and Schmidt (1984, reviewed in Adrados–Bernabé–Mendoza, 1995-1998) argued that innovations common to Celtic and Germanic came from a time when Latin peoples had already migrated to the Italian peninsula, i.e. later than those common to Celtic, Latin and Germanic.

On the unity of Proto-Italic and Proto-Latin, Adrados–Bernabé–Mendoza (1995-1998): “dubious is the old unity scheme, no doubt only partial, between Latin and Osco-Umbrian, which has been rejected by famous Italian linguists, relating every coincidence to recent contacts. I am not so sure about that, as the common innovations are big; cf. Beeler 1966, who doesn’t however dispel the doubts. Obviously, according to the decision taken, there are different historical consequences. If one thinks that both linguistic groups come from the North, through the Alps (cf. Tovar 1950), from the end of the 2nd millennium, a previous unity can be proposed. But authors like Devoto (1962) or Szemerényi (1962) made Latin peoples come from the East, through Apulia.” There has been a continued archaeological and (especially) linguistic support by mainstream IE studies to the derivation of Italic (and Latin) from a West Indo-European core, even after critics to the old Italo-Celtic concept (C. Watkins Italo-Celtic Revisited, 1963, K.H. Schmidt Latein und Keltisch, 1986); see Porzig (Die Gliederung des indogermanischen Sprachgebiets, 1954), Dressler (“Über die Reknostruktion der idg. Syntax”, KZ 85, 1971), Tovar (1970), Pisani (Indogermanisch und Europa, 1974), Bonfante (“Il celtibèrico, il cèltico e l’indoeuropeo” in RALinc., ser. VIII 1983; “La protopatria degli Slavi”, in Accademia Polaca delle Scienze, Conferenze 89, 1984), Adrados–Bernabé–Mendoza (1995-1998), etc.; on the archaeological question, see Ghirshman (L’Iran et la migration des indo-aryens et des iraniens, 1977), Thomas (“Archaeological Evidence for the Migrations of the Indo-Europeans”, in Polomé (ed.) 1984), Gimbutas (“Primary and Secondary homeland of the Indo-Europeans”, JIES 13, 1985), etc.

On Meillet’s Italo-Celtic, it appears today that the idea is rejected by a majority of scholars, on the grounds of shared isoglosses which do not conform a community (cf. e.g. Watkins 1966). However, some common elections do reflect that both linguistic domains could in ancient times penetrate each other (Adrados–Bernabé–Mendoza 1995-1998). Recent publications (Gamkrelidze–Ivanov 1994-1995, Kortlandt 2007, etc.), as well as quantitative studies (see above §1.4.3) classify Italic and Celtic within the same branch, although sometimes as a West group including a late Italo-Germanic or Celto-Germanic subgroup.

NOTE 3. Today, the contacts between Balto-Slavic and Indo-Iranian are usually classified as from a late ‘areal’ contact or Sprachbund, or some sort of late North-West–East continuum (so e.g. in Kortlandt 1990, Mallory 1989, Adrados–Bernabé–Mendoza 1995-1998, West 2007, Anthony 2007); e.g. Mallory–Adams (2007): “The Indo-Iranian and Balto-Slavic languages share both satemisation and the ruki-rule and may have developed as some form of west–east (or northwest–south-east) continuum with certain features running through them” (see below Indo-Iranian).

6.  The Germanic homeland is usually traced back to the Nordic Late Neolithic in Scandinavia, still in contact with the Italo-Celtic homeland in Central Europe (Proto-Únětice?); the Late Corded Ware groups to the east probably represent the Balto-Slavic homeland. Beekes (1995), Adrados–Bernabé–Mendoza (1995-1998), etc.

Text Box: Haywood et al. The Cassell Atlas of World History. (1997) (2011, modified from Briangotts 2009)Eurasian cultures in 2000 BC, after the disintegration of IEDs.


A. Germanic

Text Box: Germanic languages as first language of the majority (dark colour) or official language of the country (light colour). (2011, modified from Shardz-Hayden 2010) The largest Germanic languages are English and German, with ca. 340 and some 120 million native speakers, respectively. Other significant languages include Low Germanic dialects (like Dutch) and the Scandinavian languages.

Their common ancestor is Proto-Germanic, probably still spoken in the mid-1st millennium B.C. in Iron Age Northern Europe, since its separation from an earlier Pre-Proto-Germanic, a Northern Indo-European dialect dated ca. 1500-500 BC. The succession of archaeological horizons suggests that before their language differentiated into the individual Germanic branches the Proto-Germanic speakers lived in southern Scandinavia and along the coast from the Netherlands in the west to the Vistula in the east around 750 BC. Early Germanic dialects enter history with the Germanic peoples who settled in northern Europe along the borders of the Roman Empire from the second century AD.

NOTE.  A few surviving inscriptions in a runic script from Scandinavia dated to ca. 200 are thought to represent a later stage of Proto-Norse; according to Bernard Comrie, it represents a Late Common Germanic which followed the “Proto-Germanic” stage. Several historical linguists have pointed towards the apparent material and social continuity connecting the cultures of the Nordic Bronze Age (1800-500 BC) and the Pre-Roman Iron Age (500 BC - AD 1) as having implications in regard to the stability and later development of the Germanic language group. Lehmann (1977) writes: “Possibly the most important conclusion based on archeological evidence with relevance for linguistic purposes is the assumption of ‘one huge cultural area’ which was undisturbed for approximately a thousand years, roughly from 1500-500 BC Such a conclusion in a stable culture permits inferences concerning linguistic stability, which are important for an interpretation of the Germanic linguistic data.”

The earliest evidence of the Germanic branch is recorded from names in the first century by Tacitus, and in a single instance in the second century BC, on the Negau helmet. From roughly the second century AD, some speakers of early Germanic dialects developed the Elder Futhark. Early runic inscriptions are also largely limited to personal names, and difficult to interpret. The Gothic language was written in the Gothic alphabet developed by Bishop Ulfilas for his translation of the Bible in the fourth century. Later, Christian priests and monks who spoke and read Latin in addition to their native Germanic tongue began writing the Germanic languages with slightly modified Latin letters, but in Scandinavia, runic alphabets remained in common use throughout the Viking Age.

Text Box: Negau helmet. It reads (from right to left): /// harikastiteiva\\\ip, “Harigast the priest”. (PD, n.d.) The so-called Grimm’s law is a set of statements describing the inherited North-West Indo-European stops as they developed in Pre-Proto-Germanic. As it is presently formulated, Grimm’s Law consists of three parts, which must be thought of as three consecutive phases in the sense of a chain shift:

·Voiceless stops change to PGmc. voiceless fricatives: p→*f, t→*θ, k→*x, kw→*xw.

·Voiced stops become PGmc. voiceless stops: b→*p, d→*t, g→*k, gw→*kw.

·Voiced aspirated stops lose their aspiration and change into plain voiced stops: bh→*b, dh→*d, gh→*g, gwh→*gw,*g,*w.

Verner’s Law addresses a category of exceptions, stating that unvoiced fricatives are voiced when preceded by an unaccented syllable: PGmc. *s→*z, *f→*v, *θ→*ð; as, NWIE bhratēr PGmc. *brōþēr ‘brother’, but NWIE mātr PGmc. *mōðēr ‘mother’.

NOTE 1. W. P. Lehmann (1961) considered that Jacob Grimm’s “First Germanic Sound Shift”, or Grimm’s Law and Verner's Law, which pertained mainly to consonants and were considered for a good many decades to have generated Proto-Germanic, were Pre-Proto-Germanic, and that the “upper boundary” was the fixing of the accent, or stress, on the root syllable of a word, typically the first. Proto-Indo-European had featured a moveable pitch accent comprising “an alternation of high and low tones” as well as stress of position determined by a set of rules based on the lengths of the word's syllables.

The fixation of the stress led to sound changes in unstressed syllables. For Lehmann, the “lower boundary” was the dropping of final -a or -e in unstressed syllables; for example, PIE woid-á >, Goth. wait, “knows” (the > and < signs in linguistics indicate a genetic descent). Antonsen (1965) agreed with Lehmann about the upper boundary but later found runic evidence that the -a was not dropped: Gmc. ékwakraz ... wraita ‘I wakraz ... wrote (this)’. He says: “We must therefore search for a new lower boundary for Proto-Germanic”.

Text Box: Nordic Bronze Age culture (ca. 1200 BC), Harper Atlas of World History (1993, PD)NOTE 2. Sometimes the shift produced allophones (consonants that were pronounced differently) depending on the context of the original. With regard to original PIE k and kw, Trask (2000) says that the resulting PGmc. *x and *xw were reduced to *h and *hw in word-initial position. Consonants were lengthened or prolonged under some circumstances, appearing in some daughter languages as geminated graphemes. Kraehenmann (2003) states that Proto-Germanic already had long consonants, but they contrasted with short ones only word-medially. Moreover, they were not very frequent and occurred only intervocally almost exclusively after short vowels. The phonemes *b, *d, *g and *gw, says Ringe (2006) were stops in some environments and fricatives in others.

Effects of the aforementioned sound laws include the following examples:

·  pf: pods, foot, cf. PGmc. fōts; cf. Goth. fōtus, O.N. fōtr, O.E. fōt, O.H.G. fuoz.

·  tþ,ð: tritjós, third, cf. PGmc. þriðjaz; cf. Goth. þridja, O.N. þriðe, O.E. þridda, O.H.G. dritto.

·  kx,h: kwon, dog, cf. PGmc. xunðaz; cf. Goth. hunds, O.N. hundr, O.E. hund, O.H.G. hunt.

·  kwxw,hw: kwos, what, who, cf. Gmc. hwoz; cf. Goth. hwas, O.N. hverr, O.S. hwe, O.E. hwā, O.Fris. hwa, O.H.G. hwër.

·  bp: werbō, throw, cf. Gmc. werpō; cf. Goth. wairpan, O.S. werpan, O.N. verpa, O.E. weorpan, M.L.G., Du. werpen, Ger. werfen.

·  dt: dekm̥, ten, cf. Gmc. tehun; cf. Goth. taihun, O.S. tehan, O.N. tiu, O.Fris. tian, O.Du. ten, O.H.G. zehan.

·  gk: gelu, ice, cf. Gmc. kaldaz; cf. Goth. kalds, O.N. kaldr, O.E. cald, O.H.G. kalt.

·  gwkw: gwīwós, alive, cf. Gmc. kwi(k)waz; cf. Goth. kwius, O.N. kvikr, O.E. cwic, O.H.G. quec.

·  bhb: bhrātēr, brother, cf. Gmc. brōþēr; cf. Goth. bróþar, O.N. brōþir, O.E. brōþor, O.H.G. bruoder.

·  dhd: dhworis, door, cf. Gmc. duriz; cf. Goth. daúr, O.N. dyrr, O.E duru, O.H.G. turi.

·  ghg: ghansis, goose, cf. Gmc. gansiz; cf. Goth gansus, O.N. gās, O.E. gōs, O.H.G. gans.

·  Text Box: Putzger, Historischer Atlas (1954) (Dbachmann 2005)gwhgw/g/w: gwhormos, warm, cf. Gmc. warmaz; cf. O.N. varmr, O.E. wearm, O.H.G. warm. For gwhondos, fight, cf. Gmc. gandaz; cf. Goth. gunþs, O.N. gandr, O.E. gūþ, O.H.G. gund.

A known exception is that the voiceless stops did not become fricatives if they were preceded by PIE s., i.e. sp, st, sk, skw. Similarly, PIE t did not become a fricative if it was preceded by p, k, or kw. This is sometimes treated separately under the Germanic spirant law.

NWIE vowels: a,o→*a; ā,ō→*ō. PGmc. had then short *i, *u, *e, *a, and long *ī, *ū, *ē, *ō, *ǣ?

NOTE 1. Similar mergers happened in the Slavic languages, but in the opposite direction. At the time of the merge, the vowels probably were [ɒ] and [ɒ:] before their timbres differentiated into maybe [ɑ] and [ɔ:].

NOTE 2. PGmc. *ǣ and *ē are also transcribed as *ē1 and *ē2; *ē2 is uncertain as a phoneme, and only reconstructed from a small number of words; it is posited by the comparative method because whereas all probable instances of inherited NWIE ē (PGmc. *ē1) are distributed in Gothic as ē and the other Germanic languages as ā, all the Germanic languages agree on some occasions of ē (e.g. PGmc. *2r Goth.,O.E.,O.N. hēr, “here”). Krahe treats *ē2 (secondary *ē) as identical with *ī. It probably continues NWIE ei or ēi, and it may have been in the process of transition from a diphthong to a long simple vowel in the Proto-Germanic period. Gothic makes no orthographic and therefore presumably no phonetic distinction between *ē1 and *ē2. The existence of two Proto-Germanic [e:]-like phonemes is supported by the existence of two e-like Elder Futhark runes, Ehwaz and Eihwaz.

B. Latin

The Romance or Romanic (also Neolatin) languages comprise all languages that descended from Latin, the language of the Roman Empire.

Text Box: Regions where Romance languages are spoken as official languages (dark), by sizeable minorities or official status (lighter) (2011 modified from PD) Romance languages have some 800 million native speakers worldwide, mainly in the Americas, Europe, and Africa, as well as in many smaller regions scattered through the world. The largest languages are Spanish and Portuguese, with about 400 and 200 million mother tongue speakers respectively, most of them outside Europe. Within Europe, French (with 80 million) and Italian (70 million) are the largest ones. All Romance languages descend from Vulgar Latin, the language of soldiers, settlers, and slaves of the Roman Empire, which was substantially different from the Classical Latin of the Roman literati. Between 200 BC and AD 100, the expansion of the Empire, coupled with administrative and educational policies of Rome, made Vulgar Latin the dominant native language over a wide area spanning from the Iberian Peninsula to the Western coast of the Black Sea. During the Empire’s decadence and after its collapse and fragmentation in the fifth century, Vulgar Latin evolved independently within each local area, and eventually diverged into dozens of distinct languages. The oversea empires established by Spain, Portugal and France after the fifteenth century then spread Romance to the other continents — to such an extent that about two thirds of all Romance speakers are now outside Europe.

Latin is usually classified, along with Faliscan, as an Italic dialect. The Italic speakers were not native to Italy, but migrated into the Italian Peninsula in the course of the second millennium BC, and were apparently related to the Celtic tribes that roamed over a large part of Western Europe at the time.

Text Box: Based on The Harper Atlas of World History 1987 (Zymos 2007)Archaeologically, the Apennine culture of inhumations enters the Italian Peninsula from ca. 1350 BC, east to west; the Iron Age reaches Italy from ca. 1100 BC, with the Villanovan culture  (with the practice of cremation), intruding north to south. The later Osco-Umbrian, Veneti and Lepontii peoples, as well as the Latino-Faliscans, have been associated with this culture. The first settlement on the Palatine hill dates to ca. 750 BC, settlements on the Quirinal to 720 BC, both related to the founding of Rome. As Rome extended its political dominion over Italy, Latin became dominant over the other Italic languages, which ceased to be spoken perhaps sometime in the first c. AD. 

Italic is usually divided into:

·Sabellic, including:

o Oscan, spoken in south-central Italy.

o Umbrian group:

§ Umbrian.

§ Volscian.

§ Aequian.

§ Marsian.

§ South Picene.

·Latino-Faliscan, including:

o Text Box: Ethnic groups within the Italian peninsula, ca. 600-500 BC. In central Italy, Italic languages, (2011, modified from Ewan ar Born) Faliscan, spoken in the area around Falerii Veteres, north of the city of Rome.

o Latin, spoken in west-central Italy. The Roman conquests eventually spread it throughout the Roman Empire and beyond.

The ancient Venetic language, as revealed by its inscriptions (including complete sentences), was also closely related to the Italic languages and is sometimes even classified as Italic. However, since it also shares similarities with other Western Indo-European branches (particularly Germanic), some linguists prefer to consider it an independent IE language.

Phonetic changes from NWIE to Latin include: bhf/b, dhf/b, ghh/f, gww/g, kwkw/k, pp/kw.

The Italic languages are first attested in writing from Umbrian and Faliscan inscriptions dating to the seventh century BC. The alphabets used are based on the Old Italic alphabet, which is itself based on the Greek alphabet. The Italic languages themselves show minor influence from the Etruscan and somewhat more from the Ancient Greek languages.

Oscan had much in common with Latin, though there are also some differences, and many common word-groups in Latin were represented by different forms; as, Lat. uolo, uelle, uolui, and other such forms from PIE wel-, will, were represented by words derived from gher-, desire, cf. Osc. herest ‘he wants, desires’ as opposed to Lat. uult (id.). Lat. locus ‘place’ was absent and represented by Osc. slaagid.

Text Box: The Duenos (O.Lat. duenus, Lat. buenus) Inscription in Old Latin, sixth century BC. Illustration from Hermes (1881, PD) In phonology, Oscan also shows a different evolution, as NWIE kw Osc. p instead of Lat. kw (cf. Osc. pis, Lat. quis); NWIE gw Osc. b instead of Latin w; NWIE medial bh, dh Osc. f, in contrast to Lat. b or d (cf. Osc. mefiai, Lat. mediae); etc.

NOTE. A specimen of Faliscan appears written round the edge of a picture on a patera: foied vino pipafo, cra carefo, which in Old Latin would have been hodie vinom bibabo, cras carebo, translated as ‘today I will drink wine; tomorrow I won't have any’ (R. S. Conway, Italic Dialects). Among other distinctive features, it shows the retention of medial f which in Latin became b, and evolution of NWIE ghf (fo-, contrast Lat. ho-).

Hence the reconstructed changes of North-West Indo-European into Proto-Italic:

·Voiced labiovelars unround or lenite: gw→*g/*w, gwh→*gh.

·Voiced aspirates become first unvoiced, then fricativise: bh→*ph→*ɸ→*f; dh→*th→*θ; gh→*kh→*x.

NOTE. About intervocalic gh Ita. *x, linguists (see Joseph & Wallace 1991) generally propose that it evolves as Faliscan g or k, while in Latin it becomes glottal h, without a change of manner of articulation. Picard (1993) rejects that proposal citing abstract phonetic principles, which Chela-Flores (1999) argues citing examples of Spanish phonology.

·   NWIE s → Ita. *θ before r (cf. Ita. kereθrom, Lat. cerebrum); unchanged elsewhere.

Up to 8 cases are found; apart from the 6 cases of Classic Latin (i.e. N-V-A-G-D-Ab), there was a locative (cf. Lat. proxumae viciniae, domī, carthagini; Osc. aasai, Lat. ‘in ārā’, etc.) and an instrumental (cf. Columna Rostrata Lat. pugnandod, marid, naualid, etc; Osc. cadeis amnud, Lat. ‘inimicitiae causae’; Osc. preiuatud, Lat. ‘prīuātō’, etc.). For originally differentiated genitives and datives, compare genitive (Lapis Satricanus:) Popliosio Valesiosio (the type in -ī is also very old, Segomaros -i), and dative (Praeneste Fibula:) numasioi, (Lucius Cornelius Scipio Epitaph:)  quoiei.

C. Celtic

The Celtic languages are the languages descended from Proto-Celtic, or Common Celtic.

Text Box: Diachronic distribution of Celtic-speaking peoples: maximal expansion (ca. 200 BC) and modern Celtic-speaking territories. (2011, modified from Dbachmann 2010) During the first millennium BC, especially between 400-100 BC they were spoken across Europe, from the southwest of the Iberian Peninsula and the North Sea, up the Rhine and down the Danube to the Black Sea and the Upper Balkan Peninsula, and into Asia Minor (Galatia). Today, Celtic languages are now limited to a few enclaves in the British Isles and on the peninsula of Brittany in France.

The distinction of Celtic into different sub-families probably occurred about 1000 BC. The early Celts are commonly associated with the archaeological Urnfield culture, the La Tène culture, and the Hallstatt culture.

Some scholars distinguish Continental and Insular Celtic, arguing that the differences between the Goidelic and Brythonic languages arose after these split off from the Continental Celtic languages. Other scholars distinguish P-Celtic from Q-Celtic, putting most of the Continental Celtic languages in the former group – except for Celtiberian, which is Q-Celtic.

NOTE. There are two competing schemata of categorisation. One scheme, argued for by Schmidt (1988) among others, links Gaulish with Brythonic in a P-Celtic node, leaving Goidelic as Q-Celtic. The difference between P and Q languages is the treatment of NWIE kw, which became *p in the P-Celtic languages but *k in Goidelic. An example is the Cel. verbal root kwrin- ‘to buy’, which became Welsh pryn-, but O.Ir. cren-.

Text Box: Hallstatt core territory (ca. 800 BC) and its influence (ca. 500 BC); La Tène culture (ca. 450) and its influence (ca. 50 BC). Major Celtic tribes are labelled. (Mod. from Dbachmann 2008)The other scheme links Goidelic and Brythonic together as an Insular Celtic branch, while Gaulish and Celtiberian are referred to as Continental Celtic. According to this theory, the ‘P-Celtic’ sound change of kw to p occurred independently or regionally. The proponents of the Insular Celtic hypothesis point to other shared innovations among Insular Celtic languages, including inflected prepositions, VSO word order, and the lenition of intervocalic m to β̃, a nasalised voiced bilabial fricative (an extremely rare sound), etc. There is, however, no assumption that the Continental Celtic languages descend from a common “Proto-Continental Celtic” ancestor. Rather, the Insular/Continental schemata usually consider Celtiberian the first branch to split from Proto-Celtic, and the remaining group would later have split into Gaulish and Insular Celtic.

Known NWIE evolutions into Proto-Celtic include:

·Consonants: p →*ɸ→*h in initial and intervocalic positions. Cel. *ɸsxs, *ɸtxt

Text Box: Gaulish iscription ϹΕΓΟΜΑΡΟϹ ΟΥΙΛΛΟΝΕΟϹ ΤΟΟΥΤΙΟΥϹ ΝΑΜΑΥϹΑΤΙϹ ΕΙωΡΟΥ ΒΗΛΗϹΑΜΙ ϹΟϹΙΝ ΝΕΜΗΤΟΝ "Segomaros, son of Uillū, citizen (toutious) of Namausos, dedicated this sanctuary to Belesama" (Fabrice Philibert-Caillat 2004)NOTE. LIE p was lost in Proto-Celtic, apparently going through the stages ɸ (perhaps in Lus. porcos) and h (perhaps attested by the toponym Hercynia if this is of Celtic origin) before being lost completely word-initially and between vowels. NWIE sp- became Old Irish s and Brythonic f; while Schrijver (1995) argues there was an intermediate stage *sɸ- (in which ɸ remained an independent phoneme until after Proto-Insular Celtic had diverged into Goidelic and Brythonic), McCone (1996) finds it more economical to believe that sp- remained unchanged in PC, that is, the change p to *ɸdid not happen when s preceded.

·Aspirated: dhd, bhb, ghx, gwhgw; but gwb.

·Vowels: ō ā, ū (in final syllable); ēī; NWIE u-w → Cel. o-w.

·Diphthongs: āiai, ēiei, ōioi; āuau, ēu,ōuou.

·Resonants: l̥la, li (before stops); r̥ ar, ri (before stops); m̥ am; n̥ an.

Italo-Celtic refers to the hypothesis that Italic and Celtic dialects are descended from a common ancestor, Proto-Italo-Celtic, at a stage post-dating Late Indo-European. Since both Proto-Celtic and Proto-Italic date to the early Iron Age (say, the centuries on either side of 1000 BC), a probable time frame for the assumed period of language contact would be the late Bronze Age, the early to mid-second millennium BC. Such grouping was proposed by Meillet (1890), and has been recently supported by Kortlandt (2007), among others (see above).

NOTE. One argument for Italo-Celtic was the thematic genitive in I (e.g. dominus, domini). Both in Italic (Popliosio Valesiosio, Lapis Satricanus) and in Celtic (Lepontic, Celtiberian -o), however, traces of PIE genitive -osjo have been discovered, so that the spread of the i-genitive could have occurred in the two groups independently, or by areal diffusion. The community of in Italic and Celtic may be then attributable to late contact, rather than to an original unity. The i-Genitive has been compared to the so-called Cvi formation in Sanskrit, but that too is probably a comparatively late development.

Other arguments include that both Celtic and Italic have collapsed the PIE Aorist and Perfect into a single past tense, and the ā-subjunctive, because both Italic and Celtic have a subjunctive descended from an earlier optative in -ā-. Such an optative is not known from other languages, but the suffix occurs in Balto-Slavic and Tocharian past tense formations, and possibly in Hitt. -ahh-.

D. Slavic

Text Box: World map of countries with a majority Slavic speakers (dark colour), and a significant minority (light) of more than 10%. (Therexbanner 2010) The Slavic or Slavonic languages have speakers in most of Eastern Europe, in much of the Balkans, in parts of Central Europe, and in the northern part of Asia. The largest languages are Russian and Polish, with 165 and some 47 million speakers, respectively. The oldest Slavic literary language was Old Church Slavonic, which later evolved into Church Slavonic.

There is much debate on whether Pre-Slavic branched off directly from a Northern LIE dialect, or it passed through a common Proto-Balto-Slavic stage, which would have necessarily split apart before 1000 BC in its two main sub-branches.

The original homeland of the speakers of Proto-Slavic remains controversial too. The most ancient recognisably Slavic hydronyms are to be found in northern and western Ukraine and southern Belarus. It has also been noted that Proto-Slavic seemingly lacked a maritime vocabulary.

Text Box: Based on information and maps from Mallory–Adams (1997). (Slovenski Volk 2009)The Proto-Slavic language secession from a common Proto-Balto-Slavic is estimated on archaeological and glottochronological criteria to have occurred between 1500-1000 BC (see below Baltic). Common Slavic is usually reconstructible to around AD 600.

By the seventh century, Common Slavic had broken apart into large dialectal zones. Linguistic differentiation received impetus from the dispersion of the Slavic peoples over a large territory – which in Central Europe exceeded the current extent of Slavic-speaking territories. Written documents of the ninth, tenth and eleventh centuries already show some local linguistic features.

NOTE. For example the Freising monuments show a language which contains some phonetic and lexical elements peculiar to Slovenian dialects (e.g. rhotacism, the word krilatec).

In the second half of the ninth century, the dialect spoken north of Thessaloniki became the basis for the first written Slavic language, created by the brothers Cyril and Methodius who translated portions of the Bible and other church books. The language they recorded is known as Old Church Slavonic. Old Church Slavonic is not identical to Proto-Slavic, having been recorded at least two centuries after the breakup of Proto-Slavic, and it shows features that clearly distinguish it from Proto-Slavic. However, it is still reasonably close, and the mutual intelligibility between Old Church Slavonic and other Slavic dialects of those days was proved by Cyril’s and Methodius’ mission to Great Moravia and Pannonia. There, their early South Slavic dialect used for the translations was clearly understandable to the local population which spoke an early West Slavic dialect.

ZographensisColour.jpgAs part of the preparation for the mission, the Glagolitic alphabet was created in 862 and the most important prayers and liturgical books, including the Aprakos Evangeliar – a Gospel Book lectionary containing only feast-day and Sunday readings –, the Psalter, and Acts of the Apostles, were translated. The language and the alphabet were taught at the Great Moravian Academy (O.C.S. Veľkomoravské učilište) and were used for government and religious documents and books. In 885, the use of the O.C.S. in Great Moravia was prohibited by the Pope in favour of Latin. Students of the two apostles, who were expelled from Great Moravia in 886, brought the Glagolitic alphabet and the Old Church Slavonic language to the Bulgarian Empire, where it was taught and Cyrillic alphabet developed in the Preslav Literary School.

Vowel changes from Late Indo-European to Proto-Slavic:

·    LIE *ī, *ei → Sla. *i1; LIE *i *i → Sla. *ь; LIE *u *u Sla. *ъ; LIE ū → Sla. *y.

·    LIE *e Sla. *e; LIE *ē → Sla. *ě1;

·    LIE *en, *em Sla. *ę; LIE *an, *on; *am, *om *an; *am Sla. *ǫ.

·    Text Box: Page from Codex Zographensis (10th - 11th c. AD) in Old Church Slavonic. (PD)LIE *a, *o *a Sla. *o; LIE *ā, *ō Sla. *a; LIE *ai, *oi *ai → Sla. *ě2. Reduced *ai (*ăi/*ui) → Sla. *i2; LIE *au,*ou *au Sla. *u.

NOTE. Apart from these simplified equivalences, other patterns appear (see Kortlandt’s article <http://www.kortlandt.nl/publications/art066e.pdf>, From Proto-Indo-European to Slavic):

o  The vowels *i2, *ě2 developed later than *i1, *ě1. In Late Proto-Slavic there were no differences in pronunciation between *i1 and i2 as well as between *ě1 and *ě2. They had caused, however, different changes of preceding velars, see below.

o  Late Proto-Slavic yers *ь, *ъ < earlier *i, *u developed also from reduced LIE *e, *o respectively. The reduction was probably a morphologic process rather than phonetic.

o  We can observe similar reduction of *ā into (and finally *y) in some endings, especially in closed syllables.

o  The development of the Sla. *i2 was also a morphologic phenomenon, originating only in some endings.

o  Another source of the Proto-Slavic *y is in Germanic loanwords – the borrowings took place when Proto-Slavic no longer had *ō in native words, as LIE *ō had already changed into .

o  LIE *ə disappeared without traces when in a non-initial syllable.

o  LIE *eu probably developed into *jau in Early Proto-Slavic (or during the Balto-Slavic epoch), and eventually into Proto-Slavic *ju.

o Text Box: After Barford (A history of Eastern Europe: crisis and change, 2007). (Slovenski Volk (2009) According to some authors, LIE long diphthongs *ēi, *āi, *ōi, *ēu, *āu, *ōu had twofold development in Early Proto-Slavic, namely they shortened in endings into simple *ei, *ai, *oi, *eu, *au, *ou but they lost their second element elsewhere and changed into *ē, *ā, *ō with further development like above.

Other vocalic changes from Proto-Slavic include *jo, *, *jy changed into *je, *, *ji; *o, *ъ, *y also changed into *e, *ь, *i after *c, *ʒ, *s’ which developed as the result of the 3rd palatalisation; *e, *ě changed into *o, *a after *č, *ǯ, *š, *ž in some contexts or words; a similar change of *ě into *a after *j seems to have occurred in Proto-Slavic but next it can have been modified by analogy.

On the origin of Proto-Slavic consonants, the following relationships are found:

·      LIE *p Sla. *p; LIE *b, *bh Sla. *b.

·      LIE *t Sla. *t; LIE *d, *dh Sla. *d.

·      LIE *k, *kw Sla. k (palatalised *kj → Sla. s); LIE *g, *gh, *gw, *gwh Sla. *g (palatalised *gj, *gjh Sla. *z)

·      LIE *s → Sla. *s; before a voiced consonant LIE *z Sla. *z; before a vowel when after *r, *u, *k, *i, probably also after *l → Sla. *x. 

·      LIE word-final *m → Sla. *n (<BSl. *n).

·      LIE *m̥ Sla. *im, *um; LIE *n̥ Sla. *in, *un; LIE *l̥ Sla. *il, *ul; LIE r̥ Sla. *ir, *ur.

·      LIE *w Sla. *v (<BSl. *w); LIE *j Sla. *j.

In some words the Proto-Slavic *x developed from LIE phonemes like *ks, *sk.

E. Baltic

The Baltic languages were spoken in areas extending east and southeast of the Baltic Sea in Northern Europe.

The language group is often divided into two sub-groups: Western Baltic, containing only extinct languages as Prussian or Galindan, and Eastern Baltic, containing extinct as well as the two living languages in the group, Lithuanian and Latvian. While related, Lithuanian and Latvian differ substantially from each other and are not mutually intelligible.

The oldest Baltic linguistic record is the Elbinger lexicon of the beginning of the fourteenth century. It contains 802 Old Prussian equivalents of Old Middle German words. The oldest Baltic text is Old Prussian as well; it comes from the middle of the fourteenth century and includes only eleven words. The first Old Lithuanian and Old Latvian texts come from the sixteenth century and appear already in book form, and were translations of a catechism and the Lord’s Prayer.

Text Box: Adapted from Gimbutas (The Balts, 1963). (Map Master 2007)Baltic and Slavic share so many similarities that many linguists, following the lead of such notable Indo-Europeanists as August Schleicher and Oswald Szemerényi, take these to indicate that the two groups separated from a common ancestor, the Proto-Balto-Slavic language, dated ca. 1500-500 BC, depending on the different guesstimates.

NOTE 1. About Balto-Slavic guesstimates, “Classical glottochronology” conducted by Czech Slavist M. Čejka in 1974 dates the Balto-Slavic split to -910±340 BC, Sergei Starostin in 1994 dates it to 1210 BC, and “recalibrated glottochronology” conducted by Novotná & Blažek dates it to 1400-1340 BC. This agrees well with Trziniec-Komarov culture, localised from Silesia to Central Ukraine and dated to the period 1500-1200 BC.

NOTE 2. Until Meillet’s Dialectes indo-européens of 1908, Balto-Slavic unity was undisputed among linguists – as he notes at the beginning of the Le Balto-Slave chapter, “L’unité linguistique balto-slave est l’une de celles que personne ne conteste”. Meillet’s critique of Balto-Slavic confined itself to the seven characteristics listed by Karl Brugmann in 1903, attempting to show that no single one of these is sufficient to prove genetic unity. Szemerényi in his 1957 re-examination of Meillet’s results concludes that the Balts and Slavs did, in fact, share a “period of common language and life”, and were probably separated due to the incursion of Germanic tribes along the Vistula and the Dnieper roughly at the beginning of the Common Era.

A new theory was proposed in the 1960s by V. Ivanov and V. Toporov: that the Balto-Slavic proto-language split from the start into West Baltic, East Baltic and Proto-Slavic. In their framework, Proto-Slavic is a peripheral and innovative Balto-Slavic dialect which suddenly expanded, due to a conjunction of historical circumstances. Onomastic evidence shows that Baltic languages were once spoken in much wider territory than the one they cover today, and were later replaced by Slavic.

Text Box: Linguistic area of Balto-Slavic areas, Ramat (1993). (Slovenski Volk 2009)NOTE. The most important of these common Balto-Slavic isoglosses are:

o Winter’s law: lengthening of a short vowel before a voiced plosive, usually in a closed syllable.

o Identical reflexes of LIE syllabic resonants, usually developing i and u before them. Kuryłowicz thought that *uR reflexes arose after LIE velars, and also notable is also older opinion of J.Endzelīns and *R. Trautmann according to whom *uR reflexes are the result of zero-grade of morphemes that had LIE *o PBSl. *a in normal-grade. Matasović (2008) proposes following internal rules after LIE *r̥ BSl. *ər: 1) *ə*i in a final syllable; 2) *ə*u after velars and before nasals; 3) *ə*i otherwise.

o Hirt’s law: retraction of LIE accent to the preceding syllable closed by a laryngeal.

o Rise of the Balto-Slavic acute before LIE laryngeals in a closed syllable.

o Replacement of LIE genitive singular of thematic nouns with ablative.

o Formation of past tense in *-ē (cf. Lith. pret. dãvė, “he gave”, O.C.S. imperfect , “he was”)

o Generalisation of the LIE neuter to- stem to the nominative singular of masculine and feminine demonstratives instead of LIE so- pronoun, so, , tod → BSl. *tos, *tā, *tod.

o Formation of definite adjectives with a construction of adjective and relative pronoun; cf. Lith. geràsis, “the good”, vs. gras, “good”; O.C.S dobrъjь, “the good”, vs. dobrъ, “good”.

Common Balto-Slavic innovations include several other prominent, but non-exclusive isoglosses, such as the satemisation, Ruki, change of LIE *o BSl. *a (shared with Germanic, Indo-Iranian and Anatolian) and the loss of labialisation in LIE labiovelars (shared with Indo-Iranian, Armenian and Tocharian). Among Balto-Slavic archaisms notable is the retention of traces of an older LIE pitch accent.  ‘Ruki’ is the term for a sound law which is followed especially in BSl. and Aryan dialects. The name of the term comes from the sounds which cause the phonetic change, i.e. LIE *s š / r, u, k, i (it associates with a Slavic word which means ‘hands’ or ‘arms’). A sibilant *s is retracted to *ʃ after *i, *u,* r, and after velars (i.e. *k which may have developed from earlier *k, *g, *gh). Due to the character of the retraction, it was probably an apical sibilant (as in Spanish), rather than the dorsal of English. The first phase (*s *š) seems to be universal, the later retroflexion (in Sanskrit and probably in Proto-Slavic as well) is due to levelling of the sibilant system, and so is the third phase - the retraction to velar *x in Slavic and also in some Middle Indian languages, with parallels in e.g. Spanish. This rule was first formulated for IE by Holger Pedersen.

Baltic and Slavic show a remarkable amount of correspondence in vocabulary too; there are at least 100 words exclusive to BSl., either being a common innovation or sharing the same semantic development from a PIE root; as, BSl. *lēipā, “tilia Lith. líepa, O.Prus. līpa, Ltv. lipa; Sla. *lipa; BSl. *rankā, “hand Lith. rankà, O.Prus. rānkan, Ltv. rùoka; Sla. *rǭ (cf. O.C.S. rǫka). BSl. *galwā́,head Lith. galvà, O.Prus. galwo, Ltv. galva; Sla. *golvà (cf. O.C.S. glava).


F. Fragmentary Dialects


Messapian (also known as Messapic) is an extinct language of south-eastern Italy, once spoken in the regions of Apulia and Calabria. It was spoken by the three Iapygian tribes of the region: the Messapians, the Daunii and the Peucetii. The language, a centum dialect, has been preserved in about 260 inscriptions dating from the sixth to the first century BC. It became extinct after the Roman Empire conquered the region and assimilated the inhabitants.

Some have proposed that Messapian was an Illyrian language. The Illyrian languages were spoken mainly on the other side of the Adriatic Sea. The link between Messapian and Illyrian is based mostly on personal names found on tomb inscriptions and on classical references, since hardly any traces of the Illyrian language are left.

NOTE. Some phonetic characteristics of the language may be regarded as quite certain:

o  PIE short *oa, as in the last syllable of the genitive kalatoras.

o  PIE final *mn, as in aran.

o  PIE *njnn, as in the Messapian praenomen Dazohonnes vs. the Illyrian praenomen Dazonius; the Messapian genitive Dazohonnihi vs. Illyrian genitive Dasonii, etc.

o  PIE *tjtth, as in the Messapian praenomen Dazetthes vs. Illyrian Dazetius; the Messapian genitive Dazetthihi vs. the Illyrian genitive Dazetii; from a Dazet- stem common in Illyrian and Messapian.

o  PIE *sjss, as in Messapian Vallasso for Vallasio, a derivative from the shorter name Valla.

o  The loss of final *-d, as in tepise, and probably of final *-t, as in -des, perhaps meaning ‘set’, from PIE *dhe- ‘set, put’.

o  The change of voiced aspirates in Proto-Indo-European to plain voiced consonants: PIE *dhd, as in Messapian anda (< PIE *en-dha- < PIE *en- ‘in’, compare Gk. entha); and PIE *bhb, as in Messapian beran (< PIE *bher- ‘to bear’).

o  PIE *auā before (at least some) consonants: Bāsta, from Bausta.

o  The form penkaheh – which Torp very probably identifies with the Oscan stem pompaio – a derivative of the Proto-Indo-European numeral *penkwe ‘five’.

o  If this last identification be correct it would show, that in Messapian (just as in Venetic and Ligurian) the original labiovelars (*kw,*gw, *gwh) were retained as gutturals and not converted into labials. The change of o to a is interesting, being associated with the northern branches of Indo-European such as Gothic, Albanian and Lithuanian, and not appearing in any other southern dialect hitherto known. The Greek Aphrodite appears in the form Aprodita (Dat. Sg., fem.).

o  The use of double consonants which has been already pointed out in the Messapian inscriptions has been very acutely connected by Deecke with the tradition that the same practice was introduced at Rome by the poet Ennius who came from the Messapian town Rudiae (Festus, p. 293 M).


Venetic was spoken in the Veneto region of Italy, between the Po River delta and the southern fringe of the Alps. It was a centum language.

The language is attested by over 300 short inscriptions dating between the sixth century BC and first century AD. Its speakers are identified with the ancient people called Veneti by the Romans and Enetoi by the Greek. The inscriptions use a variety of the Northern Italic alphabet, similar to the Old Italic alphabet. It became extinct around the first century when the local inhabitants were assimilated into the Roman sphere.

NOTE. The exact relationship of Venetic to other Indo-European languages is still being investigated, but the majority of scholars agree that Venetic, aside from Liburnian, was closest to the Italic languages. Venetic may also have been related to the Illyrian languages, though the theory that Illyrian and Venetic were closely related is debated by current scholarship.

Interesting parallels with Germanic have also been noted, especially  in pronominal forms:

Ven. ego ‘I’, acc. mego ‘me’; Goth. ik, acc. mik; but cf. Lat. ego, acc. me.

Ven. sselboisselboi ‘to oneself’; O.H.G. selb selbo; but cf. Lat. sibi ipsi.

Venetic had about six or even seven noun cases and four conjugations (similar to Latin). About 60 words are known, but some were borrowed from Latin (liber.tos. < libertus) or Etruscan. Many of them show a clear Indo-European origin, such as Ven. vhraterei (< PIE *bhreh2terei) ‘to the brother’.

In Venetic, PIE stops *bhf, *dhf, *ghh, in word-initial position (as in Latin and Osco-Umbrian), but to *bhb, *dhd, *ghg, in word-internal intervocalic position, as in Latin. For Venetic, at least the developments of *bh and *dh are clearly attested. Faliscan and Osco-Umbrian preserve internal *bhf,* dhf, *ghh.

There are also indications of the developments of PIE initial *gww-, PIE *kwkv and PIE initial *gwhf in Venetic, all of which are parallel to Latin, as well as the regressive assimilation of PIE sequence *p...kw... kw...kw... (e.g. *penkwe → *kwenkwe, “five”, *perkwu→ *kwerkwu, “oak”), a feature also found in Italic and Celtic (Lejeune 1974).


The Ligurian language was spoken in pre-Roman times and into the Roman era by an ancient people of north-western Italy and south-eastern France known as the Ligures. Very little is known about this language (mainly place names and personal names remain) which is generally believed to have been Indo-European; it appears to have adopted significantly from other IE languages, primarily Celtic (Gaulish) and Latin.

Strabo states “As for the Alps... Many tribes (éthnê) occupy these mountains, all Celtic (Keltikà) except the Ligurians; but while these Ligurians belong to a different people (hetero-ethneis), still they are similar to the Celts in their modes of life (bíois).”


The Liburnian language is an extinct language spoken by the ancient Liburnians in the region of Liburnia (south of the Istrian peninsula) in classical times. It is usually classified as a centum language. It appears to have been on the same Indo-European branch as the Venetic language; indeed, the Liburnian tongue may well have been a Venetic dialect.

NOTE. No writings in Liburnian are known, though. The grouping of Liburnian with Venetic is based on the Liburnian onomastics. In particular, Liburnian anthroponyms show strong Venetic affinities, with many common or similar names and a number of common roots, such as Vols-, Volt-, and Host- (<PIE *ghos-ti- ‘stranger, guest, host’). Liburnian and Venetic names also share suffixes in common, such as -icus and -ocus.

These features set Liburnian and Venetic apart from the Illyrian onomastic province, though this does not preclude the possibility that Venetic-Liburnian and Illyrian may have been closely related, belonging to the same Indo-European branch. In fact, a number of linguists argue that this is the case, based on similar phonetic features and names in common between Venetic-Liburnian on the one hand and Illyrian on the other.

Liburnia was conquered by the Romans in 35 BC, and its language was eventually replaced by Latin, undergoing language death probably very early in the Common Era.


Lusitanian or Lusatian (so named after the Lusitani or Lusitanians) was a Paleohispanic IE language known by only five inscriptions and numerous toponyms and theonyms. The language was spoken before the Roman conquest of Lusitania, in the territory inhabited by Lusitanian tribes, from Douro to the Tagus River in the western area of the Iberian Peninsula, where they were established already before the sixth century BC.

Text Box: (2011, modified from Alcides Pinto 2010)Their language is usually considered a Pre-Celtic (possibly stemming from a common Italo-Celtic) IE dialect, and it is sometimes associated with the language of the Vettones and with the linguistic substratum of the Gallaeci and Astures, based on archaeological findings and descriptions of ancient historians.

NOTE. The affiliation of the Lusitanian language within a Pre-Celtic IE group is supported by Tovar, Schmidt, Gorrochategui, among others, while Untermann e.g. considers it a Celtic language.The theory that it was a Celtic language is largely based upon the historical fact that the only Indo-European tribes that are known to have existed in Hispania at that time were Celtic tribes. The apparent Celtic character of most of the lexicon —anthroponyms and toponyms — may also support a Celtic affiliation. There is a substantial problem in the Celtic theory, though: the preservation of PIE initial *p-, as in Lusitanian pater ‘father’, or porcom ‘pig’. The Celtic languages had lost that initial PIE *p- in their evolution; compare Lat. pater, Gaul. ater, and Lat. porcum, O.Ir. orc. However, that does not necessarily preclude the possibility of Lusitanian being Celtic, because of the theoretical evolution of LIE initial *p → *ɸ → *h Cel. , so it might have been an early Proto-Celtic (or Italo-Celtic) dialect that split off before the loss of *p-, or when *p- had become *ɸ- (before shifting to *h- and then being lost); the letter p of the Latin alphabet could have been used to represent either sound.

 F. Villar and R. Pedrero relate Lusitanian with the Italic languages. The theory is based on parallels in the names of deities, as Lat. Consus, Lus. Cossue, Lat. Seia, Lus. Segia, or Marrucinian Iovia, Lus. Iovea(i), etc. and other lexical items, as Umb. gomia, Lus. comaiam, with some other grammatical elements.

II. Northern Indo-European in Asia: Tocharian

Tocharian or Tokharian is one of the most obscure branches of the Northern dialects. The name of the language is taken from people known to the Greek historians (Ptolemy VI, 11, 6) as the Tocharians (Greek Τόχαροι, Tókharoi).

NOTE. These are sometimes identified with the Yuezhi and the Kushans, while the term Tokharistan usually refers to first millennium Bactria. A Turkic text refers to the Turfanian language (Tocharian A) as twqry. F. W. K. Müller has associated this with the name of the Bactrian Tokharoi. In Tocharian, the language is referred to as arish-käna and the Tocharians as arya.

Tocharian consisted of two languages; Tocharian A (Turfanian, Arsi, or East Tocharian) and Tocharian B (Kuchean or West Tocharian). These languages were spoken roughly from the sixth to ninth centuries; before they became extinct, their speakers were absorbed into the expanding Uyghur tribes. Both languages were once spoken in the Tarim Basin in Central Asia, now the Xinjiang Autonomous Region of China. 

NOTE. Properly speaking, based on the tentative interpretation of twqry as related to Tokharoi, only Tocharian A may be referred to as Tocharian, while Tocharian B could be called Kuchean (its native name may have been kuśiññe), but since their grammars are usually treated together in scholarly works, the terms A and B have proven useful.

Tocharian is documented in manuscript fragments, mostly from the eighth century (with a few earlier ones) that were written on palm leaves, wooden tablets and Chinese paper, preserved by the extremely dry climate of the Tarim Basin. Samples of the language have been discovered at sites in Kucha and Karasahr, including many mural inscriptions.

Tocharian A and B were not intercomprehensible. The common Proto-Tocharian language must have preceded the attested languages by several centuries, probably dating to the first millennium BC.

1.7.2. Southern Indo-European Dialects

Text Box: Ancient Greek dialects by 400 BC after R.D. Woodard (2004). (2009, modified from Fut. Perf. 2008) I. Greek

1. Greek has a documented history of 3500 years. Today, Modern Greek is spoken by 15 million people.

2. The major dialect groups of the Ancient Greek period can be assumed to have developed not later than 1120 BC, at the time of the Dorian invasions, and their first appearances as precise alphabetic writing began in the eighth century BC.

3. Mycenaean is the most ancient attested form of the Greek branch, spoken on mainland Greece and on Crete between 1600-1100 BC, before the Dorian invasion. It is preserved in inscriptions in Linear B, a script invented on Crete before the fourteenth century BC. Most instances of these inscriptions are on clay tablets found in Knossos and in Pylos. The language is named after Mycenae, the first of the palaces to be excavated.

NOTE. The tablets remained long undeciphered, and every conceivable language was suggested for them, until Michael Ventris deciphered the script in 1952 and proved the language to be an early form of Greek. The texts on the tablets are mostly lists and inventories. No prose narrative survives, much less myth or poetry. Still, much may be glimpsed from these records about the people who produced them, and about the Mycenaean period at the eve of the so-called Greek Dark Ages.

5. Unlike later varieties of Greek, Mycenaean probably had seven grammatical cases, the nominative, the genitive, the accusative, the dative, the instrumental, the locative, and the vocative.

The instrumental and the locative however gradually fell out of use.

Text Box: Mycenaean tablet (MY Oe 106) inscripted in linear B coming from the House of the Oil Merchant (ca. 1250 BC). The tablet registers an amount of wool which is to be dyed. National Archaeological Museum of Athens. (Marsyas 2005) NOTE. For the locative in *-ei, compare di-da-ka-re, ‘didaskalei’, e-pi-ko-e, ‘Epikóhei’, etc (in Greek there are syntactic compounds like puloi-genēs, ‘born in Pylos’); also, for remains of an ablative case in *-ōd, compare (months’ names) ka-ra-e-ri-jo-me-no, wo-de-wi-jo-me-no, etc.

6. Proto-Greek (the so-called Proto-Hellenic, or Pre-Greek in Sihler 1995) was a Southern LIE dialect, spoken in the late third millennium BC, roughly at the same time as North-West Indo-European and Proto-Indo-Iranian, most probably in the Balkans.

NOTE. According to Anthony (2007): “Greek shared traits with Armenian and Phrygian, both of which probably descended from languages spoken in southeastern Europe before 1200 BCE, so Greek shared a common background with some southeastern European languages that might have evolved from the speech of the Yamnaya immigrants in Bulgaria”. Proponents of a Proto-Greek homeland in Bulgaria or Romania are found in Sergent (1995), J. Makkay (Atti e memorie del Secondo Congresso Internazionale di Micenologia, 1996; Origins of the Proto-Greeks and Proto-Anatolians from a Common Perspective, 2003).

7. Proto-Greek (Pre-Greek or Proto-Hellenic) has been posited as a probable ancestor of Phrygian, and a possible ancestor of Thracian, Dacian, and Ancient Macedonian. Armenian has traditionally been regarded as derived from it through Phrygian, although this is disputed today.

NOTE. The Graeco-Armenian hypothesis proposed a close relationship to the Greek language – putting both in the larger context of the Paleo-Balkan Sprachbund – notably including Phrygian, which is widely accepted particularly close to Greek –, consistent with Herodotus’ recording of the Armenians as descending from colonists of the Phrygians. That traditional linguistic theory, proposed by Pedersen (1924), proposed a close relationship between both original communities, Greek and Armenian, departing from a common language. That vision, accepted for a long time, was rejected by Clackson (1994) in The linguistic relationship between Armenian and Greek, which, while supporting the Graeco-Aryan community, argues that there are not more coincidences between Armenian and Greek than those found in the comparison between any other IE language pair; shared isoglosses would therefore stem from contiguity within the common S.LIE community. Those findings are supported by Kortlandt in Armeniaca (2003), in which he proposes an old Central IE continuum Daco-Albanian / Graeco-Phrygian / Thraco-Armenian. Adrados (1998), considers an older Southern continuum Graeco-[Daco-]Thraco-Phrygian / Armenian  / Indo-Iranian. Olteanu (2009) proposes a Graeco-Daco-Thracian language.

8. The unity of Proto-Greek probably ended as Hellenic migrants entered the Greek peninsula around 2300 BC.

NOTE. About the archaeological quest, Anthony (2007): “The people who imported Greek or Proto-Greek to Greece mighthave moved several times, perhaps by sea, from the western Pontic steppes to southeastern Europe to western Anatolia to Greece, making their trailhard to find. The EHII/III transition about 2400-2200 BCE has long been seen as a time of radical change in Greece when new people mighthave arrived (…)”.

 In West’s (2007) words, “The first speakers of Greek – or rather of the language that was to develop into Greek; I will call them mello-Greeks – arrived in Greece, on the most widely accepted view, at the beginning of Early Helladic III, that is, around 2300. They came by way of Epirus, probably from somewhere north of the Danube”.

9. The primary sound changes from PIE to Proto-Greek include:

·                 Aspiration of PIE intervocalic *s PGk h.

NOTE. The loss of PIE prevocalic *s- was not completed entirely, famously evidenced by the loss of prevocalic *s was not completed entirely, famously evidenced by PGk sūs (also hūspig, from PIE *suh-); sun ‘with’, sometimes considered contaminated with PIE *kom (cf. Latin cum) to Homeric / Old Attic ksun, is possibly a consequence of Gk. psi-substrate (Villar).

·                 De-voicing of PIE voiced aspirates: *bhph, *dhth, *ghkh, *gwhkwh.

·                 Dissimilation of aspirates (Grassmann’s law), possibly post-Mycenaean.

·                 PIE word-initial *j- (not *Hj-) is strengthened to PGk dj- (later Gk. ζ-).

·                 In the first stage of palatalisation (Sihler 1995), PIE *dj- was possibly palatalised into PGk dz(j)- , while PIE *tj-, *dhj- probably became PGk ts(j)-.

·                 Vocalisation of laryngeals between vowels and initially before consonants, i.e. *h1e, *h2a, *h3o; as, from PIE *h2nēr ‘man’, PGk. anr.

NOTE. That development is common to Greek, Phrygian and Armenian; cf. Gk. anr, Phrygian anar, and Armenian ayr (from earlier *anir). In other branches, laryngeals did not vocalise in this position and eventually disappeared. The evolution of Proto-Greek should be considered with the background of an early Palaeo-Balkan Sprachbund that makes it difficult to delineate exact boundaries between individual languages. Phrygian and Armenian also share other phonological and morphological peculiarities of Greek.

·                 The sequence CRHC evolves generally as follows: *CRh1C PGk CRēC; PIE *CRh2C PGk CRāC; PIE *CRh3C PGk CRōC.

·                 The sequence PIE *CRHV becomes generally PGk CaRV.

NOTE. It has also been proposed by Sihler (2000) that PIE *Vkwukw; cf. PIE *nokwts ‘night’ → PGk nukwts → Gk. nuks/nuxt-; cf. also *kwekwlos ‘wheel’ → PGk kwukwlos → Gk. kuklos; etc. This is related to Cowgill's law, raising *o to u between a resonant and a labial.

10. Later sound changes from Proto-Greek into mello-Greek (or from Pre-Greek into Proto-Greek after Sihler 1995), from which Mycenaean was derived, include:

o  The second stage of palatalisation, which affected all consonants, including the restored tsj and dzj sequences (Sihler 1995).

o  Loss of final stop consonants; final mn.

o  Syllabic m̥am, and n̥an, before resonants; otherwise both were nasalised m̥/n̥ãa.

o  Loss of s in consonant clusters, with supplementary lengthening, e.g. esmiēmi.

o  Creation of secondary s from clusters, ntjansa. Assibilation tisi only in southern dialects.

o  Mycenaean i-vocalism and replacement of double-consonant -kw- for -kwkw-.

NOTE. On the problematic case of common Greek ἵππος (hippos), horse, derived from PIE and PGk ekwos, Meier-Brügger (2003): “the i-vocalism of which is best understood as an inheritance from the Mycenaean period. At that time, e in a particular phonetic situation must have been pronounced in a more closed manner, cf. di-pa i.e. dipas neuter ‘lidded container fror drinking’ vs. the later δέρας (since Homer): Risch (1981), O. Panagl (1989). That the i-form extended to the entire Greek region may be explained in that the word, very central during Mycenaean rule of the entire region (second millennium BC), spread and suppressed the e-form that had certainly been present at one time. On the -pp-: The original double-consonance -ku̯- was likely replaced by -kwkw- in the pre-Mycenaean period, and again, in turn by -pp- after the disappearance of the labiovelars. Suggestions of an ancient -kwkw- are already given by the Mycenaean form as i-qo (a possible *i-ko-wo does not appear) and the noted double-consonance in alphabetic Greek. The aspiration of the word at the beginning remains a riddle.”

Other features common to the earliest Greek dialects include:

·  Text Box: Main dialectal distribution in territories with Greek-speaking majorities (ca. 15th c.): Koiné, Pontic and Cappadocian Greek. The language distribution in Anatolia remained almost unchanged until the expulsion of Greeks (1914-1923) from Turkey. (2011, modified from Ivanchay, Infocan 2008)Late satemisation trend, evidenced by the post-Mycenaean change of labiovelars into dentals before e; as, kwe te ‘and’.

·  PIE dative, instrumental and locative were syncretised into a single dative.

·  Dialectal nominative plural in -oi, -ai (shared with Latin) fully replaces LIE common *-ōs, *-ās.

·  The superlative -tatos (<PIE *-to-) becomes productive.

·  The peculiar oblique stem gunaik- ‘women’, attested from the Thebes tablets is probably Proto-Greek; it appears, at least as gunai- also in Armenian.

·  The pronouns houtos, ekeinos and autos are created. Use of ho, hā, ton as articles is post-Mycenaean.

·  The first person middle verbal desinences -mai, -mān replace -ai, -a. The third singular pherei is an analogical innovation, replacing the expected PIE *bhéreti, i.e. Dor. *phereti, Ion. *pheresi.

·  The future tense is created, including a future passive, as well as an aorist passive.

·  The suffix -ka- is attached to some perfects and aorists.

·  Infinitives in -ehen, -enai and -men are also common to Greek dialects.

II. Armenian

The earliest testimony of the Armenian language dates to the fifth century AD, the Bible translation of Mesrob Mashtots. The earlier history of the language is unclear and the subject of much speculation. It is clear that Armenian is an Indo-European language, but its development is opaque.  

Text Box: Armenian manuscript, ca. 5th-6th c. AD (PD)NOTE. Proto-Armenian sound-laws are varied and eccentric, such as IE *dw- yielding Arm. k-, and in many cases still uncertain. In fact, that phonetic development is usually seen as *dw- to erk-, based on PIE numeral *dwo- ‘two’, a reconstruction Kortlandt (Armeniaca 2003) dismisses, exposing alternative etymologies for the usual examples.

PIE voiceless stops are aspirated in Proto-Armenian.

NOTE. That circumstance gave rise to the Glottalic theory, which postulates that this aspiration may have been sub-phonematic already in Proto-Indo-European. In certain contexts, these aspirated stops are further reduced to w, h or in Armenian – so e.g. PIE *p’ots, into Arm. otn, Gk. pous ‘foot’; PIE *t’reis, Arm. erek’, Gk. treis ‘three’.

Text Box: Armenia today (darkest colour), Armenian majorities (dark) and greatest extent of the Kingdom of Armenia (light). Territory of the 6 Armenian Vilayets in the Ottoman Empire (dotted line), and areas with significant Armenian population prior to the Armenian Genocide (stripes). Ivaşca Flavius (2010).III. Indo-Iranian

The Indo-Iranian or Aryan language group consists of two main language subgroups, Indo-Aryan and Iranian. Nuristani has been suggested as a third one, while Dardic is usually classified within Indo-Aryan.

The contemporary Indo-Iranian languages form the second largest sub-branch of Late Indo-European (after North-West Indo-European), with more than one billion speakers in total, stretching from Europe (Romani) and the Caucasus (Ossetian) to East India (Bengali and Assamese). The largest in terms of native speakers are Hindustani (Hindi and Urdu, ca. 540 million), Bengali (ca. 200 million), Punjabi (ca. 100 million), Marathi and Persian (ca. 70 million each), Gujarati (ca. 45 million), Pashto (40 million), Oriya (ca. 30 million), Kurdish and Sindhi (ca. 20 million each).

While the archaeological identification of Pre-Proto-Indo-Iranians and Proto-Indo-Iranians remains unsolved, it is believed that ca. 2500 BC a distinct Proto-Indo-Iranian language must have been spoken in the eastern part of the previous Yamna territory.

NOTE. Parpola (The formation of the Aryan branch of Indo-European, 1999) suggests the following identifications:

Date range

Archaeological culture

Suggested by Parpola

2800-2000 BC

Late Catacomb and Poltavka cultures

LIE to Proto-Indo-Iranian.

2000-1800 BC

Srubna and Abashevo cultures


2000-1800 BC



1900-1700 BC


“Proto-Dasa” Indo-Aryans establishing themselves in the existing BMAC settlements, defeated by “Proto-Rigvedic” Indo-Aryans around 1700

1900-1400 BC

Cemetery H

Indian Dasa

1800-1000 BC


Indo-Aryan, including “Proto–Sauma-Aryan” practicing the Soma cult

1700-1400 BC

early Swat culture

Proto-Rigvedic = Proto-Dardic

1700-1500 BC

late BMAC

“Proto–Sauma-Dasa”, assimilation of Proto-Dasa and Proto–Sauma-Aryan

1500-1000 BC

Early West Iranian Grey Ware

Mitanni-Aryan (offshoot of “Proto-Sauma-Dasa”)

1400-800 BC

late Swat culture and Punjab, Painted Grey Ware

late Rigvedic

1400-1100 BC

Yaz II-III, Seistan


1100-1000 BC

Gurgan Buff Ware, Late West Iranian Buff Ware

Proto-Persian, Proto-Median

1000-400 BC

Iron Age cultures of Xinjang



It is generally believed that early Indo-Iranian contacts with the easternmost part of North-West IE (Pre-Balto-Slavic) accounts for their shared linguistic features, such as satemisation and Ruki sound law. Assuming – as it is commonly done – that both phonetic trends were late developments after the LIE community, an early North-West–South-East Sprachbund or dialect continuum must have taken place before the Proto-Indo-Iranian migration to the East.

NOTE. From a linguistic point of view, Burrow (1955): “(…) in the case of Sanskrit migrations at a comparatively late date took it to the extreme East of the Indo-European domain. Before this period its ancestor, primitive Indo-Iranian must have held a fairly central position, being directly in contact with the other dialects of the satem-group, and having to the East of it that form of Indo-European which eventually turned into the dialects A and B of Chinese Turkestan. Its position can further be determined by the especially close relations which are found to exist between it and Balto-Slavonic. Since the Balts and the Slavs are not likely to have moved far from the positions in which they are to be found in their earliest recorded history, the original location of Indo-Iranian towards the South-East of this area becomes highly probable.”

As we have seen Kortlandt’s (1990) interpretation of linguistic contacts according to Mallory’s (1989) account of archaeological events (v.s. §1.7.1. North-West Indo-European): “If the speakers of the other satem languages can be assigned to the Yamnaya horizon and the western Indo-Europeans to the Corded Ware horizon, it is attractive to assign the ancestors of the Balts and the Slavs to the Middle Dnieper culture”, an identification also made by Anthony (2007).

Text Box: Archaeological cultures associated with late Indo-Iranian migrations. The early phases of the Andronovo culture have often been seen to offer a “staging area” for Indo-Iranian movements. The BMAC offers the Central Asian cultural “filter” through which some argue the Indo-Iranians must have passed southwards to such sites as Mehrgarh and Sibri. Mallory–Adams (1997) (Dbachmann 2005)Similarly, Adrados–Bernabé–Mendoza (1995-1998), about the dialectal situation of Slavic (from a linguistic point of view): “To a layer of archaisms, shared or not with other languages (…) Slavic added different innovations, some common to Baltic. Some of them are shared with Germanic, as the oblique cases in -m and feminine participle; others with Indo-Iranian, so satemisation, Ruki sound law (more present in Slavic than in Baltic) (…) Most probably, those common characteristics come from a recent time, from secondary contacts between [N.LIE] (whose rearguard was formed by Balto-Slavs) and [S.LIE] (in a time when Greeks were not in contact anymore, they had already migrated to Greece).

Text Box: Current distribution of Indo-Aryan languages, A Historical Atlas of South Asia (1992) (Dbachmann 2008)Because Proto-Indo-Aryan is assumed to have been spoken ca. 2000-1500 BC (preceding Vedic cultures), historical linguists broadly estimate that the continuum of Indo-Iranian languages had to diverge ca. 2200-2000 BC, if not earlier. The Aryan expansion before the Indo-Iranian split – for which the terminus ante quem is 2000 BC (cf. Mallory 1989) –, implies centuries of previous Pre-Indo-Aryan and Pre-Iranian differentiation. This time is commonly identified with the early bearers of the Andronovo culture (Sintashta-Petrovka-Arkaim, in Southern Urals, 2200-1600 BC), who spread over an area of the Eurasian steppe that borders the Ural River on the west, the Tian Shan on the east – where the Indo-Iranians took over the area occupied by the earlier Afanasevo culture –, and Transoxiana and the Hindu Kush on the south.  

A Two-wave model of Indo-Iranian expansion has been proposed (Burrow 1973, Parpola 1999), strongly associated with the chariot. Indo-Aryans left linguistic remains in a Hittite horse-training manual written by one “Kikkuli the Mitannian”. Other evidence is found in references to the names of Mitanni rulers and the gods they swore by in treaties; these remains are found in the archives of the Mitanni’s neighbours, and the time period for this is about 1500 BC.

NOTE. The standard model for the entry of the Indo-European languages into South Asia is that the First Wave went over the Hindu Kush, either into the headwaters of the Indus and later the Ganges. The earliest stratum of Vedic Sanskrit, preserved only in the Rigveda, is assigned to roughly 1500 BC. From the Indus, the Indo-Aryan languages spread from ca. 1500 BC to ca. 500 BC, over the northern and central parts of the subcontinent, sparing the extreme south. The Indo-Aryans in these areas established several powerful kingdoms and principalities in the region, from eastern Afghanistan to the doorstep of Bengal.

Text Box: Current distribution of Iranian dialects. Dbachmann(2006)The Second Wave is interpreted as the Iranian wave. The Iranians would take over all of Central Asia, Iran, and for a considerable period, dominate the European steppe (the modern Ukraine) and intrude north into Russia and west into central and Eastern Europe well into historic times and as late as the Common Era. The first Iranians to reach the Black Sea may have been the Cimmerians in the eighth century BC, although their linguistic affiliation is uncertain. They were followed by the Scythians, who are considered a western branch of the Central Asian Sakas, and the Sarmatian tribes.

The main changes separating Proto-Indo-Iranian from Late Indo-European include (according to Burrow 1955 and Fortson 2004):

·  Early  satemisation trend: The satem shift, consisting of two sets of related changes:

o  Palatalisation of LIE velars: *k͡ɕ, *g͡ʑ, *gh͡ʑh; as, *km̥tóm ķatám, *gónu ģānu, *ghéimn̥ ģhima-.

o  Merge of LIE labiovelars with plain velars: *kwk , *gwg, *gwhgh; as, *kʷo- ka-, *gwou- gau-, *gwhormó- gharmá-.

·  These plain velars, when before a front vowel (pre-PII *i or *e) or the glide *j, were then palatalised to affricates: *kt͡ʃ, *gd͡ʒ, *ghd͡ʒh; as, *kwe a-, *gwīwós ġīwás, *gwhénti ġhanti.

NOTE. This palatalisation is often called the Law of Palatals. It must have happened before the merge of PIE *e, *o, with *a. An illustrative example is found in weak perfect stem *kwe-kwr- ‘did’ pre-PII *ke-kr- *e-kr- PII *a-kr- (Ved. cakr-, Av. and O.Pers. caxr-).

·  Before a dental occlusive, š, ġ ž; ġʰž, with aspiration of the occlusive; as, *oķt ašt, *mr̥ģt- mžd-,  *ʰtó- uždʰá-.

·  The sequence *ķs was simplified to šš; as, *aks- ášš-.

·  Assimilation of LIE vowels *e, *o a; *ē, *ō ā

·  Interconsonantal and word-final LIE *H→ PII i,cf. *ph2tr PII pitr, *-medʰH *-madhi.

·  LIE *m̥ *n̥ merge with a; as, *km̥tóm ķatám, *mn̥tó- matá-.

·  Bartholomae’s law: an aspirate immediately followed by a voiceless consonant becomes voiced stop + voiced aspirate. In addition, *dh+t dzdh; as, *ubhto- ubdha-, *urdhto- urdzdha-, *augh-tá- augdhá-.

·  The Ruki rule: *s is retracted to š when immediately following *r *r̥ *u *k or *i. Its allophone *z likewise becomes ž; as, *wers- warš-, *pr̥sto pšta-, *geus- ģauš-, *kʷsep- kšap-, *wis- wiš-, *nisdo- nda-.

·  Brugmann’s law: *o in an open syllable lengthens to ō; *dehtór-m̥ Pre-PII *dehtr-m̥* dātāram.

·  Resonants are generally stable in PII, but for the confusion *l/*r, which in the oldest Rigveda and in Avestan gives a trend LIE *l̥ PII , as well as *lr; as, *̥kʷos wkas / wkas.

A synoptic table of Indo-Iranian phonetic system:










dental/ alveolar
















































































IV. Palaeo-Balkan Languages

A. Phrygian

Text Box: Phrygian Kingdom ca. 800-700 BC, from “Atlas of the Bible Lands” (1959) (2011 from PD)The Phrygian language was spoken by the Phrygians, a people that settled in Asia Minor during the Bronze Age. It survived probably into the sixth century AD, when it was replaced by Greek.

Ancient historians and myths sometimes did associate Phrygian with Thracian and maybe even Armenian, on grounds of classical sources. Herodotus recorded the Macedonian account that Phrygians migrated into Asia Minor from Thrace, and stated that the Armenians were colonists of the Phrygians, still considered the same in the time of Xerxes I. The earliest mention of Phrygian in Greek sources, in the Homeric Hymn to Aphrodite, depicts it as different from Trojan: in the hymn, Aphrodite, disguising herself as a mortal to seduce the Trojan prince Anchises, tells him:

Otreus of famous name is my father, if so be you have heard of him, and he reigns over all Phrygia rich in fortresses. But I know your speech well beside my own, for a Trojan nurse brought me up at home”. Of Trojan, unfortunately, nothing is known.

Phrygian is attested by two corpora, one, Palaeo-Phrygian, from around 800 BC and later, and another after a period of several centuries, Neo-Phrygian, from around the beginning of the Common Era. The Palaeo-Phrygian corpus is further divided geographically into inscriptions of Midas-city, Gordion, Central, Bithynia, Pteria, Tyana, Daskyleion, Bayindir, and “various”. The Mysian inscriptions show a language classified as a separate Phrygian dialect, written in an alphabet with an additional letter, the “Mysian s”. We can reconstruct some words with the help of some inscriptions written with a script similar to the Greek one.

Its structure, what can be recovered from it, was typically LIE, with at least three nominal cases, three gender classes and two grammatical numbers, while the verbs were conjugated for tense, voice, mood, person and number.

Phrygian seems to exhibit an augment, like Greek and Armenian, as in Phryg. eberet, probably corresponding to PIE *é-bher-e-t (cf. Gk. epheret).

A sizable body of Phrygian words is theoretically known; however, the meaning and etymologies and even correct forms of many Phrygian words (mostly extracted from inscriptions) are still being debated.

Phrygian words with possible PIE origin and Graeco-Armenian cognates include:

·  Phryg. bekos ‘bread’, from PIE *bheh3g-; cf. Gk. phōgō ‘to roast’.

·  Phryg. bedu ‘water’, from PIE *wed-; cf. Arm. getriver’.

·  Phryg. anar ‘husband, man’, PIE *h2ner- ‘man’; cf. Gk. aner- ‘man, husband’.

·  Phryg. belte ‘swamp’, from PIE root *bhel- ‘to gleam’; cf. Gk. baltos ‘swamp’.

·  Phryg. brater ‘brother’, from PIE *bhreh2ter-; cf. Gk. phrāter-.

·  Phryg. ad-daket ‘does, causes’, from PIE stem *dhē-k-; cf. Gk. ethēka.

·  Phryg. germe ‘warm’, from PIE *gwher-mo-; cf. Gk. thermos.

·  Phryg. gdan ‘earth’, from PIE *dhghom-; cf. Gk. khthōn.

NOTE. For more information on similarities between Greek and Phrygian, see Neumann (Phrygisch und Griechisch, 1988).

B. Illyrian

The Illyrian languages are a group of Indo-European languages that were spoken in the western part of the Balkans in former times by ethnic groups identified as Illyrians: Delmatae, Pannoni, Illyrioi, Autariates, Taulanti.

Text Box: Roman provinces in the Balkans, Droysens Historischem Handatlas (1886)The main source of authoritative information about the Illyrian language consists of a handful of Illyrian words cited in classical sources, and numerous examples of Illyrian anthroponyms, ethnonyms, toponyms and hydronyms. Some sound-changes and other language features are deduced from what remains of the Illyrian languages, but because no writings in Illyrian are known, there is not sufficient evidence to clarify its place within the Indo-European language family aside from its probable centum nature.

NOTE. A grouping of Illyrian with the Messapian language has been proposed for about a century, but remains an unproven hypothesis. The theory is based on classical sources, archaeology, as well as onomastic considerations. Messapian material culture bears a number of similarities to Illyrian material culture. Some Messapian anthroponyms have close Illyrian equivalents. A relation to the Venetic language and Liburnian language, once spoken in northeastern Italy and Liburnia respectively, is also proposed.

B. Thracian

Excluding Dacian, whose status as a Thracian language is disputed, Thracian was spoken in what is now southern Bulgaria, parts of Serbia, the Republic of Macedonia, Northern Greece – especially prior to Ancient Macedonian expansion –, throughout Thrace (including European Turkey) and in parts of Bithynia (North-Western Anatolia). Most of the Thracians were eventually Hellenised (in the province of Thrace) or Romanised (in Moesia, Dacia, etc.), with the last remnants surviving in remote areas until the fifth century AD.

NOTE. As an extinct language with only a few short inscriptions attributed to it, there is little known about the Thracian language, but a number of features are agreed upon. A number of probable Thracian words are found in inscriptions – most of them written with Greek script – on buildings, coins, and other artifacts. Some Greek lexical elements may derive from Thracian, such as balios ‘dappled’ (< PIE *bhel- ‘to shine’, Pokorny also cites Illyrian as possible source), bounos ‘hill, mound’, etc.

C. Dacian

The Dacian language was spoken by the ancient people of Dacia. It is often considered to have been either a northern variant of the Thracian language, or closely related to it.

There are almost no written documents in Dacian. It used to be one of the major languages of South-Eastern Europe, stretching from what is now Eastern Hungary to the Black Sea shore. Based on archaeological findings, the origins of the Dacian culture are believed to be in Moldavia, being identified as an evolution of the Iron Age Basarabi culture.

It is unclear exactly when the Dacian language became extinct, or even whether it has a living descendant. The initial Roman conquest of part of Dacia did not put an end to the language, as free Dacian tribes such as the Carpi may have continued to speak Dacian in Moldavia and adjacent regions as late as the sixth or seventh century AD, still capable of leaving some influences in the forming of Slavic languages.

E. Paionian

The Paionian language is the poorly attested language of the ancient Paionians, whose kingdom once stretched north of Macedon into Dardania and in earlier times into southwestern Thrace.

Classical sources usually considered the Paionians distinct from Thracians or Illyrians, comprising their own ethnicity and language. Athenaeus seemingly connected the Paionian tongue to the Mysian language, itself barely attested. If correct, this could mean that Paionian was an Anatolian language. On the other hand, the Paionians were sometimes regarded as descendants of Phrygians, which may put Paionian on the same linguistic branch as the Phrygian language.

NOTE. Modern linguists are uncertain on the classification of Paionian, due to the extreme scarcity of materials we have on this language. However, it seems that Paionian was an independent IE dialect. It shows a/o distinction and does not appear to have undergone satemisation. The Indo-European voiced aspirates became plain voiced consonants, i.e. *bhb, *dhd, *ghg, *gwhgw; as in Illyrian, Thracian, Macedonian and Phrygian (but unlike Greek).

F. Ancient Macedonian

The Ancient Macedonian language was the tongue of the Ancient Macedonians. It was spoken in Macedon during the first millennium BC. Marginalised from the fifth century BC, it was gradually replaced by the common Greek dialect of the Hellenistic Era. It was probably spoken predominantly in the inland regions away from the coast. It is as yet undetermined whether the language was a dialect of Greek, a sibling language to Greek, or an Indo-European language which is a close cousin to Greek and also related to Thracian and Phrygian.

Knowledge of the language is very limited, because there are no surviving texts that are indisputably written in the language. However, a body of authentic Macedonian words has been assembled from ancient sources, mainly from coin inscriptions, and from the fifth century lexicon of Hesychius of Alexandria, amounting to about 150 words and 200 proper names. Most of these are confidently identifiable as Greek, but some of them are not easily reconciled with standard Greek phonology. The 6,000 surviving Macedonian inscriptions are in the Greek Attic dialect.

NOTE. Suggested phylogenetic classifications of Macedonian include: An Indo-European language which is a close cousin to Greek and also related to Thracian and Phrygian languages, suggested by A. Meillet (1913) and I. I. Russu (1938), or part of a Sprachbund encompassing Thracian, Illyrian and Greek (Kretschmer 1896, E. Schwyzer 1959). An “Illyrian” dialect mixed with Greek, suggested by K. O. Müller (1825) and by G. Bonfante (1987). Various explicitly “Greek” scenarios: A Greek dialect, part of the North-Western (Locrian, Aetolian, Phocidian, Epirote) variants of Doric Greek, suggested amongst others by N.G.L. Hammond (1989) Olivier Masson (1996) and Michael Meier-Brügger (2003). A northern Greek dialect, related to Aeolic Greek and Thessalian, suggested among others by A.Fick (1874) and O.Hoffmann (1906). A Greek dialect with a non-Indo-European substratal influence, suggested by M. Sakellariou (1983). A sibling language of Greek within Indo-European, Macedonian and Greek forming two subbranches of a Greco-Macedonian subgroup within Indo-European (sometimes called “Hellenic”), suggested by Joseph (2001) and others.

Text Box: The Pella katadesmos is a katadesmos (a curse, or magic spell) inscribed on a lead scroll, probably dating to 380-350 BC. It was found in Pella in 1986 (PD)The Pella curse tablet, a text written in a distinct Doric Greek idiom, found in Pella in 1986, dated to between mid to early fourth century BC, has been forwarded as an argument that the Ancient Macedonian language was a dialect of North-Western Greek. Before the discovery it was proposed that the Macedonian dialect was an early form of Greek, spoken alongside Doric proper at that time.

NOTE. Olivier Masson thinks that “in contrast with earlier views which made of it an Aeolic dialect (O.Hoffmann compared Thessalian) we must by now think of a link with North-West Greek (Locrian, Aetolian, Phocidian, Epirote). This view is supported by the recent discovery at Pella of a curse tablet which may well be the first ‘Macedonian’ text attested (...); the text includes an adverb “opoka” which is not Thessalian”. Also, James L. O’Neil states that the “curse tablet from Pella shows word forms which are clearly Doric, but a different form of Doric from any of the west Greek dialects of areas adjoining Macedon. Three other, very brief, fourth century inscriptions are also indubitably Doric. These show that a Doric dialect was spoken in Macedon, as we would expect from the West Greek forms of Greek names found in Macedon. And yet later Macedonian inscriptions are in Koine avoiding both Doric forms and the Macedonian voicing of consonants. The native Macedonian dialect had become unsuitable for written documents.”

From the few words that survive, a notable sound-law may be ascertained, that PIE voiced aspirates *dh, *bh, *gh, appear as δ (=d[h]), β (=b[h]), γ (=g[h]), in contrast to Greek dialects, which unvoiced them to θ (=th), φ (=ph), χ (=kh).

NOTE. Since these languages are all known via the Greek alphabet, which has no signs for voiced aspirates, it is unclear whether de-aspiration had really taken place, or whether the supposed voiced stops β, δ, γ were just picked as the closest matches to express voiced aspirates PIE *bh, *dh, *gh. As to Macedonian β, δ, γ = Greek φ, θ, χ, Claude Brixhe (1996) suggests that it may have been a later development: The letters may already have designated not voiced stops, i.e. [b, d, g], but voiced fricatives, i.e. [β, δ, γ], due to a voicing of the voiceless fricatives [φ, θ, x] (= Classical Attic [ph, th, kh]). Brian Joseph (2001) sums up that “The slender evidence is open to different interpretations, so that no definitive answer is really possible”, but cautions that “most likely, Ancient Macedonian was not simply an Ancient Greek dialect on a par with Attic or Aeolic”. In this sense, some authors also call it a “deviant Greek dialect”.

·PIE *dhenh2- ‘to leave’, A.Mac. δανός (δanós) ‘death’; cf. Attic θάνατος (thánatos). PIE *h2aidh- A.Mac. ἄδραια (aδraia) ‘bright weather’, Attic αἰθρία (aithría).

·PIE *bhasko- A.Mac. βάσκιοι (βáskioi) ‘fasces’. Compare also for A.Mac. ἀϐροῦτες (aβroûtes) or ἀϐροῦϝες (aβroûwes), Attic ὀφρῦς (ophrûs) ‘eyebrows’; for Mac. Βερενίκη (Βere-níkē), Attic Φερενίκη (Phere-níkē) ‘bearing victory’.

o According to Herodotus (ca. 440 BC), the Macedonians claimed that the Phryges were called Brygoi (<PIE *bhrugo-) before migrating from Thrace to Anatolia ca. 1200 BC.

o In Aristophanes’ The Birds, the form κεϐλήπυρις (keβlē-pyris) ‘red-cap (bird)’, shows a voiced stop instead of a standard Greek unvoiced aspirate, i.e. Macedonian κεϐ(α)λή (keβalē) vs. Greek κεφαλή (kephalē) ‘head’.

·If A.Mac. γοτάν (γotán) ‘pig’, is related to PIE *gwou- ‘cow’, this would indicate that the labiovelars were either intact (hence *gwotán), or merged with the velars, unlike the usual Gk. βοῦς (boûs). Such deviations, however, are not unknown within Greek dialects; compare Dor. γλεπ- (glep-) for common Gk. βλεπ- (blep-), as well as Dor. γλάχων (gláchōn) and Ion. γλήχων (glēchōn) for Gk. βλήχων (blēchōn).

·Examples suggest that voiced velar stops were devoiced, especially word-initially: PIE *genu- → A.Mac. κάναδοι (kánadoi) ‘jaws’; PIE *gombh- → A.Mac. κόμϐους (kómbous) ‘molars’.

o Compared to Greek words, there is A.Mac. ἀρκόν (arkón) vs. Attic ἀργός (argós); the Macedonian toponym Akesamenai, from the Pierian name Akesamenos – if Akesa- is cognate to Greekagassomai, agamai ‘to astonish’; cf. also the Thracian nameAgassamenos.

V. Albanian

Text Box: (Megistias 2010)Albanian is spoken by over 8 million people primarily in Albania, Kosovo, and the Former Yugoslav Republic of Macedonia, but also by smaller numbers of ethnic Albanians in other parts of the Balkans, along the eastern coast of Italy and in Sicily. It has no living close relatives among the modern IE languages. There is no consensus over its origin and dialectal classification.

References to the existence of Albanian survive from the fourteenth century AD, but without recording any specific words. The oldest surviving documents written in Albanian are the Formula e Pagëzimit (Baptismal formula), Unte paghesont premenit Atit et birit et spertit senit ‘I baptise thee in the name of the Father, and the Son, and the Holy Spirit’, recorded by Pal Engjelli, Bishop of Durres in 1462 in the Gheg dialect, and some New Testament verses from that period.

11.7.3. Anatolian Languages

The Anatolian branch is generally considered the earliest to split off from the Proto-Indo-European language, from a stage referred to as Proto-Indo-Hittite (PIH). Typically a date ca. 4500-3500 BC is assumed for the separation.

NOTE. A long period of time is necessary for Proto-Anatolian to develop into Common Anatolian. Craig Melchert and Alexander Lehrman agreed that a separation date of about 4000 BCE between Proto-Anatolian and the Proto-Indo-Hittite language community seems reasonable. The millennium or so around 4000 BC, say 4500 to 3500 BC, constitutes the latest window within which Proto-Anatolian is likely to have separated.

Within a Kurgan framework, there are two possibilities of how early Anatolian speakers could have reached Anatolia: from the north via the Caucasus, and from the west, via the Balkans. The archaeological identification of Anatolian speakers remains highly speculative, as it depends on the broad guesstimates that historical linguistics is able to offer. Nevertheless, the Balkans route appears to be somewhat more likely for archaeologists; so e.g. Mallory (1989) and Steiner (1990).  

Text Box: Map of the Hittite Empire at its greatest extent under Suppiluliuma I (ca.1350-1322 BC) and Mursili II (ca. 1321–1295 BC). (Javier Fernandez-Vina 2010). Attested dialects of the Anatolian branch are:

·Hittite (nesili), attested from ca. 1800 BC to 1100 BC, official language of the Hittite Empire.

·Luwian (luwili), close relative of Hittite spoken in Arzawa, to the southwest of the core Hittite area.

·Palaic, spoken in north-central Anatolia, extinct around the thirteenth century BC, known only fragmentarily from quoted prayers in Hittite texts.

·Lycian, spoken in Lycia in the Iron Age, most likely a descendant of Luwian, became extinct ca. the first century BC. A fragmentary language, it is also a likely candidate for the language spoken by Trojans.

·Lydian, spoken in Lydia, extinct ca. the first century BC, fragmentary, possibly from the same dialect group as Hittite.

·Carian, spoken in Caria, fragmentarily attested from graffiti by Carian mercenaries in Egypt from ca. the seventh century BC, extinct ca. the thirteenth century BC.

·Pisidian and Sidetic (Pamphylian), fragmentary.

·Milyan, known from a single inscription.

Anatolia was heavily Hellenised following the conquests of Alexander the Great, and it is generally thought that by the first century BC the native languages of the area were extinct.

Hittite proper is known from cuneiform tablets and inscriptions erected by the Hittite kings and written in an adapted form of Old Assyrian cuneiform orthography. Owing to the predominantly syllabic nature of the script, it is difficult to ascertain the precise phonetic qualities of some Hittite sounds.

The Hittite language has traditionally been stratified – partly on linguistic and partly on paleographic grounds – into Old Hittite, Middle Hittite and New or Neo-Hittite, corresponding to the Old, Middle and New Kingdoms of the Hittite Empire, ca. 1750-1500 BC, 1500-1430 BC and 1430-1180 BC, respectively.

Text Box: Clay tablet in Hittite cuneiform containing the correpondance between the Luwian King of Arzawa and the Pharaoh of Egypt. (PD)Luwian was spoken by population groups in Arzawa, to the west or southwest of the core Hittite area. In the oldest texts, e.g. the Hittite Code, the Luwian-speaking areas including Arzawa and Kizzuwatna were called Luwia. From this homeland, Luwian speakers gradually spread through Anatolia and became a contributing factor to the downfall, after ca. 1180 BC, of the Hittite Empire, where it was already widely spoken. Luwian was also the language spoken in the Neo-Hittite states of Syria, such as Milid and Carchemish, as well as in the central Anatolian kingdom of Tabal that flourished around 900 BC. Luwian has been preserved in two forms, named after the writing systems used: Cuneiform Luwian and Hieroglyphic Luwian.

Text Box: Luwian language spreading, second to first millennium BC (Hendrik Tammen 2006)For the most part, the immediate ancestor of the known Anatolian languages, Common Anatolian (a late Proto-Anatolian dialect spoken ca. 3000-2000 BC) has been reconstructed on the basis of Hittite. However, the usage of Hittite cuneiform writing system limits the enterprise of understanding and reconstructing Anatolian phonology, partly due to the deficiency of the adopted Akkadian cuneiform syllabary to represent Hittite sounds, and partly due to the Hittite scribal practices.

NOTE. This especially pertains to what appears to be confusion of voiceless and voiced dental stops, where signs -dV- and -tV- are employed interchangeably different attestations of the same word. Furthermore, in the syllables of the structure VC only the signs with voiceless stops are generally used. Distribution of spellings with single and geminated consonants in the oldest extant monuments indicates that the reflexes of PIE voiceless stops were spelled as double consonants and the reflexes of Proto-Indo-European voiced stops as single consonants.

Known changes from Indo-Hittite into Common Anatolian include:

·  Voiced aspirates merged with voiced stops: *dh→*d, *bh→*b, *gh→*g.

·  Voiceless stops become voiced after accented long vowel or diphthong: PIH *wēk- → CA *wēg- (cf. Hitt. wēk- ‘ask for’); PIH *dheh1ti ‘putting’ → CA *dǣdi (cf. Luw. taac- ‘votive offering’).

·  Conditioned allophone PIH *tj- → CA *tsj-, as Hittite still shows.

·  PIH *h1 is lost in CA, but for *eh1→*ǣ, appearing as Hitt., Pal. ē, Luw., Lyc., Lyd. ā; word-initial *h2→*x, non-initial *h2→*h; *h3→*h.

NOTE 1. Melchert proposes that CA *x (voiceless fricative) is lenited to *h (voiced fricative) under the same conditions as voiceless stops. Also, word-initial *h3 is assumed by some scholars to have been already lost in CA. 

NOTE 2. There is an important assimilation of laryngeals within CA: a sequence *-VRHV- becomes -VRRV-; cf. PIH *sperh1V- Hitt. isparr-  ‘kick flat’; PIH *sun-h3-V- → Hitt. sunna- ‘fill’, Pal. sunnuttil- ‘outpouring’; etc.

·  PIH resonants are generally stable in CA. Only word-initial *r̥ has been eliminated. Word-initial *je- shows a trend to become CA *e-, but the trend is not complete in CA, as Hittite shows.

·  Diphthong evolved as PIH *ei → CA *ę̄; PIH *eu CA *ū. PIE *oi, *ai, *ou, *au, appear also in CA.

NOTE. Common Anatolian preserves PIE vowel system basically intact. Some cite the merger of PIH *o and (a controversial) *a as a Common Anatolian innovation, but according to Melchert that merger was secondary shared innovation in Hittite, Palaic and Luwian, but not in Lycian. Also, the lengthening of accented short vowels in open syllables cannot be of Common Anatolian, and neither can lengthening in accented closed syllables.

·  The CA nominal system shows an productive declension in *-i, *-u, considered an archaic feature retained from PIH.

·  There are only two grammatical genders, animate and inanimate; this has usually been interpreted as the original system in PIH.

·  Hittite verbs are inflected according to two general verbal classes, the mi- and the hi-conjugation. They had two voices (active and mediopassive), two moods (indicative and imperative), and two tenses (present and past), two infinitive forms, one verbal substantive, a supine, and a participle.


Part II

Phonology & Morphology






By Carlos Quiles & Fernando López-Menchero