Third Edition



Part I.

Language and Culture

Part II.

Phonology & Morphology

Part III.


Part IV.

Texts & Dictionary




Carlos Quiles                 

Fernando López-Menchero

Version 5.00 (April 2011)

© 2011 by Carlos Quiles

© 2011 by Fernando López-Menchero



Avda. Sta. María de la Cabeza, 3, E-LL, Badajoz 06001, Spain.

Badajoz – Leg. Dep. BA-145-0 (2006)  |  Sevilla – Leg. Dep. SE -4405-2007 U.E.

ISBN-13: 978-1461022138  |  ISBN-10: 1461022134

Information, translations and revisions of this title:  <http://indo-european.info/>


Printed in the European Union

Description: Description: C:\Users\Carlos\Desktop\MIE\indo-european-grammar_files\image001.pngPublished by the Indo-European Language Association <http://dnghu.org/>

This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit <http://creativecommons.org/licenses/by-sa/3.0/> or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

Images taken or modified from Wikimedia projects are referenced with description and author-date, with usernames (or real names, if available), and links to the websites of origin in the Bibliography section, at the end of this book, unless they are in the public domain (PD).

This free (e)book is intended for nonprofit and educational purposes, its authors do not attribute themselves the authorship of the excerpts referenced, it is not intended for specialised readers in IE linguistics (so the potential market of the copyrighted works remains intact), and the amount and substantiality of the portions used in relation to the copyrighted works as a whole are neglectible. Therefore, the use of excerpts should fall within the fair use policy of international copyright laws. Since revisions of this free (e)book are published immediately, no material contained herein remains against the will and rights of authors or publishers.

The cover image has been modified from a photo of the Solvognen (The Sun Carriage) from the Bronze Age, at display at the National Museum (Nationalmuseet) in Denmark (Malene Thyssen 2004). For the epithet ‘wheel of the sun’, see §10.8.

While every precaution has been taken in the preparation of this book, the publisher and authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

Table of Contents

Table of Contents...............................................................................3


Guide to the Reader.. 13

Acknowledgements. 15

Considerations of Method.. 16

The Three-Dorsal Theory. 18

The Loss of Laryngeals. 26

Conventions Used in This Book.. 33

Writing System.. 35

Modern Indo-European. 42

1. Introduction.................................................................................49

1.1. The Indo-European Language Family. 49

1.2. Traditional Views. 51

1.3. The Theory of the Three Stages. 53

1.4. The Proto-Indo-European Urheimat. 59

1.5. Other Archaeolinguistic Theories. 69

1.6. Relationship to Other Languages. 71

1.7. Indo-European Dialects. 73

Schleicher’s Fable: From PIE to Modern English. 73

1.7.1. Northern Indo-European dialects. 76

1.7.2. Southern Indo-European Dialects. 109

11.7.3. Anatolian Languages. 128

2. Phonology.....................................................................................135

2.1. Classification of Sounds. 135

2.2. Pronunciation.. 138

2.3. Syllables. 141

2.4. Prosody. 144

2.5. Accent. 145

2.6. Vowel Change.. 147

2.7. Consonant Change.. 150

3. Words and their Forms.............................................................153

3.1. The Parts of Speech.. 153

3.2. Inflection.. 154

3.3. Root and Stem.. 155

3.4. Gender.. 158

3.5. Number.. 161

4. Nouns..............................................................................................163

4.1. Declension of Nouns. 163

4.2. First Declension.. 168

4.2.1. First Declension Paradigm.. 168

4.2.2. First Declension in Examples. 169

4.2.3. The Plural in the First Declension. 170

4.3. Second Declension.. 171

4.3.1. Second Declension Paradigm.. 171

4.3.2. Second Declension in Examples. 172

4.5.3. The Plural in the Second Declension. 173

4.4. Third Declension.. 174

4.4.1. Third Declension Paradigm.. 174

4.4.2. In i, u. 175

4.4.3. In Diphthong. 177

4.4.4. The Plural in the Third and Fourth Declension. 178

4.5. Fourth Declension.. 179

4.5.1. Fourth Declension Paradigm.. 179

4.5.2. In Occlusive, m, l 181

4.5.3. In r, n, s. 182

4.5.4. The Plural in the Fourth Declension. 183

4.6. Variable Nouns. 184

4.7. Inflection Types. 184

Excursus: Nominal Accent-Ablaut Patterns. 186

4.8. Number Developments: The Dual. 190

5. Adjectives......................................................................................191

5.1. Inflection of Adjectives. 191

5.2. The Motion.. 191

5.3. Adjective Specialisation.. 194

5.4. Comparison of Adjectives. 195

5.5. Numerals. 197

5.5.1. Classification of Numerals. 197

5.5.2. Cardinals and Ordinals. 198

5.5.3. Declension of Cardinals and Ordinals. 201

5.5.4. Distributives. 204

5.5.5. Numeral Adverbs. 204

5.5.6. Multiplicatives. 205

6. Pronouns......................................................................................207

6.1. About the Pronouns. 207

6.2. Personal Pronouns. 207

6.3. Reflexive Pronouns. 209

6.4. Possessive Pronouns. 210

6.5. Anaphoric Pronouns. 210

6.6. Demonstrative Pronouns. 211

6.7. Interrogative and Indefinite Pronouns. 213

6.7.1. Introduction. 213

6.7.2. Compounds. 215

6.7.3. Correlatives. 216

6.8. Relative Pronouns. 217

6.9. Other Pronouns. 218

7. Verbs...............................................................................................219

7.1. Introduction.. 219

7.1.1. Voice, Mood, Tense, Person, Number. 219

7.1.2. Voice. 220

7.1.3. Moods. 222

7.1.4. Aspect. 222

7.1.5. Tenses of the Finite Verb. 223

7.2. Forms of the Verb.. 224

7.2.1. The Verbal Stems. 224

7.2.2. Verb-Endings. 225

7.2.3. The Thematic Vowel 230

7.3. The Conjugations. 231

7.4. The Four Stems. 236

7.4.1. Tense-Stems and Verb Derivation. 236

7.4.2. The Present Stem.. 237

7.4.3. The Aorist Stem.. 249

7.4.4. The Perfect Stem.. 252

7.5. Mood Stems. 253

7.7. Noun and Adjective Forms. 257

7.8. Conjugated Examples. 262

7.8.1. Thematic Verbs. 262

7.8.2. Athematic Inflection. 267

7.8.3. Common PIE Stems. 271

7.9. Verbal Composition.. 273

7.10. The Verbal Accent. 273

8. Particles........................................................................................275

8.1. Particles. 275

8.2. Adverbs. 277

8.3. Derivation of Adverbs. 277

8.4. Prepositions. 279

8.5. Conjunctions. 282

8.6. Interjections. 284

9. Morphosyntax.............................................................................287

9.1. Verbal Morphosyntax.. 287

9.1.1. Person. 287

9.1.2. Tense-Aspect and Mood. 287

9.1.3. Voice. 291

9.2. Nominal Morphosyntax.. 292

9.2.1. Nominative. 292

9.2.3. Vocative. 293

9.2.4. Accusative. 293

9.2.5. Instrumental 294

9.2.6. Dative. 295

9.2.7. Ablative. 296

9.2.8. Genitive. 297

9.2.9. Locative. 297

9.2.10. Case Forms: Adverbial Elements. 298

10. Sentence Syntax........................................................................301

10.1. The Sentence.. 301

10.1.1. Kinds of Sentences. 302

10.1.2. Nominal Sentence. 304

10.1.3. Verbal Sentence. 306

10.2. Sentence Modifiers. 308

10.2.1. Intonation Patterns. 308

10.2.2. Sentence Delimiting Particles. 310

10.3. Verbal Modifiers. 311

10.3.1. Declarative Sentences. 311

10.3.2. Interrogative Sentences. 312

10.3.3. Negative Sentences. 315

10.4. Nominal Modifiers. 317

10.4.1. Adjective and Genitive Constructions. 317

10.4.2. Compounds. 318

10.4.3. Determiners in Nominal Phrases. 324

10.4.4. Apposition. 325

10.5. Modified forms of PIE Simple Sentences. 327

10.5.1. Coordination. 327

10.5.2. Complementation. 329

10.5.3. Subordinate Clauses. 332

10.6. Syntactic Categories. 338

10.6.1. Particles as Syntactic Means of Expression. 338

10.6.2. Marked Order in Sentences. 341

10.6.3. Topicalisation with Reference to Emphasis. 342

10.6.4. Wackernagel’s Law and the placement of clitics. 345

10.7. Phrase and Sentence Prosody. 348

10.8. Poetry. 349

10.9. Names of persons. 352


Appendix I: Indo-European in Use...............................................357

I.1. Komtloqjom (Conversation). 357

I.2. Horatjosjo kanm (Horatii Carminvm). 360

I.3. The New Testament in Indo-European.. 362

I.3.1. Pater Nos (Lord’s Prayer). 362

I.3.2. Slwēje Marijā (Hail Mary). 363

I.3.2. Kréddhēmi (Nicene Creed). 363

I.3.3. Noudós Sūnús (Parable of the Prodigal Son). 366

I.3.4. Newos Bheidhos (New Testament) – Jōhanēs, 1, 1-14. 370

I.4. The Rigveda in Indo-European.. 372

Appendix II: Late Indo-European Lexicon.................................373

Formal Aspects. 373

II.1. English – Late Indo-European.. 375

II.2. Late Indo-European – English.. 413

II.3. Etymology From Descendant Languages. 491

Appendix III: In-Depth Analysis...................................................765

III.1. Root Nouns. 765

III.2. Indefinite, Demonstrative, and Personal Pronouns. 770

III.2.1. Indefinite Pronouns. 770

III.2.2. Demonstrative Pronouns. 771

III.2.3. Personal Pronouns. 771

III.3. Word Formation: Common PIE Lengthenings and Suffixes. 775

Bibliography and Further Reading..........................................783

Online Resources. 786

Wikipedia. 787

Images and maps. 788



In this newer edition of our Grammar, we follow the first intention of this work, trying not to include personal opinions, but a collection of the latest, most reasoned academic papers on the latest reconstructible PIE, providing everything that might be useful for the teaching and learning of Indo-European as a living language.

With that aim in mind, and with our compromise to follow the scientific method, we have revised the whole text in search for out-dated material and unexplained forms, as well as inconsistencies in reconstructions or conventions. We have also restricted the amount of marginal choices in favour of the general agreement, so that we could offer a clear, sober, and commonly agreed manual to learn Indo-European.

The approach featured in this book for more than half a decade already is similar to the one followed in Gamkrelidze–Ivanov (1994-1995), and especially to that followed by Adrados–Bernabé–Mendoza (1995-1998). Both returned to (and revised) the ‘Brugmannian’ Indo-European, the historical result of the development of certain isoglosses, both phonetic (loss of laryngeals, with the development of brief and long vowel system) and morphological (polythematic system in noun and verb, innovations in their inflection).

Adrados–Bernabé–Mendoza (1995-1998) distinguished between Late Indo-European and its parent-language Indo-Hittite – laryngeal, without distinction in vowel length, monothematic system. We developed that trend further, focussing on a post-Late Indo-European period, in search for a more certain, post-laryngeal IE, to avoid the merged laryngeal puzzle of the ‘disintegrating Indo-European’ of Bomhard (1984), and the conventional notation of a schwa indogermanicum (kept in Adrados–Bernabé–Mendoza), most suitable for a description of a complex period of phonetic change –  which is possibly behind the flight of all other available modern works on PIE to the highly theoretical (but in all other respects clear and straightforward) PIH phonology. Morphology and syntax remain thus nearest to the older IE languages attested, always compared to Anatolian material, but avoiding the temporal inconsistencies that are found throughout the diachronic reconstructions in other, current manuals.

We try to fill the void that Gamkrelidze–Ivanov and Adrados–Bernabé–Mendoza left by following works (Lehmann 1972, Rix 1986, etc.) that already differentiated PIH from Late Indo-European, trying to “see the three-stage theory to the bitter end. Once established the existence of the three-staged IE, a lot must still be done. We have to define the detail, and we must explain the reason for the evolution, which formal elements does PIE deal with, and how they are ascribed to the new functions and categories. These developments shall influence the history of individual languages, which will have to be rewritten. Not only in the field of morphology, but also in phonetics and syntax” (Adrados–Bernabé–Mendoza 1995-1998).

Apart from a trustable reconstruction of the direct ancestors of the older IE languages (North-West Indo-European, Proto-Greek and Proto-Indo-Iranian), this work ‘corrupts’ the natural language – like any classical language grammar – with the intention to show a living language, and the need to establish some minimal writing conventions to embellish the phonetic notation. The question ‘why not learn Indo-European as a living language?’ arises from the same moment on when reconstruction is focussed on a (scientifically) conservative approach – an ultimate consequence of the three-stage-theory, and the search for more certain reconstructions –, yielding a reliable language system. A language system free from the need for theoretical artifices, or personal opinions on ‘original’ forms, that try to fill unending phonetic, morphological and syntactical uncertainties of the current diachronic PIE reconstruction.

As the learned reader might have already inferred, the question of “natural” vs. “artificial” is not easily answered concerning ancient languages. Ancient Greek phonetics, for example, is known through internal as well as external reconstruction, and the actual state of the art is largely based on the body of evidence discussed extensively by linguists and philologists of the nineteenth and twentieth centuries, with lots of questions unsolved. Furthermore, Ancient Greek is not one language; in fact, there are many dialects, each with different periods, and different representations of their sounds, all of which account for what we know with the unitary name Ancient Greek. Another example is Sanskrit, retained as different historical linguistic stages and dialects through oral tradition. Its first writings and grammatical rules were laid down centuries after it had ceased to be spoken, and centuries earlier before it became the classical Indian language. Latin is indeed not different from the above examples, being systematised in the so-called classical period, while a real, dialectally and temporarily variable Vulgar Latin was used by the different peoples who lived in the Roman Empire, making e.g. some questions over the proper pronunciation still debated today.

The interest in the study and use of Indo-European as a living language today is equivalent to the interest in the study and use of these ancient languages as learned languages in the the Byzantine Empire, India and Mediaeval Europe, respectively. With regard to certainty in reconstruction, Late Indo-European early dialects are not less natural than these classical languages were in the past. Even modern languages, like English, are to a great extent learned languages, in which social trends and linguistic artifices are constantly dividing between formal and colloquial, educated and uneducated, often simply good or bad usage of the language.

About the question of ‘dead’ vs. ‘living’ languages, heated debate is e.g. held on the characterisation of Sanskrit, which is not as other dead languages, being spoken, written and read today in India. The notion of the death of a language remains thus in an unclear realm between academia and public opinion.

I prefer to copy Michael Coulson’s words from the preface of a great introductory work on Sanskrit (from the Teach Yourself® series), referring originally to the way Indians used Sanskrit as a learned (and dead) language, far beyond the rules that grammarians had imposed. I think this text should also be valid if we substituted ‘Sanskrit’ for ‘Indo-European’; the ‘rules’ of ‘Sanskrit grammarians’ for the ‘reconstruction’ of ‘IE scholars’; and the ‘renowned Sanskrit writers’ for the ‘potential future IE writers’:

 «By [the time Kālidāsa, a writer fl. ca. the fifth century AD, lived] Sanskrit was not a mother tongue, but a language to be studied and consciously mastered. This transformation had come about through a gradual process, the beginnings of which are no doubt earlier than Pāini [ancient Indian Sanskrit grammarian, fl. fourth century BC] himself. (…) Kālidāsa learnt his Sanskrit from the rules of a grammarian living some 700 years before his time. Such a situation may well strike the Western reader as paradoxical. Our nearest parallel is in the position of Latin in Medieval Europe. There is, however, an important difference. Few would deny Cicero or Vergil a greater importance in Latin literature than any mediaeval author. Conversely, few Sanskritists would deny that the centre of gravity in Sanskrit literature lies somewhere in the first millennium AD, for all that its authors were writing in a so-called ‘dead-language’.

On this point it may be useful to make a twofold distinction – between a living and a dead language, and between a natural and a learned one. A language is natural when it is acquired and used instinctively; it is living when people choose to converse and formulate ideas in it in preference to any other. To the modern Western scholar Sanskrit is a dead as well as a learned language. To Kālidāsa or Śakara [ninth century Indian philosopher from a Dravidian-speaking region] it was a learned language but a living one. (The term ‘learned is not entirely satisfactory, but the term ‘artificial’, which is the obvious complementary of ‘natural’, is normally reserved for application to totally constructed languages such as Esperanto.)

(…) Living languages, whether natural or learned, change and develop. But when a learned language such as literary English is closely tied to, and constantly revitalized by, a natural idiom, its opportunities for independent growth are limited. Sanskrit provides a fascinating example of a language developing in complete freedom from such constraints as an instrument of intellectual and artistic expression. To say that Classical Sanskrit was written in conformity with Pāini’s rules is true, but in one sense entirely misleading. Pāini would have been astounded by the way in which Bāā or Bhavabhūti or Abhinavagupta handled the language. It is precisely the fact that Sanskrit writers insisted on using Sanskrit as a living and not as a dead language that has often troubled Western scholars. W. D. Whitney, a great but startlingly arrogant American Sanskritist of the nineteenth century, says of the Classical language: ‘Of linguistic history there is next to nothing in it all; but only a history of style, and this for the most part showing a gradual depravation, an increase of artificiality and an intensification of certain more undesirable features of the language – such as the use of passive constructions and of participles instead of verbs, and the substitution of compounds for sentences.’ Why such a use of passives, participles and compounds should be undesirable, let alone depraved, is left rather vague, and while there have been considerable advances in linguistic science in the past fifty years there seems to have been nothing which helps to clarify or justify these strictures. Indeed, Whitney’s words would not be worth resurrecting if strong echoes of them did not still survive in some quarters.

Acceptance of Pāini’s rules implied a final stabilization of the phonology of Sanskrit, and also (at least in the negative sense that no form could be used which was not sanctioned by him) of its morphology. But Pāini did not fix syntax. To do so explicitly and incontrovertibly would be difficult in any language, given several ways of expressing the same idea and various other ways of expressing closely similar ideas.»

Badajoz, April 2011


Guide to the Reader

A. Abbreviations

abl.: ablative

acc.: accusative

act.: active

adj.: adjective

adv.: adverb

Alb.: Albanian

Arm.: Armenian

aor.: aorist

aux.: auxiliary

Av: Avestan

BSl.: Balto-Slavic

CA: Common Anatolian

Cel.: Celtic

cf.: confer ‘compare, contrast’

Cz.: Czech

dat.: dative

Du.: Dutch

e.g.: exempli gratia ‘for example’

Eng.: English

esp.: especially

f.: feminine

fem.: feminine

gen.: genitive

Gaul.: Gaulish

Gk.: Greek

Gmc.: Proto-Germanic

Goth.: Gothic

Hitt.: Hittite

Hom.: Homeric

IE: Indo-European

IED: Late Indo-European dialects

imp.: imperative

imperf.: imperfect

Ind.-Ira.: Indo-Iranian

ins.: instrumental

int.: interrogative

Ita.: Italic

Lat.: Latin

LIE: Late Indo-European

Lith.: Lithuanian

Ltv.: Latvian

loc.: locative

Luw./Luv.: Luvian

Lyc.: Lycian

m.: masculine

masc.: masculine

M.H.G.: Middle High German

mid.: middle-passive voice

MIE: Modern Indo-European

Myc.: Mycenaean

n.: neuter

neu.: neuter

nom.: nominative

NP: noun phrase

NWIE: North-West Indo-European

O: object

Obj.: object

O.Av.: Old Avestan

O.C.S.: Old Church Slavic

O.E.: Old English

O.Ind.: Old Indian

O.Ir.: Old Irish

O.H.G.: Old High German

O.Hitt.: Old Hittite

O.Lat.: Archaic Latin

O.Lith.: Old Lithuanin

O.N.: Old Norse

O.Pers.: Old Persian

O.Pruss.: Old Prussian

O.Russ.: Old Russian

opt.: optative

Osc.: Oscan

OSV: object-subject-verb order

OV: object-verb order

perf.: perfect

PAn: Proto-Anatolian

PGmc.: Pre-Proto-Germanic

PII: Proto-Indo-Iranian

PGk: Proto-Greek

Phryg: Phrygian

PIE: Proto-Indo-European

PIH: Proto-Indo-Hittite

pl.: plural

pres.: present

pron.: pronoun

Ptc.: particle

Russ.: Russian

sg.: singular

Skt.: Sanskrit

Sla.: Slavic

SOV: subject-object-verb order

subj.: subjunctive

SVO: subject-verb-object order

Toch.: Tocharian

Umb.: Umbrian

Ved.: Vedic

v.i.: vide infra ‘see below’

VO: verb-object order

voc.: vocative

VP: verb phrase

v.s.: vide supra ‘see above’

VSO: verb-subject-object order

1st: first person

2nd: second person

3rd: third person


B. Symbols


denotes a reconstructed form, not preserved in any written documents


denotes a reconstructed form through internal reconstruction

“comes from” or “is derived from”

“turns into” or “becomes”


indicates morpheme boundary, or separates off that part of a word that the reader should focus on

( )

encloses part of a word that is not relevant to the discussion, or that is an optional part

“zero desinence” or “zero-grade”


denotes a wrong formation


C. Spelling Conventions

All linguistic forms are written in italics. The only exceptions are reconstructed IED forms, that are given in boldface; and in italics if morphemes or dialectal forms (from PII, PGk, or from East or West European). We use a non-phonetic writing for IEDs, following the conventions in Writing System (see below).

When representing word schemes:

C = consonant

R = resonant (r, l, m, n)

T = dental

K = occlusive

J = glide (j, w)

H = any laryngeal or merged laryngeal

V = vowel

= long vowel

I = i, u

° = epenthetic or auxiliary vowel

(conventionally, the symbol ° under the vocalic resonants is placed before it in these cases)

# = syllabic limit

Citation: parenthetical referencing of author-date is used for frequently cited books (referenced in the Bibliography), and author-title for articles and other books.


I owe special and personal gratitude to my best friend and now fiancée Mayte, whose many lovely qualities do not include knowledge of or an interest in historical linguistics. But without her this never would have been written.

I have been extremely fortunate to benefit from Fernando López-Menchero’s interest and from his innumerable contributions, revisions, and corrections. Without his deep knowledge of Ancient Greek and Latin, as well as his interest in the most recent research in IE studies, this grammar would have been unthinkable.

I have received the invaluable support of many colleagues and friends from the University of Extremadura (UEx), since we began publishing this book half a decade ago. The University has been crucial to this enterprise: first in 2005 when Prof. Antonio Muñoz PhD, Vice-Dean of the Faculty of Library Science, expert in Business Information, as well as other signatories – doctors in Economics and English Philology –, supported this language revival project before the competition committee and afterwards; in 2006, when representatives of the Dean’s office, of the Regional Government of Extremadura, and of the Mayor’s office of Caceres, recognised our work awarding our project a prize in the “Entrepreneurship Competition in Imagination Society”, organizing and subsidizing a business trip to Barcelona’s most innovative projects; and in 2007, when we received the unconditional support of the Department of Classical Antiquity of the UEx.

Over the years I have also received feed-back from informed end-users, as well as from friends and members of the Indo-European Language Association, who were in the best position to judge such matters as the intelligibility and consistency of the whole. I am also indebted to Manuel Romero from Imcrea Diseño Editorial, for his help with the design and editorial management of the first printed edition.

The influence of the work of many recent scholars is evident on these pages. Those who are most often cited include (in alphabetical order): D.Q. Adams, F.R. Adrados David Anthony, R.S.P. Beekes, Emile Benveniste, Alberto Bernabé, Thomas Burrow, George Cardona, James Clackson, B.W. Fortson, Matthias Fritz, T.V. Gamkrelidze, Marija Gimbutas, Eric Hamp, V.V. Ivanov, Jay Jasanoff, Paul Kiparsky, Alwin Kloekhorst, F.H.H. Kortlandt, Jerzy Kuryłowicz, W.P. Lehmann, J.P. Mallory, Manfred Mayrhofer, Wolfgang Meid, Michael Meier-Brügger, Torsten Meissner, Craig Melchert, Julia Mendoza, Anna Morpurgo Davies, Norbert Oettinger, Edgar Polomé, C.J. Ruijgh, Paolo Ramat, Donald Ringe, Helmut Rix, A.L. Sihler, Sergei Starostin, J.L. Szemerényi, Francisco Villar, Calvert Watkins, M.L. West.

Considerations of Method

This work is intended for language learners, and is not conceived as a defence of personal research. Excerpts of texts from many different sources have been copied literally, especially regarding controversial or untreated aspects. We feel that, whereas the field of Indo-European studies is indeed mature, and knowledge is out there to be grasped, we lack a comprehensive summary of the available consensual theories, scattered over innumerable specialised personal books and articles.

We must begin this work by clearly exposing our intended working method in selecting and summing up the current available theories: it is basically, as it is commonly accepted today for PIE reconstruction, the comparative method, with the help of internal reconstruction.

NOTE. Adrados–Bernabé–Mendoza (1995-1998): “We think (…) that a linguist should follow, to establish relations among languages, linguistic methods. If then the results are coincident, or compatible, or might be perfected with those obtained by archaeologists, so much the better. But a mixed method creates all types of chain mistakes and arbitrary results. We have seen that many times. And a purely archaeological method like the one supported lately by Renfrew 1987 or, in certain moments, the same Gimbutas 1985, clashes with the results of Linguistics.

The method has to rely on [the comparative method and internal reconstruction]. We have already expressed our mistrust in the results based on typological comparisons with remote languages (glottalic theory, ergative, etc.). Now they are more frequent in books like Gamkelidze-Ivanov 1994-1995.

And fundamentally lexical comparisons should not be the first argument in comparisons, either. We do not doubt their interest in certain moments, e.g. to illuminate the history of Germanic in relation with Finnish. And they could have interest in different comparisons: with Uralo-Altaic languages, Semitic, Caucasic, Summerian, etc.”

The guidelines that should be followed, as summarised by Beekes (1995):

1.       “See what information is generated by internal reconstruction.

2.      Collect all material that is relevant to the problem.

3.      Try to look at the problem in the widest possible contact, thus in relation to everything else that may be connected with it. (…)

4.      Assume that corresponding forms, that is to say, forms whose meaning (probably) and whose structures (probably) seem to be alike, all derive from one common ancestor.

5.       The question of how deviant forms should be evaluated is a difficult one to answer. When such a form can be seen as an innovation within a particular language (or group of languages), the solution is that the form in question is young and as such cannot be important for the reconstruction of the original form. Whenever a deviant form resists explanation it becomes necessary to consider the possibility that the very form in question may be one that preserves the original. (…)

6.      For every solution the assumed (new) sound-laws must be phonetically probable, and the analogies must be plausible.

7.       The reconstructed system must be probable (typological probability). If one should reconstruct a system which is found nowhere else in any of the known languages, there will always be, to say the least, reasons for doubt. On the other hand, every language is unique, and there is thus always the possibility that something entirely unknown must be reconstructed.”

There are two main aspects of the comparative method as is usually applied that strikes the ‘pure scientific’ reader, though, always obsessed with adopting a conservative approach to research, in the sense of security or reliability. We shall take words from Claude Bernard’s major discourse on scientific method, An Introduction to the Study of Experimental Medicine (1865), to illustrate our point: 

1. Authority vs. Observation. It is through observation that science is carried forward — not through uncritically accepting the authority of academic or scholastic sources. Observable reality is our only authority. “When we meet a fact which contradicts a prevailing theory, we must accept the fact and abandon the theory, even when the theory is supported by great names and generally accepted”.

NOTE. Authority is certainly a commonly used, strong and generally sound basis to keep working on comparative grammar, though, because it this is a field based on ‘pyramidal’ reasoning and not experimental research. But authority should be questioned whenever it is needed. Authority – be it the view of the majority, or the opinion of a renowned linguist or linguistic school – do not mean anything, and ideas are not to be respected because of who supports (or supported) them.

2. Verification and Disproof. “Theories are only hypotheses, verified by more or less numerous facts. Those verified by the most facts are the best, but even then they are never final, never to be absolutely believed”. What is rationally true is the only authority.

On hypothesis testing in science, decisions are usually made using a statistical null-hypothesis test approach. Regarding linguistics and its comparative method, sometimes authority is placed as null hypothesis or H0 (as in many non-experimental sciences), while counter-arguments must take the H1 position, and are therefore at disadvantage against the authority view.

If two theories show a strong argument against the basic H0 (“nothing demonstrated”), and are therefore accepted as alternative explanations for an observed fact, then the most reasonable one must be selected as the new H0, on the grounds of the lex parsimoniae (or the so-called Ockham’s razor), whereby H0 should be the competing hypothesis that makes the fewest new assumptions, when the hypotheses are equal in other respects (e.g. both sufficiently explain available data in the first place).

NOTE. The principle is often incorrectly summarised as “the simplest explanation is most likely the correct one”. This summary is misleading, however, since the principle is actually focussed on shifting the burden of proof in discussions. That is, the Razor is a principle that suggests we should tend towards simpler theories until we can trade some simplicity for increased explanatory power. Contrary to the popular summary, the simplest available theory is sometimes a less accurate explanation. Philosophers also add that the exact meaning of “simplest” can be nuanced in the first place.

As an example of the applicability of the scientific method, we will take two difficult aspects of PIE reconstructions: the series of velars and the loss of laryngeals.

The problem with these particular reconstructions might be summarised by the words found in Clackson (2007): “It is often a fault of Indo-Europeanists to over-reconstruct, and to explain every development of the daughter languages through reconstruction of a richer system in the parent language.”

The Three-Dorsal Theory

PIE phonetic reconstruction is tied to the past: acceptance of three series of velars in PIE is still widespread today. We followed the reconstruction of ‘palatovelars’, according to general authority and convention, but we have changed minds since the first edition of this grammar.

Direct comparison in early IE studies, informed by the centum-satem isogloss, yielded the reconstruction of three rows of dorsal consonants in Late Indo-European by Bezzenberger (Die indogermanischer Gutturalreihen, 1890), a theory which became classic after Brugmann included it in the 2nd Edition of his Grundriss. It was based on vocabulary comparison: so e.g. from PIE *km̥tóm ‘hundred’, there are so-called satem (cf. O.Ind. śatám, Av. satəm, Lith. šimtas, O.C.S. sto) and centum languages (cf. Gk. -katón, Lat. centum, Goth. hund, O.Ir. cet).

The palatovelars *kj, *gj, and *gjh were supposedly [k]- or [g]-like sounds which underwent a characteristic phonetic change in the satemised languages – three original “velar rows” had then become two in all Indo-European dialects attested. After that original belief, then, the centum group of languages merged the palatovelars *kj, *gj, and *gjh with the plain velars *k, *g, and *gh, while the satem group of languages merged the labiovelars *kw, *gw, and *gwh with the plain velars *k,*g, and *gh.

The reasoning for reconstructing three series was very simple: an easy and straightforward solution for the parent PIE language must be that it had all three rows found in the proto-languages, which would have merged into two rows depending on their dialectal (centum vs. satem) situation – even if no single IE dialect shows three series of velars. Also, for a long time this division was identified with an old dialectal division within IE, especially because both groups appeared not to overlap geographically: the centum branches were to the west of satem languages. Such an initial answer should be considered unsound today, at least as a starting-point to obtain a better explanation for this ‘phonological puzzle’ (Bernabé).

Many Indo-Europeanists still keep a distinction of three distinct series of velars for Late Indo-European (and also for Indo-Hittite), although research tend to show that the palatovelar series were a late phonetic development of certain satem dialects, later extended to others. This belief was originally formulated by Antoine Meillet (De quelques difficulties de la théorie des gutturals indoeuropéennes, 1893), and has been followed by linguists like Hirt (Zur Lösung der Gutturalfrage im Indogermanischen, 1899; Indogermanische Grammatik, BD III, Das Nomen 1927), Lehmann (Proto-Indo-European Phonology, 1952), Georgiev (Introduzione allo studio delle lingue indoeuropee, 1966), Bernabé (“Aportaciones al studio fonológico de las guturales indoeuropeas”, Em. 39, 1971), Steensland (Die Distribution der urindogermanischen sogenannten Guttrale, 1973), Miller (“Pure velars and palatals in Indo-European: a rejoinder to Magnusson”, Linguistics 178, 1976), Allen (“The PIE velar series: Neogrammarian and other solutions in the light of attested parallels”, TPhS, 1978), Kortlandt (“H2 and oH2”, LPosn, 1980), Shields (“A new look at the centum/satem Isogloss”, KZ 95, 1981), etc.

NOTE. There is a general trend to reconstruct labiovelars and plain velars, so that the hypothesis of two series of velars is usually identified with this theory. Among those who support two series of velars there is, however, a minority who consider the labiovelars a secondary development from the pure velars, and reconstruct only velars and palatovelars (Kuryłowicz), already criticised by Bernabé, Steensland, Miller and Allen. Still less acceptance had the proposal to reconstruct only a labiovelar and a palatal series (Magnusson).

Arguments in favour of only two series of velars include:

1. In most circumstances palatovelars appear to be allophones resulting from the neutralisation of the other two series in particular phonetic circumstances. Their dialectal articulation was probably constrained, either to an especial phonetic environment (as Romance evolution of Latin k before e and i), either to the analogy of alternating phonetic forms.

NOTE. However, it is difficult to pinpoint exactly what the circumstances of the allophony are, although it is generally accepted that neutralisation occurred after s and u, and often before r or a; also apparently before m and n in some Baltic dialects. The original allophonic distinction was disturbed when the labiovelars were merged with the plain velars. This produced a new phonemic distinction between palatal and plain velars, with an unpredictable alternation between palatal and plain in related forms of some roots (those from original plain velars) but not others (those from original labiovelars). Subsequent analogical processes generalised either the plain or palatal consonant in all forms of a particular root. Those roots where the plain consonant was generalised are those traditionally reconstructed as having plain velars in the parent language, in contrast to palatovelars.

2. The reconstructed palatovelars and plain velars appear mostly in complementary distributions, what supports their explanation as allophones of the same phonemes. Meillet (Introduction à l’étude comparative des langues indo-européennes, 1903) established the contexts in which there are only velars: before a, r, and after s, u; while Georgiev (1966) clarified that the palatalisation of velars had been produced before e, i, j, and before liquid or nasal or w + e, i, offering statistical data supporting his conclusions. The presence of palatalised velar before o is then produced because of analogy with roots in which (due to the ablaut) the velar phoneme is found before e and o, so the alternation *kje/*ko would be levelled as *kje/*kjo.

3. There is residual evidence of various sorts in satem languages of a former distinction between velar and labiovelar consonants:

·      In Sanskrit and Balto-Slavic, in some environments, resonants become iR after plain velars but uR after labiovelars.

·      In Armenian, some linguists assert that kw is distinguishable from k before front vowels.

·      In Albanian, some linguists assert that kw and gw are distinguishable from k and g before front vowels.

NOTE. This evidence shows that the labiovelar series was distinct from the plain velar series in LIE, and could not have been a secondary development in the centum languages. However, it says nothing about the palatovelar vs. plain velar series. When this debate initially arose, the concept of a phoneme and its historical emergence was not clearly understood, however, and as a result it was often claimed (and sometimes is still claimed) that evidence of three-way velar distinction in the history of a particular IE language indicates that this distinction must be reconstructed for the parent language. This is theoretically unsound, as it overlooks the possibility of a secondary origin for a distinction.

4. The palatovelar hypothesis would support an evolution kj k of centum dialects, i.e. a move of palatovelars to back consonants, what is clearly against the general tendency of velars to move forward its articulation and palatalise in these environments. A trend of this kind is unparallelled and therefore typologically a priori unlikely (although not impossible), and needs that other assumptions be made.

5. The plain velar series is statistically rarer than the other two in a PIE lexicon reconstructed with three series; it appears in words entirely absent from affixes, and most of them are of a phonetic shape that could have inhibited palatalisation.

NOTE. Common examples are:

o *yug-óm ‘yoke’: Hitt. iukan, Gk. zdugón, Skt. yugá-, Lat. iugum, O.C.S. igo, Goth. juk.

o *ghosti- ‘guest, stranger’: Lat. hostis, Goth. gasts, O.C.S. gostĭ.

“The paradigm of the word for ‘yoke’ could have shown a palatalizing environment only in the vocative *yug-e, which is unlikely ever to have been in common usage, and the word for ‘stranger’ ghosti- only ever appears with the vocalism o”. (Clackson 2007).

6. Alternations between plain velars and palatals are common in a number of roots across different satem languages, where the same root appears with a palatal in some languages but a plain velar in others.

NOTE. This is consistent with the analogical generalisation of one or another consonant in an originally alternating paradigm, but difficult to explain otherwise:

o  *ak-/ok- ‘sharp’, cf.  Lith. akúotas, O.C.S. ostrŭ, O.Ind. asrís, Arm. aseln, but Lith. asrùs.

o  *akmon- ‘stone’, cf.  Lith. akmuõ, O.C.S. kamy, O.Ind. áśma, but Lith. âsmens.

o  *keu- ‘shine’, cf. Lith. kiáune, Russ. kuna, O.Ind. svas, Arm. sukh.

o  *bhleg- ‘shine’, cf. O.Ind.  bhárgas, Lith. balgans, O.C.S. blagŭ, but Ltv. blâzt.

o  *gherdh- ‘enclose’, O.Ind. g, Av. gərəda, Lith. gardas, O.C.S. gradu, Lith. zardas, Ltv. zârdas.

o  *swekros ‘father-in-law’, cf. O.Sla. svekry, O.Ind. śvaśru.

o  *peku- ‘stock animal’; cf. O.Lith. pkus, Skt. paśu-, Av. pasu-.

o  *kleus- ‘hear’; cf. Skt. śrus, O.C.S. slušatĭ, Lith. kláusiu.

A rather weak argument in favour of palatovelars rejecting these finds is found in Clackson (2007): “Such forms could be taken to reflect the fact that Baltic is geographically peripheral to the satem languages and consequently did not participate in the palatalization to the same degree as other languages”.

7. There are different pairs of satemised and non-satemised velars found within the same language.

NOTE. The old argument proposed by Brugmann (and later copied by many dictionaries) about “centum loans” is not tenable today. For more on this, see Szemerény (1978, review from Adrados–Bernabé–Mendoza 1995-1998), Mayrhofer (“Das Guttrualproblem un das indogermanische Wort für Hase”, Studien zu indogermanische Grundsprache, 1952), Bernabé (1971). Examples include:

o  *selg-  ‘throw’, cf. O.Ind. sjáti, sargas

o  *kau/keu- ‘shout’, cf. Lith. kaukti, O.C.S. kujati, Russ. sova (as Gk. kauax); O.Ind. kauti, suka-.

o  *kleu- ‘hear’, Lith. klausýti, slove, O.C.S. slovo;  O.Ind. karnas, sruti,  srósati, śrnóti, sravas.

o  *leuk-, O.Ind. rokás, ruśant-.

8. The number and periods of satemisation trends reconstructed for the different branches are not coincident.

NOTE. So for example Old Indian shows two stages,

o    PIE *k O.Ind. s

o    PIE *kwe, *kwi O.Ind. ke, ki; PIE *ske, *ski > O.Ind. c (cf. cim, candra, etc.)

In Slavic, three stages are found,

o    PIE *ks

o    PIE *kwe, *kwič  (čto, čelobek)

o    PIE *kwoi→*koi→*ke gives ts (as Sla. tsená)

9. In most attested languages which present aspirates as a result of the so-called palatovelars, the palatalisation of other phonemes is also attested (e.g. palatalisation of labiovelars before e, i), what may indicate that there is an old trend to palatalise all possible sounds, of which the palatalisation of velars is the oldest attested result.

NOTE. It is generally believed that satemisation could have started as a late dialectal ‘wave’, which eventually affected almost all PIE dialectal groups. The origin is probably to be found in velars followed by e, i, even though alternating forms like *gen/gon caused natural analogycal corrections within each dialect, which obscures still more the original situation. Thus, non-satemised forms in so-called satem languages would be non-satemised remains of the original situation, just as Spanish has feliz and not ˟heliz, or fácil and not ˟hácil, or French facile and nature, and not ˟fêle or ˟nûre as one should expect from its phonetic evolution.

10. The existence of satem languages like Armenian in the Balkans, a centum territory, and the presence of Tocharian, a centum dialect, in Central Asia, being probably a northern IE dialect.

NOTE. The traditional explanation of a three-way dorsal split requires that all centum languages share a common innovation that eliminated the palatovelar series, due to the a priori unlikely move of palatovelars to back consonants (see above). Unlike for the satem languages, however, there is no evidence of any areal connection among the centum languages, and in fact there is evidence against such a connection – the centum languages are geographically noncontiguous. Furthermore, if such an areal innovation happened, we would expect to see some dialect differences in its implementation (cf. the above differences between Balto-Slavic and Indo-Iranian), and residual evidence of a distinct palatalised series. In fact, however, neither type of evidence exists, suggesting that there was never a palatovelar series in the centum languages. (Evidence does exist for a distinct labiovelar series in the satem languages, though; see above.)

11. A system of two gutturals, velars and labiovelars, is a linguistic anomaly, isolated in the IE occlusive subsystem – there are no parallel oppositions bw-b, pw-p, tw-t, dw-d, etc. Only one feature, their pronunciation with an accompanying rounding of the lips, helps distinguish them from each other. Such a system has been attested in some older IE languages. A system of three gutturals – palatovelars, velars and labiovelars –, with a threefold distinction isolated in the occlusive system, is still less likely.

NOTE. In the two-dorsal system, labiovelars turn velars before -u, and there are some neutralisation positions which help identify labiovelars and velars; also, in some contexts (e.g. before -i, -e) velars tend to move forward its articulation and eventually palatalise. Both trends led eventually to centum and satem dialectalisation.

Those who support the model of the threefold distinction in PIE cite evidence from Albanian (Pedersen) and Armenian (Pisani), that they seem to treat plain velars differently from the labiovelars in at least some circumstances, as well as the fact that Luwian could have had distinct reflexes of all three series.

NOTE 1. It is disputed whether Albanian shows remains of two or three series (cf. Ölberg “Zwei oder drei Gutturaldreihen? Vom Albanischen aus gesehen” Scritti…Bonfante 1976; Kortlandt 1980; Pänzer “Ist das Französische eine Satem-Sprache? Zu den Palatalisierung im Ur-Indogermanischen und in den indogermanischen Einzelsprachen”, Festschrift für J. Hübschmidt, 1982), although the fact that only the worst and one of the most recently attested (and neither isolated nor remote) IE dialect could be the only one to show some remains of the oldest phonetic system is indeed very unlikely. Clackson (2007), supporting the three series: “Albanian and Armenian are sometimes brought forward as examples of the maintenance of three separate dorsal series. However, Albanian and Armenian are both satem languages, and, since the *kj series has been palatalised in both, the existence of three separate series need not disprove the two-dorsal theory for PIE; they might merely show a failure to merge the unpalatalised velars with the original labio-velars.”

NOTE 2. Supporters of the palatovelars cite evidence from Luwian, an Anatolian language, which supposedly shows a three-way velar distinction *kjz (probably [ts]); *kk; *kwku (probably [kw]), as defended by Melchert (“Reflexes of *h3 in Anatolian”, Sprache 38 1987). So, the strongest argument in favour of the traditional three-way system is that the the distinction supposedly derived from Luwian findings must be reconstructed for the parent language. However, the underlying evidence “hinges upon especially difficult or vague or otherwise dubious etymologies” (see Sihler 1995); and, even if those findings are supported by other evidence in the future, it is obvious that Luwian might also have been in contact with satemisation trends of other Late IE dialects, that it might have developed its own satemisation trend, or that maybe the whole system was remade within the Anatolian branch. Clackson (2007), supporting the three series, states: “This is strong independent evidence for three separate dorsal series, but the number of examples in support of the change is small, and we still have a far from perfect understanding of many aspects of Anatolian historical phonology.”

Also, one of the most difficult problems which subsists in the interpretation of the satemisation as a phonetic wave is that, even though in most cases the variation *kj/k may be attributed either to a phonetic environment or to the analogy of alternating apophonic forms, there are some cases in which neither one nor the other may be applied, i.e. it is possible to find words with velars in the same environments as words with palatals.

NOTE. Compare for example *okj(u), eight, which presents k before an occlusive in a form which shows no change (to suppose a syncope of an older *okjitō, as does Szemerényi, is an explanation ad hoc). Other examples in which the palatalisation cannot be explained by the next phoneme nor by analogy are *swekru- ‘husband’s mother’, *akmōn ‘stone’, *peku ‘cattle’, which are among those not shared by all satem languages. Such unexplained exceptions, however, are not sufficient to consider the existence of a third row of ‘later palatalised’ velars (see Bernabé 1971; Cheng & Wang “Sound change: actuation and implementation”, Lg. 51, 1975), although there are still scholars who come back to the support of the hypothesis of three velars. So e.g. Tischler 1990 (reviewed in Meier-Brügger 2003): “The centum-satem isogloss is not to be equated with a division of Indo-European, but rather represents simply one isogloss among many…examples of ‘centum-like aspects’ in satem languages and of ‘satem-like aspects’ in centum languages that may be evaluated as relics of the original three-part plosive system, which otherwise was reduced every-where to a two-part system.”

Newer trends to support the old assumptions include e.g. Huld (1997, reviewed in Clackson 2007), in which the old palatal *kj is reconstructed as a true velar, and *k as a uvular stop, so that the problem of the a priori unlikely and unparallelled merger of palatal with velar in centum languages is theoretically solved.

As it is clear from the development of the dorsal reconstruction, the theory that made the fewest assumptions was that an original Proto-Indo-European had two series of velars. These facts should have therefore shifted the burden of proof, already by the time Meillet (1893) rejected the proposal of three series; but the authority of Neogrammarians and well-established works of the last century, as well as traditional conventions, probably weighted (and still weight) more than reasons.

NOTE. More than half century ago we had already a similar opinion on the most reasonable reconstruction, that still today is not followed, as American Sanskritist Burrow (1955) shows: “The difficulty that arises from postulating a third series in the parent language, is that no more than two series (…) are found in any of the existing languages. In view of this it is exceedingly doubtful whether three distinct series existed in Indo-European. The assumption of the third series has been a convenience for the theoreticians, but it is unlikely to correspond to historical fact. Furthermore, on examination, this assumption does not turn out to be as convenient as would be wished. While it accounts  in a way for correspondences like the above which otherwise would appear irregular, it still leaves over a considerable number of forms in the satem-languages which do not fit into the framework (…) Examples of this kind are particularly common in the Balto-Slavonic languages (…). Clearly a theory which leaves almost as many irregularities as it clears away is not very soundly established, and since these cases have to be explained as examples of dialect mixture in early Indo-European, it would appear simplest to apply the same theory to the rest. The case for this is particularly strong when we remember that when false etymologies are removed, when allowance is made for suffix alternation, and when the possibility of loss of labialization in the vicinity of the vowel u is considered (e.g. kraví-, ugrá-), not many examples remain for the foundation of the theory.”

Of course, we cannot (and we will probably never)  actually know if there were two or three series of velars in LIE, or PIH, and because of that the comparative method should be preferred over gut intuition, historical authority, or convention, obstacles to the progress in a dynamic field like IE studies.

As Adrados (2005) puts it with bitterness: “Indo-Europeanists keep working on a unitary and flat PIE, that of Brugmann’s reconstruction. A reconstruction prior to the decipherment of Hittite and the study of Anatolian! This is but other proof of the terrible conservatism that has seized the scientific discipline that is or must be Indo-European linguistics: it moves forward in the study of individual languages, but the general theory is paralised”.

The Loss of Laryngeals

Today, the reconstruction of consonantal sounds to explain what was reconstructed before as uncertain vocalic schwa indogermanicum or schwa primum is firmly accepted in IE studies in general, and there is a general agreement on where laryngeals should be reconstructed. Even the number and quality of those laryngeals is today a field of common agreement, although alternative number of laryngeals and proposals for their actual phonemic value do actually exist.

However, as Clackson (2007) sums up: “Particularly puzzling is the paradox that laryngeals are lost nearly everywhere, in ways that are strikingly similar, yet apparently unique to each language branch. We can of course assume some common developments already within PIE, such as the effect of the laryngeals *h2 and *h3 to change a neighbouring *e to *a or *o, but the actual loss of laryngeals must be assumed to have taken place separately after the break-up of the parent language (…) it would have seemed a plausible assumption that the retention of *h2, and possibly also *h1 and *h3, is an archaism of Anatolian, and the loss of the laryngeals was made in common by the other languages.”

In the vocalic inventory of current Late Indo-European reconstruction, the following evolution paradigm is widespread, following Beekes (1995), Meier-Brügger (2003) and Ringe (2005):















* ī






















































































































NOTE 1. A differentiation between early or pre-LIE and late or post-LIE has to be made. An auxiliary vowel was firstly inserted in the evolution PIH → pre-LIE in a certain position, known because it is found in all dialects alike: *Ch1C *Ch1°C, *Ch2C *Ch2°C, *Ch3C *Ch3°C. By post-LIE we assume a period of a Northern-Southern dialectal division and Southern dialectal split, in which the whole community remains still in contact, allowing the spread of innovations like a generalised vocalisation of the auxiliary vowel (during the first migrations in the Kurgan framework, the assumed end of the LIE community). During that period, the evolution pre-LIE post-LIE would have been as follows: *Ch1°C*Ch1əC*CHəC*CəC. That evolution reached IEDs differently: whereas in South-West IE (Greek, Armenian, Phrygian, Ancient Macedonian) the pre-LIE laryngeal probably colourised the vocalic output from *Ch1əC as in the general scheme (into e, a, o), in NWIE and PII the late LIE the *ə from *CəC was assimilated to another vowel: generally to a in NWIE, and to i in PII. Word-initially, only South-West IE dialects appear to have had an output * *Hə e, a, o, while the other dialects lost them *H .

NOTE 2. The following developments should also be added: 

-  In South-West IE there are no cases of known *Hj- *Vj-. It has been assumed that this group produced in Greek a z.

-  It seems that some evidence of word-initial laryngeals comes from Indo-Iranian, where some compound words show lengthening of the final vowel before a root presumed to have had an initial laryngeal.

-  The *-ih2 group in auslaut had an alternative form *-j°h2, LIE *-ī/-jə, which could produce IED -ī, -ja (alternating forms are found even within the same dialect).

-  Apparently a reflect of consonantal laryngeals is found between nonhigh vowels as hiatuses (or glottal stops) in the oldest Indo-Iranian languages – Vedic Sanskrit and Old Avestan, as well as in Homeric Greek (Lindeman Introduction to the ‘Laryngeal Theory’, 1987). For a discussion on its remains in Proto-Germanic, see Connolly (“‘Grammatischer Wechsel’ and the laryngeal theory”, IF 85 1980).

-  Contentious is also the so-called Osthoff’s Law (which affected all IE branches but for Tocharian and Indo-Iranian), which possibly shows a general trend of post-LIE date.

-  When *H is in a post-plosive, prevocalic position, the consonantal nature of the laryngeal values is further shown *CHVC → *ChVC; that is more frequent in PII, cf. *pl̥th2ú- → Ved. pr̥thú-; it appears also in the perfect endings, cf. Gk. oistha.

-  The group *CR̥HC is explained differently for the individual dialects without a general paradigm; so e.g. Beekes (1995) or Meier-Brügger (2003) distinguish the different dialectal outputs as: Tocharian (*r̥HC→*r°HC), Germanic (*r̥H→*r̥) and to some extent Balto-Slavic (distinction by accentuation), Italo-Celtic (*r̥H→*r°H), while in Greek the laryngeal determined the vowel: e.g. *r̥h1→*r̥°h1→*r̥eH.

There are multiple examples which do not fit in any dialectal scheme, though; changes of outputs from PIH reconstructed forms with resonants are found even within the same dialects. The  explanation in Adrados–Bernabé–Mendoza (1995-1998) is probably nearer to the actual situation, in going back to the pronunciation of the common (pre-LIE) group: “the different solutions in this case depend solely on two factors: a) if there are one or two auxiliary vowels to facilitate the pronunciation of this group; b) the place where they appear.” So e.g. a group *CR̥HC could be pronounced in LIE with one vowel (*CR°HC or *C°RHC) or with two (*C°R°HC,  *C°RH°C, or *CR°H°C). That solution accounts for all LIE variants found in the different branches, and within them.

-  The laryngeal of *RHC- in anlaut was vocalised in most languages, while the resonant was consonantal (*R̥HC- became *RVC-).

-  In the group *CR̥HV, a vowel generally appears before the resonant and the laryngeal disappears; that vowel is usually coincident with the vocalic output that a resonant alone would usually give in the different dialects, so it can be assumed that generally *CR̥HVC(V)R̥V, although exceptions can indeed be found. A common example of parallel treatment within the same dialect is Greek pros/paros < *pros/p°ros.

-  Accounting for some irregularities in the outcome of laryngeals (especially with *-h2, but not limited to it) is the so-called “Saussure effect”, whereby LIE dialects do not show an usual reflection of the inherited sequences #HRo- and -oRHC-. According to Nussbaum (Sound law and analogy: papers in honor of Robert S.P. Beekes on the occasion of his 60th birthday, Alexander Lubotsky, 1997), this effect “reflects something that happened, or failed to happen, already in the proto-language”.

Hence, for the moment, we could assume that a South-East and a South-West IE dialects were already separated, but still closely related through a common (Northern) IE core, because the loss (or, more exactly, the vocalic evolution) of laryngeals of Northern IE did in fact reach Graeco-Aryan dialects similarly and in a complementary distribution. That is supported by modern linguistic Northern-Southern separation model (v.i. §§1.3, 1.4, 1.7): “(…) today it is thought that most innovations of Greek took place outside Greece; no doubt, within the Indo-Greek group, but in a moment in which certain eastern isoglosses didn’t reach it.” Adrados–Bernabé–Mendoza (1995-1998).

Apart from those fictions or artifices that help linguists keep on with their work on individual dialects from a secure starting point (conventional PIH phonetics), there is no reason to doubt that the most (scientifically) conservative starting point for PIE evolution is that LIE had lost most laryngeals but for one merged *H – of the “Disintegrating Indo-European” of Bomhard (Toward Proto-Nostratic: A New Approach to the Comparison of Proto-Indo-European and Proto-Afroasiatic, 1984) – into the known timeline and groupings, and that a late post-LIE vocalisation of interconsonantal *H into *Hə and later *ə did eventually substitute the original forms, albeit at a different pace, arriving probably somehow late and incompletely to the earliest dialects to split up, which completed independently the laryngeal loss.

Some individual finds seem to support a different treatment of laryngeals in certain dialects and environments, though.

NOTE. Examples are the contentious Cogwill’s Law (“such shortening is fairly common cross-linguistically, and the IE examples may have each arisen independently”, Fortson 2004), or other peculiar sound changes recently found in Latin and Balto-Slavic, all of them attested in late IE dialects that had already undergone different vocalic evolutions.

Meier-Brügger (2003) mentions 3 non-Anatolian testimonies of laryngeals:

1)    Indo-Iranian: “the Vedic phrase devyètu, i.e. devì etuυ is best understandable if we suppose that dev ‘goddess’ still contained the laryngeal form *dewíH (with *-iH<*-ih2) at the time of the formulation fo the verse in question. In the phase *-íH it was possible for the laryngeal simply to disappear before a vowel”. Other common example used is *wr̥kiH. It is not justified, though, that it must represent a sort of unwritten laryngeal, and not an effect of it, i.e. a laryngeal hiatus or glottal stop, from older two-word sandhis that behave as a single compound word, see §2.4.3. Interesting is also that they are in fact from words alternating in pre-LIE *-iH/*-j°H (or post-LIE *-ī/*-jə) which according to Fortson (2004) reflect different syllabification in Indo-Iranian vs. Greek and Tocharian, whilst “[t]he source of the difference is not fully understood”. In line with this problem is that the expected case of *-aH stems is missing, what makes it less likely that Indo-Iranian examples come from a common hypothetic PII stage in which a word-final *-H had not still disappeared, and more likely that it was a frozen remain (probably of a glottal stop) in certain formal expressions. In fact, it has long been recognised that the treatment of word-final laryngeals shows a strong tendency to disappear (so e.g. in Hittite), and most of the time it appears associated with morphological elements (Adrados–Bernabé–Mendoza 1995-1998). They should then be considered – like the hiatuses or glottal stops found in Hom. Gk. and Germanic compositions – probable ancient reminiscences of a frozen formal language.

2)   The sandhi variant in *-aH is found, according to Meier-Brügger (2003) and Ringe (2006), in Greek and Old Church Slavonic. In both “clear traces are missing that would confirm a PIE ablaut with full grade *-eh2- and zero grade *-h2- (…) That is why it appears as if the differentiation between the nominative and vocative singular in this case could be traced to sandhi-influenced double forms that were common at a time when the stems were still composed of *-ah2, and the contraction *-ah2- >*-ā- had not yet occurred”. Szemerény (1999) among others already rejected it: “The shortening of the original IE ending -ā to -ă is regular, as the voc., if used at the beginning of a sentence or alone, was accented on the first syllable but was otherwise enclitic and unaccented; a derivation from -ah with the assumption of a prevocalic sandhi variant in -a fails therefore to explain the shortening.”

3)   The latest example given by Meier-Brügger is found in the unstable *CRHC model (see above), which is explained with PIE *gn̥h1-- ‘created, born’: so in Vedic jātá- < PII ģātó- < *ģaHtó- < *gjn̥h1-, which would mean that the laryngeal merged after the evolution LIE *n̥ PII a. The other irregular dialectal reconstructions shown are easily explained following the model of epenthetic vowel plus merged laryngeal (or glottal stop?) in *gnəh1-; cf. for the same intermediate grade PGk gnētó- (< post-LIE *gneHtó-), pre-NWIE g(°)naʔ- (<post-LIE *gnəHtó-) into Ita., Cel. *gnātó-, PGmc. *kunʔda-, Bal.-Sla. *ginə-. Such dialectal late loss of the merged laryngeal *H (or glottal stop) is therefore limited to the groups including a sonorant, and the finds support a vocalisation of LIE *n̥, *m̥ PII a earlier than the loss of laryngeal (or glottal stop) in that environment. That same glottal stop is possibly behind the other examples in Meier-Brügger: O.Av. va.ata-< PII waʔata-, or Ved. *ca-kar-ʔa (the ʔ still preserved in the period of the activity of Brugmann’s law), or Ved. náus < *naʔus.

In Lubotsky (1997) different outputs are proposed for *CRH groups before certain vowels: “It is clear that the “short” reflexes are due to laryngeal loss in an unaccented position, but the chronology of this loss is not easy to determine. If the laryngeal loss had already occurred in PIIr., we have to assume that PIIr. *CruV subsequently yielded CurvV in Sanskrit. The major problem we face is that the evidence for the phonetically regular outcome of *CriV and *CruV in Indo-Iranian is meager and partly conflicting.” Again, the conflict is solved assuming a late loss of the laryngeal; however, the attestation of remains of glottal stops, coupled with the auxiliary vowel solution of Adrados–Bernabé–Mendoza (1995-1998) solves the irregularities without making new assumptions and dialectal sound laws that in turn need their own further exceptions.

Kortlandt seems to derive the loss of laryngeals from Early Slavic (see below §1.7.1.I.D), a sister language of West and East Baltic languages, according to his view. Also, on Italo-Celtic (2007):  “If my view is correct, the loss of the laryngeals after a vocalic resonant is posterior to the shortening of pretonic long vowels in Italic and Celtic. The specific development of the vocalic liquids, which is posterior to the common shortening of pretonic long vowels, which is in its turn posterior to the development of ē, ā, ō from short vowel plus laryngeal, supports the hypothesis of Italo-Celtic linguistic unity.” Hence the problematic environments with sonorants are explained with a quite late laryngeal loss precisely in those groups.

The most probable assumption then, if some of those peculiar developments are remnants of previous laryngeals, as it seems, is that the final evolution of the merged *H was coincident with LIE disintegration, and might have reached its end in the different early prehistoric communities, while still in contact with each other (in order to allow for the spread of the common trends); the irregular vocalic changes would have then arisen from unstable syllables (mainly those which included a resonant), alternating even within the same branches, and even in the same phonetic environments without laryngeals (v.i. §2.3).

While there are reasons to support a late evolution of the pre-LIE merged laryngeal, there seems to be no strong argument for the survival of LIE merged *H into the later periods of NWIE, PGk or PII dialects, still less into later proto-languages (as Germanic, Slavic, Indo-Aryan, etc.). However, for some linguists, the complete loss of the LIE laryngeal (or even laryngeals) must have happened independently in each dialectal branch attested; so e.g. Meier-Brügger (2003): “As a rule, the laryngeals were disposed of only after the Proto-Indo-European era”; Clackson (2007): “But the current picture of laryngeal reconstruction necessitates repeated loss of laryngeals in each language branch”.

NOTE. The question is then brought by Clackson into the Maltese and Modern Hebrew examples, languages isolated from Semitic into an Indo-European environment for centuries. That is indeed a possible explanation: that all IE branches, after having split up from the LIE common language, would have become independently isolated, and then kept in close contact with (or, following the Maltese example, surrounded by) non-IE languages without laryngeals. Then, every change in all branches could be explained by way of diachronic and irregular developments of vowel quality. In Clackson’s words: “(…) the comparative method does not rely on absolute regularity, and the PIE laryngeals may provide an example of where reconstruction is possible without the assumption of rigid sound-laws.”

Even accepting that typologically both models of (a common, post-LIE vs. an independent, dialectal) laryngeal loss were equally likely, given that all languages had lost the merged laryngeal before being attested, all with similar outputs, and that even the final evolution  (laryngeal hiatuses or glottal stops) must have been shared in an early period – since they are found only in frozen remains in old and distant dialects –, an early IED loss of laryngeals fits into a coherent timeline within the known dialectal evolution. With that a priori assumption, we limit the need for unending ad hoc ‘sound-laws’ for each dialectal difference involving a sonorant, which would in turn need their own exceptions. Therefore, we dispense with unnecessary hypotheses, offering the most conservative approach to the problem.

Conventions Used in This Book

1. We try to keep a consistent nomenclature throughout the book, when referring to the different reconstructible stages of Proto-Indo-European (PIE). From Pre-PIH, highly hypothetical stage, only reconstructible through internal reconstruction, to the most conservative reconstruction of early LIE dialects (IEDs). We do so by using the following schema of frequent terms and dates:

Description: Description: C:\Users\Carlos\Desktop\MIE\indo-european-grammar_files\image002.png

NOTE. This is just a simplified summary to understand the following sections. The full actual nomenclature and archaeological dates are discussed in detail in §§1.3, 1.4, and 1.7.

The dates include an archaeological terminus post quem, and a linguistic terminus ante quem. In such a huge time span we could differentiate between language periods. However, these (linguistic and archaeological) limits are usually difficult to define, and their differentiation hardly necessary in this grammar. Similarly, the terms Hittite, Sanskrit, Ancient Greek, Latin, etc. (as well as modern languages) might refer in the broadest sense to a time span of over 1,000 years in each case, and they are still considered a single language; a selection is made of the prestigious dialect and age for each one, though, as it is done in this grammar, where the prestigious language is Late Indo-European, while phonetics remains nearer to the middle-late period of IEDs, whose post-laryngeal output is more certain.

2. The above graphic is intended to show stemmatic, as well as synchronic levels. The reconstruction of North-West Indo-European is based on secondary materials: it is a level 3 proto-language, reconstructed on the basis of level 5 proto-languages (of ca. 1000 BC), i.e. primary Proto-Celtic, Proto-Italic, secondary Proto-Balto-Slavic (through Proto-Slavic and Proto-Baltic) and secondary Pre-Proto-Germanic (through internal reconstruction), see §1.7.1.

NOTE. Coeval level 3 dialects Proto-Greek (from level 5 Mycenaean and level 6 Ancient Greek primary materials) and Proto-Indo-Iranian (from level 5 Old Indian and level 6 Iranian materials) could be considered reconstructions based on primary as well as secondary materials. All of them, as well as data from other dialects (Tocharian A and B, Armenian, Albanian), conform the secondary and tertiary materials used to reconstruct a level 2 Late Indo-European. Proto-Anatolian is a level 2 internal reconstruction from level 3 Common Anatolian, in turn from level 4 and level 5 primary materials on Anatolian dialects. Both Late Indo-European and Proto-Anatolian help reconstruct a parent language, Indo-Hittite, which is then a level 1 language.

Each reconstructed parent level is, indeed, more uncertain and inconsistent than the previous one, because the older a material is (even primary texts directly attested), the more uncertain the reconstructed language. And more so because all parent reconstructions are in turn helpful to refine and improve the reconstruction of daughter and sister proto-languages. With that scheme in mind, it is logical to consider more consistent and certain the reconstruction of IEDs, these in turn more than LIE, and this more than PIH.

3. Palatovelars are neither reconstructed for Late Indo-European, nor (consequently) for Indo-Hittite. While not still a settled question (v.s. Considerations of Method), we assume that the satem trend began as an areal dialectal development in South-East Indo-European, and spread later (and incompletely) through contact zones – e.g. into Pre-Balto-Slavic.

NOTE. Because West and Central European (Italo-Celtic and Germanic) and Proto-Greek were not affected by that early satemisation trend –although Latin, Greek and Celtic actually show late independent ‘satemisations’ –, the reconstruction of centum NWIE and PGk, and satem PII (the aim of this book) should be an agreed solution, no matter what the different personal or scholarly positions on LIE and PIH might be.

4. We assume an almost fully vocalic – i.e. post-laryngeal – nature of IEDs since the end of the LIE community (assumed to have happened before ca. 2500 BC, according to archaeological dates), although not a settled question either (v.s. Considerations of Method). Whether LIE lost the merged laryngeal *H sooner or later, etymological roots which include laryngeals will be labelled PIH and follow today’s general three-laryngeal convention, while some common LIE vocabulary will be shown either with pre-LIE merged *H or post-LIE vocalic output *ə (which was assimilated to NWIE a, PII i), or with the reconstructed post-LIE glottal stop *ʔ.

NOTE. In this grammar we will show the reconstructed phonetics of a post-LIE period, focussing on NWIE vocalism, while keeping a vocabulary section with a Late Indo-European reconstruction, respecting NWIE/PII dialectal differences; not included are the different vocalic outputs of South-West IE, from word-initial and interconsonantal laryngeals.

Writing System

This table contains common Proto-Indo-European phonemes and their proposed regular corresponding letters in alphabets and Brahmic alphasyllabaries.

Consonants and Consonantal Sounds










Π π

P p

ـپ ــ

Պ պ

П п


Β β

B b

ـب ـبـ‎‎ بـ‎‎

Բ բ

Б б


Βη βη

Bh bh

‎‎ـبھ ـبھـ‎‎ بھـ‎‎

Բհ բհ

Бх бх


Τ τ

T t

ـت ـتـ تـ

Տ տ

Т т


Θ θ

Th th

ـتھ ـتھـ تھـ

Թ թ

Тх тх


Δ δ

D d

ـد ـد

Դ դ

Д д


Δη δη

Dh dh

ـدھ ـدھـ دھـ

Դհ դհ

Дх дх


Κ κ

K k

ـک ـكـ كـ

Կ կ

К к


Χ χ

Kh kh

ـكھ ـكھـ كھـ

Ք ք

Кх кх


Γ γ

G g

ـگ ـگـ گـ

Գ գ

Г г


Γη γη

Gh gh

ـگھ ـگھـ گھـ

Գհ գհ

Гх гх


Ϙ ϙ

Q q

ـق ق ـقـ

Խ խ

Къ къ



Ϟ ϟ

C c

ـغ ـغـ غـ

Ղ ղ

Гъ гъ



Ϟη ϟη














Ch ch

ـغھ ـغھـ غھـ

Ղհ ղհ

Гъх гъх



Η η












H h

ـھ ـھـ ھـ

Հ հ

Х х











J ϳ

J j

ـۑـ ـۑ‎‎ ۑـ

Յ յ

Й й


Ϝ ϝ

W w

ـۋ ـۋ ۋ

Ւ ւ

В в


Ρ ρ

R r


Ռ ռ

Р р


Λ λ

L l

ـل ـلـ لـ

Լ լ

Л л


Μ μ

M m

ـم ــمــ مـ

Մ մ

М м


Ν ν

N n

ـن ـنـ ـنـ

Ն ն

Н н


Σ σ ς

S s

ـس ـسـ سـ

Ս ս

С с


Sounds found in Proto-Greek only


Φ φ   

Ph ph

ـپھ ھــ ھ

Փ փ

Пх пх


Ϙη ϙη

Qh qh

ـقھ قھ ـقھـ

Խհ խհ

Чх чх



Τσ τσ

Ts ts

ـتسـ ـتس تسـ

Ծ ծ

Ц ц


Δζ δζ

Dz dz

ـدز ـﺩز دز

Ձ ձ

Дз дз


Sounds found in Proto-Indo-Iranian only


Τϻ τϻ

Ķ ķ

ـتژ ـتژ تژ

Չ չ

Ч ч


Δϻ δϻ

Ģ ģ

ـدﮋ ـدﮋ ﺩﮋ

Ց ց

Дщ дщ


Δϻη δϻη

Ģh ģh

دﮋھـ ـدﮋھـ ﺩﮋھـ

Ցհ ցհ

Дщ дщ



Τþ τþ

Ḳ ḳ

ـ ــ ـ

Ճ ճ

Тш тш


Δþ δþ

Ġ ġ

ـ ــ ـ

Ջ ջ

Дж дж


Δþη δþη

Ġh ġh

ـھ ـھـ ھـ

Ջհ ջհ

Джх джх


Ϸ þ

Š š

ـﺶ ـﺶ

Շ շ

Ш ш


Vowels and Vocalic Allophones










Α α

A a

ـا ـا ا

Ա ա

А а


Ε ε

E e

ـێـ ـێ ێـ

Է է

E e


Ο ο

O o

ـۆ ـۆ ۆ

Օ օ

О о


Ā ā

ـأ ـأ أ

Ա՟ ա՟

Ā ā


Ē ε̄

Ē ē

ـێٔـ ـێٔ ێٔـ

Է՟ է՟

Ē ē


Ō ō

Ō ō

ـۆٔ ۆٔ ـۆٔ

Օ՟ օ՟

Ō ō









Ι ι

I i

ی ـيـ يـ

Ի ի

И и


Ī ī

یٔ ـئـ ئـ

Ի՟ ի՟

Ӣ ӣ


Υ υ

U u

     ـو ـو و

Ո ո

У у


Ū ū

ـؤ ـؤ ؤ

Ո՟ ո՟

Ӯ ӯ











Ρ̣ ρ̣

Ṛ ṛ

ـرٜ ـرٜ رٜ

Ռՙ ռՙ

Р̣ р̣


Λ̣ λ̣

Ḷ ḷ

ـلٜ ـلٜـ لٜـ

Լՙ լՙ

Л̣ л̣


Μ̣ μ̣

Ṃ ṃ

ـمٜ ــمٜــ مٜـ

Մՙ մՙ

М̣ м̣



Ν̣ ν̣

Ṇ ṇ

ـنٜ ـنٜـ ـنٜـ

Նՙ նՙ

М̣ м̣



This proposal is purely conventional, and it takes into account values such as availability, simplicity (one letter for each sound), transliteration, tradition.

NOTE. We have followed this order of objectives in non-Brahmic scripts:

·       Availability: especially of letters in common Latin and Cyrillic keyboards and typography, since they account for most of the current Northern IE world.

·       Simplicity: each sound is represented with one letter (or letter plus diacritics). Digraphs used only when necessary: aspirated consonants are represented with the consonant plus the letter for [h], unless there is an independent character for that aspirated consonant.

·       Equivalence of letters: a character in one alphabet should be transliterated and read directly in any other to allow an automatic change from the main alphabets into the others without human intervention. The lack of adequate characters to represent PIE phonetics (resonants, semivowels, long vowels) in alphabets conditions the final result.

·       Description: Description: C:\Users\Carlos\Desktop\MIE\indo-european-grammar_files\image003.jpgTradition: the historic or modern sound of the letters is to be retained when possible.

Description: Description: Writing systems of the Indo-European World. (2011, modified from Mirzali Zazaoğlu 2008)The names of the consonants in Indo-European following the Latin pattern would be – B, be (pronounced bay); Bh, bhe (bhay); C, ce (gway); Ch, che (gwhay); D, de (day); Dh, dhe (dhay); G, ge (gay); Gh, ghe (ghay); H, ha; K, ka; L, el; M, em; N, en; P, pe; Q, qa (kwa); R, er; S, es; T, te; W, wa.

In Aryan, the letters are named with their sound followed by a, as in Sanskrit – ba, bha, ca, cha, da, dha, ga, gha, and so on.


An acute accent (´) is written over the vowel in the accented syllable, except when accent is on the second to last syllable (or paenultima) and in monosyllabic words.

NOTE. Since all non-clitic words of more than one syllable would be marked with one accent, as we have seen, a more elegant convention is not to write all accents always.  The second to last syllable seems to be the most frequent accented syllable, so we can spare unnecessary diacritics if the accent is understood in that position, unless marked in other syllable.

Long vowels are marked with a macron ( ¯ ), and vocalic allophones of resonants are marked with a dot below it ( ̣). Accented long vowels and resonants are represented with special characters that include their diacritics plus an acute accent.

NOTE. It is recommended to write all diacritics if possible, although not necessary. The possibility of omitting the diacritical marks arises from the lack of appropriate fonts in traditional typography, or difficulty typing those marks in common international keyboards. Therefore, alternative writings include pater/patr, m. father, nmrtos/mtós, m. immortal, kmtom/któm, hundred, etc. Such a defective representation of accents and long vowels is common even today in Latin and Greek texts, as well as in most modern languages, which lack a proper representation for sounds. That does not usually hinder an advanced reader from read a text properly.

1. The Modern Greek alphabet lacks letters to represent PIE phonetics properly. Therefore, the Ancient Greek letters and values assigned to them are used instead.

NOTE. The consonant cluster [kh] was in Ancient Greece written as X (Chi) in eastern Greek, and Ξ (Xi) in western Greek dialects. In the end, X was standardised as [kh] ([x] in modern Greek), while Ξ represented [ks]. In the Greek alphabet used for IE, X represents [kh], while Ξ represents [kwh], necessary for the representation of a Proto-Greek voiceless aspirate. As in Ancient Greek, Φ stands for [ph], and Θ for [th].

The Greek alphabet lacks a proper representation for long vowels, so they are all marked (as in the other alphabets) with diacritics. Η is used to represent the sound [h], as it was originally used in most Ancient Greek dialects; it is also used to mark (voiced) aspirated phonemes. Ē represents [eː] and Ō stands for [oː] in the Greek alphabet for IE. For more on the problem of historical Eta and its representation in the Modern Greek alphabet, see <http://www.tlg.uci.edu/~opoudjis/unicode/unicode_aitch.html>.

While not a practical solution (in relation to the available Modern Greek keyboards), we keep a traditional Ancient Greek script, assuming that it will enjoy the transliteration of texts mainly written in Latin or Cyrillic letters; so e.g. Archaic koppa Ϙ stood for [k] before back vowels (e.g. Ϙόρινθος, Korinthos), hence its IE value [kw]. Archaic digamma Ϝ represented [w], a sound lost already in Classical Greek. Additions to the IE alphabet are new letter koppa Ϟ for [gw], based on the alternative Unicode shapes of the archaic koppa, and the ‘more traditional’ inverted iota for [j], preferred over Latin yot – although the lack of capital letter for inverted iota makes the use of (at least) a capital J necessary to distinguish [j] from [i]. See <http://www.tlg.uci.edu/~opoudjis/unicode/yot.html>.

2. The Latin alphabet used to write Indo-European is similar to the English, which is in turn borrowed from the Late Latin abecedarium. Because of the role of this alphabet as model for other ones, simplicity and availability of the characters is preferred over tradition and exactitude.

NOTE. The Latin alphabet was borrowed in very early times from the Greek alphabet and did not at first contain the letter G. The letters Y and Z were introduced still later, about 50 BC. The Latin character C originally meant [g], a value always retained in the abbreviations C. (for Gaius) and Cn. (for Gnaeus). That was probably due to Etruscan influence, which copied it from Greek Γ, Gamma, just as later Cyrillic Г, Ge. In early Latin script C came also to be used for [k], and K disappeared except before in a few words, as Kal. (Kalendae), Karthago. Thus there was no distinction in writing between the sounds [g] and [k]. This defect was later remedied by forming (from C, the original [g]-letter) a new character G. In Modern Indo-European, unambiguous K stands for [k], and G for [g], so C is left without value, being used (taking its oldest value [g]) to represent the labiovelar [gw].

V originally denoted the vowel sound [u] (Eng. oo), and F stood for the sound of consonant [w] (from Gk. ϝ, called digamma). When F acquired the value of our [f], V came to be used for consonant [w] as well as for the vowel [u]. The Latin [w] semivowel developed into Romance [v]; therefore V no longer adequately represented [u] or [w], and the Latin alphabet had to develop alternative letters. The Germanic [w] phoneme was therefore written as VV (a doubled V or U) by the seventh or eighth century by the earliest writers of Old English and Old High German. During the late Middle Ages, two forms of V developed, which were both used for its ancestor U and modern V. The pointed form V was written at the beginning of a word, while a rounded form U was used in the middle or end, regardless of sound. The more recent letters U and Germanic W probably represent the consonantal sounds [u] and [w] respectively more unambiguously than Latin V.

The letter I stood for the vowel [i], and was also used in Latin (as in Modern Greek) for its consonant sound [j]. J was originally developed as a swash character to end some Roman numerals in place of I; both I and J represented [i], [iː], and [j]. In IE, J represents the semivowel [j], an old Latin value current in most Germanic and Slavic languages. Y is used to represent the vowel [y] in foreign words. That [j] value is retained in English J only in foreign words, as Hallelujah or Jehovah. Because Romance languages developed new sounds (from former [j] and [ɡ]) that came to be represented as I and J, English J (from French J), as well as Spanish, Portuguese or Italian J have sound values quite different from [j]. The romanisation of the sound [j] from different writing systems (like Devanagari) as Y –  which originally represented in Latin script the Greek vowel [y] – is due to its modern value in English and French, and has spread a common representation of [j] as Y in Indo-European studies, while J is used to represent other sounds.

A different use of the Latin alphabet to represent PIE, following the Classical Latin tradition, is available at <http://verger1.narod.ru/lang1.htm>.

3. The Perso-Arabic script has been adapted to the needs of a fully differentiated PIE alphabet, following Persian, Urdu and Kurdish examples.

NOTE. The Perso-Arabic script is a writing system that is originally based on the Arabic alphabet. Originally used exclusively for the Arabic language, the Arabic script was modified to match the Persian language, adding four letters: پ [p], چ [tʃ], ژ [ʒ], and گ [ɡ]. Many languages which use the Perso-Arabic script add other letters. Besides the Persian alphabet itself, the Perso-Arabic script has been applied to the Urdu or Kurdish Soraní alphabet.

Unlike the standard Arabic alphabet, which is an abjad (each symbol represents a consonant, the vowels being more or less defective), the IE perso-arabic script is a true alphabet, in which vowels are mandatory, making the script easy to read.

Among the most difficult decisions is the use of letters to represent vowels – as in modern alphabets like Kurdish or Berber – instead of diacritics – as in the traditional Arabic or Urdu scripts. Following tradition, hamza (originally a glottal stop) should probably be placed on the short vowels and resonants, instead of the long ones (especially above ‘alif), but automatic equivalence with the other alphabets make the opposite selection more practical.

Because waw و  and yodh ي could represent short and long vowels u and i, and consonantal w and j, a conventional selection of current variants has been made: Arabic letter Ve, sometimes used to represent the sound [v] when transliterating foreign words in Arabic, and also used in writing languages with that sound (like Kurdish) is an obvious selection for consonantal [w] because of its availability. The three-dotted yodh becomes then a consequent selection for consonantal yodh. Hamza distinguishes then the long vowel from the short ones, which is represented with the original symbols.

4. Armenian characters, similarly to Greek, need to be adapted to a language with a different series of short and long vowels and aspirated phonemes.

NOTE. Because of that, a tentative selection is made, which needs not be final – as with any other script. Because Armenian lacks a proper character for [u], and because it has not different characters to represent long vowels other than [eː] or [oː], the more practical choice is to imitate the other alphabets to allow for equivalence. The characters that represent short vowels also represent different sounds; as, Ե for [ɛ] and word initially [jɛ], and Ո for [o] and word initially [vo], so a less ambiguous choice would be Է for [e] and Օ for [o]. Hence the letter Ո historically used to write [o] and [u] (in digraphs) stands for [u].

The conventional selection of one-character representation of aspirated voiceless consonants follows Armenian tradition and equivalence with Greek, a closely related language, as we have already seen; i.e. Proto-Greek is probably the nearest branch to the one Pre-Armenian actually belonged to, and it is therefore practical to retain equivalence between both scripts.

Armenian diacritics (like the abbreviation mark proposed for long vowels) are defined as ‘modifier letters’, not as ‘combining diacritical marks’ in Unicode, so they do not combine as true superscript. Some fonts do combine them, as Everson Mono Ա՟ ա՟ Է՟ է՟ Օ՟ օ՟ Ի՟ ի՟ Ո՟ ո՟.

6. The Cyrillic script is used following its modern trends, taking on account that Russian is the model for most modern keyboards and available typography.

NOTE. Non-Russian characters have been avoided, and we have followed the principle of one letter for each sound: While Й is commonly used to represent [j], Cyrillic scripts usually lack a character to represent consonantal [w], given that usually [v] (written В) replaces it. While У is generally used in Cyrillic for foreign words, a ‘one character, one sound’ policy requires the use of a character complementary to Й, which is logically found in В – a sound lacking in Indo-European.

In Slavistic transcription jer Ъ and front jer Ь were used to denote Proto-Slavic extra-short sounds [ŭ] and [ĭ] respectively (e.g. slověnьskъ adj. ‘slavonic’). Today they are used with other values in the different languages that still use them, but the need for traditional ‘labial’ [w] and ‘palatal’ [j] signs available in most Cyrillic keyboards made them the most logical selection to mark a change of value in the characters representing stops.

7. The Brahmic or Indic scripts are a family of abugida (alphabetic-syllabary) writing systems, historically used within their communities – from Pakistan to Indochina – to represent Sanskrit, whose phonology is similar to the parent PIE language. Devanāgarī has come to be the most commonly used Brahmic script to represent Sanskrit, hence our proposal of its character values for the rest of them.

NOTE. The characters and accents are generally used following their traditional phonetic value. Exceptions are the lack of vocalic characters to properly represent [m̥] and [n̥]. Hence anusvara अं, which represents [], is used to represent [m̥]. Also, visarga अः, which stands for [] (allophonic with word-final r and s) is proposed for [n̥].  

Automatic transliteration between many Brahmic scripts is usually possible, and highly available within scripts used in India.

NOTE. That happens e.g. with the InScript keyboard: because all Brahmic scripts share the same order, any person who knows InScript typing in one script can type in any other Indic script using dictation even without knowledge of that script.

However, due to the lack of characters in western alphabets to represent resonants and long vowels, diacritics are used. These diacritics are not commonly available (but for the Arabic hamza), and therefore if they are not written, transliteration into Brahmic scripts becomes defective. That problem does not exist in the other direction i.e. from Brahmic scripts into the other alphabets.

Modern Indo-European

1. Modern Indo-European (MIE) is therefore a set of conventions or ‘rules’ applied to systematise the reconstructed North-West Indo-European dialect of Late Indo-European – see below §§ 1.3, 1.7.1. Such conventions refer to its writing system, morphology and syntax, and are conceived to facilitate the transition of the reconstructed language into a learned and living one.

2. Because proto-languages were spoken by prehistoric societies, no genuine sample texts are available, and thus comparative linguistics is not in the position to reconstruct exactly how the language was, but more or less certain approximations, whose statistical confidence decrease as we get further back in time. The hypothesised language will then be always somewhat controversial.

NOTE 1. Mallory–Adams (2007): “How real are our reconstructions? This question has divided linguists on philosophical grounds. There are those who argue that we are not really engaged in ‘reconstructing’ a past language but rather creating abstract formulas that describe the systematic relationship between sounds in the daughter languages. Others argue that our reconstructions are vague approximations of the proto-language; they can never be exact because the proto-language itself should have had different dialects (yet we reconstruct only single proto-forms) and our reconstructions are not set to any specific time. Finally, there are those who have expressed some statistical confidence in the method of reconstruction. Robert Hall, for example, claimed that when examining a test control case, reconstructing proto-Romance from the Romance languages (and obviously knowing beforehand what its ancestor, Latin, looked like), he could reconstruct the phonology at 95% confidence, and the grammar at 80%. Obviously, with the much greater time depth of Proto-Indo-European, we might well wonder how much our confidence is likely to decrease.  Most historical linguists today would probably argue that reconstruction results in approximations. A time traveller, armed with this book and seeking to make him- or herself understood would probably engender frequent moments of puzzlement, not a little laughter, but occasional instances of lucidity.”

On the same question, Fortson (2004): “How complete is our picture of PIE? We know there are gaps in our knowledge that come not only from the inevitable loss and replacement of a percentage of words and grammatical forms over time, but also from the nature of our preserved texts. Both the representative genres and external features such as writing systems impose limits on what we can ascertain about the linguistic systems of both PIE and the ancient IE languages (…)

In spite of all the scholarly disagreements that enliven the pages of technical books and journals, all specialists would concur that enormous progress has been made since the earliest pioneering work in this field, with consensus having been reached on many substantial issues. The Proto-Indo-Europeans lived before the dawn of recorded human history, and it is a testament to the power of the comparative method that we know as much about them as we do.”

NOTE 2. The Hebrew language revival is comparable to our proposal of speaking Indo-European as a living language. We have already said that ‘living’ and ‘dead’, ‘natural’ and ‘learned’, are not easily applicable to ancient or classical languages. It is important to note that, even though there is a general belief that Modern Hebrew and Ancient Hebrew are the same languages, among Israeli scholars there have been calls for the “Modern Hebrew” language to be called “Israeli Hebrew” or just “Israeli”, due to the strong divergences that exist – and further develop with its use – between the modern language spoken in Israel and its theoretical basis, the Ancient Hebrew from the Tanakh. The old language system, with its temporary and dialectal variations spanned over previous centuries of oral tradition, was compiled probably between 450-200 BC, i.e when the language was already being substituted by Aramaic. On that interesting question, Prof. Ghil’ad Zuckermann considers that “Israelis are brainwashed to believe they speak the same language as the prophet Isaiah, a purely Semitic language, but this is false. It’s time we acknowledge that Israeli is very different from the Hebrew of the past”. He points out to the abiding influence of modern Indo-European dialects – especially Yiddish, Russian and Polish –, in vocabulary, syntax and phonetics, as imported by Israel’s founders.

3. Features of Late Indo-European that are common to IEDs (North-West Indo-European, Proto-Greek and Proto-Indo-Iranian), like most of the nominal and verbal inflection, morphology, and syntax, make it possible for LIE to be proposed as Dachsprache for the living languages.

NOTE 1. Because North-West Indo-European had other sister dialects that were spoken by coeval prehistoric communities, languages like Modern Hellenic (a revived Proto-Greek) and Modern Aryan (a revived Proto-Indo-Iranian) can also be used in the regions where their surviving dialects are currently spoken. These proto-languages are not more different from North-West Indo-European than are today English from Dutch, Czech from Slovenian, Spanish from Italian. They might also serve as linguae francae for closely related languages or neighbouring regions; especially interesting would be to have a uniting Aryan language for today’s religiously divided South and West Asia.

NOTE 2. The terms Ausbausprache-Abstandsprache-Dachsprache were coined by Heinz Kloss (1967), and they are designed to capture the idea that there are two separate and largely independent sets of criteria and arguments for calling a variety an independent “language” rather than a “dialect”: the one based on its social functions, and the other based on its objective structural properties. A variety is called an ausbau language if it is used autonomously with respect to other related languages.

Dachsprache means a language form that serves as standard language for different dialects, even though these dialects may be so different that mutual intelligibility is not possible on the basilectal level between all dialects, particularly those separated by significant geographical distance. So e.g. the Rumantsch Grischun developed as such a Dachsprache for a number of quite different Romansh language forms spoken in parts of Switzerland; or the Euskara Batua, “Standard Basque”, and the Southern Quechua literary standard, both developed as standard languages for dialect continua that had historically been thought of as discrete languages with many dialects and no “official" dialect. Standard German and standard Italian to some extent function (or functioned) in the same way. Perhaps the most widely used Dachsprache is Modern Standard Arabic, which links together the speakers of many different, often mutually unintelligible Arabic dialects.

The standard Indo-European looked for in this grammar takes Late Indo-European reconstruction as the wide Dachsprache necessary to encompass (i.e. to serve as linguistic umbrella for) the modern usage of IEDs, whose – phonetic, morphological, syntactical – peculiarities are also respected.  

4. Modern Indo-European words to complete the lexicon of North-West Indo-European, in case that no common vocabulary is found in Late Indo-European, are to be loan-translated from present-day Northwestern IE languages. Common loan words from sister dialects can also be loan-translated or borrowed as loan words.

NOTE. Even though the vocabulary reconstructible for IEDs is indeed wider than the common Proto-Indo-European lexicon, a remark of Mallory–Adams (2007) regarding reconstructible PIE words is interesting, in that it shows another difficulty of trying to speak a common LIE or PIH:

“To what extent does the reconstructed vocabulary mirror the scope of the original PIE language? The first thing we should dismiss is the notion that the language (any language) spoken in later prehistory was somehow primitive and restricted with respect to vocabulary. Counting how many words a language has is not an easy task because linguists (and dictionaries) are inconsistent in their definition or arrangement of data. If one were simply to count the headwords of those dictionaries that have been produced to deal with nonliterate languages in Oceania, for example, the order of magnitude is somewhere on the order of 15,000–20,000 ‘words’. The actual lexical units are greater because a single form might have a variety of different meanings, each of which a speaker must come to learn, e.g. the English verb take can mean ‘to seize’, ‘to capture’, ‘to kill’, ‘to win in a game’, ‘to draw a breath’, ‘imbibe a drink’, ‘to accept’, ‘to accommodate’ to name just a few of the standard dictionary meanings. Hence, we might expect that a language spoken c. 4000 BC would behave very much like one spoken today and have a vocabulary on the order of 30,000–50,000 lexical units. If we apply fairly strict procedures to distinguishing PIE lexical items to the roots and words listed in Mallory and Adams’s Encyclopedia or Calvert Watkins’s The American Heritage Dictionary of Indo-European Roots (1985) we have less than 1,500 items. The range of meanings associated with a single lexeme is simply unknown although we occasionally get a hint, e.g. *bher- indicates both ‘carry (a load)’ and ‘bear (a child)’. So the PIE vocabulary that we reconstruct may well provide the basis for a much larger lexicon given the variety of derivational features in PIE.”

Examples of loan translations from modern NWIE languages are e.g. from Latin aquaeduct (Lat. aquaeductus MIE aqāsduktos) or universe (Lat. uniuersus<*oin(i)-uors-o-<*oino-wt-to- MIE oinowstós ‘turned into one’); from English, like software (from Gmc. samþu-, warō MIE somtúworā); from French, like ambassador (from Cel. amb(i)actos MIE ambhíagtos ‘public servant’); or chamber (from O.Lat. camera, from PGk. kamárā, ‘vault’ MIE kamarā); from Russian, like bolshevik (MIE belijówikos); etc.

Loan words from sister IE dialects can be either loan-translated or directly taken as loan-words; as e.g. ‘photo’, which should be taken directly as loan-word o-stem pháwotos, from Gk phawots, gen. phawotós, as Gk. φῶς (<φάϝος), φωτός, in compound phawotogphjā, photography, derived from IE root bhā-, shine, which could be loan-translated as MIE ˟bháwots, from ˟bhawotogbhjā, but without having a meaning for extended bha-wes-, still less for bha-wot-, in North-West Indo-European or even Proto-Indo-European, as it is only found in Ancient Greek dialects. Or MIE skhol, from Lat. schola, taken from Gk. σχολή (<PGk. skhol) ‘spare time, leisure, tranquility’, borrowed from Greek with the meaning ‘school’, which was in O.Gk. σχολεῖον (scholeíon), translated as PGk. skholehjom <*-esjo-m, from IE root segh-, which could also be loan-translated as MIE ˟sghol or even more purely (and artificially) ˟sgholesjom, none of them being Proto-Indo-European or common Indo-European terms. Examples from Indo-Iranian include wasāáranas, bazaar, from O.Ira. vahacarana ‘sale-traffic, bazaar’, which could also be translated as proper MIE ˟wesāqólenos, from PIE roots wes- and qel-; or atúrangam, chess, from Skt. caturaŋgam (which entered Europe from Pers. shatranj) a bahuvrihi compound, meaning ‘having four limbs or parts’, which in epic poetry often means ‘army’, possibly shortened from aturangabalam, Skt. caturaŋgabalam, lit. ‘four-member force’, ‘an army comprising of four parts’, could be loan-translated as MIE ˟qaturangom and ˟qaturangobelom, from roots qetur-, ang- and bel-.

Loan words and loan translations might also coexist in specialised terms; as, from *h1rudhrós, red, PGk eruthrós, in loan eruthrókutos, erythrocyte, proper MIE rudhrós, in rudhr (ésenos) kētjā, red (blood) cell; cf. also MIE mūs, musós, mouse, muscle, PGk mūs, muhós, in loan muhokutos, myocyte, for muskosjo kētjā, muscle cell.

1.8.5. The name of the Modern Indo-European is eurōpājóm, or eurōpājdghwā, European language, from adj. eurōpājós, m. European, in turn from the Greek noun Eurōpā.

NOTE. Gk. Eurō is from unknown origin, even though it was linked with Homer’s epithet for Zeus euruo, from *hurú-oqeh2 ‘far-seeing, broad’, or *h1urú-woqeh2 ‘far-sounding’ (Heath, 2005). Latinate adj. europaeus, which was borrowed by most European languages, comes from Gk. adj. eurōpaíos, in turn from PGk eurōpai-jós < PIE *eurōpeh2-jós MIE eurōpā-jós. For the evolution PIH *-eh2jo- → PGk *-aijo-, cf. adjective formation in Gk. agor-agoraíos, Ruigh (1967).

In the old IE languages, those which had an independent name for languages used the neuter. Compare Gk. n.pl. Ἑλληνικά (hellēniká), Skt. n.sg. संस्कृतम् (sasktam), O.H.G. diutisc, O.Prus. prūsiskan, etc.; cf. also in Tacitus Lat. uōcābulum latīnum. In most IE languages, the language is also referred to as ‘language’ defined by an adjective, whose gender follows the general rule of concordance; cf. Skt. sasktā vāk ‘refined speech’, Gk. ελληνική γλώσσα, Lat. latīna lingua, O.H.G. diutiska sprāhha (Ger. Deutsche Sprache), O.Prus. prūsiskai bilā, O.C.S. словѣньскыи ѩзыкъ (slověnĭskyi językŭ), etc.

Common scholar terms would include sindhueurōpājóm, Indo-European, prāmosindhueurōpājóm, Proto-Indo-European, ópitjom sindhueurōpājóm, Modern Indo-European,etc.


Part I

Language & Culture






Collection of texts and images adapted and organised by Carlos Quiles, with contributions by Fernando López-Menchero





1. Introduction

Description: Description: C:\Users\Carlos\Desktop\MIE\indo-european-grammar_files\image005.jpg1.1. The Indo-European Language Family

Description: Description: Countries with a majority (dark colour) and minority or official status (light) of Indo-European language speakers. (2011, modified from Brianski 2007)

1.1.1. The Indo-European languages are a family of several hundred modern languages and dialects, including most of the major languages of Europe, as well as many in Asia. Contemporary languages in this family include English, German, French, Spanish, Portuguese, Hindustani (i.e., Hindi and Urdu among other modern dialects), Persian and Russian. It is the largest family of languages in the world today, being spoken by approximately half the world’s population as mother tongue. Furthermore, the majority of the other half speaks at least one of them as second language.

1.1.2. Romans didn’t perceive similarities between Latin and Celtic dialects, but they found obvious correspondences with Greek. After grammarian Sextus Pompeius Festus:

Suppum antiqui dicebant, quem nunc supinum dicimus ex Graeco, videlicet pro adspiratione ponentes <s> litteram, ut idem ὕλας dicunt, et nos silvas; item ἕξ sex, et ἑπτά septem

Such findings are not striking, though, as Rome was believed to have been originally funded by Trojan hero Aeneas and, consequently, Latin was derived from Old Greek.

1.1.3. Florentine merchant Filippo Sassetti travelled to the Indian subcontinent, and was among the first European observers to study the ancient Indian language, Sanskrit. Writing in 1585, he noted some word similarities between Sanskrit and Italian, e.g. deva/dio ‘God’, sarpa/serpe ‘snake’, sapta/sette ‘seven’, ashta/otto ‘eight’, nava/nove ‘nine’. This observation is today credited to have foreshadowed the later discovery of the Indo-European language family.

1.1.4. The first proposal of the possibility of a common origin for some of these languages came from Dutch linguist and scholar Marcus Zuerius van Boxhorn in 1647. He discovered the similarities among Indo-European languages, and supposed the existence of a primitive common language which he called ‘Scythian’. He included in his hypothesis Dutch, Greek, Latin, Persian, and German, adding later Slavic, Celtic and Baltic languages. He excluded languages such as Hebrew from his hypothesis. However, the suggestions of van Boxhorn did not become widely known and did not stimulate further research.

1.1.5. On 1686, German linguist Andreas Jäger published De Lingua Vetustissima Europae, where he identified an remote language, possibly spreading from the Caucasus, from which Latin, Greek, Slavic, ‘Scythian’ (i.e. Persian) and Celtic (or ‘Celto-Germanic’) were derived, namely Scytho-Celtic.

1.1.6. The hypothesis re-appeared in 1786 when Sir William Jones first lectured on similarities between four of the oldest languages known in his time: Latin, Greek, Sanskrit and Persian:

“The Sanskrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists: there is a similar reason, though not quite so forcible, for supposing that both the Gothic and the Celtic, though blended with a very different idiom, had the same origin with the Sanskrit; and the old Persian might be added to the same family”

1.1.7. Danish Scholar Rasmus Rask was the first to point out the connection between Old Norwegian and Gothic on the one hand, and Lithuanian, Slavonic, Greek and Latin on the other. Systematic comparison of these and other old languages conducted by the young German linguist Franz Bopp supported the theory, and his Comparative Grammar, appearing between 1833 and 1852, counts as the starting-point of Indo-European studies as an academic discipline.

NOTE. The term Indo-European itself now current in English literature, was coined in 1813 by the British scholar Sir Thomas Young, although at that time there was no consensus as to the naming of the recently discovered language family. Among the names suggested were indo-germanique (C. Malte-Brun, 1810), Indoeuropean (Th. Young, 1813), japetisk (Rasmus C. Rask, 1815), indisch-teutsch (F. Schmitthenner, 1826), sanskritisch (Wilhelm von Humboldt, 1827), indokeltisch (A. F. Pott, 1840), arioeuropeo (G. I. Ascoli, 1854), Aryan (F. M. Müller, 1861), aryaque (H. Chavée, 1867), etc.

In English, Indo-German was used by J. C. Prichard in 1826 although he preferred Indo-European. In French, use of indo-européen was established by A. Pictet (1836). In German literature, Indo-Europäisch was used by Franz Bopp since 1835. The term Indo-Germanisch had already been introduced by Julius von Klapproth in 1823, intending to include the northernmost and the southernmost of the family’s branches, as it was as an abbreviation of the full listing of involved languages that had been common in earlier literature; that opened the doors to ensuing fruitless discussions whether it should not be Indo-Celtic, or even Tocharo-Celtic.

1.2. Traditional Views

1.2.1. In the beginnings of the Indo-European studies using the comparative method, Indo-European was reconstructed as a unitary proto-language. For Rask, Bopp and other linguists, it was a search for the Indo-European. Such a language was supposedly spoken in a certain region between Europe and Asia and at one point in time.

1.2.2. The Stammbaumtheorie or Genealogical Tree theory states that languages split up in other languages, each of them in turn split up in others, and so on, like the branches of a tree. For example, a well-known out-dated theory about Indo-European is that, within the PIE language, two main groups of dialects known as centum and satem were formed, a model represented by a clean break-up from the parent language.

NOTE. The centum and satem isogloss is one of the oldest known phonological differences of IE languages, and is still used by many to classify PIE in two main dialectal groups – postulating the existence of proto-Centum and proto-Satem languages –, according to their pronunciation of PIE *(d)km̥tóm, hundred, disregarding their relevant morphological and syntactical differences, and usually implicitly accepting a common PIE series of palatovelars.

Tree diagrams remain the most used model for understanding the Indo-European language reconstruction, since it was proposed by A. Schleicher (Compendium, 1866). The problem with its simplicity is that “the branching of the different groups is portrayed as a series of clean breaks with no connection between branches after they have split, as if each dialectal group marched away from the rest. Such sharp splits are possible, but assuming that all splits within Proto-Indo-European were like this is not very plausible, and any linguist surveying the current Indo-European languages would note dialectal variations running through some but not all areas, often linking adjacent groups who may belong to different languages” (Mallory–Adams, 2007).

1.2.3. The Wellentheorie or Waves Theory, of J. Schmidt, states that one language is created from another by the spread of innovations, the way water waves spread when a stone hits the water surface. The lines that define the extension of the innovations are called isoglosses. The convergence of different isoglosses over a common territory signals the existence of a new language or dialect. Where isoglosses from different languages coincide, transition zones are formed.

NOTE. After Mallory and Adams (2007), “their criteria of inclusion, why we are looking at any particular one, and not another one, are no more solid than those that define family trees. The key element here is what linguistic features actually help determine for us whether two languages are more related or less related to one another.”

1.2.4. Because of the difficulties found in the modelling of Proto-Indo-European branches and daughter languages into the traditional, unitary ‘Diverging Tree’ framework, i.e. a uniform Proto-Indo-European language with its branches, a new model called ‘Converging Association of Languages’ was proposed, in which languages that are in contact (not necessarily related to each other) exchange linguistic elements and rules, thus developing and acquiring from each other. Most linguists have rejected it as an implausible explanation of the irregularities found in the old, static concept of PIE.

NOTE. Among the prominent advocates is N.S. Trubetzkoy (Urheimat, 1939): “The term ‘language family’  does not presuppose the common descent of a quantity of languages from a single original language. We consider a ‘language family’ a group of languages, in which a considerable quantity of lexical and morphological elements exhibit regular equivalences (…) it is not necessary for one to suppose common descent, since such regularity may also originate through borrowings between neighboring unrelated languages (…) It is just as conceivable that the ancestors of the Indo-European language branches were originally different from each other, but though constant contact, mutual influence, and borrowings, approached each other, without however ever becoming identical to one another”  (Meier-Brügger, 2003).

Agreeing with Neumann (1996), Meier-Brügger (2003) rejects that association of languages in the Proto-Indo-European case by stating: “that the various Indo-European languages have developed from a prior unified language is certain. Questionable is, however, the concrete ‘how’ of this process of differentiation”, and that this “thesis of a ‘converging association of languages’ may immediately be dismissed, given that all Indo-European languages are based upon the same Proto-Indo-European flexion morphology. As H. Rix makes clear, it is precisely this morphological congruence that speaks against the language association model, and for the diverging tree model.”

1.3. The Theory of the Three Stages

1.3.1. Even the first Indo-Europeanists had noted in their works the possibility of reconstructing older stages of the ‘Brugmannian’ Proto-Indo-European.

NOTE. The development of this theory of three linguistic stages can be traced back to the very origins of Indo-European studies, firstly as a diffused idea of a non-static PIE language, and later widely accepted as a dynamic dialectal evolution, already in the twentieth century, after the decipherment of the Anatolian scripts. Most linguists accept that Proto-Indo-European must be the product of a long historical development, as any ‘common language’ is being formed gradually, and proto-languages (like languages) have stages, as described by Lehmann (Introducción a la lingüística histórica, Spa. transl. 1961). On this question, H. Rix (Modussystem, 1986) asserts “[w]hereby comparative reconstruction is based upon a group of similar forms in a number of languages, internal reconstruction takes its point of departure from irregularities or inhomogeneities of the system of a single language (…) The fundamental supposition of language-internal reconstruction is that such an irregularity or inhomogeneity in the grammar of a language is the result of a diachronic process, in which an older pattern, or homogeneity is eclipsed, but not fully suppressed”. According to Meier-Brügger (2003), “Rix works back from Late Proto-Indo-European Phase B (reconstructible Proto-Indo-European) using deducible information about an Early Proto-Indo-European Phase A, and gathers in his work related evidence on the Proto-Indo-European verbal system”. On that question, see also the “Late Indo-European” differentiation in Gamkrelidze–Ivanov (1994-1995), Adrados–Bernabé–Mendoza (1995-1998); a nomenclature also widespread today stems from G.E. Dunkel’s Early, Middle, Late Indo-European: Doing it My Way (1997); etc.

1.3.2. Today, a widespread Three-Stage theory divides PIE internal language evolution into three main historic layers or stages, including a description of branches and languages either as clean breaks from a common source (e.g. PAn and LIE from Indo-Hittite) or from intermediate dialect continua (e.g. Germanic and Balto-Slavic from North-West IE), or classifying similarities into continued linguistic contact (e.g. between Balto-Slavic and Indo-Iranian):

1)  Pre-Proto-Indo-European (Pre-PIE), more properly following the current nomenclature Pre-Indo-Hittite (Pre-PIH), also Early PIE, is the hypothetical ancestor of Indo-Hittite, and probably the oldest stage of the language that comparative grammar could help reconstruct using internal reconstruction. There is, however, no common position as to how it was like or when and where it was spoken.

2) The second stage corresponds to a time before the separation of Proto-Anatolian from the common linguistic community where it should have coexisted (as a Pre-Anatolian dialect) with Pre-LIE. That stage of the language is today commonly called Indo-Hittite (PIH), and also Middle PIE, but often simply Proto-Indo-European; it is identified with early kurgan cultures in the Kurgan Hypothesis.

NOTE. On the place of Anatolian among IE languages, the question is whether it separated first as a language branch from PIE, and to what extent was it thus spared developments common to the remaining Proto-Indo-European language group. There is growing consensus in favour of its early split from Indo-European (Heading, among others, ‘Indo-Hittite’); see N. Oettinger (‘Indo-Hittite’ – Hypothesen und Wortbildung 1986), A. Lehrman (Indo-Hittite Revisited, 1996), H. Craig Melchert (The Dialectal Position of Anatolian within IE in IE Subgrouping, 1998), etc.

For Kortlandt (The Spread of The Indo-Europeans, JIES 18, 1990): “Since the beginnings of the Yamnaya, Globular Amphora, Corded Ware, and Afanasievo cultures can all be dated between 3600 and 3000 BC, I am inclined to date Proto-Indo-European to the middle of the fourth millennium, and to recognize Proto-Indo-Hittite as a language which may have been spoken a millennium earlier.”

For Ringe (2006), “[i]nterestingly, there is by now a general consensus among Indo-Europeanists that the Anatolian subfamily is, in effect, one half of the IE family, all the other subgroups together forming the other half.”

On the Anatolian question and its implications on nomenclature, West (2007) states that “[t]here is growing consensus that the Anatolian branch, represented by Hittite and related languages of Asia Minor, was the first to diverge from common Indo-European, which continued to evolve for some time after the split before breaking up further. This raises a problem of nomenclature. It means that with the decipherment of Hittite the ‘Indo-European’ previously reconstructed acquired a brother in the shape of proto-Anatolian, and the archetype of the family had to be put back a stage. E. H. Sturtevant coined a new term ‘Indo-Hittite’ (…) The great majority of linguists, however, use ‘Indo-European’ to include Anatolian, and have done, naturally enough, ever since Hittite was recognized to be ‘an Indo-European language’. They will no doubt continue to do so.”

3)  The common immediate ancestor of most of the reconstructed IE proto-languages is approximately the same static ‘Brugmannian’ PIE searched for since the start of Indo-European studies, before Hittite was deciphered. It is usually called Late Indo-European (LIE) or Late PIE, generally dated some time ca. 3500-2500 BC using linguistic or archaeological models, or both.

NOTE. According to Mallory–Adams (2007): “Generally, we find some form of triangulation based on the earliest attested Indo-European languages, i.e. Hittite, Mycenaean Greek, and Indo-Aryan, each of these positioned somewhere between c. 2000 and 1500 BC. Given the kind of changes linguists know to have occurred in the attested histories of Greek or Indo-Aryan, etc., the linguist compares the difference wrought by such changes with the degree of difference between the earliest attested Hittite, Mycenaean Greek, and Sanskrit and reconstructed Proto-Indo-European. The order of magnitude for these estimates (or guesstimates) tends to be something on the order of 1,500-2,000 years. In other words, employing some form of gut intuition (based on experience which is often grounded on the known separation of the Romance or Germanic languages), linguists tend to put Proto-Indo-European sometime around 3000 BC plus or minus a millennium (…) the earliest we are going to be able to set Proto-Indo-European is about the fifth millennium BC if we want it to reflect the archaeological reality of Eurasia. We have already seen that individual Indo-European groups are attested by c. 2000 BC. One might then place a notional date of c. 4500-2500 BC on Proto-Indo-European. The linguist will note that the presumed dates for the existence of Proto-Indo-European arrived at by this method are congruent with those established by linguists’ ‘informed estimation’. The two dating techniques, linguistic and archeological, are at least independent and congruent with one another.”

Likewise, in Meier-Brügger (2003), about a common Proto-Indo-European: “No precise statement concerning the exact time period of the Proto-Indo-European linguistic community is possible. One may only state that the ancient Indo-European languages that we know, which date from the 2nd millennium BC, already exhibit characteristics of their respective linguistic groups in their earliest occurrences, thus allowing one to presume the existence of a separate and long pre-history (…) The period of 5000-3000 BC is suggested as a possible timeframe of a Proto-Indo-European language.”

However, on the early historic and prehistoric finds, and the assumption of linguistic communities linked with archaeological cultures, Hänsel (Die Indogermanen und das Pferd, B. Hänsel, S. Zimmer (eds.), 1994) states that “[l]inguistic development may be described in steps that, although logically comprehensible, are not precisely analyzable without a timescale. The archaeologist pursues certain areas of cultural development, the logic of which (if one exists) remains a mystery to him, or is only accessible in a few aspects of its complex causality. On the other hand, he is provided with concrete ideas with regard to time, as vague as these may be, and works with a concept of culture that the Indo-European linguist cannot attain. For the archaeologist, culture is understood in the sense of a sociological definition (…) The archaeological concept of culture is composed of so many components, that by its very nature its contours must remain blurred. But languages are quite different. Of course there are connections; no one can imagine cultural connections without any possibility of verbal communication. But it is too much to ask that archaeologists equate their concept of culture, which is open and incorporates references on various levels, to the single dimension of linguistic community. Archaeology and linguistics are so fundamentally different that, while points of agreement may be expected, parallels and congruency may not. The advantage of linguistic research is its ability to precisely distinguish between individual languages and the regularity of developments. The strength of archaeology is its precision in developing timelines. What one can do, the other cannot. They could complement each other beautifully, if only there were enough commonality.”

1.3.3. Another division has to be made, so that the dialectal evolution is properly understood. Late Indo-European had at least two main inner dialectal branches, the Southern or Graeco-Aryan (S.LIE) and the Northern (N.LIE) ones.

It seems that speakers of Southern or Graeco-Aryan dialects spread in different directions with the first LIE migrations (ca. 3000-2500 BC in the Kurgan framework), forming at least a South-East (including Pre-Indo-Iranian) and a South-West (including Pre-Greek) group. Meanwhile, speakers of Northern dialects migrated to the North-West (see below), but for speakers of a North-East IE branch (from which Pre-Tocharian developed), who migrated to Asia.

NOTE. Beekes (1995), from an archaeological point of view, on the Yamnaya culture: “This is one of the largest pre-historic complexes in Europe, and scholars have been able to distinguish between different regions within it. It is dated from 3600-2200 B.C. In this culture, the use of copper for the making of various implements is more common. From about 3000 B.C. we begin to find evidence for the presence in this culture of two- and four-wheeled wagons (…) There seems to be no doubt that the Yamnaya culture represents the last phase of an Indo-European linguistic unity, although there were probably already significant dialectal differences within it.”

Fortson (2004) similarly suggests: in the period 3100-2900 BC came a clear and dramatic infusion of Yamna cultural practice, including burials, into eastern Hungary and along the lower Danube. With this we seem able to witness the beginnings of the Indo-Europeanization of Europe. By this point, the members of the Yamna culture had spread out over a very large area and their speech had surely become dialecticaly strongly differentiated.”

Meier-Brügger (2003): “Within the group of IE languages, some individual languages are more closely associated with one another owing to morphological or lexical similarities. The cause for this, as a rule, is a prehistoric geographic proximity (perhaps even constituting single linguistic community) or a common preliminary linguistic phase, a middle mother-language phase, which would however then be posterior to the period of the mother language.”

About Tocharian, Adrados–Bernabé–Mendoza (1995-1998): “even if archaic in some respects (its centum character, subjunctive, etc.) it shares common features with Balto-Slavic, among other languages: they must be old isoglosses, shared before it separated and migrated to the East. It is, therefore, [a N.LIE] language. It shows great innovations, too, something normal in a language that evolved isolated.”

On the Southern (Graeco-Aryan or Indo-Greek) LIE dialect, see Tovar (Krahes alteuropäische Hydronymie und die west-indogermanischen Sprachen, 1977; Actas del II Coloquio sobre lenguas y culturas prerromanas de la Península Ibérica, Salamanca, 1979), Gamkrelidze–Ivanov (1993-1994), Clackson (The Linguistic Relationship Between Armenian and Greek, 1994), Adrados–Bernabé–Mendoza (1995-1998), etc. In Mallory–Adams (2007): “Many have argued that Greek, Armenian, and Indo-Iranian share a number of innovations that suggest that there should have been some form of linguistic continuum between their predecessors.”

On the Graeco-Aryan community, West (2007) proposes the latest terminus ante quem for its split: “We shall see shortly that Graeco-Aryan must already have been differentiated from [LIE] by 2500. We have to allow several centuries for the development of [LIE] after its split from proto-Anatolian and before its further division. (…) The first speakers of Greek – or rather of the language that was to develop into Greek; I will call them mello-Greeks – arrived in Greece, on the most widely accepted view, at the beginning of Early Helladic III, that is, around 2300. They came by way of Epirus, probably from somewhere north of the Danube. Recent writers have derived them from Romania or eastern Hungary. (…) we must clearly go back at least to the middle of the millennium for the postulated Graeco-Aryan linguistic unity or community.”

1.3.4. The so-called North-West Indo-European is considered by some to have formed an early linguistic community already separated from other Northern dialects (which included Pre-Tocharian) before or during the LIE dialectal split, and is generally assumed to have been a later IE dialect continuum between different communities in Northern Europe during the centuries on either side of 2500 BC, with a development usually linked to the expansion of the Corded Ware culture.

NOTE. A dialect continuum, or dialect area, was defined by Leonard Bloomfield as a range of dialects spoken across some geographical area that differ only slightly between neighbouring areas, but as one travels in any direction, these differences accumulate such that speakers from opposite ends of the continuum are no longer mutually intelligible. Examples of dialect continua included (now blurred with national languages and administrative borders) the North-Germanic, German, East Slavic, South Slavic, Northern Italian, South French, or West Iberian languages, among others.

A Sprachbund, also known as a linguistic area, convergence area, diffusion area or language crossroads – is a group of languages that have become similar in some way because of geographical proximity and language contact. They may be genetically unrelated, or only distantly related. That was probably the case with Balto-Slavic and Indo-Iranian, v.i. §1.7.

North-West IE was therefore a language or group of closely related dialects that emerged from a parent (N.LIE) dialect, in close contact for centuries, which allowed them to share linguistic developments.

NOTE. On the so-called “Nort-West Indo-Europeandialect continuum, see Tovar (1977, 1979), Eric Hamp (“The Indo-European Horse” in T. Markey and J.Greppin (eds.) When Worlds Collide: Indo-Europeans and Pre-Indo-Europeans, 1990), N. Oettinger Grundsätzliche überlegungen zum Nordwest-Indogermanischen (1997), and Zum nordwestindogermanischen Lexikon (1999); M. E. Huld Indo-Europeanization of Northern Europe (1996); Adrados–Bernabé–Mendoza (1995-1998); etc.  

Regarding the dating of European proto-languages (of ca. 1500-500 BC) to the same time as Proto-Greek or Proto-Indo-Iranian (of ca. 2500-2000), obviating the time span between them, we might remember Kortlandt’s (1990) description of what “seems to be a general tendency to date proto-languages farther back in time than is warranted by the linguistic evidence. When we reconstruct Proto-Romance, we arrive at a linguistic stage which is approximately two centuries later than the language of Caesar and Cicero (cf. Agard 1984: 47-60 for the phonological differences). When we start from the extralinguistic evidence and identify the origins of Romance with the beginnings of Rome, we arrive at the eighth century BC, which is almost a millennium too early. The point is that we must identify the formation of Romance with the imperfect learning of Latin by a large number of people during the expansion of the Roman empire.”

1.3.4. Apart from the shared phonology and vocabulary, North-Western dialects show other common features, as a trend to reduce the noun inflection system, shared innovations in the verbal system (merge of imperfect, aorist and perfect in a single preterite, although some preterite-presents are found) the -r endings of the middle or middle-passive voice, a common evolution of laryngeals, etc.

The southern IEDs, which spread in different directions and evolved without forming a continuum, show therefore a differentiated phonology and vocabulary, but common older developments like the augment in é-, middle desinences in -i, athematic verbal inflection, pluperfect and perfect forms, and aspectual differentiation between the types *bhére/o- and *tudé/o-.

1.4. The Proto-Indo-European Urheimat

The search for the Urheimat or ‘Homeland’ of the prehistoric Proto-Indo-Europeans has developed as an archaeological quest along with the linguistic research looking for the reconstruction of the proto-language.

NOTE. Mallory (Journal of Indo-European Studies 1, 1973): “While many have maintained that the search for the PIE homeland is a waste of intellectual effort, or beyond the competence of the methodologies involved, the many scholars who have tackled the problem have ably evinced why they considered it important. The location of the homeland and the description of how the Indo-European languages spread is central to any explanation of how Europe became European. In a larger sense it is a search for the origins of western civilization.”

According to A. Scherer’s Die Urheimat der Indogermanen (1968), summing up the views of various authors from the years 1892-1963, still followed by mainstream Indo-European studies today, “[b]ased upon the localization of later languages such as Greek, Anatolian, and Indo-Iranian, a swathe of land in southern Russia north of the Black Sea is often proposed as the native area of the speakers of Proto-Indo-European”.

1.4.1. Historical Linguistics

In Adrados–Bernabé–Mendoza (1995-1998), a summary of main linguistic facts is made, supported by archaeological finds:

 “It is communis opinio today that the languages of Europe have developed in situ in our continent; although indeed, because of the migrations, they have remained sometimes dislocated, and also extended and fragmented (…) Remember the recent date of the ‘crystallisation’ of European languages. ‘Old European’ [=North-West IE], from which they derive, is an already evolved language, with opposition masculine/feminine, and must be located in time ca. 2000 BC or before. Also, one must take into account the following data: the existence of Tocharian, related to [Northern LIE], but far away to the East, in the Chinese Turkestan; the presence of [Southern LIE] languages to the South of the Carpathian Mountains, no doubt already in the third millennium (the ancestors of Thracian, Iranian, Greek speakers); differentiation of Hittite and Luwian, within the Anatolian group, already ca. 2000 BC, in the documents of Kültepe, what means that Common Anatolian must be much older.

NOTE. Without taking on account archaeological theories, linguistic data reveals that:

a)      [Northern LIE], located in Europe and in the Chinese Turkestan, must come from an intermediate zone, with expansion into both directions.

b)      [Southern LIE], which occupied the space between Greece and the north-west of India, communicating both peninsulas through the languages of the Balkans, Ukraine and Northern Caucasus, the Turkestan and Iran, must also come from some intermediate location. Being a different linguistic group, it cannot come from Europe or the Russian Steppe, where Ural-Altaic languages existed.

c)       Both groups have been in contact secondarily, taking on account the different ‘recent’ isoglosses in the contact zone.

d)      The more archaic Anatolian must have been isolated from the more evolved IE; and that in some region with easy communication with Anatolia.

(…) Only the Steppe North of the Caucasus, the Volga river and beyond can combine all possibilities mentioned: there are pathways that go down into Anatolia and Iran through the Caucasus, through the East of the Caspian Sea, the Gorgan plains, and they can migrate from there to the Chinese Turkestan, or to Europe, where two ways exist: to the North and to the South of the Carpathian mountains.

These linguistic data, presented in a diagram, are supported by strong archaeological arguments: they have been defended by Gimbutas 1985 against Gamkrelidze–Ivanov (1994-1995) (…). This diagram proposes three stages. In the first one, [PIH] became isolated, and from it Anatolian emerged, being first relegated to the North of the Caucasus, and then crossing into the South: Common Anatolian must be located there. Note that there is no significant temporal difference with the other groups; it happens also that the first IE wave into Europe was older. It is somewhere to the North of the people that later went to Anatolia that happened the great revolution that developed [LIE], the ‘common language’.


Description: Description: Stage 1

Stage 2


Stage 3

Description: Description:               
 Jkfghjfghjdghjdfhdfhdfjhfghkfk rjtyjdghj

Description: Description:  Anat.

Description: Description: C:\Users\Carlos\Desktop\MIE\indo-european-grammar_files\image010.png




Description: Description: C:\Users\Carlos\Desktop\MIE\indo-european-grammar_files\image011.pngDescription: Description: West IE   Bal.-Sla.Description: Description: C:\Users\Carlos\Desktop\MIE\indo-european-grammar_files\image013.png                        Northern horde                                                         Tocharian


Description: Description: Gk.-Thrac.   Arm.   Ind.-Ira.



Description: Description: C:\Users\Carlos\Desktop\MIE\indo-european-grammar_files\image010.png                                                                             Southern Horde


Description: Description: Germanic          Bal.-Sla.Description: Description: C:\Users\Carlos\Desktop\MIE\indo-european-grammar_files\image016.png                            Northern Horde

Description: Description:            Indo-Iranian,Thr.


Description: Description: Cel.,Ita.





Diagram of the expansion and relationships of IE languages. Adapted from Adrados (1979).

                                                                                       Southern Horde

The following stages refer to that common language. The first is the one that saw both [N.LIE] (to the North) and [S.LIE] (to the South), the former being fragmented in two groups, one that headed West and one that migrated to the East. That is a proof that somewhere in the European Russia a common language [N.LIE] emerged; to the South, in Ukraine or in the Turkestan [S.LIE].

The second stage continues the movements of both branches, that launched waves to the South, but that were in contact in some moments, arising isoglosses that unite certain languages of the [Southern IE] group (first Greek, later Iranian, etc.) with those of the rearguard of [Northern IE] (especially Baltic and Slavic, also Italic and Germanic)”.

NOTE. The assumption of three independent series of velars (v.s. Considerations of Method), has logical consequences when trying to arrange a consistent chronological and dialectal evolution from the point of view of historical linguistics. That is necessarily so because phonological change is generally assumed to be easier than morphological evolution for any given language. As a consequence, while morphological change is an agreed way to pinpoint different ancient groups, and lexical equivalences to derive late close contacts and culture (using them we could find agreement in grouping e.g. Balto-Slavic, Italo-Celtic, and Germanic between both groups, as well as an older Graeco-Aryan dialects), phonetics is often used – whether explicitly or not – as key to the groupings and chronology of the final split up of Late Indo-European, which is at the core of the actual archaeological quest today.

 If we assume that the satem languages were show the most natural trend of leniting palatals from an ‘original’ system of three series of velars; if we assume that the other, centum languages, had undergone a trend of (unlikely and unparallelled) depalatalisation of the palatovelars; then the picture of the dialectal split must be different, because centum languages must be more closely related to each other in ancient times (due to the improbable happening of depalatalisation in more than one branch independently). That is the scheme followed in some manuals on IE linguistics or archaeology if three series are reconstructed or accepted, as it is commonly the case.

From that point of view, Italic, Celtic and Tocharian must be grouped together, while the satem core can be found in Balto-Slavic and Indo-Iranian. This contradicts the finds on different Northern and Graeco-Aryan dialects, though. As already stated, the Glottalic theory might support that dialectal scheme, by assuming a neater explanation of the natural evolution of glottalic, voiced and voiceless stops, different from the depalatalisation proposal. However, the glottalic theory is today mostly rejected (see below §1.5). Huld’s (1997) explanation of the three series could also support this scheme (see above).

1.4.2. Archaeology

The Kurgan hypothesis was introduced by Marija Gimbutas (The Prehistory of Eastern Europe, Part 1, 1956) in order to combine archaeology with linguistics in locating the origins of the Proto-Indo-Europeans. She named the set of cultures in question “Kurgan” after their distinctive burial mounds and traced their diffusion into Eastern and Northern Europe.

NOTE. People were buried with their legs flexed, a position which remained typical for peoples identified with Indo-European speakers for a long time. The burials were covered with a mound, a kurgan (Turkish loanword in Russian for ‘tumulus’).

According to her hypothesis, PIE speakers were probably a nomadic tribe of the Pontic-Caspian steppe that expanded in successive stages of the Kurgan culture and three successive “waves” of expansion during the third millennium BC:

·   Kurgan I, Dnieper/Volga region, earlier half of the fourth millennium BC. Apparently evolving from cultures of the Volga basin, subgroups include the Samara and Seroglazovo cultures.

·   Kurgan II–III, latter half of the fourth millennium BC. Includes the Sredny Stog culture and the Maykop culture of the northern Caucasus. Stone circles, early two-wheeled chariots, anthropomorphic stone stelae of deities.

·   Description: Description: C:\Users\Carlos\Desktop\MIE\indo-european-grammar_files\image019.jpgKurgan IV or Pit Grave culture, first half of the third millennium BC, encompassing the entire steppe region from the Ural to Romania.

Description: Description: Hypothetical Urheimat (Homeland) of the first PIE speakers, from 4500 BC onwards. The Yamna (Pit Grave) culture lasted from ca. 3600 till 2200 BC. In this time the first wagons appeared. (PD) 

There were proposed to be three successive “waves” of expansion:

o Wave 1, predating Kurgan I, expansion from the lower Volga to the Dnieper, leading to coexistence of Kurgan I and the Cucuteni culture. Repercussions of the migrations extend as far as the Balkans and along the Danube to the Vinča and Lengyel cultures in Hungary.

o Wave 2, mid fourth millennium BC, originating in the Maykop culture and resulting in advances of kurganised hybrid cultures into northern Europe around 3000 BC – Globular Amphora culture, Baden culture, and ultimately Corded Ware culture.

o Wave 3, 3000-2800 BC, expansion of the Pit Grave culture beyond the steppes; appearance of characteristic pit graves as far as the areas of modern Romania, Bulgaria and eastern Hungary.

The ‘kurganised’ Globular Amphora culture in Europe is proposed as a ‘secondary Urheimat’ of PIE, the culture separating into the Bell-Beaker culture and Corded Ware culture around 2300 BC. This ultimately resulted in the European IE families of Italic, Celtic and Germanic languages, and other, partly extinct, language groups of the Balkans and central Europe, possibly including the proto-Mycenaean invasion of Greece.

1.4.3. Quantitative Analysis

Glottochronology tries to compare lexical, morphological or phonological traits in order to develop more trustable timelines and dialectal groupings. It hasn’t attracted much reliability among linguists, though, in relation with the comparative method, on which the whole IE reconstruction is still based.

NOTE. Most of these glottochronological works are highly controversial, partly owing to issues of accuracy, partly to the question of whether its very basis is sound. Two serious arguments that make this method mostly invalid today are the proof that Swadesh formulae would not work on all available material, and that language change arises from socio-historical events which are of course unforeseeable and, therefore, incomputable.

A variation of traditional glottochronology is phylogenic reconstruction; in biological systematics, phylogeny is a graph intended to represent genetic relationships between  biological taxa. Linguists try to transfer these biological models to obtain “subgroupings of one or the other branch of a language family.

NOTE. Clackson (2007) describes a recent phylogenetic study, by Atkinson et al. (“From Words to Dates: Water into Wine, Mathemagic or Phylogenetic Inference?”, Transactions of the Philological Society 103, 2005): “The New Zealand team use models which were originally designed to build phylogenies based on DNA and other genetic information, which do not assume a constant rate of change. Instead, their model accepts that the rate of change varies, but it constrains the variation within limits that coincide with attested linguistic sub-groups. For example, it is known that the Romance languages all derive from Latin, and we know that Latin was spoken 2,000 years ago. The rates of lexical change in the Romance family can therefore be calculated in absolute terms. These different possible rates of change are then projected back into prehistory, and the age of the parent can be ascertained within a range of dates depending on the highest and lowest rates of change attested in the daughter languages. More recently (Atkinson et al. 2005), they have used data based not just on lexical characters, but on morphological and phonological information as well.”

Their results show a late separation of the Northwestern IE languages, with a last core of Romance-Germanic, earlier Celto-Romano-Germanic, and earlier Celto-Romano-Germano-Balto-Slavic. Previous to that date, Graeco-Armenian would have separated earlier than Indo-Iranian, while Tocharian would have been the earliest to split up from LIE, still within the Kurgan framework, although quite early (ca. 4000-3000 BC). Before that, the Anatolian branch is found to have split quite earlier than the dates usually assumed in linguistics and archaeology (ca. 7000-6000 BC).

Holm proposed to apply a Separation-Level Recovery system to PIE. This is made (Holm, 2008) by using the data on the new Lexikon der indogermanischen Verben, 2nd ed. (Rix et al. 2001), considered a “more modern and linguistic reliable database” than the data traditionally used from Pokorny IEW. The results show a similar grouping to those of Atkinson et al. (2005), differentiating between North-West IE (Italo-Celtic, Germanic, Balto-Slavic), and Graeco-Aryan (Graeco-Armenian, Indo-Iranian) groups. However, Anatolian is deemed to have separated quite late compared to linguistic dates, being considered then just another LIE dialect, therefore rejecting the concept of Indo-Hittite altogether. Some of Holm’s studies are available at <http://hjholm.de/>.

The most recent quantitative studies then apparently show similar results in the phylogenetic groupings of recent languages, i.e. Late Indo-European dialects, excluding Tocharian. Their dates remain, at best, just approximations for the separation of late and well attested languages, though, while the dating (and even grouping) of ancient languages like Anatolian or Tocharian with modern evolution patterns remains at best questionable.


1.4.4. Archaeogenetics

Description: Description: Distribution of haplotypes R1b (light colour) for Eurasiatic Paleolithic and R1a (dark colour) for Yamna expansion; black represents other haplogroups. (2009, modified from Dbachmann 2007)Description: Description: C:\Users\Carlos\Desktop\MIE\indo-european-grammar_files\image022.jpgCavalli-Sforza and Alberto Piazza argue that Renfrew (v.i. §1.5) and Gimbutas reinforce rather than contradict each other, stating that “genetically speaking, peoples of the Kurgan steppe descended at least in part from people of the Middle Eastern Neolithic who immigrated there from Turkey”.

Description: Description: Distribution of haplogroup R1a (2011, modified from Crates 2009)Description: Description: C:\Users\Carlos\Desktop\MIE\indo-european-grammar_files\image024.jpgNOTE. The genetic record cannot yield any direct information as to the language spoken by these groups. The current interpretation of genetic data suggests a strong genetic continuity in Europe; specifically, studies of mtDNA by Bryan Sykes show that about 80% of the genetic stock of Europeans originated in the Paleolithic.

Spencer Wells suggests that the origin, distribution and age of the R1a1 haplotype points to an ancient migration, possibly corresponding to the spread by the Kurgan people in their expansion across the Eurasian steppe around 3000 BC, stating that “there is nothing to contradict this model, although the genetic patterns do not provide clear support either”.

NOTE. R1a1 is most prevalent in Poland, Russia, and Ukraine, and is also observed in Pakistan, India and central Asia. R1a1 is largely confined east of the Vistula gene barrier and drops considerably to the west. The spread of Y-chromosome DNA haplogroup R1a1 has been associated with the spread of the Indo-European languages too. The mutations that characterise haplogroup R1a occurred ~10,000 years bp. Haplogroup R1a1, whose lineage is thought to have originated in the Eurasian Steppes north of the Black and Caspian Seas, is therefore associated with the Kurgan culture, as well as with the postglacial Ahrensburg culture which has been suggested to have spread the gene originally.

The present-day population of R1b haplotype, with extremely high peaks in Western Europe and measured up to the eastern confines of Central Asia, are believed to be the descendants of a refugium in the Iberian peninsula at the Last Glacial Maximum, where the haplogroup may have achieved genetic homogeneity. As conditions eased with the Allerød Oscillation in about 12000 BC, descendants of this group migrated and eventually recolonised all of Western Europe, leading to the dominant position of R1b in variant degrees from Iberia to Scandinavia, so evident in haplogroup maps.

NOTE. High concentrations of Mesolithic or late Paleolithic YDNA haplogroups of types R1b (typically well above 35%) and I (up to 25%), are thought to derive ultimately of the robust Eurasiatic Cro Magnoid homo sapiens of the Aurignacian culture, and the subsequent gracile leptodolichomorphous people of the Gravettian culture that entered Europe from the Middle East 20,000 to 25,000 years ago, respectively.

1.4.5. The Kurgan Hypothesis and the Three-Stage Theory

ARCHAEOLOGY (Kurgan Hypothesis)

LINGUISTICS (Three-Stage Theory)

ca. 4500-4000 BC. Sredny Stog, Dnieper-Donets and Sarama cultures, domestication of the horse.


ca. 4000-3500 BC. The Yamna culture, the kurgan builders, emerges in the steppe, and the Maykop culture in northern Caucasus.

Pre-LIE and Pre-PAn dialects evolve in different communities but presumably still in contact within the same territory.

ca. 3500-3000 BC. Yamna culture at its peak: stone idols, two-wheeled proto-chariots, animal husbandry, permanent settlements and hillforts, subsisting on agriculture and fishing, along rivers. Contact of the Yamna culture with late Neolithic Europe cultures results in kurganised Globular Amphora and Baden cultures. Maykop culture shows earliest evidence of the beginning Bronze Age; bronze weapons and artifacts introduced.

Proto-Anatolian becomes isolated (either to the south of the Caucasus or in the Balkans), and has no more contacts with the linguistic innovations of the common Late Indo-European language.

Late Indo-European evolves in turn into dialects, at least a Southern or Graeco-Aryan and a Northern one.

ca. 3000-2500 BC. The Yamna culture extends over the entire Pontic steppe. The Corded Ware culture extends from the Rhine to the Volga, corresponding to the latest stage of IE unity. Different cultures disintegrate, still in loose contact, enabling the spread of technology.

Dialectal communities begin to migrate, remaining still in loose contact, enabling the spread of the last common phonetic and morphological innovations, and loan words. PAn, spoken in Asia Minor, evolves into Common Anatolian.

ca. 2500-2000 BC. The Bronze Age reaches Central Europe with the Beaker culture of Northern Indo-Europeans. Indo-Iranians settle north of the Caspian in the Sintashta-Petrovka and later the Andronovo culture.

The breakup of the southern IE dialects is complete. Proto-Greek spoken in the Balkans; Proto-Indo-Iranian in Central Asia; North-West Indo-European in Northern Europe; Common Anatolian dialects in Anatolia.

ca. 2000-1500 BC. The chariot is invented, leading to the split and rapid spread of Iranians and other peoples from the Andronovo culture and the Bactria-Margiana Complex over much of Central Asia, Northern India, Iran and Eastern Anatolia. Greek Darg Ages and flourishing of the Hittite Empire. Pre-Celtic Unetice culture.

Indo-Iranian splits up in two main dialects, Indo-Aryan and Iranian. European proto-dialects like Pre-Germanic, Pre-Celtic, Pre-Italic, and Pre-Balto-Slavic differentiate from each other. Anatolian languages like Hittite and Luwian are written down; Indo-Iranian attested through Mitanni; a Greek dialect, Mycenaean, is already spoken.

ca. 1500-1000 BC. The Nordic Bronze Age sees the rise of the Germanic Urnfield and the Celtic Hallstatt cultures in Central Europe, introducing the Iron Age. Italic peoples move to the Italian Peninsula. Rigveda composed. Decline of Hittite Kingdoms and the Mycenaean civilisation.

Celtic, Italic, Germanic, Baltic and Slavic are already different proto-languages, developing in turn different dialects. Iranian and other related southern dialects expand through military conquest, and Indo-Aryan spreads in the form of its sacred language, Sanskrit.

ca. 1000-500 BC. Northern Europe enters the Pre-Roman Iron Age. Early Indo-European Kingdoms and Empires in Eurasia. In Europe, Classical Antiquity begins with the flourishing of the Greek peoples. Foundation of Rome.

Celtic dialects spread over western Europe, German dialects to the south of Jutland. Italic languages in the Italian Peninsula. Greek and Old Italic alphabets appear. Late Anatolian dialects. Cimmerian, Scythian and Sarmatian in Asia, Palaeo-Balkan languages in the Balkans.

1.5. Other Archaeolinguistic Theories

1.5.1. The most known new alternative theory concerning PIE is the Glottalic theory. It assumes that Proto-Indo-European was pronounced more or less like Armenian, i.e. instead of PIE *p, *b, *bh, the pronunciation would have been *p’, *p, *b, and the same with the other two voiceless-voiced-voiced aspirated series of consonants usually reconstructed. The IE Urheimat would have been then located in the surroundings of Anatolia, especially near Lake Urmia, in northern Iran, hence the archaism of Anatolian dialects and the glottalics found in Armenian.

NOTE. Those linguistic and archaeological findings are supported by Gamkredlize-Ivanov (“The early history of Indo-European languages”, Scientific American, 1990) where early Indo-European vocabulary deemed “of southern regions” is examined, and similarities with Semitic and Kartvelian languages are also brought to light.

This theory is generally rejected; Beekes (1995) for all: “But this theory is in fact very improbable. The presumed loan-words are difficult to evaluate, because in order to do so the Semitic words and those of other languages would also have to be evaluated. The names of trees are notoriously unreliable as evidence. The words for panther, lion and elephant are probably incorrectly reconstructed as PIE words.”

1.5.2. Alternative theories include:

I. The European Homeland thesis maintains that the common origin of the IE languages lies in Europe. These hypotheses are often driven by archeological theories. A. Häusler (Die Indoeuropäisierung Griechenlands, Slovenska Archeológia 29, 1981; etc.) continues to defend the hypothesis that places Indo-European origins in Europe, stating that all the known differentiation emerged in the continuum from the Rhin to the Urals.

NOTE. It has been traditionally located in 1) Lithuania and the surrounding areas, by R.G. Latham (1851) and Th. Poesche (Die Arier. Ein Beitrag zur historischen Anthropologie, 1878); 2) Scandinavia, by K.Penka (Origines ariacae, 1883); 3) Central Europe, by G. Kossinna (“Die Indogermanische Frage archäologisch beantwortet”, Zeitschrift für Ethnologie, 34, 1902), P.Giles (The Aryans, 1922), and by linguist/archaeologist G. Childe (The Aryans. A Study of Indo-European Origins, 1926).

a. The Paleolithic Continuity theory posits that the advent of IE languages should be linked to the arrival of Homo sapiens in Europe and Asia from Africa in the Upper Paleolithic. The PCT proposes a continued presence of Pre-IE and non-IE peoples and languages in Europe from Paleolithic times, allowing for minor invasions and infiltrations of local scope, mainly during the last three millennia.

NOTE. There are some research papers concerning the PCT available at <http://www.continuitas.com/>. Also, the PCT could in turn be connected with Frederik Kortlandt’s Indo-Uralic and Altaic studies <http://kortlandt.nl/publications/>.

On the temporal relationship question, Mallory–Adams (2007): “Although there are still those who propose solutions dating back to the Palaeolithic, these cannot be reconciled with the cultural vocabulary of the Indo-European languages. The later vocabulary of Proto-Indo- European hinges on such items as wheeled vehicles, the plough, wool, which are attested in Proto-Indo-European, including Anatolian. It is unlikely then that words for these items entered the Proto-Indo-European lexicon prior to about 4000 BC.”

b. A new theory put forward by Colin Renfrew relates IE expansion to the Neolithic revolution, causing the peacefully spreading of an older pre-IE language into Europe from Asia Minor from around 7000 BC, with the advance of farming. It proposes that the dispersal (discontinuity) of Proto-Indo-Europeans originated in Neolithic Anatolia.

NOTE. Reacting to criticism, Renfrew by 1999 revised his proposal to the effect of taking a pronounced Indo-Hittite position. Renfrew’s revised views place only Pre-Proto-Indo-European in seventh millennium Anatolia, proposing as the homeland of Proto-Indo-European proper the Balkans around 5000 BC, explicitly identified as the “Old European culture” proposed by Gimbutas.

Mallory–Adams (2007): “(…) in both the nineteenth century and then again in the later twentieth century, it was proposed that Indo-European expansions were associated with the spread of agriculture. The underlying assumption here is that only the expansion of a new more productive economy and attendant population expansion can explain the widespread expansion of a language family the size of the Indo-European. This theory is most closely associated with a model that derives the Indo-Europeans from Anatolia about the seventh millennium BC from whence they spread into south-eastern Europe and then across Europe in a Neolithic ‘wave of advance’.

(…) Although the difference between the Wave of Advance and Kurgan theories is quite marked, they both share the same explanation for the expansion of the Indo-Iranians in Asia (and there are no fundamental differences in either of their difficulties in explaining the Tocharians), i.e. the expansion of mobile pastoralists eastwards and then southwards into Iran and India. Moreover, there is recognition by supporters of the Neolithic theory that the ‘wave of advance’ did not reach the peripheries of Europe (central and western Mediterranean, Atlantic and northern Europe) but that these regions adopted agriculture from their neighbours rather than being replaced by them”.

Talking about these new hypotheses, Adrados–Bernabé–Mendoza (1995-1998) discuss the relevance that is given to each new personal archaeological ‘revolutionary’ theory: “[The hypothesis of Colin Renfrew (1987)] is based on ideas about the diffusion of agriculture from Asia to Europe in [the fifth millennium Neolithic Asia Minor], diffusion that would be united to that of Indo-Europeans; it doesn’t pay attention at all to linguistic data. The [hypothesis of Gamkrelidze–Ivanov (1980, etc.)], which places the Homeland in the contact zone between Caucasian and Semitic peoples, south of the Caucasus, is based on real or supposed lexical loans; it disregards morphological data altogether, too. Criticism of these ideas – to which people have paid too much attention – are found, among others, in Meid (1989), Villar (1991), etc.”

II. Another hypothesis, contrary to the European ones, also mainly driven today by nationalistic or religious views, traces back the origin of PIE to Vedic Sanskrit, postulating that this is very pure, and that the origin of common Proto-Indo-European can thus be traced back to the Indus Valley Civilisation of ca. 3000 BC.

NOTE. Pan-Sanskritism was common among early Indo-Europeanists, as Schlegel, Young, A. Pictet (Les origines indoeuropéens, 1877) or Schmidt (who preferred Babylonia), but are now mainly supported by those who consider Sanskrit almost equal to Late Proto-Indo-European. For more on this, see S. Misra (The Aryan Problem: A Linguistic Approach, 1992), Elst (Update on the Aryan Invasion Debate, 1999), followed up by S.G. Talageri (The Rigveda: A Historical Analysis, 2000), both part of “Indigenous Indo-Aryan” viewpoint by N. Kazanas, the “Out of India” theory, with a framework dating back to the times of the Indus Valley Civilisation.

1.6. Relationship to Other Languages

1.6.1. Many higher-level relationships between PIE and other language families have been proposed, but these speculative connections are highly controversial. Perhaps the most widely accepted proposal is of an Indo-Uralic family, encompassing PIE and Proto-Uralic, a language from which Hungarian, Finnish, Estonian, Saami and a number of other languages belong. The evidence usually cited in favour of this is the proximity of the proposed Urheimaten for both of them, the typological similarity between the two languages, and a number of apparent shared morphemes.

NOTE. Other proposals, further back in time (and correspondingly less accepted), model PIE as a branch of Indo-Uralic with a Caucasian substratum; link PIE and Uralic with Altaic and certain other families in Asia, such as Korean, Japanese, Chukotko-Kamchatkan and Eskimo-Aleut (representative proposals are Greenberg’s Eurasiatic and its proposed parent-language Nostratic); etc.

1.6.2. Indo-Uralic or Uralo-Indo-European is therefore a hypothetical language family consisting of Indo-European and Uralic (i.e. Finno-Ugric and Samoyedic). Most linguists still consider this theory speculative and its evidence insufficient to conclusively prove genetic affiliation.

NOTE. The problem with lexical evidence is to weed out words due to borrowing, because Uralic languages have been in contact with Indo-European languages for millennia, and consequently borrowed many words from them.

Björn Collinder, author of the path-breaking Comparative Grammar of the Uralic Languages (1960), a standard work in the field of Uralic studies, argued for the kinship of Uralic and Indo-European (1934, 1954, 1965).

The most extensive attempt to establish sound correspondences between Indo-European and Uralic to date is that of the late Slovenian linguist Bojan Čop. It was published as a series of articles in various academic journals from 1970 to 1989 under the collective title Indouralica. The topics to be covered by each article were sketched out at the beginning of “Indouralica II”. Of the projected 18 articles only 11 appeared. These articles have not been collected into a single volume and thereby remain difficult to access.

Dutch linguist Frederik Kortlandt supports a model of Indo-Uralic in which its speakers lived north of the Caspian Sea, and Proto-Indo-Europeans began as a group that branched off westward from there to come into geographic proximity with the Northwest Caucasian languages, absorbing a Northwest Caucasian lexical blending before moving farther westward to a region north of the Black Sea where their language settled into canonical Proto-Indo-European.

1.6.3. The most common arguments in favour of a relationship between PIH and Uralic are based on seemingly common elements of morphology, such as:




‘I, me’

*me ‘me’ (Acc.), *mene ‘my’ (Gen.)

*mun, *mina ‘I’

‘you’ (sg)

*tu (Nom.), *twe (Acc.), *tewe ‘your’ (Gen.)

*tun, *tina

1st P. singular



1st P. plural



2nd P. singular

*-s (active), *-tHa (perfect)


2nd P. plural




*so ‘this, he/she’ (animate nom)

*ša (3rd person singular)

Interr. pron. (An.)

*kwi-  ‘who?, what?’; *kwo- ‘who?, what?’

*ken ‘who?’, *ku-, ‘who?’

Relative pronoun


*-ja (nomen agentis)







Nom./Acc. plural

*-es (Nom. pl.), *-m̥-s (Acc. pl.)


Oblique plural

*-i (pronomin. pl., cf. *we-i- ‘we’,  *to-i- ‘those’)






*-s- (aorist); *-es-, *-t (stative substantive)


Negative particle

*nei, *ne

*ei- [negative verb] , *ne

‘to give’



‘to wet’,’water’

*wed- ‘to wet’, *wodr̥- ‘water’

*weti ‘water’


*mesg- ‘dip under water, dive’

*muśke- ‘wash’

‘to assign’,

*nem- ‘to assign, to allot’, *h1nomn̥- ‘name’

*nimi ‘name’


*h2weseh2- ‘gold’

*waśke ‘some metal’


*mei- ‘exchange’

*miHe- ‘give, sell’


*(s)kwalo- ‘large fish’

*kala ‘fish’