In this newer edition of our Grammar, we follow the first intention of this work, trying not to include personal opinions, but a collection of the latest, most reasoned academic papers on the latest reconstructible PIE, providing everything that might be useful for the teaching and learning of Indo-European as a living language.

With that aim in mind, and with our compromise to follow the scientific method, we have revised the whole text in search for out-dated material and unexplained forms, as well as inconsistencies in reconstructions or conventions. We have also restricted the amount of marginal choices in favour of the general agreement, so that we could offer a clear, sober, and commonly agreed manual to learn Indo-European.

The approach featured in this book for more than half a decade already is similar to the one followed in Gamkrelidze–Ivanov (1994-1995), and especially to that followed by Adrados–Bernabé–Mendoza (1995-1998). Both returned to (and revised) the ‘Brugmannian’ Indo-European, the historical result of the development of certain isoglosses, both phonetic (loss of laryngeals, with the development of brief and long vowel system) and morphological (polythematic system in noun and verb, innovations in their inflection).

Adrados–Bernabé–Mendoza (1995-1998) distinguished between Late Indo-European and its parent-language Indo-Hittite – laryngeal, without distinction in vowel length, monothematic system. We developed that trend further, focussing on a post-Late Indo-European period, in search for a more certain, post-laryngeal IE, to avoid the merged laryngeal puzzle of the ‘disintegrating Indo-European’ of Bomhard (1984), and the conventional notation of a schwa indogermanicum (kept in Adrados–Bernabé–Mendoza), most suitable for a description of a complex period of phonetic change –  which is possibly behind the flight of all other available modern works on PIE to the highly theoretical (but in all other respects clear and straightforward) PIH phonology. Morphology and syntax remain thus nearest to the older IE languages attested, always compared to Anatolian material, but avoiding the temporal inconsistencies that are found throughout the diachronic reconstructions in other, current manuals.

We try to fill the void that Gamkrelidze–Ivanov and Adrados–Bernabé–Mendoza left by following works (Lehmann 1972, Rix 1986, etc.) that already differentiated PIH from Late Indo-European, trying to “see the three-stage theory to the bitter end. Once established the existence of the three-staged IE, a lot must still be done. We have to define the detail, and we must explain the reason for the evolution, which formal elements does PIE deal with, and how they are ascribed to the new functions and categories. These developments shall influence the history of individual languages, which will have to be rewritten. Not only in the field of morphology, but also in phonetics and syntax” (Adrados–Bernabé–Mendoza 1995-1998).

Apart from a trustable reconstruction of the direct ancestors of the older IE languages (North-West Indo-European, Proto-Greek and Proto-Indo-Iranian), this work ‘corrupts’ the natural language – like any classical language grammar – with the intention to show a living language, and the need to establish some minimal writing conventions to embellish the phonetic notation. The question ‘why not learn Indo-European as a living language?’ arises from the same moment on when reconstruction is focussed on a (scientifically) conservative approach – an ultimate consequence of the three-stage-theory, and the search for more certain reconstructions –, yielding a reliable language system. A language system free from the need for theoretical artifices, or personal opinions on ‘original’ forms, that try to fill unending phonetic, morphological and syntactical uncertainties of the current diachronic PIE reconstruction.

As the learned reader might have already inferred, the question of “natural” vs. “artificial” is not easily answered concerning ancient languages. Ancient Greek phonetics, for example, is known through internal as well as external reconstruction, and the actual state of the art is largely based on the body of evidence discussed extensively by linguists and philologists of the nineteenth and twentieth centuries, with lots of questions unsolved. Furthermore, Ancient Greek is not one language; in fact, there are many dialects, each with different periods, and different representations of their sounds, all of which account for what we know with the unitary name Ancient Greek. Another example is Sanskrit, retained as different historical linguistic stages and dialects through oral tradition. Its first writings and grammatical rules were laid down centuries after it had ceased to be spoken, and centuries earlier before it became the classical Indian language. Latin is indeed not different from the above examples, being systematised in the so-called classical period, while a real, dialectally and temporarily variable Vulgar Latin was used by the different peoples who lived in the Roman Empire, making e.g. some questions over the proper pronunciation still debated today.

The interest in the study and use of Indo-European as a living language today is equivalent to the interest in the study and use of these ancient languages as learned languages in the the Byzantine Empire, India and Mediaeval Europe, respectively. With regard to certainty in reconstruction, Late Indo-European early dialects are not less natural than these classical languages were in the past. Even modern languages, like English, are to a great extent learned languages, in which social trends and linguistic artifices are constantly dividing between formal and colloquial, educated and uneducated, often simply good or bad usage of the language.

About the question of ‘dead’ vs. ‘living’ languages, heated debate is e.g. held on the characterisation of Sanskrit, which is not as other dead languages, being spoken, written and read today in India. The notion of the death of a language remains thus in an unclear realm between academia and public opinion.

I prefer to copy Michael Coulson’s words from the preface of a great introductory work on Sanskrit (from the Teach Yourself® series), referring originally to the way Indians used Sanskrit as a learned (and dead) language, far beyond the rules that grammarians had imposed. I think this text should also be valid if we substituted ‘Sanskrit’ for ‘Indo-European’; the ‘rules’ of ‘Sanskrit grammarians’ for the ‘reconstruction’ of ‘IE scholars’; and the ‘renowned Sanskrit writers’ for the ‘potential future IE writers’:

 «By [the time Kālidāsa, a writer fl. ca. the fifth century AD, lived] Sanskrit was not a mother tongue, but a language to be studied and consciously mastered. This transformation had come about through a gradual process, the beginnings of which are no doubt earlier than Pāini [ancient Indian Sanskrit grammarian, fl. fourth century BC] himself. (…) Kālidāsa learnt his Sanskrit from the rules of a grammarian living some 700 years before his time. Such a situation may well strike the Western reader as paradoxical. Our nearest parallel is in the position of Latin in Medieval Europe. There is, however, an important difference. Few would deny Cicero or Vergil a greater importance in Latin literature than any mediaeval author. Conversely, few Sanskritists would deny that the centre of gravity in Sanskrit literature lies somewhere in the first millennium AD, for all that its authors were writing in a so-called ‘dead-language’.

On this point it may be useful to make a twofold distinction – between a living and a dead language, and between a natural and a learned one. A language is natural when it is acquired and used instinctively; it is living when people choose to converse and formulate ideas in it in preference to any other. To the modern Western scholar Sanskrit is a dead as well as a learned language. To Kālidāsa or Śakara [ninth century Indian philosopher from a Dravidian-speaking region] it was a learned language but a living one. (The term ‘learned is not entirely satisfactory, but the term ‘artificial’, which is the obvious complementary of ‘natural’, is normally reserved for application to totally constructed languages such as Esperanto.)

(…) Living languages, whether natural or learned, change and develop. But when a learned language such as literary English is closely tied to, and constantly revitalized by, a natural idiom, its opportunities for independent growth are limited. Sanskrit provides a fascinating example of a language developing in complete freedom from such constraints as an instrument of intellectual and artistic expression. To say that Classical Sanskrit was written in conformity with Pāini’s rules is true, but in one sense entirely misleading. Pāini would have been astounded by the way in which Bāā or Bhavabhūti or Abhinavagupta handled the language. It is precisely the fact that Sanskrit writers insisted on using Sanskrit as a living and not as a dead language that has often troubled Western scholars. W. D. Whitney, a great but startlingly arrogant American Sanskritist of the nineteenth century, says of the Classical language: ‘Of linguistic history there is next to nothing in it all; but only a history of style, and this for the most part showing a gradual depravation, an increase of artificiality and an intensification of certain more undesirable features of the language – such as the use of passive constructions and of participles instead of verbs, and the substitution of compounds for sentences.’ Why such a use of passives, participles and compounds should be undesirable, let alone depraved, is left rather vague, and while there have been considerable advances in linguistic science in the past fifty years there seems to have been nothing which helps to clarify or justify these strictures. Indeed, Whitney’s words would not be worth resurrecting if strong echoes of them did not still survive in some quarters.

Acceptance of Pāini’s rules implied a final stabilization of the phonology of Sanskrit, and also (at least in the negative sense that no form could be used which was not sanctioned by him) of its morphology. But Pāini did not fix syntax. To do so explicitly and incontrovertibly would be difficult in any language, given several ways of expressing the same idea and various other ways of expressing closely similar ideas.»

Badajoz, April 2011


Guide to the Reader

A. Abbreviations

abl.: ablative

acc.: accusative

act.: active

adj.: adjective

adv.: adverb

Alb.: Albanian

Arm.: Armenian

aor.: aorist

aux.: auxiliary

Av: Avestan

BSl.: Balto-Slavic

CA: Common Anatolian

Cel.: Celtic

cf.: confer ‘compare, contrast’

Cz.: Czech

dat.: dative

Du.: Dutch

e.g.: exempli gratia ‘for example’

Eng.: English

esp.: especially

f.: feminine

fem.: feminine

gen.: genitive

Gaul.: Gaulish

Gk.: Greek

Gmc.: Proto-Germanic

Goth.: Gothic

Hitt.: Hittite

Hom.: Homeric

IE: Indo-European

IED: Late Indo-European dialects

imp.: imperative

imperf.: imperfect

Ind.-Ira.: Indo-Iranian

ins.: instrumental

int.: interrogative

Ita.: Italic

Lat.: Latin

LIE: Late Indo-European

Lith.: Lithuanian

Ltv.: Latvian

loc.: locative

Luw./Luv.: Luvian

Lyc.: Lycian

m.: masculine

masc.: masculine

M.H.G.: Middle High German

mid.: middle-passive voice

MIE: Modern Indo-European

Myc.: Mycenaean

n.: neuter

neu.: neuter

nom.: nominative

NP: noun phrase

NWIE: North-West Indo-European

O: object

Obj.: object

O.Av.: Old Avestan

O.C.S.: Old Church Slavic

O.E.: Old English

O.Ind.: Old Indian

O.Ir.: Old Irish

O.H.G.: Old High German

O.Hitt.: Old Hittite

O.Lat.: Archaic Latin

O.Lith.: Old Lithuanin

O.N.: Old Norse

O.Pers.: Old Persian

O.Pruss.: Old Prussian

O.Russ.: Old Russian

opt.: optative

Osc.: Oscan

OSV: object-subject-verb order

OV: object-verb order

perf.: perfect

PAn: Proto-Anatolian

PGmc.: Pre-Proto-Germanic

PII: Proto-Indo-Iranian

PGk: Proto-Greek

Phryg: Phrygian

PIE: Proto-Indo-European

PIH: Proto-Indo-Hittite

pl.: plural

pres.: present

pron.: pronoun

Ptc.: particle

Russ.: Russian

sg.: singular

Skt.: Sanskrit

Sla.: Slavic

SOV: subject-object-verb order

subj.: subjunctive

SVO: subject-verb-object order

Toch.: Tocharian

Umb.: Umbrian

Ved.: Vedic

v.i.: vide infra ‘see below’

VO: verb-object order

voc.: vocative

VP: verb phrase

v.s.: vide supra ‘see above’

VSO: verb-subject-object order

1st: first person

2nd: second person

3rd: third person


B. Symbols


denotes a reconstructed form, not preserved in any written documents


denotes a reconstructed form through internal reconstruction

“comes from” or “is derived from”

“turns into” or “becomes”


indicates morpheme boundary, or separates off that part of a word that the reader should focus on

( )

encloses part of a word that is not relevant to the discussion, or that is an optional part

“zero desinence” or “zero-grade”


denotes a wrong formation


C. Spelling Conventions

All linguistic forms are written in italics. The only exceptions are reconstructed IED forms, that are given in boldface; and in italics if morphemes or dialectal forms (from PII, PGk, or from East or West European). We use a non-phonetic writing for IEDs, following the conventions in Writing System (see below).

When representing word schemes:

C = consonant

R = resonant (r, l, m, n)

T = dental

K = occlusive

J = glide (j, w)

H = any laryngeal or merged laryngeal

V = vowel

= long vowel

I = i, u

° = epenthetic or auxiliary vowel

(conventionally, the symbol ° under the vocalic resonants is placed before it in these cases)

# = syllabic limit

Citation: parenthetical referencing of author-date is used for frequently cited books (referenced in the Bibliography), and author-title for articles and other books.


I owe special and personal gratitude to my best friend and now fiancée Mayte, whose many lovely qualities do not include knowledge of or an interest in historical linguistics. But without her this never would have been written.

I have been extremely fortunate to benefit from Fernando López-Menchero’s interest and from his innumerable contributions, revisions, and corrections. Without his deep knowledge of Ancient Greek and Latin, as well as his interest in the most recent research in IE studies, this grammar would have been unthinkable.

I have received the invaluable support of many colleagues and friends from the University of Extremadura (UEx), since we began publishing this book half a decade ago. The University has been crucial to this enterprise: first in 2005 when Prof. Antonio Muñoz PhD, Vice-Dean of the Faculty of Library Science, expert in Business Information, as well as other signatories – doctors in Economics and English Philology –, supported this language revival project before the competition committee and afterwards; in 2006, when representatives of the Dean’s office, of the Regional Government of Extremadura, and of the Mayor’s office of Caceres, recognised our work awarding our project a prize in the “Entrepreneurship Competition in Imagination Society”, organizing and subsidizing a business trip to Barcelona’s most innovative projects; and in 2007, when we received the unconditional support of the Department of Classical Antiquity of the UEx.

Over the years I have also received feed-back from informed end-users, as well as from friends and members of the Indo-European Language Association, who were in the best position to judge such matters as the intelligibility and consistency of the whole. I am also indebted to Manuel Romero from Imcrea Diseño Editorial, for his help with the design and editorial management of the first printed edition.

The influence of the work of many recent scholars is evident on these pages. Those who are most often cited include (in alphabetical order): D.Q. Adams, F.R. Adrados David Anthony, R.S.P. Beekes, Emile Benveniste, Alberto Bernabé, Thomas Burrow, George Cardona, James Clackson, B.W. Fortson, Matthias Fritz, T.V. Gamkrelidze, Marija Gimbutas, Eric Hamp, V.V. Ivanov, Jay Jasanoff, Paul Kiparsky, Alwin Kloekhorst, F.H.H. Kortlandt, Jerzy Kuryłowicz, W.P. Lehmann, J.P. Mallory, Manfred Mayrhofer, Wolfgang Meid, Michael Meier-Brügger, Torsten Meissner, Craig Melchert, Julia Mendoza, Anna Morpurgo Davies, Norbert Oettinger, Edgar Polomé, C.J. Ruijgh, Paolo Ramat, Donald Ringe, Helmut Rix, A.L. Sihler, Sergei Starostin, J.L. Szemerényi, Francisco Villar, Calvert Watkins, M.L. West.

Considerations of Method

This work is intended for language learners, and is not conceived as a defence of personal research. Excerpts of texts from many different sources have been copied literally, especially regarding controversial or untreated aspects. We feel that, whereas the field of Indo-European studies is indeed mature, and knowledge is out there to be grasped, we lack a comprehensive summary of the available consensual theories, scattered over innumerable specialised personal books and articles.

We must begin this work by clearly exposing our intended working method in selecting and summing up the current available theories: it is basically, as it is commonly accepted today for PIE reconstruction, the comparative method, with the help of internal reconstruction.

NOTE. Adrados–Bernabé–Mendoza (1995-1998): “We think (…) that a linguist should follow, to establish relations among languages, linguistic methods. If then the results are coincident, or compatible, or might be perfected with those obtained by archaeologists, so much the better. But a mixed method creates all types of chain mistakes and arbitrary results. We have seen that many times. And a purely archaeological method like the one supported lately by Renfrew 1987 or, in certain moments, the same Gimbutas 1985, clashes with the results of Linguistics.

The method has to rely on [the comparative method and internal reconstruction]. We have already expressed our mistrust in the results based on typological comparisons with remote languages (glottalic theory, ergative, etc.). Now they are more frequent in books like Gamkelidze-Ivanov 1994-1995.

And fundamentally lexical comparisons should not be the first argument in comparisons, either. We do not doubt their interest in certain moments, e.g. to illuminate the history of Germanic in relation with Finnish. And they could have interest in different comparisons: with Uralo-Altaic languages, Semitic, Caucasic, Summerian, etc.”

The guidelines that should be followed, as summarised by Beekes (1995):

1.  “See what information is generated by internal reconstruction.

2. Collect all material that is relevant to the problem.

3. Try to look at the problem in the widest possible contact, thus in relation to everything else that may be connected with it. (…)

4. Assume that corresponding forms, that is to say, forms whose meaning (probably) and whose structures (probably) seem to be alike, all derive from one common ancestor.

5.  The question of how deviant forms should be evaluated is a difficult one to answer. When such a form can be seen as an innovation within a particular language (or group of languages), the solution is that the form in question is young and as such cannot be important for the reconstruction of the original form. Whenever a deviant form resists explanation it becomes necessary to consider the possibility that the very form in question may be one that preserves the original. (…)

6. For every solution the assumed (new) sound-laws must be phonetically probable, and the analogies must be plausible.

7.  The reconstructed system must be probable (typological probability). If one should reconstruct a system which is found nowhere else in any of the known languages, there will always be, to say the least, reasons for doubt. On the other hand, every language is unique, and there is thus always the possibility that something entirely unknown must be reconstructed.”

There are two main aspects of the comparative method as is usually applied that strikes the ‘pure scientific’ reader, though, always obsessed with adopting a conservative approach to research, in the sense of security or reliability. We shall take words from Claude Bernard’s major discourse on scientific method, An Introduction to the Study of Experimental Medicine (1865), to illustrate our point: 

1. Authority vs. Observation. It is through observation that science is carried forward — not through uncritically accepting the authority of academic or scholastic sources. Observable reality is our only authority. “When we meet a fact which contradicts a prevailing theory, we must accept the fact and abandon the theory, even when the theory is supported by great names and generally accepted”.

NOTE. Authority is certainly a commonly used, strong and generally sound basis to keep working on comparative grammar, though, because it this is a field based on ‘pyramidal’ reasoning and not experimental research. But authority should be questioned whenever it is needed. Authority – be it the view of the majority, or the opinion of a renowned linguist or linguistic school – do not mean anything, and ideas are not to be respected because of who supports (or supported) them.

2. Verification and Disproof. “Theories are only hypotheses, verified by more or less numerous facts. Those verified by the most facts are the best, but even then they are never final, never to be absolutely believed”. What is rationally true is the only authority.

On hypothesis testing in science, decisions are usually made using a statistical null-hypothesis test approach. Regarding linguistics and its comparative method, sometimes authority is placed as null hypothesis or H0 (as in many non-experimental sciences), while counter-arguments must take the H1 position, and are therefore at disadvantage against the authority view.

If two theories show a strong argument against the basic H0 (“nothing demonstrated”), and are therefore accepted as alternative explanations for an observed fact, then the most reasonable one must be selected as the new H0, on the grounds of the lex parsimoniae (or the so-called Ockham’s razor), whereby H0 should be the competing hypothesis that makes the fewest new assumptions, when the hypotheses are equal in other respects (e.g. both sufficiently explain available data in the first place).

NOTE. The principle is often incorrectly summarised as “the simplest explanation is most likely the correct one”. This summary is misleading, however, since the principle is actually focussed on shifting the burden of proof in discussions. That is, the Razor is a principle that suggests we should tend towards simpler theories until we can trade some simplicity for increased explanatory power. Contrary to the popular summary, the simplest available theory is sometimes a less accurate explanation. Philosophers also add that the exact meaning of “simplest” can be nuanced in the first place.

As an example of the applicability of the scientific method, we will take two difficult aspects of PIE reconstructions: the series of velars and the loss of laryngeals.

The problem with these particular reconstructions might be summarised by the words found in Clackson (2007): “It is often a fault of Indo-Europeanists to over-reconstruct, and to explain every development of the daughter languages through reconstruction of a richer system in the parent language.”

The Three-Dorsal Theory

PIE phonetic reconstruction is tied to the past: acceptance of three series of velars in PIE is still widespread today. We followed the reconstruction of ‘palatovelars’, according to general authority and convention, but we have changed minds since the first edition of this grammar.

Direct comparison in early IE studies, informed by the centum-satem isogloss, yielded the reconstruction of three rows of dorsal consonants in Late Indo-European by Bezzenberger (Die indogermanischer Gutturalreihen, 1890), a theory which became classic after Brugmann included it in the 2nd Edition of his Grundriss. It was based on vocabulary comparison: so e.g. from PIE *km̥tóm ‘hundred’, there are so-called satem (cf. O.Ind. śatám, Av. satəm, Lith. šimtas, O.C.S. sto) and centum languages (cf. Gk. -katón, Lat. centum, Goth. hund, O.Ir. cet).

The palatovelars *kj, *gj, and *gjh were supposedly [k]- or [g]-like sounds which underwent a characteristic phonetic change in the satemised languages – three original “velar rows” had then become two in all Indo-European dialects attested. After that original belief, then, the centum group of languages merged the palatovelars *kj, *gj, and *gjh with the plain velars *k, *g, and *gh, while the satem group of languages merged the labiovelars *kw, *gw, and *gwh with the plain velars *k,*g, and *gh.

The reasoning for reconstructing three series was very simple: an easy and straightforward solution for the parent PIE language must be that it had all three rows found in the proto-languages, which would have merged into two rows depending on their dialectal (centum vs. satem) situation – even if no single IE dialect shows three series of velars. Also, for a long time this division was identified with an old dialectal division within IE, especially because both groups appeared not to overlap geographically: the centum branches were to the west of satem languages. Such an initial answer should be considered unsound today, at least as a starting-point to obtain a better explanation for this ‘phonological puzzle’ (Bernabé).

Many Indo-Europeanists still keep a distinction of three distinct series of velars for Late Indo-European (and also for Indo-Hittite), although research tend to show that the palatovelar series were a late phonetic development of certain satem dialects, later extended to others. This belief was originally formulated by Antoine Meillet (De quelques difficulties de la théorie des gutturals indoeuropéennes, 1893), and has been followed by linguists like Hirt (Zur Lösung der Gutturalfrage im Indogermanischen, 1899; Indogermanische Grammatik, BD III, Das Nomen 1927), Lehmann (Proto-Indo-European Phonology, 1952), Georgiev (Introduzione allo studio delle lingue indoeuropee, 1966), Bernabé (“Aportaciones al studio fonológico de las guturales indoeuropeas”, Em. 39, 1971), Steensland (Die Distribution der urindogermanischen sogenannten Guttrale, 1973), Miller (“Pure velars and palatals in Indo-European: a rejoinder to Magnusson”, Linguistics 178, 1976), Allen (“The PIE velar series: Neogrammarian and other solutions in the light of attested parallels”, TPhS, 1978), Kortlandt (“H2 and oH2”, LPosn, 1980), Shields (“A new look at the centum/satem Isogloss”, KZ 95, 1981), etc.

NOTE. There is a general trend to reconstruct labiovelars and plain velars, so that the hypothesis of two series of velars is usually identified with this theory. Among those who support two series of velars there is, however, a minority who consider the labiovelars a secondary development from the pure velars, and reconstruct only velars and palatovelars (Kuryłowicz), already criticised by Bernabé, Steensland, Miller and Allen. Still less acceptance had the proposal to reconstruct only a labiovelar and a palatal series (Magnusson).

Arguments in favour of only two series of velars include:

1. In most circumstances palatovelars appear to be allophones resulting from the neutralisation of the other two series in particular phonetic circumstances. Their dialectal articulation was probably constrained, either to an especial phonetic environment (as Romance evolution of Latin k before e and i), either to the analogy of alternating phonetic forms.

NOTE. However, it is difficult to pinpoint exactly what the circumstances of the allophony are, although it is generally accepted that neutralisation occurred after s and u, and often before r or a; also apparently before m and n in some Baltic dialects. The original allophonic distinction was disturbed when the labiovelars were merged with the plain velars. This produced a new phonemic distinction between palatal and plain velars, with an unpredictable alternation between palatal and plain in related forms of some roots (those from original plain velars) but not others (those from original labiovelars). Subsequent analogical processes generalised either the plain or palatal consonant in all forms of a particular root. Those roots where the plain consonant was generalised are those traditionally reconstructed as having plain velars in the parent language, in contrast to palatovelars.

2. The reconstructed palatovelars and plain velars appear mostly in complementary distributions, what supports their explanation as allophones of the same phonemes. Meillet (Introduction à l’étude comparative des langues indo-européennes, 1903) established the contexts in which there are only velars: before a, r, and after s, u; while Georgiev (1966) clarified that the palatalisation of velars had been produced before e, i, j, and before liquid or nasal or w + e, i, offering statistical data supporting his conclusions. The presence of palatalised velar before o is then produced because of analogy with roots in which (due to the ablaut) the velar phoneme is found before e and o, so the alternation *kje/*ko would be levelled as *kje/*kjo.

3. There is residual evidence of various sorts in satem languages of a former distinction between velar and labiovelar consonants:

·In Sanskrit and Balto-Slavic, in some environments, resonants become iR after plain velars but uR after labiovelars.

·In Armenian, some linguists assert that kw is distinguishable from k before front vowels.

·In Albanian, some linguists assert that kw and gw are distinguishable from k and g before front vowels.

NOTE. This evidence shows that the labiovelar series was distinct from the plain velar series in LIE, and could not have been a secondary development in the centum languages. However, it says nothing about the palatovelar vs. plain velar series. When this debate initially arose, the concept of a phoneme and its historical emergence was not clearly understood, however, and as a result it was often claimed (and sometimes is still claimed) that evidence of three-way velar distinction in the history of a particular IE language indicates that this distinction must be reconstructed for the parent language. This is theoretically unsound, as it overlooks the possibility of a secondary origin for a distinction.

4. The palatovelar hypothesis would support an evolution kj k of centum dialects, i.e. a move of palatovelars to back consonants, what is clearly against the general tendency of velars to move forward its articulation and palatalise in these environments. A trend of this kind is unparallelled and therefore typologically a priori unlikely (although not impossible), and needs that other assumptions be made.

5.The plain velar series is statistically rarer than the other two in a PIE lexicon reconstructed with three series; it appears in words entirely absent from affixes, and most of them are of a phonetic shape that could have inhibited palatalisation.

NOTE. Common examples are:

o *yug-óm ‘yoke’: Hitt. iukan, Gk. zdugón, Skt. yugá-, Lat. iugum, O.C.S. igo, Goth. juk.

o *ghosti- ‘guest, stranger’: Lat. hostis, Goth. gasts, O.C.S. gostĭ.

“The paradigm of the word for ‘yoke’ could have shown a palatalizing environment only in the vocative *yug-e, which is unlikely ever to have been in common usage, and the word for ‘stranger’ ghosti- only ever appears with the vocalism o”. (Clackson 2007).

6. Alternations between plain velars and palatals are common in a number of roots across different satem languages, where the same root appears with a palatal in some languages but a plain velar in others.

NOTE. This is consistent with the analogical generalisation of one or another consonant in an originally alternating paradigm, but difficult to explain otherwise:

o *ak-/ok- ‘sharp’, cf.  Lith. akúotas, O.C.S. ostrŭ, O.Ind. asrís, Arm. aseln, but Lith. asrùs.

o *akmon- ‘stone’, cf.  Lith. akmuõ, O.C.S. kamy, O.Ind. áśma, but Lith. âsmens.

o *keu- ‘shine’, cf. Lith. kiáune, Russ. kuna, O.Ind. svas, Arm. sukh.

o *bhleg- ‘shine’, cf. O.Ind.  bhárgas, Lith. balgans, O.C.S. blagŭ, but Ltv. blâzt.

o *gherdh- ‘enclose’, O.Ind. ghá, Av. gərəda, Lith. gardas, O.C.S. gradu, Lith. zardas, Ltv. zârdas.

o *swekros ‘father-in-law’, cf. O.Sla. svekry, O.Ind. śvaśru.

o *peku- ‘stock animal’; cf. O.Lith. pkus, Skt. paśu-, Av. pasu-.

o *kleus- ‘hear’; cf. Skt. śrus, O.C.S. slušatĭ, Lith. kláusiu.

A rather weak argument in favour of palatovelars rejecting these finds is found in Clackson (2007): “Such forms could be taken to reflect the fact that Baltic is geographically peripheral to the satem languages and consequently did not participate in the palatalization to the same degree as other languages”.

7. There are different pairs of satemised and non-satemised velars found within the same language.

NOTE. The old argument proposed by Brugmann (and later copied by many dictionaries) about “centum loans” is not tenable today. For more on this, see Szemerény (1978, review from Adrados–Bernabé–Mendoza 1995-1998), Mayrhofer (“Das Guttrualproblem un das indogermanische Wort für Hase”, Studien zu indogermanische Grundsprache, 1952), Bernabé (1971). Examples include:

o *selg-  ‘throw’, cf. O.Ind. sjáti, sargas

o *kau/keu- ‘shout’, cf. Lith. kaukti, O.C.S. kujati, Russ. sova (as Gk. kauax); O.Ind. kauti, suka-.

o *kleu- ‘hear’, Lith. klausýti, slove, O.C.S. slovo;  O.Ind. karnas, sruti,  srósati, śrnóti, sravas.

o *leuk-, O.Ind. rokás, ruśant-.

8. The number and periods of satemisation trends reconstructed for the different branches are not coincident.

NOTE. So for example Old Indian shows two stages,

o PIE *k O.Ind. s

o PIE *kwe, *kwi O.Ind. ke, ki; PIE *ske, *ski > O.Ind. c (cf. cim, candra, etc.)

In Slavic, three stages are found,

o PIE *ks

o PIE *kwe, *kwič  (čto, čelobek)

o PIE *kwoi→*koi→*ke gives ts (as Sla. tsená)

9. In most attested languages which present aspirates as a result of the so-called palatovelars, the palatalisation of other phonemes is also attested (e.g. palatalisation of labiovelars before e, i), what may indicate that there is an old trend to palatalise all possible sounds, of which the palatalisation of velars is the oldest attested result.

NOTE. It is generally believed that satemisation could have started as a late dialectal ‘wave’, which eventually affected almost all PIE dialectal groups. The origin is probably to be found in velars followed by e, i, even though alternating forms like *gen/gon caused natural analogycal corrections within each dialect, which obscures still more the original situation. Thus, non-satemised forms in so-called satem languages would be non-satemised remains of the original situation, just as Spanish has feliz and not ˟heliz, or fácil and not ˟hácil, or French facile and nature, and not ˟fêle or ˟nûre as one should expect from its phonetic evolution.

10. The existence of satem languages like Armenian in the Balkans, a centum territory, and the presence of Tocharian, a centum dialect, in Central Asia, being probably a northern IE dialect.

NOTE. The traditional explanation of a three-way dorsal split requires that all centum languages share a common innovation that eliminated the palatovelar series, due to the a priori unlikely move of palatovelars to back consonants (see above). Unlike for the satem languages, however, there is no evidence of any areal connection among the centum languages, and in fact there is evidence against such a connection – the centum languages are geographically noncontiguous. Furthermore, if such an areal innovation happened, we would expect to see some dialect differences in its implementation (cf. the above differences between Balto-Slavic and Indo-Iranian), and residual evidence of a distinct palatalised series. In fact, however, neither type of evidence exists, suggesting that there was never a palatovelar series in the centum languages. (Evidence does exist for a distinct labiovelar series in the satem languages, though; see above.)

11. A system of two gutturals, velars and labiovelars, is a linguistic anomaly, isolated in the IE occlusive subsystem – there are no parallel oppositions bw-b, pw-p, tw-t, dw-d, etc. Only one feature, their pronunciation with an accompanying rounding of the lips, helps distinguish them from each other. Such a system has been attested in some older IE languages. A system of three gutturals – palatovelars, velars and labiovelars –, with a threefold distinction isolated in the occlusive system, is still less likely.

NOTE. In the two-dorsal system, labiovelars turn velars before -u, and there are some neutralisation positions which help identify labiovelars and velars; also, in some contexts (e.g. before -i, -e) velars tend to move forward its articulation and eventually palatalise. Both trends led eventually to centum and satem dialectalisation.

Those who support the model of the threefold distinction in PIE cite evidence from Albanian (Pedersen) and Armenian (Pisani), that they seem to treat plain velars differently from the labiovelars in at least some circumstances, as well as the fact that Luwian could have had distinct reflexes of all three series.

NOTE 1. It is disputed whether Albanian shows remains of two or three series (cf. Ölberg “Zwei oder drei Gutturaldreihen? Vom Albanischen aus gesehen” Scritti…Bonfante 1976; Kortlandt 1980; Pänzer “Ist das Französische eine Satem-Sprache? Zu den Palatalisierung im Ur-Indogermanischen und in den indogermanischen Einzelsprachen”, Festschrift für J. Hübschmidt, 1982), although the fact that only the worst and one of the most recently attested (and neither isolated nor remote) IE dialect could be the only one to show some remains of the oldest phonetic system is indeed very unlikely. Clackson (2007), supporting the three series: “Albanian and Armenian are sometimes brought forward as examples of the maintenance of three separate dorsal series. However, Albanian and Armenian are both satem languages, and, since the *kj series has been palatalised in both, the existence of three separate series need not disprove the two-dorsal theory for PIE; they might merely show a failure to merge the unpalatalised velars with the original labio-velars.”

NOTE 2. Supporters of the palatovelars cite evidence from Luwian, an Anatolian language, which supposedly shows a three-way velar distinction *kjz (probably [ts]); *kk; *kwku (probably [kw]), as defended by Melchert (“Reflexes of *h3 in Anatolian”, Sprache 38 1987). So, the strongest argument in favour of the traditional three-way system is that the the distinction supposedly derived from Luwian findings must be reconstructed for the parent language. However, the underlying evidence “hinges upon especially difficult or vague or otherwise dubious etymologies” (see Sihler 1995); and, even if those findings are supported by other evidence in the future, it is obvious that Luwian might also have been in contact with satemisation trends of other Late IE dialects, that it might have developed its own satemisation trend, or that maybe the whole system was remade within the Anatolian branch. Clackson (2007), supporting the three series, states: “This is strong independent evidence for three separate dorsal series, but the number of examples in support of the change is small, and we still have a far from perfect understanding of many aspects of Anatolian historical phonology.”

Also, one of the most difficult problems which subsists in the interpretation of the satemisation as a phonetic wave is that, even though in most cases the variation *kj/k may be attributed either to a phonetic environment or to the analogy of alternating apophonic forms, there are some cases in which neither one nor the other may be applied, i.e. it is possible to find words with velars in the same environments as words with palatals.

NOTE. Compare for example *okj(u), eight, which presents k before an occlusive in a form which shows no change (to suppose a syncope of an older *okjitō, as does Szemerényi, is an explanation ad hoc). Other examples in which the palatalisation cannot be explained by the next phoneme nor by analogy are *swekru- ‘husband’s mother’, *akmōn ‘stone’, *peku ‘cattle’, which are among those not shared by all satem languages. Such unexplained exceptions, however, are not sufficient to consider the existence of a third row of ‘later palatalised’ velars (see Bernabé 1971; Cheng & Wang “Sound change: actuation and implementation”, Lg. 51, 1975), although there are still scholars who come back to the support of the hypothesis of three velars. So e.g. Tischler 1990 (reviewed in Meier-Brügger 2003): “The centum-satem isogloss is not to be equated with a division of Indo-European, but rather represents simply one isogloss among many…examples of ‘centum-like aspects’ in satem languages and of ‘satem-like aspects’ in centum languages that may be evaluated as relics of the original three-part plosive system, which otherwise was reduced every-where to a two-part system.”

Newer trends to support the old assumptions include e.g. Huld (1997, reviewed in Clackson 2007), in which the old palatal *kj is reconstructed as a true velar, and *k as a uvular stop, so that the problem of the a priori unlikely and unparallelled merger of palatal with velar in centum languages is theoretically solved.

As it is clear from the development of the dorsal reconstruction, the theory that made the fewest assumptions was that an original Proto-Indo-European had two series of velars. These facts should have therefore shifted the burden of proof, already by the time Meillet (1893) rejected the proposal of three series; but the authority of Neogrammarians and well-established works of the last century, as well as traditional conventions, probably weighted (and still weight) more than reasons.

NOTE. More than half century ago we had already a similar opinion on the most reasonable reconstruction, that still today is not followed, as American Sanskritist Burrow (1955) shows: “The difficulty that arises from postulating a third series in the parent language, is that no more than two series (…) are found in any of the existing languages. In view of this it is exceedingly doubtful whether three distinct series existed in Indo-European. The assumption of the third series has been a convenience for the theoreticians, but it is unlikely to correspond to historical fact. Furthermore, on examination, this assumption does not turn out to be as convenient as would be wished. While it accounts  in a way for correspondences like the above which otherwise would appear irregular, it still leaves over a considerable number of forms in the satem-languages which do not fit into the framework (…) Examples of this kind are particularly common in the Balto-Slavonic languages (…). Clearly a theory which leaves almost as many irregularities as it clears away is not very soundly established, and since these cases have to be explained as examples of dialect mixture in early Indo-European, it would appear simplest to apply the same theory to the rest. The case for this is particularly strong when we remember that when false etymologies are removed, when allowance is made for suffix alternation, and when the possibility of loss of labialization in the vicinity of the vowel u is considered (e.g. kraví-, ugrá-), not many examples remain for the foundation of the theory.”

Of course, we cannot (and we will probably never)  actually know if there were two or three series of velars in LIE, or PIH, and because of that the comparative method should be preferred over gut intuition, historical authority, or convention, obstacles to the progress in a dynamic field like IE studies.

As Adrados (2005) puts it with bitterness: “Indo-Europeanists keep working on a unitary and flat PIE, that of Brugmann’s reconstruction. A reconstruction prior to the decipherment of Hittite and the study of Anatolian! This is but other proof of the terrible conservatism that has seized the scientific discipline that is or must be Indo-European linguistics: it moves forward in the study of individual languages, but the general theory is paralised”.

The Loss of Laryngeals

Today, the reconstruction of consonantal sounds to explain what was reconstructed before as uncertain vocalic schwa indogermanicum or schwa primum is firmly accepted in IE studies in general, and there is a general agreement on where laryngeals should be reconstructed. Even the number and quality of those laryngeals is today a field of common agreement, although alternative number of laryngeals and proposals for their actual phonemic value do actually exist.

However, as Clackson (2007) sums up: “Particularly puzzling is the paradox that laryngeals are lost nearly everywhere, in ways that are strikingly similar, yet apparently unique to each language branch. We can of course assume some common developments already within PIE, such as the effect of the laryngeals *h2 and *h3 to change a neighbouring *e to *a or *o, but the actual loss of laryngeals must be assumed to have taken place separately after the break-up of the parent language (…) it would have seemed a plausible assumption that the retention of *h2, and possibly also *h1 and *h3, is an archaism of Anatolian, and the loss of the laryngeals was made in common by the other languages.”

In the vocalic inventory of current Late Indo-European reconstruction, the following evolution paradigm is widespread, following Beekes (1995), Meier-Brügger (2003) and Ringe (2005):















* ī






















































































































NOTE 1. A differentiation between early or pre-LIE and late or post-LIE has to be made. An auxiliary vowel was firstly inserted in the evolution PIH → pre-LIE in a certain position, known because it is found in all dialects alike: *Ch1C *Ch1°C, *Ch2C *Ch2°C, *Ch3C *Ch3°C. By post-LIE we assume a period of a Northern-Southern dialectal division and Southern dialectal split, in which the whole community remains still in contact, allowing the spread of innovations like a generalised vocalisation of the auxiliary vowel (during the first migrations in the Kurgan framework, the assumed end of the LIE community). During that period, the evolution pre-LIE post-LIE would have been as follows: *Ch1°C *Ch1əC*CHəC*CəC. That evolution reached IEDs differently: whereas in South-West IE (Greek, Armenian, Phrygian, Ancient Macedonian) the pre-LIE laryngeal probably colourised the vocalic output from *Ch1əC as in the general scheme (into e, a, o), in NWIE and PII the late LIE the *ə from *CəC was assimilated to another vowel: generally to a in NWIE, and to i in PII. Word-initially, only South-West IE dialects appear to have had an output *H° *Hə e, a, o, while the other dialects lost them *H .

NOTE 2. The following developments should also be added: 

- In South-West IE there are no cases of known *Hj- *Vj-. It has been assumed that this group produced in Greek a z.

- It seems that some evidence of word-initial laryngeals comes from Indo-Iranian, where some compound words show lengthening of the final vowel before a root presumed to have had an initial laryngeal.

- The *-ih2 group in auslaut had an alternative form *-j°h2, LIE *-ī/-jə, which could produce IED -ī, -ja (alternating forms are found even within the same dialect).

- Apparently a reflect of consonantal laryngeals is found between nonhigh vowels as hiatuses (or glottal stops) in the oldest Indo-Iranian languages – Vedic Sanskrit and Old Avestan, as well as in Homeric Greek (Lindeman Introduction to the ‘Laryngeal Theory’, 1987). For a discussion on its remains in Proto-Germanic, see Connolly (“‘Grammatischer Wechsel’ and the laryngeal theory”, IF 85 1980).

- Contentious is also the so-called Osthoff’s Law (which affected all IE branches but for Tocharian and Indo-Iranian), which possibly shows a general trend of post-LIE date.

- When *H is in a post-plosive, prevocalic position, the consonantal nature of the laryngeal values is further shown *CHVC → *ChVC; that is more frequent in PII, cf. *pl̥th2ú- → Ved. pr̥thú-; it appears also in the perfect endings, cf. Gk. oistha.

- The group *CR̥HC is explained differently for the individual dialects without a general paradigm; so e.g. Beekes (1995) or Meier-Brügger (2003) distinguish the different dialectal outputs as: Tocharian (*r̥HC→*r°HC), Germanic (*r̥H→*r̥) and to some extent Balto-Slavic (distinction by accentuation), Italo-Celtic (*r̥H→*r°H), while in Greek the laryngeal determined the vowel: e.g. *r̥h1→*r̥°h1→*r̥eH.

There are multiple examples which do not fit in any dialectal scheme, though; changes of outputs from PIH reconstructed forms with resonants are found even within the same dialects. The  explanation in Adrados–Bernabé–Mendoza (1995-1998) is probably nearer to the actual situation, in going back to the pronunciation of the common (pre-LIE) group: “the different solutions in this case depend solely on two factors: a) if there are one or two auxiliary vowels to facilitate the pronunciation of this group; b) the place where they appear.” So e.g. a group *CR̥HC could be pronounced in LIE with one vowel (*CR°HC or *C°RHC) or with two (*C°R°HC,  *C°RH°C, or *CR°H°C). That solution accounts for all LIE variants found in the different branches, and within them.

- The laryngeal of *RHC- in anlaut was vocalised in most languages, while the resonant was consonantal (*R̥HC- became *RVC-).

- In the group *CR̥HV, a vowel generally appears before the resonant and the laryngeal disappears; that vowel is usually coincident with the vocalic output that a resonant alone would usually give in the different dialects, so it can be assumed that generally *CR̥HVC(V)R̥V, although exceptions can indeed be found. A common example of parallel treatment within the same dialect is Greek pros/paros < *pros/p°ros.

- Accounting for some irregularities in the outcome of laryngeals (especially with *-h2, but not limited to it) is the so-called “Saussure effect”, whereby LIE dialects do not show an usual reflection of the inherited sequences #HRo- and -oRHC-. According to Nussbaum (Sound law and analogy: papers in honor of Robert S.P. Beekes on the occasion of his 60th birthday, Alexander Lubotsky, 1997), this effect “reflects something that happened, or failed to happen, already in the proto-language”.

Hence, for the moment, we could assume that a South-East and a South-West IE dialects were already separated, but still closely related through a common (Northern) IE core, because the loss (or, more exactly, the vocalic evolution) of laryngeals of Northern IE did in fact reach Graeco-Aryan dialects similarly and in a complementary distribution. That is supported by modern linguistic Northern-Southern separation model (v.i. §§1.3, 1.4, 1.7): “(…) today it is thought that most innovations of Greek took place outside Greece; no doubt, within the Indo-Greek group, but in a moment in which certain eastern isoglosses didn’t reach it.” Adrados–Bernabé–Mendoza (1995-1998).

Apart from those fictions or artifices that help linguists keep on with their work on individual dialects from a secure starting point (conventional PIH phonetics), there is no reason to doubt that the most (scientifically) conservative starting point for PIE evolution is that LIE had lost most laryngeals but for one merged *H – of the “Disintegrating Indo-European” of Bomhard (Toward Proto-Nostratic: A New Approach to the Comparison of Proto-Indo-European and Proto-Afroasiatic, 1984) – into the known timeline and groupings, and that a late post-LIE vocalisation of interconsonantal *H into *Həand later *ə did eventually substitute the original forms, albeit at a different pace, arriving probably somehow late and incompletely to the earliest dialects to split up, which completed independently the laryngeal loss.

Some individual finds seem to support a different treatment of laryngeals in certain dialects and environments, though.

NOTE. Examples are the contentious Cogwill’s Law (“such shortening is fairly common cross-linguistically, and the IE examples may have each arisen independently”, Fortson 2004), or other peculiar sound changes recently found in Latin and Balto-Slavic, all of them attested in late IE dialects that had already undergone different vocalic evolutions.

Meier-Brügger (2003) mentions 3 non-Anatolian testimonies of laryngeals:

1)  Indo-Iranian: “the Vedic phrase devyètu, i.e. devì etu υ is best understandable if we suppose that dev ‘goddess’ still contained the laryngeal form *dewíH (with *-iH<*-ih2) at the time of the formulation fo the verse in question. In the phase *-íH it was possible for the laryngeal simply to disappear before a vowel”. Other common example used is *wr̥kiH. It is not justified, though, that it must represent a sort of unwritten laryngeal, and not an effect of it, i.e. a laryngeal hiatus or glottal stop, from older two-word sandhis that behave as a single compound word, see §2.4.3. Interesting is also that they are in fact from words alternating in pre-LIE *-iH/*-j°H (or post-LIE *-ī/*-jə) which according to Fortson (2004) reflect different syllabification in Indo-Iranian vs. Greek and Tocharian, whilst “[t]he source of the difference is not fully understood”. In line with this problem is that the expected case of *-aH stems is missing, what makes it less likely that Indo-Iranian examples come from a common hypothetic PII stage in which a word-final *-H had not still disappeared, and more likely that it was a frozen remain (probably of a glottal stop) in certain formal expressions. In fact, it has long been recognised that the treatment of word-final laryngeals shows a strong tendency to disappear (so e.g. in Hittite), and most of the time it appears associated with morphological elements (Adrados–Bernabé–Mendoza 1995-1998). They should then be considered – like the hiatuses or glottal stops found in Hom. Gk. and Germanic compositions – probable ancient reminiscences of a frozen formal language.

2)The sandhi variant in *-aH is found, according to Meier-Brügger (2003) and Ringe (2006), in Greek and Old Church Slavonic. In both “clear traces are missing that would confirm a PIE ablaut with full grade *-eh2- and zero grade *-h2- (…) That is why it appears as if the differentiation between the nominative and vocative singular in this case could be traced to sandhi-influenced double forms that were common at a time when the stems were still composed of *-ah2, and the contraction *-ah2- >*-ā- had not yet occurred”. Szemerény (1999) among others already rejected it: “The shortening of the original IE ending -ā to -ă is regular, as the voc., if used at the beginning of a sentence or alone, was accented on the first syllable but was otherwise enclitic and unaccented; a derivation from -ah with the assumption of a prevocalic sandhi variant in -a fails therefore to explain the shortening.”

3) The latest example given by Meier-Brügger is found in the unstable *CRHC model (see above), which is explained with PIE *gn̥h1-- ‘created, born’: so in Vedic jātá- < PII ģātó- < *ģaHtó- < *gjn̥h1-, which would mean that the laryngeal merged after the evolution LIE *n̥ PIIa. The other irregular dialectal reconstructions shown are easily explained following the model of epenthetic vowel plus merged laryngeal (or glottal stop?) in *gnəh1-; cf. for the same intermediate grade PGk gnētó- (< post-LIE *gneHtó-), pre-NWIE g(°)naʔ- (<post-LIE *gnəHtó-) into Ita., Cel. *gnātó-, PGmc. *kunʔda-, Bal.-Sla. *ginə-. Such dialectal late loss of the merged laryngeal *H (or glottal stop) is therefore limited to the groups including a sonorant, and the finds support a vocalisation of LIE *n̥, *m̥ PII a earlier than the loss of laryngeal (or glottal stop) in that environment. That same glottal stop is possibly behind the other examples in Meier-Brügger: O.Av. va.ata-< PII waʔata-, or Ved. *ca-kar-ʔa (the ʔ still preserved in the period of the activity of Brugmann’s law), or Ved. náus < *naʔus.

In Lubotsky (1997) different outputs are proposed for *CRH groups before certain vowels: “It is clear that the “short” reflexes are due to laryngeal loss in an unaccented position, but the chronology of this loss is not easy to determine. If the laryngeal loss had already occurred in PIIr., we have to assume that PIIr. *CruV subsequently yielded CurvV in Sanskrit. The major problem we face is that the evidence for the phonetically regular outcome of *CriV and *CruV in Indo-Iranian is meager and partly conflicting.” Again, the conflict is solved assuming a late loss of the laryngeal; however, the attestation of remains of glottal stops, coupled with the auxiliary vowel solution of Adrados–Bernabé–Mendoza (1995-1998) solves the irregularities without making new assumptions and dialectal sound laws that in turn need their own further exceptions.

Kortlandt seems to derive the loss of laryngeals from Early Slavic (see below §1.7.1.I.D), a sister language of West and East Baltic languages, according to his view. Also, on Italo-Celtic (2007):  “If my view is correct, the loss of the laryngeals after a vocalic resonant is posterior to the shortening of pretonic long vowels in Italic and Celtic. The specific development of the vocalic liquids, which is posterior to the common shortening of pretonic long vowels, which is in its turn posterior to the development of ē, ā, ō from short vowel plus laryngeal, supports the hypothesis of Italo-Celtic linguistic unity.” Hence the problematic environments with sonorants are explained with a quite late laryngeal loss precisely in those groups.

The most probable assumption then, if some of those peculiar developments are remnants of previous laryngeals, as it seems, is that the final evolution of the merged *H was coincident with LIE disintegration, and might have reached its end in the different early prehistoric communities, while still in contact with each other (in order to allow for the spread of the common trends); the irregular vocalic changes would have then arisen from unstable syllables (mainly those which included a resonant), alternating even within the same branches, and even in the same phonetic environments without laryngeals (v.i. §2.3).

While there are reasons to support a late evolution of the pre-LIE merged laryngeal, there seems to be no strong argument for the survival of LIE merged *H into the later periods of NWIE, PGk or PII dialects, still less into later proto-languages (as Germanic, Slavic, Indo-Aryan, etc.). However, for some linguists, the complete loss of the LIE laryngeal (or even laryngeals) must have happened independently in each dialectal branch attested; so e.g. Meier-Brügger (2003): “As a rule, the laryngeals were disposed of only after the Proto-Indo-European era”; Clackson (2007): “But the current picture of laryngeal reconstruction necessitates repeated loss of laryngeals in each language branch”.

NOTE. The question is then brought by Clackson into the Maltese and Modern Hebrew examples, languages isolated from Semitic into an Indo-European environment for centuries. That is indeed a possible explanation: that all IE branches, after having split up from the LIE common language, would have become independently isolated, and then kept in close contact with (or, following the Maltese example, surrounded by) non-IE languages without laryngeals. Then, every change in all branches could be explained by way of diachronic and irregular developments of vowel quality. In Clackson’s words: “(…) the comparative method does not rely on absolute regularity, and the PIE laryngeals may provide an example of where reconstruction is possible without the assumption of rigid sound-laws.”

Even accepting that typologically both models of (a common, post-LIE vs. an independent, dialectal) laryngeal loss were equally likely, given that all languages had lost the merged laryngeal before being attested, all with similar outputs, and that even the final evolution  (laryngeal hiatuses or glottal stops) must have been shared in an early period – since they are found only in frozen remains in old and distant dialects –, an early IED loss of laryngeals fits into a coherent timeline within the known dialectal evolution. With that a priori assumption, we limit the need for unending ad hoc ‘sound-laws’ for each dialectal difference involving a sonorant, which would in turn need their own exceptions. Therefore, we dispense with unnecessary hypotheses, offering the most conservative approach to the problem.

Conventions Used in This Book

1. We try to keep a consistent nomenclature throughout the book, when referring to the different reconstructible stages of Proto-Indo-European (PIE). From Pre-PIH, highly hypothetical stage, only reconstructible through internal reconstruction, to the most conservative reconstruction of early LIE dialects (IEDs). We do so by using the following schema of frequent terms and dates:

NOTE. This is just a simplified summary to understand the following sections. The full actual nomenclature and archaeological dates are discussed in detail in §§1.3, 1.4, and 1.7.

The dates include an archaeological terminus post quem, and a linguistic terminus ante quem. In such a huge time span we could differentiate between language periods. However, these (linguistic and archaeological) limits are usually difficult to define, and their differentiation hardly necessary in this grammar. Similarly, the terms Hittite, Sanskrit, Ancient Greek, Latin, etc. (as well as modern languages) might refer in the broadest sense to a time span of over 1,000 years in each case, and they are still considered a single language; a selection is made of the prestigious dialect and age for each one, though, as it is done in this grammar, where the prestigious language is Late Indo-European, while phonetics remains nearer to the middle-late period of IEDs, whose post-laryngeal output is more certain.

2. The above graphic is intended to show stemmatic, as well as synchronic levels. The reconstruction of North-West Indo-European is based on secondary materials: it is a level 3 proto-language, reconstructed on the basis of level 5 proto-languages (of ca. 1000 BC), i.e. primary Proto-Celtic, Proto-Italic, secondary Proto-Balto-Slavic (through Proto-Slavic and Proto-Baltic) and secondary Pre-Proto-Germanic (through internal reconstruction), see §1.7.1.

NOTE. Coeval level 3 dialects Proto-Greek (from level 5 Mycenaean and level 6 Ancient Greek primary materials) and Proto-Indo-Iranian (from level 5 Old Indian and level 6 Iranian materials) could be considered reconstructions based on primary as well as secondary materials. All of them, as well as data from other dialects (Tocharian A and B, Armenian, Albanian), conform the secondary and tertiary materials used to reconstruct a level 2 Late Indo-European. Proto-Anatolian is a level 2 internal reconstruction from level 3 Common Anatolian, in turn from level 4 and level 5 primary materials on Anatolian dialects. Both Late Indo-European and Proto-Anatolian help reconstruct a parent language, Indo-Hittite, which is then a level 1 language.

Each reconstructed parent level is, indeed, more uncertain and inconsistent than the previous one, because the older a material is (even primary texts directly attested), the more uncertain the reconstructed language. And more so because all parent reconstructions are in turn helpful to refine and improve the reconstruction of daughter and sister proto-languages. With that scheme in mind, it is logical to consider more consistent and certain the reconstruction of IEDs, these in turn more than LIE, and this more than PIH.

3. Palatovelars are neither reconstructed for Late Indo-European, nor (consequently) for Indo-Hittite. While not still a settled question (v.s. Considerations of Method), we assume that the satem trend began as an areal dialectal development in South-East Indo-European, and spread later (and incompletely) through contact zones – e.g. into Pre-Balto-Slavic.

NOTE. Because West and Central European (Italo-Celtic and Germanic) and Proto-Greek were not affected by that early satemisation trend –although Latin, Greek and Celtic actually show late independent ‘satemisations’ –, the reconstruction of centum NWIE and PGk, and satem PII (the aim of this book) should be an agreed solution, no matter what the different personal or scholarly positions on LIE and PIH might be.

4. We assume an almost fully vocalic – i.e. post-laryngeal – nature of IEDs since the end of the LIE community (assumed to have happened before ca. 2500 BC, according to archaeological dates), although not a settled question either (v.s. Considerations of Method). Whether LIE lost the merged laryngeal *H sooner or later, etymological roots which include laryngeals will be labelled PIH and follow today’s general three-laryngeal convention, while some common LIE vocabulary will be shown either with pre-LIE merged *H or post-LIE vocalic output *ə (which was assimilated to NWIE a, PII i), or with the reconstructed post-LIE glottal stop *ʔ.

NOTE. In this grammar we will show the reconstructed phonetics of a post-LIE period, focussing on NWIE vocalism, while keeping a vocabulary section with a Late Indo-European reconstruction, respecting NWIE/PII dialectal differences; not included are the different vocalic outputs of South-West IE, from word-initial and interconsonantal laryngeals.

Writing System

This table contains common Proto-Indo-European phonemes and their proposed regular corresponding letters in alphabets and Brahmic alphasyllabaries.

Consonants and Consonantal Sounds










Π π

P p

ـپ ــ

Պ պ

П п


Β β

B b

ـب ـبـ‎‎ بـ‎‎

Բ բ

Б б


Βη βη

Bh bh

‎‎ـبھ ـبھـ‎‎ بھـ‎‎

Բհ բհ

Бх бх


Τ τ

T t

ـت ـتـ تـ

Տ տ

Т т


Θ θ

Th th

ـتھ ـتھـ تھـ

Թ թ

Тх тх


Δ δ

D d

ـد ـد

Դ դ

Д д


Δη δη

Dh dh

ـدھ ـدھـ دھـ

Դհ դհ

Дх дх


Κ κ

K k

ـک ـكـ كـ

Կ կ

К к


Χ χ

Kh kh

ـكھ ـكھـ كھـ

Ք ք

Кх кх


Γ γ

G g

ـگ ـگـ گـ

Գ գ

Г г


Γη γη

Gh gh

ـگھ ـگھـ گھـ

Գհ գհ

Гх гх


Ϙ ϙ

Q q

ـق ق ـقـ

Խ խ

Къ къ



Ϟ ϟ

C c

ـغ ـغـ غـ

Ղ ղ

Гъ гъ



Ϟη ϟη














Ch ch

ـغھ ـغھـ غھـ

Ղհ ղհ

Гъх гъх



Η η












H h

ـھ ـھـ ھـ

Հ հ

Х х











J ϳ

J j

ـۑـ ـۑ‎‎ ۑـ

Յ յ

Й й


Ϝ ϝ

W w

ـۋ ـۋ ۋ

Ւ ւ

В в


Ρ ρ

R r


Ռ ռ

Р р


Λ λ

L l

ـل ـلـ لـ

Լ լ

Л л


Μ μ

M m

ـم ــمــ مـ

Մ մ

М м


Ν ν

N n

ـن ـنـ ـنـ

Ն ն

Н н


Σ σ ς

S s

ـس ـسـ سـ

Ս ս

С с


Sounds found in Proto-Greek only


Φ φ   

Ph ph

ـپھ ھــ ھ

Փ փ

Пх пх


Ϙη ϙη

Qh qh

ـقھ قھ ـقھـ

Խհ խհ

Чх чх



Τσ τσ

Ts ts

ـتسـ ـتس تسـ

Ծ ծ

Ц ц


Δζ δζ

Dz dz

ـدز ـﺩز دز

Ձ ձ

Дз дз


Sounds found in Proto-Indo-Iranian only


Τϻ τϻ

Ķ ķ

ـتژ ـتژ تژ

Չ չ

Ч ч


Δϻ δϻ

Ģ ģ

ـدﮋ ـدﮋ ﺩﮋ

Ց ց

Дщ дщ


Δϻη δϻη

Ģh ģh

دﮋھـ ـدﮋھـ ﺩﮋھـ

Ցհ ցհ

Дщ дщ



Τþ τþ

Ḳ ḳ

ـ ــ ـ

Ճ ճ

Тш тш


Δþ δþ

Ġ ġ

ـ ــ ـ

Ջ ջ

Дж дж


Δþη δþη

Ġh ġh

ـھ ـھـ ھـ

Ջհ ջհ

Джх джх


Ϸ þ

Š š

ـﺶ ـﺶ

Շ շ

Ш ш


Vowels and Vocalic Allophones










Α α

A a

ـا ـا ا

Ա ա

А а


Ε ε

E e

ـێـ ـێ ێـ

Է է

E e


Ο ο

O o

ـۆ ـۆ ۆ

Օ օ

О о


Ᾱ ᾱ

Ā ā

ـأ ـأ أ

Ա՟ ա՟

Ā ā


Ē ε̄

Ē ē

ـێٔـ ـێٔ ێٔـ

Է՟ է՟

Ē ē


Ō ō

Ō ō

ـۆٔ ۆٔ ـۆٔ

Օ՟ օ՟

Ō ō









Ι ι

I i

ی ـيـ يـ

Ի ի

И и


Ῑ ῑ

Ī ī

یٔ ـئـ ئـ

Ի՟ ի՟

Ӣ ӣ


Υ υ

U u

     ـو ـو و

Ո ո

У у


Ῡ ῡ

Ū ū

ـؤ ـؤ ؤ

Ո՟ ո՟

Ӯ ӯ











Ρ̣ ρ̣

Ṛ ṛ

ـرٜ ـرٜ رٜ

Ռՙ ռՙ

Р̣ р̣


Λ̣ λ̣

Ḷ ḷ

ـلٜ ـلٜـ لٜـ

Լՙ լՙ

Л̣ л̣


Μ̣ μ̣

Ṃ ṃ

ـمٜ ــمٜــ مٜـ‎ ‎

Մՙ մՙ

М̣ м̣



Ν̣ ν̣

Ṇ ṇ

ـنٜ ـنٜـ ـنٜـ

Նՙ նՙ

М̣ м̣



This proposal is purely conventional, and it takes into account values such as availability, simplicity (one letter for each sound), transliteration, tradition.

NOTE. We have followed this order of objectives in non-Brahmic scripts:

·  Availability: especially of letters in common Latin and Cyrillic keyboards and typography, since they account for most of the current Northern IE world.

·  Simplicity: each sound is represented with one letter (or letter plus diacritics). Digraphs used only when necessary: aspirated consonants are represented with the consonant plus the letter for [h], unless there is an independent character for that aspirated consonant.

·  Equivalence of letters: a character in one alphabet should be transliterated and read directly in any other to allow an automatic change from the main alphabets into the others without human intervention. The lack of adequate characters to represent PIE phonetics (resonants, semivowels, long vowels) in alphabets conditions the final result.

·  Tradition: the historic or modern sound of the letters is to be retained when possible.

Text Box: Writing systems of the Indo-European World. (2011, modified from Mirzali Zazaoğlu 2008)The names of the consonants in Indo-European following the Latin pattern would be – B, be (pronounced bay); Bh, bhe (bhay); C, ce (gway); Ch, che (gwhay); D, de (day); Dh, dhe (dhay); G, ge (gay); Gh, ghe (ghay); H, ha; K, ka; L, el; M, em; N, en; P, pe; Q, qa (kwa); R, er; S, es; T, te; W, wa.

In Aryan, the letters are named with their sound followed by a, as in Sanskrit – ba, bha, ca, cha, da, dha, ga, gha, and so on.


An acute accent (´) is written over the vowel in the accented syllable, except when accent is on the second to last syllable (or paenultima) and in monosyllabic words.

NOTE. Since all non-clitic words of more than one syllable would be marked with one accent, as we have seen, a more elegant convention is not to write all accents always.  The second to last syllable seems to be the most frequent accented syllable, so we can spare unnecessary diacritics if the accent is understood in that position, unless marked in other syllable.

Long vowels are marked with a macron ( ¯ ), and vocalic allophones of resonants are marked with a dot below it ( ̣). Accented long vowels and resonants are represented with special characters that include their diacritics plus an acute accent.

NOTE. It is recommended to write all diacritics if possible, although not necessary. The possibility of omitting the diacritical marks arises from the lack of appropriate fonts in traditional typography, or difficulty typing those marks in common international keyboards. Therefore, alternative writings include pater/patr, m. father, nmrtos/mtós, m. immortal, kmtom/któm, hundred, etc. Such a defective representation of accents and long vowels is common even today in Latin and Greek texts, as well as in most modern languages, which lack a proper representation for sounds. That does not usually hinder an advanced reader from read a text properly.

1. The Modern Greek alphabet lacks letters to represent PIE phonetics properly. Therefore, the Ancient Greek letters and values assigned to them are used instead.

NOTE. The consonant cluster [kh] was in Ancient Greece written as X (Chi) in eastern Greek, and Ξ (Xi) in western Greek dialects. In the end, X was standardised as [kh] ([x] in modern Greek), while Ξ represented [ks]. In the Greek alphabet used for IE, X represents [kh], while Ξ represents [kwh], necessary for the representation of a Proto-Greek voiceless aspirate. As in Ancient Greek, Φ stands for [ph], and Θ for [th].

The Greek alphabet lacks a proper representation for long vowels, so they are all marked (as in the other alphabets) with diacritics. Η is used to represent the sound [h], as it was originally used in most Ancient Greek dialects; it is also used to mark (voiced) aspirated phonemes. Ē represents [eː] and Ō stands for [oː] in the Greek alphabet for IE. For more on the problem of historical Eta and its representation in the Modern Greek alphabet, see <>.

While not a practical solution (in relation to the available Modern Greek keyboards), we keep a traditional Ancient Greek script, assuming that it will enjoy the transliteration of texts mainly written in Latin or Cyrillic letters; so e.g. Archaic koppa Ϙ stood for [k] before back vowels (e.g. Ϙόρινθος, Korinthos), hence its IE value [kw]. Archaic digamma Ϝ represented [w], a sound lost already in Classical Greek. Additions to the IE alphabet are new letter koppa Ϟ for [gw], based on the alternative Unicode shapes of the archaic koppa, and the ‘more traditional’ inverted iota for [j], preferred over Latin yot – although the lack of capital letter for inverted iota makes the use of (at least) a capital Jnecessary to distinguish [j] from [i]. See <>.

2. The Latin alphabet used to write Indo-European is similar to the English, which is in turn borrowed from the Late Latin abecedarium. Because of the role of this alphabet as model for other ones, simplicity and availability of the characters is preferred over tradition and exactitude.

NOTE. The Latin alphabet was borrowed in very early times from the Greek alphabet and did not at first contain the letter G. The letters Y and Z were introduced still later, about 50 BC. The Latin character C originally meant [g], a value always retained in the abbreviations C. (for Gaius) and Cn. (for Gnaeus). That was probably due to Etruscan influence, which copied it from Greek Γ, Gamma, just as later Cyrillic Г, Ge. In early Latin script C came also to be used for [k], and K disappeared except before in a few words, as Kal. (Kalendae), Karthago. Thus there was no distinction in writing between the sounds [g] and [k]. This defect was later remedied by forming (from C, the original [g]-letter) a new character G. In Modern Indo-European, unambiguous K stands for [k], and G for [g], so C is left without value, being used (taking its oldest value [g]) to represent the labiovelar [gw].

V originally denoted the vowel sound [u] (Eng. oo), and F stood for the sound of consonant [w] (from Gk. ϝ, called digamma). When F acquired the value of our [f], V came to be used for consonant [w] as well as for the vowel [u]. The Latin [w] semivowel developed into Romance [v]; therefore V no longer adequately represented [u] or [w], and the Latin alphabet had to develop alternative letters. The Germanic [w] phoneme was therefore written as VV (a doubled V or U) by the seventh or eighth century by the earliest writers of Old English and Old High German. During the late Middle Ages, two forms of V developed, which were both used for its ancestor U and modern V. The pointed form V was written at the beginning of a word, while a rounded form U was used in the middle or end, regardless of sound. The more recent letters U and Germanic W probably represent the consonantal sounds [u] and [w] respectively more unambiguously than Latin V.

The letter I stood for the vowel [i], and was also used in Latin (as in Modern Greek) for its consonant sound [j]. J was originally developed as a swash character to end some Roman numerals in place of I; both I and J represented [i], [iː], and [j]. In IE, J represents the semivowel [j], an old Latin value current in most Germanic and Slavic languages. Y is used to represent the vowel [y] in foreign words. That [j] value is retained in English J only in foreign words, as Hallelujah or Jehovah. Because Romance languages developed new sounds (from former [j] and [ɡ]) that came to be represented as I and J, English J (from French J), as well as Spanish, Portuguese or Italian J have sound values quite different from [j]. The romanisation of the sound [j] from different writing systems (like Devanagari) as Y –  which originally represented in Latin script the Greek vowel [y] – is due to its modern value in English and French, and has spread a common representation of [j] as Y in Indo-European studies, while J is used to represent other sounds.

A different use of the Latin alphabet to represent PIE, following the Classical Latin tradition, is available at <>.

3. The Perso-Arabic script has been adapted to the needs of a fully differentiated PIE alphabet, following Persian, Urdu and Kurdish examples.

NOTE. The Perso-Arabic script is a writing system that is originally based on the Arabic alphabet. Originally used exclusively for the Arabic language, the Arabic script was modified to match the Persian language, adding four letters: پ [p], چ [tʃ], ژ [ʒ], and گ [ɡ]. Many languages which use the Perso-Arabic script add other letters. Besides the Persian alphabet itself, the Perso-Arabic script has been applied to the Urdu or Kurdish Soraní alphabet.

Unlike the standard Arabic alphabet, which is an abjad (each symbol represents a consonant, the vowels being more or less defective), the IE perso-arabic script is a true alphabet, in which vowels are mandatory, making the script easy to read.

Among the most difficult decisions is the use of letters to represent vowels – as in modern alphabets like Kurdish or Berber – instead of diacritics – as in the traditional Arabic or Urdu scripts. Following tradition, hamza (originally a glottal stop) should probably be placed on the short vowels and resonants, instead of the long ones (especially above ‘alif), but automatic equivalence with the other alphabets make the opposite selection more practical.

Because waw و  and yodh ي could represent short and long vowels u and i, and consonantal w and j, a conventional selection of current variants has been made: Arabic letter Ve, sometimes used to represent the sound [v] when transliterating foreign words in Arabic, and also used in writing languages with that sound (like Kurdish) is an obvious selection for consonantal [w] because of its availability. The three-dotted yodh becomes then a consequent selection for consonantal yodh. Hamza distinguishes then the long vowel from the short ones, which is represented with the original symbols.

4. Armenian characters, similarly to Greek, need to be adapted to a language with a different series of short and long vowels and aspirated phonemes.

NOTE. Because of that, a tentative selection is made, which needs not be final – as with any other script. Because Armenian lacks a proper character for [u], and because it has not different characters to represent long vowels other than [eː] or [oː], the more practical choice is to imitate the other alphabets to allow for equivalence. The characters that represent short vowels also represent different sounds; as, Ե for [ɛ] and word initially [jɛ], and Ո for [o] and word initially [vo], so a less ambiguous choice would be Է for [e] and Օ for [o]. Hence the letter Ո historically used to write [o] and [u] (in digraphs) stands for [u].

The conventional selection of one-character representation of aspirated voiceless consonants follows Armenian tradition and equivalence with Greek, a closely related language, as we have already seen; i.e. Proto-Greek is probably the nearest branch to the one Pre-Armenian actually belonged to, and it is therefore practical to retain equivalence between both scripts.

Armenian diacritics (like the abbreviation mark proposed for long vowels) are defined as ‘modifier letters’, not as ‘combining diacritical marks’ in Unicode, so they do not combine as true superscript. Some fonts do combine them, as Everson Mono Ա՟ ա՟ Է՟ է՟ Օ՟ օ՟ Ի՟ ի՟ Ո՟ ո՟.

6. The Cyrillic script is used following its modern trends, taking on account that Russian is the model for most modern keyboards and available typography.

NOTE. Non-Russian characters have been avoided, and we have followed the principle of one letter for each sound: While Й is commonly used to represent [j], Cyrillic scripts usually lack a character to represent consonantal [w], given that usually [v] (written В) replaces it. While У is generally used in Cyrillic for foreign words, a ‘one character, one sound’ policy requires the use of a character complementary to Й, which is logically found in В – a sound lacking in Indo-European.

In Slavistic transcription jer Ъ and front jer Ь were used to denote Proto-Slavic extra-short sounds [ŭ] and [ĭ] respectively (e.g. slověnьskъ adj. ‘slavonic’). Today they are used with other values in the different languages that still use them, but the need for traditional ‘labial’ [w] and ‘palatal’ [j] signs available in most Cyrillic keyboards made them the most logical selection to mark a change of value in the characters representing stops.

7. The Brahmic or Indic scripts are a family of abugida (alphabetic-syllabary) writing systems, historically used within their communities – from Pakistan to Indochina – to represent Sanskrit, whose phonology is similar to the parent PIE language. Devanāgarī has come to be the most commonly used Brahmic script to represent Sanskrit, hence our proposal of its character values for the rest of them.

NOTE. The characters and accents are generally used following their traditional phonetic value. Exceptions are the lack of vocalic characters to properly represent [m̥] and [n̥]. Hence anusvara अं, which represents [], is used to represent [m̥]. Also, visarga अः, which stands for [] (allophonic with word-final r and s) is proposed for [n̥]. 

Automatic transliteration between many Brahmic scripts is usually possible, and highly available within scripts used in India.

NOTE. That happens e.g. with the InScript keyboard: because all Brahmic scripts share the same order, any person who knows InScript typing in one script can type in any other Indic script using dictation even without knowledge of that script.

However, due to the lack of characters in western alphabets to represent resonants and long vowels, diacritics are used. These diacritics are not commonly available (but for the Arabic hamza), and therefore if they are not written, transliteration into Brahmic scripts becomes defective. That problem does not exist in the other direction i.e. from Brahmic scripts into the other alphabets.

Modern Indo-European

1. Modern Indo-European (MIE) is therefore a set of conventions or ‘rules’ applied to systematise the reconstructed North-West Indo-European dialect of Late Indo-European – see below §§ 1.3, 1.7.1. Such conventions refer to its writing system, morphology and syntax, and are conceived to facilitate the transition of the reconstructed language into a learned and living one.

2. Because proto-languages were spoken by prehistoric societies, no genuine sample texts are available, and thus comparative linguistics is not in the position to reconstruct exactly how the language was, but more or less certain approximations, whose statistical confidence decrease as we get further back in time. The hypothesised language will then be always somewhat controversial.

NOTE 1. Mallory–Adams (2007): “How real are our reconstructions? This question has divided linguists on philosophical grounds. There are those who argue that we are not really engaged in ‘reconstructing’ a past language but rather creating abstract formulas that describe the systematic relationship between sounds in the daughter languages. Others argue that our reconstructions are vague approximations of the proto-language; they can never be exact because the proto-language itself should have had different dialects (yet we reconstruct only single proto-forms) and our reconstructions are not set to any specific time. Finally, there are those who have expressed some statistical confidence in the method of reconstruction. Robert Hall, for example, claimed that when examining a test control case, reconstructing proto-Romance from the Romance languages (and obviously knowing beforehand what its ancestor, Latin, looked like), he could reconstruct the phonology at 95% confidence, and the grammar at 80%. Obviously, with the much greater time depth of Proto-Indo-European, we might well wonder how much our confidence is likely to decrease.  Most historical linguists today would probably argue that reconstruction results in approximations. A time traveller, armed with this book and seeking to make him- or herself understood would probably engender frequent moments of puzzlement, not a little laughter, but occasional instances of lucidity.”

On the same question, Fortson (2004): “How complete is our picture of PIE? We know there are gaps in our knowledge that come not only from the inevitable loss and replacement of a percentage of words and grammatical forms over time, but also from the nature of our preserved texts. Both the representative genres and external features such as writing systems impose limits on what we can ascertain about the linguistic systems of both PIE and the ancient IE languages (…)

In spite of all the scholarly disagreements that enliven the pages of technical books and journals, all specialists would concur that enormous progress has been made since the earliest pioneering work in this field, with consensus having been reached on many substantial issues. The Proto-Indo-Europeans lived before the dawn of recorded human history, and it is a testament to the power of the comparative method that we know as much about them as we do.”

NOTE 2. The Hebrew language revival is comparable to our proposal of speaking Indo-European as a living language. We have already said that ‘living’ and ‘dead’, ‘natural’ and ‘learned’, are not easily applicable to ancient or classical languages. It is important to note that, even though there is a general belief that Modern Hebrew and Ancient Hebrew are the same languages, among Israeli scholars there have been calls for the “Modern Hebrew” language to be called “Israeli Hebrew” or just “Israeli”, due to the strong divergences that exist – and further develop with its use – between the modern language spoken in Israel and its theoretical basis, the Ancient Hebrew from the Tanakh. The old language system, with its temporary and dialectal variations spanned over previous centuries of oral tradition, was compiled probably between 450-200 BC, i.e when the language was already being substituted by Aramaic. On that interesting question, Prof. Ghil’ad Zuckermann considers that “Israelis are brainwashed to believe they speak the same language as the prophet Isaiah, a purely Semitic language, but this is false. It’s time we acknowledge that Israeli is very different from the Hebrew of the past”. He points out to the abiding influence of modern Indo-European dialects – especially Yiddish, Russian and Polish –, in vocabulary, syntax and phonetics, as imported by Israel’s founders.

3. Features of Late Indo-European that are common to IEDs (North-West Indo-European, Proto-Greek and Proto-Indo-Iranian), like most of the nominal and verbal inflection, morphology, and syntax, make it possible for LIE to be proposed as Dachsprache for the living languages.

NOTE 1. Because North-West Indo-European had other sister dialects that were spoken by coeval prehistoric communities, languages like Modern Hellenic (a revived Proto-Greek) and Modern Aryan (a revived Proto-Indo-Iranian) can also be used in the regions where their surviving dialects are currently spoken. These proto-languages are not more different from North-West Indo-European than are today English from Dutch, Czech from Slovenian, Spanish from Italian. They might also serve as linguae francae for closely related languages or neighbouring regions; especially interesting would be to have a uniting Aryan language for today’s religiously divided South and West Asia.

NOTE 2. The terms Ausbausprache-Abstandsprache-Dachsprache were coined by Heinz Kloss (1967), and they are designed to capture the idea that there are two separate and largely independent sets of criteria and arguments for calling a variety an independent “language” rather than a “dialect”: the one based on its social functions, and the other based on its objective structural properties. A variety is called an ausbau language if it is used autonomously with respect to other related languages.

Dachsprache means a language form that serves as standard language for different dialects, even though these dialects may be so different that mutual intelligibility is not possible on the basilectal level between all dialects, particularly those separated by significant geographical distance. So e.g. the Rumantsch Grischun developed as such a Dachsprache for a number of quite different Romansh language forms spoken in parts of Switzerland; or the Euskara Batua, “Standard Basque”, and the Southern Quechua literary standard, both developed as standard languages for dialect continua that had historically been thought of as discrete languages with many dialects and no “official" dialect. Standard German and standard Italian to some extent function (or functioned) in the same way. Perhaps the most widely used Dachsprache is Modern Standard Arabic, which links together the speakers of many different, often mutually unintelligible Arabic dialects.

The standard Indo-European looked for in this grammar takes Late Indo-European reconstruction as the wide Dachsprache necessary to encompass (i.e. to serve as linguistic umbrella for) the modern usage of IEDs, whose – phonetic, morphological, syntactical – peculiarities are also respected.  

4. Modern Indo-European words to complete the lexicon of North-West Indo-European, in case that no common vocabulary is found in Late Indo-European, are to be loan-translated from present-day Northwestern IE languages. Common loan words from sister dialects can also be loan-translated or borrowed as loan words.

NOTE. Even though the vocabulary reconstructible for IEDs is indeed wider than the common Proto-Indo-European lexicon, a remark of Mallory–Adams (2007) regarding reconstructible PIE words is interesting, in that it shows another difficulty of trying to speak a common LIE or PIH:

“To what extent does the reconstructed vocabulary mirror the scope of the original PIE language? The first thing we should dismiss is the notion that the language (any language) spoken in later prehistory was somehow primitive and restricted with respect to vocabulary. Counting how many words a language has is not an easy task because linguists (and dictionaries) are inconsistent in their definition or arrangement of data. If one were simply to count the headwords of those dictionaries that have been produced to deal with nonliterate languages in Oceania, for example, the order of magnitude is somewhere on the order of 15,000–20,000 ‘words’. The actual lexical units are greater because a single form might have a variety of different meanings, each of which a speaker must come to learn, e.g. the English verb take can mean ‘to seize’, ‘to capture’, ‘to kill’, ‘to win in a game’, ‘to draw a breath’, ‘imbibe a drink’, ‘to accept’, ‘to accommodate’ to name just a few of the standard dictionary meanings. Hence, we might expect that a language spoken c. 4000 BC would behave very much like one spoken today and have a vocabulary on the order of 30,000–50,000 lexical units. If we apply fairly strict procedures to distinguishing PIE lexical items to the roots and words listed in Mallory and Adams’s Encyclopedia or Calvert Watkins’s The American Heritage Dictionary of Indo-European Roots (1985) we have less than 1,500 items. The range of meanings associated with a single lexeme is simply unknown although we occasionally get a hint, e.g. *bher- indicates both ‘carry (a load)’ and ‘bear (a child)’. So the PIE vocabulary that we reconstruct may well provide the basis for a much larger lexicon given the variety of derivational features in PIE.”

Examples of loan translations from modern NWIE languages are e.g. from Latin aquaeduct (Lat. aquaeductus MIE aqāsduktos) or universe (Lat. uniuersus<*oin(i)-uors-o-<*oino-wt-to- MIE oinowstós ‘turned into one’); from English, like software (from Gmc. samþu-, warō MIE somtúworā); from French, like ambassador (from Cel. amb(i)actos MIE ambhíagtos ‘public servant’); or chamber (from O.Lat. camera, from PGk.kamárā, ‘vault’ MIE kamarā);from Russian, like bolshevik (MIE belijówikos); etc.

Loan words from sister IE dialects can be either loan-translated or directly taken as loan-words; as e.g. ‘photo’, which should be taken directly as loan-word o-stem pháwotos, from Gk phawots, gen. phawotós, as Gk. φῶς (<φάϝος), φωτός, in compound phawotogphjā, photography, derived from IE root bhā-, shine, which could be loan-translated as MIE ˟bháwots, from ˟bhawotogbhjā, but without having a meaning for extended bha-wes-, still less for bha-wot-, in North-West Indo-European or even Proto-Indo-European, as it is only found in Ancient Greek dialects. Or MIE skhol, from Lat. schola, taken from Gk. σχολή (<PGk. skhol) ‘spare time, leisure, tranquility’, borrowed from Greek with the meaning ‘school’, which was in O.Gk. σχολεῖον (scholeíon), translated as PGk. skholehjom <*-esjo-m, from IE root segh-, which could also be loan-translated as MIE ˟sghol or even more purely (and artificially) ˟sgholesjom, none of them being Proto-Indo-European or common Indo-European terms. Examples from Indo-Iranian include wasāáranas, bazaar, from O.Ira. vahacarana ‘sale-traffic, bazaar’, which could also be translated as proper MIE ˟wesāqólenos, from PIE roots wes- and qel-; or atúrangam, chess, from Skt. caturaŋgam (which entered Europe from Pers. shatranj) a bahuvrihi compound, meaning ‘having four limbs or parts’, which in epic poetry often means ‘army’, possibly shortened from aturangabalam, Skt. caturaŋgabalam, lit. ‘four-member force’, ‘an army comprising of four parts’, could be loan-translated as MIE ˟qaturangom and ˟qaturangobelom, from roots qetur-, ang- and bel-.

Loan words and loan translations might also coexist in specialised terms; as, from *h1rudhs, red, PGk eruthrós, in loan eruthrókutos, erythrocyte, proper MIE rudhrós, in rudhr (ésenos) kētjā, red (blood) cell; cf. also MIE mūs, musós, mouse, muscle, PGk mūs, muhós, in loan muhokutos, myocyte, for muskosjo kētjā, muscle cell.

1.8.5. The name of the Modern Indo-European is eurōpājóm, or eurōpāj dghwā, European language, from adj. eurōpājós, m. European, in turn from the Greek noun Eurōpā.

NOTE. Gk. Eurō is from unknown origin, even though it was linked with Homer’s epithet for Zeus euruo, from *hurú-oqeh2 ‘far-seeing, broad’, or *h1urú-woqeh2 ‘far-sounding’ (Heath, 2005). Latinate adj. europaeus, which was borrowed by most European languages, comes from Gk. adj. eurōpaíos, in turn from PGk eurōpai-jós < PIE *eurōpeh2-jós MIE eurōpā-jós. For the evolution PIH *-eh2jo- → PGk *-aijo-, cf. adjective formation in Gk. agor-agoraíos, Ruigh (1967).

In the old IE languages, those which had an independent name for languages used the neuter. Compare Gk. Ἑλληνικά (hellēniká), Skt.संस्कृतम् (sasktam), O.H.G. diutisc, O.Prus. prūsiskan, etc.; cf. also in Tacitus Lat. uōcābulum latīnum. In most IE languages, the language is also referred to as ‘language’ defined by an adjective, whose gender follows the general rule of concordance; cf. Skt. sasktā vāk ‘refined speech’, Gk. ελληνική γλώσσα, Lat. latīna lingua, O.H.G. diutiska sprāhha (Ger. Deutsche Sprache), O.Prus. prūsiskai bilā, O.C.S. словѣньскыи ѩзыкъ (slověnĭskyi językŭ), etc.

Common scholar terms would include sindhueurōpājóm, Indo-European, prāmosindhueurōpājóm, Proto-Indo-European, ópitjom sindhueurōpājóm, Modern Indo-European,etc.


Part I

Language & Culture






Collection of texts and images adapted and organised by Carlos Quiles, with contributions by Fernando López-Menchero