Late Indo-European



After 4000 BC, different groups were formed in the steppes. In the west, late Sredni Stog and “Post-Mariupol” (“Extended-Position-Grave”) communities, the heirs of the western early Sredni Stog clans, remained in contact with Trypillian villagers, and some assimilation seems to have happened east of the Dnieper ca. 3700-3500 BC. These contacts are supported by the steppe-related ancestry found in a Trypillian individual ca. 3700 BC from the Verteba Cave, of G2a-P15 lineage[Mathieson et al. 2017].

In the east, early Khvalynsk gave way to late Khvalynsk and Repin societies in the Volga-Don region, whose language is to be associated with a common Late Proto-Indo-European[Anthony 2007]. The split of R1b1a1a2-M269 into the eastern R1b1a1a2a2-Z2103 subclade must have happened early – possibly during the previous westward expansion of early Khvalynsk clans (of R1b1a1a2-M269 and R1b1a1a2a-L23* lineages) in and outside of the Pontic-Caspian steppes, given the similar forming date (ca. 4200 BC) and TMRCA (ca. 4100 BC). The earliest aDNA samples of haplogroup R1b1a1a2a2-Z2103 are three individuals found in the late Khvalynsk area in Lopatino I ca. 3000 BC, Ishkinovka I ca. 3000 BC, and Peshany V[1] ca. 2985 BC[Haak et al. 2015]. All samples from the Samara region are either R1b1a1a2a2-Z2103 or older lineages, except for one R1b1a1a2a-L23 (xR1b1a1a2a2-Z2103, xR1b1a1a2a1-L51) at Lopatino II dated ca. 3000 BC[Haak et al. 2015], which suggests a differentiation of R1b1a1a2a-L23 into its subclades near this region.

Haplogroup R1b1a1a2a1-L51 (formed ca. 4200 BC, TMRCA ca. 3900), given its distribution into west Europe, is hypothesized to have expanded successfully to a certain extent during the common Yamna (“Pit Grave”) period of the Pontic-Caspian steppe cultures, but later and more marginally than R1b1a1a2a2-Z2103 groups.

Given the lack of aDNA from the Western Yamna horizon, and the later westward expansion of R1b1a1a2a2-Z2103 lineages, it is probably safest to assume a western location of R1b1a1a2a1-L51 lineages within Yamna. It would have formed a community with R1b1a1a2a2-Z2103, but somehow separated culturally from it, and thus the two main dialects of Late Proto-Indo-European may have developed separately.

Graeco-Aryan (probably including at least Greek, Armenian, and Indo-Iranian) has been argued as a dialect continuum or a linguistic community where a number of common innovations were shared at an early time[Mallory and Adams 2007][West 2007]. North-West Indo-European – including Italic, Celtic, Germanic, and Balto-Slavic – has been proposed as a group of closely related dialects with some form of shared linguistic history, presumably about the 3rd to 2nd millennium BC, after the initial dispersal of the Indo-European languages but before the emergence of the individual language groups in Europe[Oettinger 1997][Oettinger 2003][Adrados 1998][Mallory and Adams 2007][Mallory 2013][Beekes 2011]. Tocharian would have been part of this group at an earlier stage, forming a ‘Northern’ Indo-European group – so called because of their later migrations, contrasting with the ‘Southern’ or Graeco-Arian Indo-European dialects[Adrados 1998][Mallory and Adams 2007][West 2007]. In light of the most likely distribution of both dialects during the common Yamna period, the names ‘Western’ and ‘Eastern’ Late Proto-Indo-European would probably be more appropriate.

Both linguistic communities remained thus in close contact, and are probably to be located to the eastern Don-Volga steppes, spreading across the Pontic-Caspian steppes after about 3300 BC[Anthony 2007]. Because of their later expansion, their division could be speculatively traced back to the early division of Volga-Don groups: the western, Don-based Repin culture, and the eastern, Volga-based late Khvalynsk culture.

The westward and eastward expansion of the Repin culture about 3300 BC is associated to the rapid diffusion of the Yamna horizon across the Pontic-Caspian steppes, and a common, “disintegrating Indo-European”[Bomhard 2015] must have been spoken in this common period, where laryngeals were already unstable, and had possibly already undergone the first common phase of laryngeal loss to leave the traditionally reconstructed long and short vowels[Szemerényi 1967], with a single laryngeal[Polomé 1987] remaining mainly in compounds with sonorants, whose later dialectal evolution is controversial[Adrados, Bernabé, and Mendoza 2010][Clackson 2007].

eneolithic-europe.jpg Diachronic map of Eneolithic migrations ca. 4000-3100 BC [Anthony 2007][Szmyt 2013][Piezonka 2015], Uni-Köln.


  1. Additional information from Sergey Malyshev at FTDNA R1 Basal Subclades project reads Z8129/Y12537 (equivalent Z2103)