I.3. Epipalaeolithic

Starting ca. 12000 BC with the first strong warming period after the last ice age, known as the Bølling-Allerød interstadial, a new migration replaced part of the European population, helped by the melting of the Alpine glacial wall that divided west and east Europe. Individuals associated with diverse Epipalaeolithic cultures (ca. 12000–5000 BC)transition to the Epigravettian in southern Europe, and Magdalenian–to–Azilian transition in western Europefall into a newly emerged Villabruna genomic cluster, which displaced previous hunter-gatherer populations.

The population ancestral to the Villabruna cluster separated from the ancestors of contemporary populations found in the Near East. It is during this time that western European hunter-gatherers become much more closely related to modern Near Easterners, proving that the new migration likely happened from the Near East into Europe. The defining sample comes from an Epigravettian individual from Villabruna, Italy (ca. 13000 BC), of hg. R1b1-L754, and this lineage is also found in Loschbour (ca. 9775 BC), along with an individual of hg. I2-M438.

Other individuals include from Bichon, Switzerland (ca. 11700 BC), hg. I2a1a1b1-L286; Loschbour, Luxembourg (ca. 6100 BC), hg. I2a1a2-M423; as well as samples from La Braña, Iberia (ca. 5865 BC), hg. C1a2-V20; and Körös (Hungary ca. 5710 BC), hg. G-M201. Ancient individuals from France, Sicily, Croatia, France, and Germany share this ancestry, which suggests that the Villabruna cluster was widely distributed in Europe for at least six thousand years, and probably expanded from a south-eastern European refugium following the last Ice Age ca. 13000 BC (Mathieson et al. 2017).

Of the fifteen samples studied, four individuals from central and central-west Europe show a distinct component found in modern East Asians, particularly Loschbour and La Braña, which indicates gene flow from a population related to modern-day East Asians into some groups of the Villabruna cluster, consistent with gene flow between populations related to East Asians (Fu et al. 2016). This supports the potential arrival of R1b1-L754 lineages from Asia associated with a male-biased migration of an eastern population.

Based on the most recent data of modern populations, an origin of the split into R1b-M343 and R1a-M420 is estimated ca. 20800 BC, with a TMRCA ca. 18400 BC for R1b-M343, and ca. 16200 for R1a-M420. The formation of R1b1-L754 is estimated ca. 16900 BC, with a time to MRCA ca. 15100 BC, suggesting successive migration events, starting probably near Siberia in Asia, based on the Mal’ta sample of hg. R-M207.

Hunter-gatherers from the Iron Gates prove the regional continuity of haplogroup R1b1-L754 (xR1b1a1-P297, xR1b1a1b-M269). These samples were probably from branches that have not survived in modern populations, and they cover an extensive period spanning from the first half of the 10th millennium to the first half of the 6th millennium BC, with the latest samples showing already Middle East farmer ancestry (Mathieson et al. 2017; González-Fortes et al. 2017).

More individuals possibly related to these ancient branches are found later in Ukraine, Iberia, and central European Neolithic in Quedlinburg as R1b1-L754 (xR1b1a1b-M269) ca. 3590 BC (Haak et al. 2015). These samples, coupled with individuals of hg. R-M207 found in Ganj Dareh (Iranian Neolithic) in the first half of the 9th millennium BC might suggest a southern Eurasian migration route for R1b1-L278 lineages, through the Iranian plateau.

The samples of basal R1b-M343* lineages in modern populations of southern Kazakhstan (Myres et al. 2011) and Iran (Grugni et al. 2012) give further support to the southern migration route into Europe. Basal R1b1-L278* lineage was found in five individuals3 Italians, 1 West Asian, 1 East Asianout of 5,326 samples studied (Cruciani et al. 2010) , which also point to a potential ancestral migration into Europe.

During the Bølling-Allerød interstadial, various divergent populations coexisted in Eurasia and Africa (Suppl. Fig. 3):

·         Epipaleolithic Iberomaurusians: represented by samples from Taforalt (ca. 18000–8000 BC), of hg. E1b1b1a1-M78 (formed. ca. 17600, TMRCA ca. 11300 BC) they derive their ancestry from ANA (ca. 45%) and a mix of CWE (ca. 40%) and other “deep” ancestry (ca. 15%). They contributed mainly to Early Neolithic populations from Morocco, and also to the Natufian population.

·         Epipalaeolithic Natufians: represented by samples from the Raqefet Cave (ca. 11300–10800 BC), probably all of hg. E1b1b1a1-M78, are a Levantine population of hunter-gatherers who lived in permanent dwellings and managed local wild plants. They show contribution from AME (73%), but also from ANA (ca. 27%), consistent with the spread of morphological features and artefacts into the Near East, as well as Y-chromosome haplogroup E.

·         Anatolian hunter-gatherers (AHG): represented by an individual from Pınarbaşı in Northern Anatolia (ca. 13350 BC), of hg. C1a2-V20, mtDNA k2b, whose ancestry descends mostly from AME (>95%), with small contributions from an ENA/ANE source, from Villabruna, and possibly from the Levant (Feldman et al. 2018). It contributed to Early Anatolian Neolithic populations.

·         West European hunter-gatherers (WHG): derived from CWE, they are represented by the Villabruna cluster in Europe. Due to its common root with AMErelative to which it lacks BE-like contributionthey are supposed to represent a population in or near Anatolia that expanded to central Europe probably from a region near the Black Sea. It is possibly part of a big AME transitional cline that connected WHG and AHG during the Palaeolithic, since south-eastern European hunter-gatherers show extra Anatolian admixture, just like AHG shows small WHG admixture. This ancestry dominates over most European hunter-gatherer populations until the arrival of the Neolithic ca. 6000 BC. Samples like Villabruna in northern Italy (ca. 12000 BC), OrienteC in Sicily (ca. 12000 BC), Bichon in Switzerland (ca. 11700 BC), Croatia Mesolithic (ca. 7200 BC), Loschbour in north-west Europe (ca. 6100 BC), La Braña 1 in north Iberia (ca. 5900 BC), or Körös in Hungary (ca. 5700 BC), all form a close WHG cluster spanning 6,000 years from the Atlantic façade to Sicily in the south and to the Balkan peninsula in the south-east (Mathieson et al. 2018).

·         ANE: represented in this late Upper Palaeolithic period by the Afontova Gora 3 sample from Lake Baikal, tentatively classified as of haplogroup Q1a-F1096 (formed ca. 24000 BC, TMRCA ca. 23900 BC) or possibly R1-M173[5] (formed ca. 26200 BC, TMRCA ca. 20800 BC). The creation of EHG ancestry (basically a WHG:ANE cline, see below) in Eastern Europe and the Caucasus was most likely associated with the westward migration of groups of ANE ancestry, probably mainly of hg. Q1a2-M25 (formed ca. 22400 BC, TMRCA ca. 14300 BC) and R1-M173 through North Eurasia. Samples of haplogroup Q-M242 found in a Baltic hunter-gatherer (ca. 6500 BC), and later in Eneolithic populations from the Caucasus, are likely remains of this early expansion. Modern-day Kets, Mansi, Native Americans, Nganasans and Yukaghirs show maximum ANE ancestry (Flegontov et al. 2016).

·         Ancient East Asians (AEA): represented by hunter-gatherers from the Early Neolithic in Lokomotiv and Shamanka (ca. 5200–4200 BC) near Lake Baikal, they show predominantly East Asian ancestry closely related to ancient individuals from the Devil’s Gate Cave (ca. 6000–5500 BC), and some ANE-related contribution (ca. 16%), representing thus another proxy for an ancestral ENA-like ancestry to compare with ANE (de Barros Damgaard, Martiniano, et al. 2018; Lazaridis et al. 2018; Sikora et al. 2018). They show one sample of haplogroup C2a1a1a-F3918 and other five probably N1a2-L666 (formed ca. 13900 BC, TMRCA ca. 6800 BC). The first appearance of AEA-related ancestry in Eastern Europe must have happened quite early, possibly later than the ANE expansion into Europe, and possibly associated with the spread of R1a1-M459 (formed ca. 16200 BC, TMRCA ca. 12000 BC) from Siberia into the Pontic–Caspian area.

·         Eastern hunter-gatherers (EHG): it can be modelled as an admixture of ANE (ca. 63%) with WHG (34%), with additional ancestry related to AEA (ca. 7%), but without additional AME contribution. It is represented by hunter-gatherers from eastern Europe, each with its specific contributions of different components, which suggests that it formed a quite stable east European cline between more WHG-like populations from the west and more ANE-like populations from the east. The presence of a Q1a2-M25 sample ca. 6500 BC in Zvejnieki points probably to the resurgence of this lineage that had spread ANE ancestry to eastern Europe, although R1a1-M459 lineages may have been involved in the creation of the cline, too. The first EHG sample is an individual from Sidelkino, from the Samara region (ca. 9300 BC), of mtDNA U5a2—with mtDNA U5 being a constant in prehistoric eastern European populations. The appearance of hg. J1-M267 in two early EHG samples from Karelia (ca. 6300 BC) may be related to an early expansion from the south, possibly of J1b1-Y6034 (formed ca. 12900 BC, TMRCA 9700 BC) from the Caucasus with ANE ancestry, creating the described cline of ANE:WHG:CHG ancestry (ca. 60:25:15) in these samples (Sikora et al. 2018).

·         Caucasus hunter-gatherers (CHG): represented by samples from Satsurblia (ca. 11300 BC, haplogroup J1-M304), and Kotias Klde (ca. 7800 BC, haplogroup J2a-M410), both sites near Dzudzuana. They show AME ancestry (ca. 56-64%) and a contribution of ENA/ANE-like populations, apart from a small “deep” ancestry.

·         Iran Neolithic (IN): represented by a Mesolithic child from the Belt Cave (ca. 12000–8000 BC), of hg. E1b1-P2, and individuals from Ganj Dareh in the Zagros Mountains (ca. 9000–8000 BC), of hg. R2-M479. They form an EHG:ANE cline similar to CHG, and thus form likely an ancestral CHG/IN population formed mainly by AME ancestry (ca. 50–58%), where IN shows a slightly higher contribution of ANE, with a statistically significant greater contribution of “deep” ancestry likely from the south.

·         Ancient Palaeosiberians (AP): represented by the Kolyma1 individual (ca. 7800 BC), they derive their ancestry from a mixture of EEA and ANS ancestry similar to that found in Native Americans (but with greater EEA contribution, 75% vs. 63%), with a closer relation to Mal’ta than to Yana RHS. The divergence of AP/Native Americans and present-day East Asians (Han Chinese) is estimated to have happened ca. 22000 BC, with AP/Native Americans showing further contribution (ca. 18000 BC) related to ANS (Sikora et al. 2018).

·         Ancient Ancestral South Indian (AASI): hypothesised South Asian hunter-gatherer ancestry deeply to present-day indigenous Andaman Islanders (Mallick et al. 2016), in particular the sampled Onge population, mainly of hg. D-M174 (Thangaraj et al. 2003), and mtDNA M2 and M4 (Reich et al. 2009; Moorjani et al. 2013).

i.3. Nostratians

Based on the most recent data of modern populations, an origin of the split into R1b-M343 and R1a-M420 is estimated ca. 20800 BC, with an expansion of R1b-M343 (TMRCA 18400 BC), then R1b1-L278 (formed ca. 18400, TMRCA ca.16900 BC). The finding of intrusive haplogroup R1b1-L754 (formed ca. 16900 BC, TMRCA ca. 15100 BC) with a homogenous WHG ancestry in Europe, and the consistent presence of mtDNA hg. U5b in different samples of the Villabruna cluster, support a male-biased migration coincident with the Bølling-Allerød interstadial.

The Mal’ta individual and the finding of the other main R1b-M343 subclade, R1b2-PH155 (TMRCA ca. 5200 BC), among early Xiongnu individuals in East Asiaand later accompanying Turkic peoplessupports a split of R1b-L278 in eastern Europe or central Eurasia, and was most likely associated with the expansion of ANE and/or EHG ancestry from Asia, possibly in successive waves of expansion that also accompanied haplogroups Q1a2-M25 and R1a-M459 to the area.

This eastern origin may justify the presence of East Asian ancestry among some samples from the Villabruna cluster associated with expanding R1b1-L754 lineages. Whether R1b-M343 lineages traversed the Middle East or expanded from the Pontic–Caspian region into Europe is unclear, although the high variability of ancient subclades found to date in eastern Europe and the Caucasus supports the regions on both sides of the Urals as the most likely cradle of R1b-M343 expansions.

Sampled hunter-gatherers from south-eastern Europe show the long-term regional continuity of haplogroup R1b1-L754 (xR1b1a1-P297, xR1b1a1b-M269), found in the Iron Gatesand also later in Mesolithic and Neolithic populations from Ukraine, the Balkans, Central Europe and Iberia. They have WHG (87%) and EHG (13%) ancestry, show mtDNA K1 and H (not present in WHG and EHG individuals), and many of these samples have been confirmed as of subclade R1b1b-V88 (formed ca. 15100 BC, TMRCA ca. 9700 BC), which must have split at the same time as R1b1-L754 was expanding into Europe. European R1b1b-V88 lineages cover thus an extensive period spanning from the first half of the 10th millennium to the first half of the 3th millennium BC, and are found widespread from Iberia in the west to the north Pontic area in the east (González-Fortes et al. 2017; Mathieson et al. 2018).

The samples of basal R1b-M343* lineages in modern populations of southern Kazakhstan (Myres et al. 2011) and Iran (Grugni et al. 2012) give further support to an eastern origin in Central Asia. Basal R1b1-L278* lineages were found in five cases out of 5,326 cases studied – three Italians, one West Asian, one East Asian (Cruciani et al. 2010) –, which also point to a potential ancestral migration into Europe. Nevertheless, population movements after their initial expansion may have obscured the original migration route, and an expansion through Anatolia cannot be excluded.

Tracing backwards potential Eurasiatic and Afroasiatic movements, and based on male-driven population expansions, the clearest link to an expanding Nostratic-speaking community is represented by the expansion of R1b1-L754 lineages, starting probably after ca. 16000 BC through the North Pontic area into south-eastern Europe, acquiring along the way the characteristic CWE-like ancestry of the Villabruna cluster.

The presence of R1b1b-V88 lineages widespread among European hunter-gatherers point to a likely early “southern Nostratisation” of Europe from east to west. The expansion of R1b1b-V88 subclades within Africa is most likely linked to the spread of Proto-Afroasiatic (see below §ii.4. Early Afrasians). The expansion of R1b1a1-P297 into north-east Europe, later emerging with post-Swiderian cultures, marks the clearest trace of the potential Eurasiatic expansion (see §ii.1. Eurasians). Even though the precise origin of expansions of R1b1-L754 subclades remains unclear, the regions surrounding the Pontic–Caspian area are the best candidates at this moment.

While the formation of hg. R2-M479 was quite early (ca. 26200 BC), its lineages survived probably somewhere in Asia until its successful expansion (based on its TMRCA ca. 14300 BC), and should probably be identified with the additional ENA/ANE contribution to Iranian Neolithic (and possibly CHG) ancestry, since they are found in samples from Ganj Dareh during the 9th millennium BC. Because one sample is R2a-M124 (formed ca. 14300 BC, TMRCA ca. 9600 BC), Iran Neolithic individuals are probably close descendants from this haplogroup’s successful expansion. Haplogroup R2a-M124 seems to be prevalent among ancient and modern Dravidians (see §viii.19. Dravidians and Indo-Aryans), and is also found in the Caucasus (Huang et al. 2017).

A connection of Dravidian with R1b-M343 is not straightforward, then, lacking fitting ancient DNA samples. Nevertheless, the likely initial expansion of R1b1-L754 lineages with ANE ancestry, as well as early expansions through the Caucasus or Turan, may have contributed to the development of other Nostratic communities in the Near East. Similarly, there is no clear connection between this haplogroup and Kartvelian, although the complex evolution of multiple small communities in the Caucasus probably allowed for many ethnolinguistic changes in the region, associated with different haplogroup expansions.

The expansion of R1b1a1-P297 lineages apparently associated with Eurasians (see below §ii.1. Eurasians) and the later emergence of R1b1a2-V1636 lineages (TMRCA ca. 4700 BC) in the Pontic–Caspian steppe region (see below §iv.2. Indo-Anatolians) supports the expansion of their upper clade R1b1a-L388 (TMRCA ca. 13600 BC) from far eastern Europe, having separated (ca. 15100 BC) with sister clade R1b1b-V88 from the ancestral R1b1-L754 trunk.

This early split of R1b1a-L388 may account for the separation of Eurasians from Pre-Kartvelians, who would have expanded close to the Caucasus with R1b1a2-V1636 lineages, while Eurasians expanded through the north. The early separation of R1b1b-V88 from the eastern European cradle of hg. R1b1-L754 and of ANE/EHG ancestry expansions would support a closer connection of ancestral Eurasiatic, Kartvelian, and possibly also Dravidian communities with each other than with Afroasiatic. The presence of basal R1b1a-L388 subclades in modern individuals from Turkey, Bulgaria, and Italy would also suggest eastern European routes of expansion for this lineage, rather than southern routes through the Caucasus or West Asia.

The timing of expansion and separation of these lineages from the common R1b-M343 trunk (Suppl. Graph. 19) coupled with known admixture events fit some of the previously published ‘shape-shifting’ Nostratic macro-languages (Campbell 1998), as well as roughly the dates published with help of language guesstimates coupled with archaeology (Beridze 2019) and statistical models (Pagel et al. 2013), which essentially predict an earlier separation of Dravidian, followed by that of Kartvelian from the common Eurasiatic superfamily (Figure 4).

Figure 4. Consensus phylogenetic tree of Eurasiatic superfamily rooted tree with estimated dates of origin of families and of superfamily by Pagel et al. (2013). P proto followed by initials of language family: PD proto-Dravidian, PK proto-Kartvelian, PU proto-Uralic, PIE proto–Indo-European, PA proto-Altaic, PCK proto–Chukchi-Kamchatkan, PIY proto–Inuit-Yupik. Consensus tree rooted using proto-Dravidian as the outgroup. The age at the root is 14.45 ± 1.75 kya (95% CI = 11.72–18.38 kya) or a slightly older 15.61 ± 2.29 kya (95% CI = 11.72–20.40 kya) if the tree is rooted with Proto-Kartvelian.