Admixture analysis


Genetic admixture refers to the analysis of the gene flow between populations that had previously been relatively isolated from one another. Since isolated populations develop linguistic differences relatively quickly, linguistic changes might be expected in a newly hybridised population[Jobling et al. 2014]. However, pidgin languages are quite rare, and often one language – usually that of the successful migrants – becomes the superstrate, and another the substrate.

On the other hand, language and culture are unlike a genome in several different ways. While it is possible to obtain admixture percentage of any ancestral population, ancestral language reconstruction and its identification with cultures needs the intervention of careful anthropological investigation. For admixture results to be meaningful, studied loci have to be correctly averaged (and samples should be as complete as possible); genetic drift and selection since admixture have to be taken into account (e.g. distant populations might show a higher differentiation from the original territory); and ancestral populations have to be correctly identified, including their number and precise alleles[Jobling et al. 2014]. Therefore, ancient DNA is best collected with the goal of testing specific hypotheses.

Some linguists have used the biological foundations of phylogenetics to extrapolate questionable methods to linguistics, and have thus obtained questionable results[Gray and Atkinson 2003]. Similarly, scientists are using the available statistical means to study genetic admixture in modern human populations, extrapolating admixture mapping methods to scarce ancient human samples, and deriving simplistic, far-fetched conclusions. This paper demonstrates the need to include wide anthropological investigation of the historical context of the samples studied, including linguistics, archaeology, and cultural anthropology, as well as careful investigation of haplogroups, to obtain plausible explanations for the complex data obtained in human biology.

It has been proposed that migrating Yamna pastoralists into already expanding Corded Ware groups[Wencel 2015] might have created the necessary environment for the spread of Indo-European languages. Previous mainstream models for Indo-European expansion, based on the “kurgan hypothesis”[Gimbutas 1977] associated the spread of Pre-Germanic (adopted on the Dniester) and Pre-Balto-Slavic (adopted on the middle Dnieper) to the expansion of Corded Ware cultures[Anthony 2007]. Given the lack of direct cultural connections between Yamna and the Corded Ware culture, this spread was explained in terms of either an incorporation of languages through centuries of interaction into Funnel Beaker cultures[Kristiansen 1989], or through the emulation of the language of Indo-European chiefs by Corded Ware cultures (beginning ca. 2700-2600 BC) for politico-religious reasons[Anthony 2007].

The component associated with certain Yamna samples is found elevated (up to 76%) in samples of the Corded Ware culture, which has been said to support the migration of Yamna populations into Corded Ware groups. The lower percentage of this component found in Bell Beaker and Únětice groups (50-70%) has been explained as a subsequent, less profound displacement process triggered by western and central European groups[Haak et al. 2015][Allentoft et al. 2015][Mathieson et al. 2015]. It has also been found that samples from the Globular Amphora culture do not show evidence of such steppe-related ancestry[Mathieson et al. 2017].

These limited results, apparently challenging archaeological interpretations previously considered established, are propagating quickly within the field of Indo-European studies. David W. Anthony has recently supported the appearance of the Corded Ware culture through the contacts of Yamna immigrants with indigenous people of the Globular Amphora culture in southern Poland[Anthony and Brown 2017], based on their previously known contacts and early dating. Similarly, Kristian Kristiansen has supported the dominance of Corded Ware in central Europe south and north of the Carpathians, asserting that their pottery was apparently shared later by the Bell Beaker culture[Kristiansen et al. 2017]. Many concerns have been raised about obtaining simplistic conclusions based on genetic results[Heyd 2017][Kristiansen et al. 2017]:


Modified file from recent papers on ancient samples from Eastern European, Southeastern European, Western European, and Bell Beaker cultures: Left: ADMIXTURE clustering analysis with k=8 showing ancient individuals. E/M/MLN, Early/Middle/Middle Late Neolithic; W/E/S/CHG, Western/Eastern/Scandinavian/Caucasus hunter-gatherers[Olalde et al. 2017]. Center: Supervised ADMIXTURE plot, modeling each ancient individual (one per row), as a mixture of populations represented by clusters containing Anatolian Neolithic (grey), Yamnaya from Samara (yellow), EHG (pink) and WHG (green). Dates indicate approximate range of individuals in each population[Mathieson et al. 2017]. Right: Ancestral components in ancient individuals estimated by ADMIXTURE (k=11)[Mittnik et al. 2017]. Original images under a CC-BY-NC 4.0 International license.

Yamna ancestry: CHG before, during, and after Chalcolithic migrations

Eastern samples

Samples from the Pontic-Caspian steppe – from which ‘steppe’ or ‘Yamna’ ancestry has been defined as a precise combination of EHG and CHG ancestry – are scarce, and the most recent ones mostly from one eastern region (Kalmykia). Because of that, east Yamna was considered the best-known proximate source for the incoming gene flow in Corded Ware samples. The exact source could have been another, yet unsampled, group of people closely related to them[Kristiansen et al. 2017], and a western or earlier (pre-Yamna) steppe population has been suggested as the potential missing link in the chain of transmission of steppe ancestry[Haak et al. 2015].

This is being confirmed by the finding of elevated steppe ancestry in samples from north Pontic steppe cultures unrelated to the Yamna culture[Mathieson et al. 2017]. This so-called ‘steppe’ ancestry defined by eastern Yamna samples has been found elevated in Late Sredni Stog, Corded Ware, Afanasevo, Andronovo, and Srubna cultures, and even a late individual of Bronze Age Bulgaria from Merichleri, ca. 1690 BC (of R1a1a1b2-Z93 lineage, probably related to the westward expansion of the Srubna culture). All of them show higher ‘steppe’ ancestry than some samples clearly identified as from the Yamna culture in Ukraine and in the Balkans, from more than a thousand years earlier to more than a thousand years after it[Mathieson et al. 2017].

All those cultures related to the Corded Ware culture cluster closely to Sredni Stog samples in PCA – but for Afanasevo, which clusters closely to eastern Yamna. On the other hand, ancestry components similar to western Yamna samples are found in a sample from Vučedol – which probably also descended directly from early western Yamna migrants – and in samples of the Eastern Bell Beaker culture.

Samples from central Balkans show in fact a relative increase in steppe ancestry later, during the Bronze Age – unlike west Europe and the southern Balkans, where ancient Indo-European languages were most likely spoken by that time. Furthermore, admixture analyses of modern populations show more steppe proportion in modern north-eastern European populations (including peoples probably speaking Finno-Ugric languages since the Corded Ware expansion) than in western European peoples that are known to have spoken Indo-European languages for millennia[1].

Most Corded Ware samples are late, almost coincident with the Bell Beaker expansion. Scarce or no samples have been published from potentially controversial areas – like the Contact Zone, north-eastern Europe, the Baltic and the Forest Zone, and western Yamna – during the most relevant periods, before and after the Chalcolithic expansions. Old samples (closer to admixture events) tend to show a higher range of variation, and could inform better of the real impact of migrations, while younger ones – depending on non-random mating processes, influenced by geographic structure or socioeconomic factors – may falsely show a relatively homogenous high or low ancestral contribution[Jobling et al. 2014], as is evident from the analysis of East Bell Beaker samples.

plot5_cut_CW.png Detail of PCA analysis of free datasets including Minoans and Mycenaeans[Lazaridis et al. 2017], and Scythian and Sarmatian[Unterländer et al. 2017] samples. PC2 vs. PC1. The graphic has been arranged so that ancestries and samples are located in geographically friendly axes similar to north-south (Y), east-west(X). Symbols are used, in a simplified manner, in accordance with symbols for Y-DNA haplogroups used in the maps. Labels have been used for simplification. Areas are drawn surrounding Yamna/Poltavka, Corded Ware (including samples from Estonia, Battle Axe, and Poltavka outlier), and succeeding Sintashta and Andronovo cultures, as well as Bell Beaker. Corded Ware sample I0104, from Erperstedt, has also been labelled.

The oldest sample from Esperstedt (labelled I0104), a second-degree relative to the Esperstedt family[Monroy Kuhn, Jakobsson, and Günther 2017], has been found to cluster the closest to steppe samples, closer than any other Corded Ware sample, previous or posterior, or samples from eastern Corded Ware-derived cultures Sintashta, or Andronovo[Haak et al. 2015]. This, connected with the exogamy prevalent among Corded Ware peoples, and the nomadic nature of its culture, precludes a proper interpretation of the ancestry found in the family. A similar case is that of the Karsdorf Late Neolithic sample dated ca. 2470 BC (attributed tentatively to the Corded Ware culture, because of its burial position), which also clusters close to steppe samples. Both samples may be directly related to early East Bell Beaker samples, which show a variable, potentially quite high steppe component, and thus also to Balkan EBA samples.

In fact, PCA analysis reveals that most samples from the Corded Ware culture and related cultures – like those from Estonia, Swedish Battle Axe, Sintashta, Andronovo (and the Poltavka outlier, in contrast with Poltavka samples) cluster close to Sredni Stog and central European populations rather than to steppe samples such as Yamna, Poltavka, and Afanasevo. Only one sample classified as from Latvian Late Neolithic / Corded Ware dated ca. 2885 BC shows more steppe component (and clusters closer to steppe samples) than others, which is compatible with the exogamy practiced by the Corded Ware population, and the contemporary expansion of Yamna along the Prut.

Western samples

Yamna migrants from the eastern zone (whose samples are used to define steppe ancestry) had migrated westward to the north Pontic area and beyond along the Danube at least twice: first in the formation process of the early Khvalynsk and Sredni Stog cultures, and later during the formation of the Yamna culture. While late Sredni Stog regions seem to have adopted a different culture than the developing Yamna to the east, potentially suggesting a different ethnolinguistic nature, Eneolithic samples from the region already show an elevated steppe component.

The northern Pontic area, from where Yamna migrants expanded west along the Danube, was a zone of interaction with peoples from the upper Danube region. The so-called ‘Yamna component’ is thus paradoxically found in lesser proportion in western Yamna samples. These include the so-called Yamna outlier from Ozera (ca. 3005 BC), one of only three samples from Ukraine; and one sample from a Yamna migrant in Mednikarovo (ca. 2960 BC). Both samples show Anatolian Neolithic ancestry, clustering closer to Balkan samples[Mathieson et al. 2017][Haak et al. 2015].


Modified from Mathieson et al (2017). «Individuals projected onto axes defined by the principal components of 799 present-day West Eurasians (not shown in thisplot for clarity, but shown in Extended Data Figure 1). Projected points include selected published individuals (faded colored circles, labeled) and newly reported individuals (other symbols; outliers shown by additional black circles). Colored polygons indicate the individuals that had cluster memberships fixed at 100% for the supervised admixture analysis [on the right]».

A simplistic assumption of recent genetic models, based on proportions of Yamna admixture, suggested that Yamna migration contributed to the expansion of the Corded Ware culture, and this in turn to the creation of the Bell Beaker culture[Haak et al. 2015], which has supported recent proposals of a direct evolution from the former to the latter[Kristiansen et al. 2017][Anthony and Brown 2017]. The presence of this component in the Sredni Stog culture, as well as the influence of the Corded Ware outlier from Esperstedt in assessing steppe ancestry of the Corded Ware culture as a whole – and the elevated steppe ancestry found recently in Bell Beaker samples from Hungary and western Europe –, demonstrate that this assumption is wrong.

The closer position of Bell Beaker samples to Yamna samples – closer than any other sample of the Corded Ware in PCA[Olalde et al. 2017] excluding the I0104 outlier –, as well as the different position of western Yamna samples and Vučedol, makes such a direct connection with Corded Ware migrants even more unlikely. Traditional models of Yamna and Bell Beaker expansion[Harrison and Heyd 2007][Heyd 2007][Heyd 2012], accompanied by the expansion of North-West Indo-European[Mallory 2013][Prescott 2012], therefore, seem to be sustained by genetic investigation.

Ancient and modern samples

It is known that the genetic isolation of Eurasian hunter-gatherers came to an end with the arrival of farming and pastoralism. This is seen in the evolution of Middle Eastern ancestries during the Neolithic and Chalcolithic[Lazaridis et al. 2016], and it is becoming clearer too with the genetic flow seen in eastern Europe during the Neolithic and Chalcolithic. Even though samples are scarce and distant, Neolithic individuals from Comb Ware (Zvejnieki), Late Khvalynsk (Samara), and Old Europe (Varna I, Smyadovo) cultures show a clear pattern towards lesser inter-group genetic distances, clearly seen in the appearance of CHG ancestry in steppe and steppe-related cultures, and in their convergence in PCA analysis[Mathieson et al. 2017].

Two female samples from Bohemia were misidentified as Bell Beaker[Allentoft et al. 2015], when they were in fact three millennia younger, from Czech Slavs[Mathieson et al. 2017]. PCA or Admixture did not (and cannot) show differences with Bell Beaker or Balkan samples, since parental populations need to be available, or else archaeological context is needed to define demographic models and potential ancestral populations, to ascertain their actual link to the so-called steppe ancestry. In fact, there is a clear north-south cline of steppe ancestry in modern populations, peaking in the Forest Zone, which mimics to some extent its geographic distribution after the Corded Ware and Yamna expansions[Haak et al. 2015], and thus also potentially to some extent a previous situation[Klejn et al. 2017].

Demographic issues

The migration of Pontic-Caspian steppe into Neolithic/Bronze Age central Europeans has been argued to be strongly male-biased[Poznik et al. 2016], with a study suggesting up to 14 migrating males for every migrating female[Goldberg, Günther, et al. 2017], but different in the rates regarding Corded Ware, Bell Beaker, and Únětice. The results of the latter study have been disputed[Lazaridis and Reich 2017], and this in turn contested by the original authors based on the impact of small, low-coverage ancient samples in admixture analyses[Goldberg, Gunther, et al. 2017]. This questions the accuracy of predictions made based on certain samples and methods used.

In terms of mtDNA, Bell Beaker and Corded Ware samples both show cases of common hunter-gatherer haplogroups U5, U2, or U4. Unlike Corded Ware, the Bell Beaker culture shows a higher proportion of haplogroup H[Brandt et al. 2013][Olalde et al. 2017], proper of western Europe, coinciding thus with its expansion to the west. Out of seventeen samples from the Corded Ware culture, four (all of Esperstedt) and one out of five of Sintashta include different J1c subclades, only found previously accompanying the expansion of Neolithic Middle Eastern farmers, including the Globular Amphora culture (in two out of nine samples).

The early Latvian Corded Ware sample (of ca. 2885 BC) is of mtDNA haplogroup U5a1b[Mathieson et al. 2017], a haplogroup found previously in four Sredni Stog samples from Deriivka (ca. 5150 BC). Later it is also found in a central European sample at Benzingerode – dated as of ca. 2275 BC, and dubiously classified as of the Bell Beaker culture, based only on the burial position[Haak et al. 2015] –, and in a sample of the Únětice culture from Przecławice (ca. 1790 BC).

Ascertaining global differences in demographic changes is especially important in light of an apparently mostly peaceful Yamna migration along the Danube[Heyd 2012], contrasting with the potentially violent and strong patrilocality shown by peoples of the Corded Ware cultures[Kristiansen et al. 2017]. Peoples of the expanding Corded Ware horizon were of nomadic and exogamic nature, keeping direct contacts with the steppe since its expansion.

Quite relevant for the effect of the invading population is also the population density prior to the invasions, and the actual increase in population estimated after such population expansions, being both greater in south-east Europe[Müller 2013]. There are therefore great potential differences in population admixture between both the Corded Ware and the Bell Beaker cultures, roughly expanding westward to the north and south of the loess belt that had previously divided the expanding farmers from hunter-gatherers.

All these differences might have greatly influenced the genetic drift observed, and must be taken into account to make inferences about the actual origin and influence of the population involved in Corded Ware and Yamna/Bell Beaker migrations.

Technical issues

Shortcomings of methods used for the analysis of ancestral populations are usually not evident, and may affect any theory developed based solely on these methods.

Extraction techniques, analysis in different sequencing centres and compilation in different platforms, classification of poorly known individuals into cultures, and variability of radiocarbon dates obtained in different labs, are just few of the many known issues involved in human evolutionary biology.

The scarcity of samples adds difficulty to the classic problem of characterisation of discrete population structure in the presence of continuous patterns of genetic differentiation. The “clines versus cluster” problem in modelling population genetic variation should be addressed taking into account geographical barriers[Bradburd, Coop, and Ralph 2017], which necessarily involves a detailed description of ancient geography, ecology, mobility, etc. for any period investigated.

Principal component analysis (PCA) is a variable-reduction technique, similar to exploratory factor analysis. It reduces a larger set of variables into a smaller set of artificial variables, called principal components (PC), which account for most of the variance in the original variable. This method assumes that there is a linear relationship between variables, that there is sampling adequacy: a precise number of cases is difficult to evaluate, but it is to be assumed that scarce, damaged samples of ancient DNA preclude an ideal sample size. Variables also need to have adequate correlations in order for them to be reduced to a smaller number of components, and there should be no significant outliers.

PCA of ancient DNA samples show usually a large number of principal components, of which the most common ones selected for graphic analysis (PC1 and PC2) can usually explain (have a combined eigenvalue of) no more than 5-10% of the total variance, depending on the samples selected[2]. This is in line with the prediction that most eigenvalues of the theoretical covariance will be ‘small’, nearly equal, but is in contrast with the expectation that a few eigenvalues will be ‘large’, reflecting past demographic events[Patterson, Price, and Reich 2006].

Shortcomings of statistical methods used for the analysis of ancestral populations are usually not evident for the layman. They may affect any theory developed based solely on these methods.


PCA analysis of dataset including Minoans and Mycenaeans[Lazaridis et al. 2017], and Scythian and Sarmatian samples[Unterländer et al. 2017]. Plots of different pairs of consecutive PCs, with symbols and corresponding samples. PC1 (3.756%) vs. PC2 (2.654%), PC3 (2.165%) vs. PC4 (2.146%), PC5 (2.129%) vs. PC6 (2.118%), and PC7 (2.114%) vs. PC8 (2.106%).

Theoretically, though, under a small number of ancestral populations, with small divergence among them, just two significant eigenvalues will exist, and selecting the two main significant axes of variation captures the most relevant information. On the other hand, ancestral populations – like certain African populations, and ancient hunter-gatherers – may show dozens of significant axes, whose meaning is unclear.

Analysis with STRUCTURE[Pritchard, Stephens, and Donnelly 2000] / ADMIXTURE[Alexander, Novembre, and Lange 2009] is also mainly reported in genetic papers, as PCA, following graphical patterns. It is important to take into account that mixed ancestry in an individual can result – apart from genetic admixture of two isolated populations, which is the object of the study – from shared ancestry (coinheritance of more than one ancestry from the same parental source, incomplete differentiation), and also from assimilation.


Detail of unsupervised ADMIXTURE plot from k=3-6, 8, 10, and 12, on a dataset consisting of 318 ancient individuals, including Minoan and Mycenaean samples[Lazaridis et al. 2017]. The Corded Ware outlier of Esperstedt (I0104) and members of his family have been marked and labelled. Component colours have been used in accordance with those used for cultures in the maps.

The number of ancestral populations selected is based on cross-validation error estimation and graphical analysis. It uses, therefore, a combination of numerical and graphical methods in a similar way to the factor extraction in PCA, but less formally explored.

In a recent study, evidence supported the selection of 21 ancestries to delineate genetic structure of present-day human population[Baker, Rotimi, and Shriner 2017], although this is debatable[3]. It is unclear whether this ‘ideal’ number would be greater or lesser for ancestral, more isolated populations, and the lack of proper sampling precludes a proper selection of K. The usual small number of inferred ancestral components, selected to show ancestries in a simplified manner (K=2-6), may thus be too simplistic, although a K = ~8-11 appears to be a good range of components for the study of modern populations.

Therefore, the selection and naming of a population as ‘ancestral’ to another is indeed conventional, and can lead to error when its nature as approximate source or proxy among poorly investigated populations is not fully understood. With new results, the naming of certain ancestral populations may become obsolete, as more ancestral proxy populations are discovered.


«Cross-validation error as a function of the number of ancestral components K. The red symbol indicates the minimum cross-validation error, which occurs at K=21»[Baker, Rotimi, and Shriner 2017].

Formal tests to investigate whether mixture occurred, and to infer proportions and dates of mixture are relatively new, and include the three-population test, D-statistics, F4-ratio estimation, admixture graph fitting, and rolloff, included – among other tools – in the free software package ADMIXTOOLS[Patterson et al. 2012]. They are robust tools based on statistical methods, but each method is dependent on certain assumptions. So, for example, an estimation of mixing proportions in a three-population test, when phylogeny for the populations studied is incorrect, leaves such proportions without useful meaning. Even discussing mixing from an ancestral population, when an intermediate admixing event occurs, hardly makes sense.

Cognitive bias, conflicts of interest, contextual bias

In the academic community, prestige, access to grants, and even jobs depend on getting articles published in journals of high impact factor. These journals prefer short articles, mainly based on mathematical methods (preferably with reference to improvements in such methods), groundbreaking conclusions, and self-important titles, with a tendency to “culture-historicism”.

Pressure to publish means also pressure to gather, analyse and interpret the data. However, knowledge and expertise in gathering genetic data from archaeological remains does not mean expertise in statistics and computer science. Statistical knowledge does not qualify one to infer conclusions based on results either, unless one has some previous knowledge of the anthropological subjects involved. Otherwise, researchers concerned with fieldwork and statistical methods are exposed, during the interpretation of results, to the risks of circular reasoning and confirmation bias, by searching only for anthropological information that might fit their results. In this sense, a clear trend can be observed in recent publications, whereby wide-ranging conclusions in genetic papers tend to become outdated in very short periods, as new samples become available.

For the general population, SNP investigation offers a simple view of one’s own paternal line, that a thousand years (or ca. 30 generations) ago would represent a 1,000,000,000th of one’s own genealogical tree; four or five thousand years ago, its contribution to a personal ethnolinguistic definition is non-existent. This, together with the perceived complexity (and lack of familiarity with) intricately linked anthropological disciplines, has made human ancestry investigation quite popular among amateur geneticists, who can easily play with published open source software programs and free aDNA datasets, due to their accessibility. However, the correct use of these programs needs much more than just knowing how to apply certain commands to some data. The quest for one’s own personal and national “ethnic proportion”, often as part of pre-existing simplistic ethnolinguistic beliefs and socio-political agendas, is also being promoted by commercial genetic testing companies to sell their products, in what would certainly be a reason for Kosinna’s smile today.


  • [Allentoft et al. 2015] ^ 1 2 Allentoft, Morten E., Martin Sikora, Karl-Goran Sjogren, Simon Rasmussen, Morten Rasmussen, Jesper Stenderup, Peter B. Damgaard, Hannes Schroeder, Torbjorn Ahlstrom, Lasse Vinner, Anna-Sapfo Malaspinas, Ashot Margaryan, Tom Higham, David Chivall, Niels Lynnerup, Lise Harvig, Justyna Baron, Philippe Della Casa, Pawel Dabrowski, Paul R. Duffy, Alexander V. Ebel, Andrey Epimakhov, Karin Frei, Miroslaw Furmanek, Tomasz Gralak, Andrey Gromov, Stanislaw Gronkiewicz, Gisela Grupe, Tamas Hajdu, Radoslaw Jarysz, Valeri Khartanovich, Alexandr Khokhlov, Viktoria Kiss, Jan Kolar, Aivar Kriiska, Irena Lasak, Cristina Longhi, George McGlynn, Algimantas Merkevicius, Inga Merkyte, Mait Metspalu, Ruzan Mkrtchyan, Vyacheslav Moiseyev, Laszlo Paja, Gyorgy Palfi, Dalia Pokutta, Lukasz Pospieszny, T. Douglas Price, Lehti Saag, Mikhail Sablin, Natalia Shishlina, Vaclav Smrcka, Vasilii I. Soenov, Vajk Szeverenyi, Gusztav Toth, Synaru V. Trifanova, Liivi Varul, Magdolna Vicze, Levon Yepiskoposyan, Vladislav Zhitenev, Ludovic Orlando, Thomas Sicheritz-Ponten, Soren Brunak, Rasmus Nielsen, Kristian Kristiansen, and Eske Willerslev. 2015. Population genomics of Bronze Age Eurasia. Nature 522 (7555):167-172.
  • [Anthony 2007] ^ 1 2 Anthony, D. 2007. The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World. Princeton and Oxford: Princeton University Press.
  • [Anthony and Brown 2017] ^ 1 2 Anthony, D.W., and D. R. Brown. 2017. Molecular Archaeology and Indo-European linguistics: Impressions from new data. In Usque ad Radices: Indo-European Studies in Honour of Birgit Anette Olsen, edited by B. Simmelkjær, S. Hansen, A. Hyllested, A. R. Jørgensen, G. Kroonen, J. H. Larsson, B. N. Whitehead, T. Olander and T. M. Søborg. Copenhagen: Museum Tusculanum Press.
  • [Bradburd, Coop, and Ralph 2017] ^ Bradburd, Gideon, Graham Coop, and Peter Ralph. 2017. Inferring Continuous and Discrete Population Genetic Structure Across Space. bioRxiv.
  • [Brandt et al. 2013] ^ Brandt, G., W. Haak, C. J. Adler, C. Roth, A. Szecsenyi-Nagy, S. Karimnia, S. Moller-Rieker, H. Meller, R. Ganslmeier, S. Friederich, V. Dresely, N. Nicklisch, J. K. Pickrell, F. Sirocko, D. Reich, A. Cooper, K. W. Alt, and Consortium Genographic. 2013. Ancient DNA reveals key stages in the formation of central European mitochondrial genetic diversity. Science 342 (6155):257-61.
  • [Gimbutas 1977] ^ Gimbutas, Marija. 1977. The first wave of eurasian pastoralists into copper age europe. JIES 5 (4):277-338.
  • [Goldberg, Günther, et al. 2017] ^ Goldberg, Amy, Torsten Günther, Noah A. Rosenberg, and Mattias Jakobsson. 2017. Ancient X chromosomes reveal contrasting sex bias in Neolithic and Bronze Age Eurasian migrations. Proceedings of the National Academy of Sciences:201616392.
  • [Goldberg, Gunther, et al. 2017] ^ Goldberg, Amy, Torsten Gunther, Noah A Rosenberg, and Mattias Jakobsson. 2017. Reply To Lazaridis And Reich: Robust Model-Based Inference Of Male-Biased Admixture During Bronze Age Migration From The Pontic-Caspian Steppe. bioRxiv.
  • [Gray and Atkinson 2003] ^ Gray, R. D., and Q. D. Atkinson. 2003. Language-tree divergence times support the Anatolian theory of Indo-European origin. Nature 426 (6965):435-439.
  • [Haak et al. 2015] ^ 1 2 3 4 5 6 7 Haak, W., I. Lazaridis, N. Patterson, N. Rohland, S. Mallick, B. Llamas, G. Brandt, S. Nordenfelt, E. Harney, K. Stewardson, Q. Fu, A. Mittnik, E. Banffy, C. Economou, M. Francken, S. Friederich, R. G. Pena, F. Hallgren, V. Khartanovich, A. Khokhlov, M. Kunst, P. Kuznetsov, H. Meller, O. Mochalov, V. Moiseyev, N. Nicklisch, S. L. Pichler, R. Risch, M. A. Rojo Guerra, C. Roth, A. Szecsenyi-Nagy, J. Wahl, M. Meyer, J. Krause, D. Brown, D. Anthony, A. Cooper, K. W. Alt, and D. Reich. 2015. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522 (7555):207-11.
  • [Heyd 2012] ^ 1 2 Heyd, Volker. 2012. Yamnaya gropus and tumuli west of the Black Sea. Travaux de la Maison de l'Orient et de la Méditerranée. Série recherches archéologiques 58 (1):535-555.
  • [Heyd 2017] ^ Heyd, Volker. 2017. Kossinna's smile. Antiquity 91 (356):348-359.
  • [Jobling et al. 2014] ^ 1 2 3 Jobling, M. A., E. Hollox, M. MHurles, T. Kivisild, and C. Tyler-Smith. 2014. Human Evolutionary Genetics. Second edition ed. New York and Abingdon: Garland Science, Taylor & Francis Group.
  • [Jones et al. 2017] Jones, Eppie R., Gunita Zarina, Vyacheslav Moiseyev, Emma Lightfoot, Philip R. Nigst, Andrea Manica, Ron Pinhasi, and Daniel G. Bradley. 2017. The Neolithic Transition in the Baltic Was Not Driven by Admixture with Early European Farmers. Current Biology.
  • [Klejn et al. 2017] ^ Klejn, Leo S., Wolfgang Haak, Iosif Lazaridis, Nick Patterson, David Reich, Kristian Kristiansen, Karl-Göran Sjögren, Morten Allentoft, Martin Sikora, and Eske Willerslev. 2017. Discussion: Are the Origins of Indo-European Languages Explained by the Migration of the Yamnaya Culture to the West? European Journal of Archaeology:1-15.
  • [Kristiansen 1980] Kristiansen, Kristian. 1989. Prehistoric Migrations - the Case of the Single Grave and Corded Ware Cultures. Journal of Danish Archaeology 8 (1):211-225.
  • [Kristiansen et al. 2017] ^ 1 2 3 4 5 Kristiansen, Kristian, Morten E. Allentoft, Karin M. Frei, Rune Iversen, Niels N. Johannsen, Guus Kroonen, Łukasz Pospieszny, T. Douglas Price, Simon Rasmussen, Karl-Göran Sjögren, Martin Sikora, and Eske Willerslev. 2017. Re-theorising mobility and the formation of culture and language among the Corded Ware Culture in Europe. Antiquity 91 (356):334-347.
  • [Lazaridis et al. 2016] ^ Lazaridis, I., D. Nadel, G. Rollefson, D. C. Merrett, N. Rohland, S. Mallick, D. Fernandes, M. Novak, B. Gamarra, K. Sirak, S. Connell, K. Stewardson, E. Harney, Q. Fu, G. Gonzalez-Fortes, E. R. Jones, S. A. Roodenberg, G. Lengyel, F. Bocquentin, B. Gasparian, J. M. Monge, M. Gregg, V. Eshed, A. S. Mizrahi, C. Meiklejohn, F. Gerritsen, L. Bejenaru, M. Bluher, A. Campbell, G. Cavalleri, D. Comas, P. Froguel, E. Gilbert, S. M. Kerr, P. Kovacs, J. Krause, D. McGettigan, M. Merrigan, D. A. Merriwether, S. O'Reilly, M. B. Richards, O. Semino, M. Shamoon-Pour, G. Stefanescu, M. Stumvoll, A. Tonjes, A. Torroni, J. F. Wilson, L. Yengo, N. A. Hovhannisyan, N. Patterson, R. Pinhasi, and D. Reich. 2016. Genomic insights into the origin of farming in the ancient Near East. Nature 536 (7617):419-24.
  • [Lazaridis et al. 2017] ^ 1 2 3 Lazaridis, Iosif, Alissa Mittnik, Nick Patterson, Swapan Mallick, Nadin Rohland, Saskia Pfrengle, Anja Furtwängler, Alexander Peltzer, Cosimo Posth, Andonis Vasilakis, P. J. P. McGeorge, Eleni Konsolaki-Yannopoulou, George Korres, Holley Martlew, Manolis Michalodimitrakis, Mehmet Özsait, Nesrin Özsait, Anastasia Papathanasiou, Michael Richards, Songül Alpaslan Roodenberg, Yannis Tzedakis, Robert Arnott, Daniel M. Fernandes, Jeffery R. Hughey, Dimitra M. Lotakis, Patrick A. Navas, Yannis Maniatis, John A. Stamatoyannopoulos, Kristin Stewardson, Philipp Stockhammer, Ron Pinhasi, David Reich, Johannes Krause, and George Stamatoyannopoulos. 2017. Genetic origins of the Minoans and Mycenaeans. Nature 548 (7666):214-218.
  • [Lazaridis and Reich 2017] ^ Lazaridis, Iosif, and David Reich. 2017. Failure to Replicate a Genetic Signal for Sex Bias in the Steppe Migration into Central Europe. bioRxiv.
  • [Mallory 2013] ^ Mallory, J.P. 2013. The Indo-Europeanization of Atlantic Europe. In Celtic From the West 2: Rethinking the Bronze Age and the Arrival of Indo-European in Atlantic Europe, edited by J. T. Koch and B. Cunliffe. Oxford: Oxbow Books.
  • [Mathieson et al. 2015] ^ Mathieson, I., I. Lazaridis, N. Rohland, S. Mallick, N. Patterson, S. A. Roodenberg, E. Harney, K. Stewardson, D. Fernandes, M. Novak, K. Sirak, C. Gamba, E. R. Jones, B. Llamas, S. Dryomov, J. Pickrell, J. L. Arsuaga, J. M. de Castro, E. Carbonell, F. Gerritsen, A. Khokhlov, P. Kuznetsov, M. Lozano, H. Meller, O. Mochalov, V. Moiseyev, M. A. Guerra, J. Roodenberg, J. M. Verges, J. Krause, A. Cooper, K. W. Alt, D. Brown, D. Anthony, C. Lalueza-Fox, W. Haak, R. Pinhasi, and D. Reich. 2015. Genome-wide patterns of selection in 230 ancient Eurasians. Nature 528 (7583):499-503.
  • [Mathieson et al. 2017] ^ 1 2 3 4 5 6 7 8 Mathieson, Iain, Songül Alpaslan Roodenberg, Cosimo Posth, Anna Szécsényi-Nagy, Nadin Rohland, Swapan Mallick, Iñigo Olade, Nasreen Broomandkhoshbacht, Olivia Cheronet, Daniel Fernandes, Matthew Ferry, Beatriz Gamarra, Gloria González Fortes, Wolfgang Haak, Eadaoin Harney, Ben Krause-Kyora, Isil Kucukkalipci, Megan Michel, Alissa Mittnik, Kathrin Nägele, Mario Novak, Jonas Oppenheimer, Nick Patterson, Saskia Pfrengle, Kendra Sirak, Kristin Stewardson, Stefania Vai, Stefan Alexandrov, Kurt W. Alt, Radian Andreescu, Dragana Antonović, Abigail Ash, Nadezhda Atanassova, Krum Bacvarov, Mende Balázs Gusztáv, Hervé Bocherens, Michael Bolus, Adina Boroneanţ, Yavor Boyadzhiev, Alicja Budnik, Josip Burmaz, Stefan Chohadzhiev, Nicholas J. Conard, Richard Cottiaux, Maja Čuka, Christophe Cupillard, Dorothée G. Drucker, Nedko Elenski, Michael Francken, Borislava Galabova, Georgi Ganetovski, Bernard Gely, Tamás Hajdu, Veneta Handzhyiska, Katerina Harvati, Thomas Higham, Stanislav Iliev, Ivor Janković, Ivor Karavanić, Douglas J. Kennett, Darko Komšo, Alexandra Kozak, Damian Labuda, Martina Lari, Catalin Lazar, Maleen Leppek, Krassimir Leshtakov, Domenico Lo Vetro, Dženi Los, Ivaylo Lozanov, Maria Malina, Fabio Martini, Kath McSweeney, Harald Meller, Marko Menđušić, Pavel Mirea, Vyacheslav Moiseyev, Vanya Petrova, T. Douglas Price, Angela Simalcsik, Luca Sineo, Mario Šlaus, Vladimir Slavchev, Petar Stanev, Andrej Starović, Tamás Szeniczey, Sahra Talamo, Maria Teschler-Nicola, Corinne Thevenet, Ivan Valchev, Frédérique Valentin, Sergey Vasilyev, Fanica Veljanovska, Svetlana Venelinova, Elizaveta Veselovskaya, Bence Viola, Cristian Virag, Joško Zaninović, Steve Zäuner, Philipp W. Stockhammer, Giulio Catalano, Raiko Krauß, David Caramelli, Gunita Zariņa, Bisserka Gaydarska, Malcolm Lillie, Alexey G. Nikitin, Inna Potekhina, Anastasia Papathanasiou, Dušan Borić, Clive Bonsall, Johannes Krause, Ron Pinhasi, and David Reich. 2017. The Genomic History Of Southeastern Europe. bioRxiv.
  • [Mittnik et al. 2017] ^ Mittnik, Alissa, Chuan-Chao Wang, Saskia Pfrengle, Mantas Daubaras, Gunita Zariņa, Fredrik Hallgren, Raili Allmäe, Valery Khartanovich, Vyacheslav Moiseyev, Anja Furtwängler, Aida Andrades Valtueña, Michal Feldman, Christos Economou, Markku Oinonen, Andrejs Vasks, Mari Tõrv, Oleg Balanovsky, David Reich, Rimantas Jankauskas, Wolfgang Haak, Stephan Schiffels, and Johannes Krause. 2017. The Genetic History of Northern Europe. bioRxiv.
  • [Müller 2013] ^ Müller, J. 2013. Demographic traces of technological innovation, social change and mobility: from 1 to 8 million Europeans (6000-2000 BCE). In Environment and subsistence – forty years after Janusz Kruk’s „Settlement studies…” (= Studien zur Archäologie in Ostmitteleuropa / Studia nad Pradziejami Europy Środkowej 11), edited by S. Kadrow and P. Włodarczak. Rzeszów, Bonn: Mitel & Verlag Dr. Rudolf Habelt.
  • [Olalde et al. 2017] ^ 1 2 3 Olalde, Iñigo, Selina Brace, Morten E. Allentoft, Ian Armit, Kristian Kristiansen, Nadin Rohland, Swapan Mallick, Thomas Booth, Anna Szécsényi-Nagy, Alissa Mittnik, Eveline Altena, Mark Lipson, Iosif Lazaridis, Nick J. Patterson, Nasreen Broomandkhoshbacht, Yoan Diekmann, Zuzana Faltyskova, Daniel M. Fernandes, Matthew Ferry, Eadaoin Harney, Peter de Knijff, Megan Michel, Jonas Oppenheimer, Kristin Stewardson, Alistair Barclay, Kurt W. Alt, Azucena Avilés Fernández, Eszter Bánffy, Maria Bernabò-Brea, David Billoin, Concepción Blasco, Clive Bonsall, Laura Bonsall, Tim Allen, Lindsey Büster, Sophie Carver, Laura Castells Navarro, Oliver Edward Craig, Gordon T. Cook, Barry Cunliffe, Anthony Denaire, Kirsten Egging Dinwiddy, Natasha Dodwell, Michal Ernée, Christopher Evans, Milan Kuchařík, Joan Francès Farré, Harry Fokkens, Chris Fowler, Michiel Gazenbeek, Rafael Garrido Pena, María Haber-Uriarte, Elżbieta Haduch, Gill Hey, Nick Jowett, Timothy Knowles, Ken Massy, Saskia Pfrengle, Philippe Lefranc, Olivier Lemercier, Arnaud Lefebvre, Joaquín Lomba Maurandi, Tona Majó, Jacqueline I. McKinley, Kathleen McSweeney, Mende Balázs Gusztáv, Alessandra Modi, Gabriella Kulcsár, Viktória Kiss, András Czene, Róbert Patay, Anna Endródi, Kitti Köhler, Tamás Hajdu, João Luís Cardoso, Corina Liesau, Michael Parker Pearson, Piotr Włodarczak, T. Douglas Price, Pilar Prieto, Pierre-Jérôme Rey, Patricia Ríos, Roberto Risch, Manuel A. Rojo Guerra, Aurore Schmitt, Joël Serralongue, Ana Maria Silva, Václav Smrčka, Luc Vergnaud, João Zilhão, David Caramelli, Thomas Higham, Volker Heyd, Alison Sheridan, Karl-Göran Sjögren, Mark G. Thomas, Philipp W. Stockhammer, Ron Pinhasi, Johannes Krause, Wolfgang Haak, Ian Barnes, Carles Lalueza-Fox, and David Reich. 2017. The Beaker Phenomenon And The Genomic Transformation Of Northwest Europe. bioRxiv.
  • [Poznik et al. 2016] ^ Poznik, G. David, Yali Xue, Fernando L. Mendez, Thomas F. Willems, Andrea Massaia, Melissa A. Wilson Sayres, Qasim Ayub, Shane A. McCarthy, Apurva Narechania, Seva Kashin, Yuan Chen, Ruby Banerjee, Juan L. Rodriguez-Flores, Maria Cerezo, Haojing Shao, Melissa Gymrek, Ankit Malhotra, Sandra Louzada, Rob Desalle, Graham R. S. Ritchie, Eliza Cerveira, Tomas W. Fitzgerald, Erik Garrison, Anthony Marcketta, David Mittelman, Mallory Romanovitch, Chengsheng Zhang, Xiangqun Zheng-Bradley, Goncalo R. Abecasis, Steven A. McCarroll, Paul Flicek, Peter A. Underhill, Lachlan Coin, Daniel R. Zerbino, Fengtang Yang, Charles Lee, Laura Clarke, Adam Auton, Yaniv Erlich, Robert E. Handsaker, Consortium The Genomes Project, Carlos D. Bustamante, and Chris Tyler-Smith. 2016. Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences. Nat Genet 48 (6):593-599.
  • [Prescott 2012] ^ Prescott, Christopher. 2012. No longer north of the Beakers. Modeling an interpretative platform for third millennium transformations in Norway. In Background to Beakers: Inquiries in Regional Cultural Backgrounds of the Bell Beaker Complex edited by H. Fokkens and F. Nicolis. Leiden: Sidestone Press.
  • [Saag et al. 2017] Saag, Lehti, Liivi Varul, Christiana Lyn Scheib, Jesper Stenderup, Morten E Allentoft, Lauri Saag, Luca Pagani, Maere Reidla, Kristiina Tambets, Ene Metspalu, Aivar Kriiska, Eske Willerslev, Toomas Kivisild, and Mait Metspalu. 2017. Extensive farming in Estonia started through a sex-biased migration from the Steppe. bioRxiv.
  • [Unterländer et al. 2017] ^ 1 2 Unterländer, Martina, Friso Palstra, Iosif Lazaridis, Aleksandr Pilipenko, Zuzana Hofmanová, Melanie Groß, Christian Sell, Jens Blöcher, Karola Kirsanow, Nadin Rohland, Benjamin Rieger, Elke Kaiser, Wolfram Schier, Dimitri Pozdniakov, Aleksandr Khokhlov, Myriam Georges, Sandra Wilde, Adam Powell, Evelyne Heyer, Mathias Currat, David Reich, Zainolla Samashev, Hermann Parzinger, Vyacheslav I. Molodin, and Joachim Burger. 2017. Ancestry and demography and descendants of Iron Age nomads of the Eurasian Steppe. 8:14615.
  • [Wencel 2015] ^ Wencel. 2015. An Absolute Chronological Framework for the Central-Eastern European Eneolithic. Oxford Journal of Archaeology 34 (1):33-43.


  1. Haak et al. 2015, fig. 3
  2. This is a guesstimate based on the limited experience of the author with free datasets.
  3. Iosif Lazaridis (Twitter 3/9/2017), criticises the choice of K=21 as a “minimum”, as well as the concept of “mixed ancestry” meaning the possession of >1 of K=21 components in an admixture analysis over ~19k SNPs.