European Gene locator link
Distribution of European Y-chromosome DNA (Y-DNA) haplogroups by region in percentage
Last update : February 2010 (Armenians, Azeris, Basques, Bashkirs, Bosnians, Cantabrians, Cypriots, Galicians, Kurds, Macedonians)
Human Y-chromosome DNA can be divided in genealogical groups sharing a common ancestor. These are called haplogroups . To know what ancient ethnic group is associated with each haplogroup, please check European Haplogroups : origins, geographic spread and relation to ethnic groups .
Note that figures are only indicative. Several sources were used and averages recalculated by merging the data available. Being approximations, numbers were rounded up to 0.5%. Frequencies inferior to 0.25% are indicated as 0%. A non-exhaustive list of the sources used for this page can be found here .
Note: the number in each ROW indicates a percentage of the population, relative to each haplogroup.
NotesTurkey is the only country that includes a sizeable percentage of Asian and African haplogroups not listed in this table (A, ExE1b1b, C, H, L, O, R2) representing 8.5% of the total. Haplogroup L alone makes up 4% of the Turkish population.
The division of Italy is as follows: North Italy is everything until Liguria and Emilia-Romagna; Central Italy comprises Tuscany, Marche, Umbria, Latium and Abruzzo. South Italy is everything else to the south, except Sardinia and Sicily, which have been made into separate categories due to their specific history and relative geographic isolation. Sources for the Italian regional breakdown .
Our division of Germany was made this way : North Germany includes the Schleswig-Holstein, Lower Saxony (+ Hamburg and Bremen) and Mecklenburg-Western Pomerania. West Germany is the Rhineland, Hesse and Saarland. South Germany is Baden-Württemberg and Bavaria. East Germany is composed of Brandenburg, Berlin, Saxony-Anhalt, Saxony and Thuringia.
The sample size for each country or region is at least of 100. Italy, Germany, England and Ireland have over 2000 samples each, France and Spain over 1000, Portugal over 900, Belgium over 750, the Netherlands, Finland and Hungary over 650, Greece and Turkey over 500.
The percentages of haplogroups H1, H3 and U5 is given in addition to the total for H and U. This is useful to assess the proportion of Paleolithic European (Cro-Magnon) lineages, as opposed to later arrivals.
The "Other" category includes mostly the older haplogroups N, R, pre-HV and HV, but also occasionally a few African (L) or Asian haplogroups (A, B, C, D, M, Z).
The largest sample sizes in this data base are Germany (n = 2610), England (n = 1577), Scotland (n = 1413), Ireland (n = 1397), France (n = 878), Italy (n = 808), Norway (n = 703), Finland (n = 580), and Iceland (n = 511). Each country has at least 100 samples.
The origins of R1 remain unclear. On the one hand there is a significant presence as far south as Central Africa, for example Cameroon. Although this is generally seen as a result of back migration from Eurasia, it has been seen, especially in conjunction with high levels of R1* in Jordan, as indicative of the likelihood that R1 had origins in the Middle East. Looking at R1's relatives more generally, haplogroup R is part of the family of haplogroup P, and a sibling clade, therefore, of haplogroup Q, which is common in the Americas, and in Eurasia is associated with eastern areas such as Siberia. Such information has been used to suggest an origin for R1 to the east of the Middle East. For example, Kivisild et al. (2003) believes the evidence "suggests that southern and western Asia might be the source of this haplogroup". Referencing Kivisild et al., Soares et al. (2010) felt in their review of the literature, that the case for South Asian origins is strongest, with Central Asia argued by Wells et al. (2001) being also worthy of consideration.
Haplogroup R1 is fairly common throughout Europe, South Asia and Central Asia. It also occurs in Africa, Near East and Native americans from North America. Low frequencies in Siberia, Malay Archipelago and Indigenous Australians.
R1 is very common throughout all of Eurasia except East Asia and Southeast Asia. Its distribution is believed to be associated with the re-settlement of Eurasia following the last glacial maximum. Its main subgroups are R1a (M420) and R1b (M343). One subclade of haplogroup R1b (especially R1b1a2, R-M269), is the most common haplogroup in Western Europe and Bashkortostan, while another R1a (especially R1a1a, R-M17 or R-M98) is the most common haplogroup in large parts of South Asia, Eastern Europe, Central Asia, Western China, and South Siberia.
Individuals whose Y-chromosomes possess all the mutations on internal nodes of the Y-DNA tree down to and including M207 (which defines Haplogroup R) but which display neither the M173 mutation that defines Haplogroup R1 nor the M479 mutation that defines Haplogroup R2 are categorised as belonging to group R*. Haplogroup R* has been found in 10.3% (10/97) of a sample of Burusho and 6.8% (3/44) of a sample of Kalash from northern Pakistan.
In Indigenous Americans groups, R1 is the most common haplogroup after Q, especially in North America in Ojibwe people at 79%, Chipewyan 62%, Seminole 50%, Cherokee 47%, Dogrib 40% and Papago 38%.
One isolated clade (or clades) of Y chromosomes that appear to belong to Haplogroup R1b1* (P25-derived) is found at high frequency among the native populations of northern Cameroon, such as the Kirdi, in west-central Africa, which is believed to reflect a prehistoric back-migration of an ancient proto-Eurasian population into Africa.
R1a and R1a1a are believed to have originated somewhere within Eurasia, most likely in the area from Eastern Europe to South Asia. Several recent studies have proposed that South Asia is the most likely region of origin. But on the other hand, as will be discussed below, some researchers continue to treat modern Indian R1a as being largely due to immigration from the Central Eurasian steppes or Southwestern Asia.
R1a has been found in high frequency at both the eastern and western ends of its core range, for example in India and Tajikistan on the one hand, and Poland on the other. Throughout all of these regions, R1a is dominated by the R1a1a (R-M17 or R-M198) sub-clade.
In South Asia R1a1a has often been observed with high frequency in a number of demographic groups. The main two subclades of R1a1a are R1a1a* and R1a1a7. R1a1a7 is positive for M458 an SNP that separate it from the rest of R1a1a. It is significant because M458 is a European marker and the epicenter is Poland. M458 marker is rare in India.
In India, high percentage of this haplogroup is observed in West Bengal Brahmins (72%) to the east, Konkanastha Brahmins (48%) to the west, Khatris (67%) in north and Iyenger Brahmins (31%) of south. It has also been found in several South Indian Dravidian-speaking Adivasis including the Chenchu (26%) and the Valmikis of Andhra Pradesh and the Kallar of Tamil Nadu suggesting that M17 is widespread in Tribal Southern Indians.
Besides these, studies show high percentages in regionally diverse groups such as Manipuris (50%) to the extreme North East and in Punjab (47%) to the extreme North West.
In Pakistan it is found at 71% among the Mohanna tribe in Sindh province to the south and 46% among the Baltis of Gilgit-Baltistan to the north. While 13% of Sinhalese of Sri Lanka were found to be R1a1a (R-M17) positive.
Hindus of Terai region of Nepal show it at 69%.
In Afghanistan, R1a1a (R-M17) is found at 51.02% among the Pashtuns (the largest ethnic group in Afghanistan) and 30.36% among the Tajiks, but it is less frequent among the Hazaras (6.67%) and the Turkic-speaking Uzbeks (17.65%).
R1a1 among others European haplogrupes
In Europe, R1a, again almost entirely in the R1a1a sub-clade, is found at highest levels among peoples of Eastern European descent (Sorbs, Poles, Russians and Ukrainians; 50 to 65%). In the Baltic countries R1a frequencies decrease from Lithuania (45%) to Estonia (around 30%). Levels in Hungarians have been noted between 20 and 60%.
There is a significant presence in peoples of Scandinavian descent, with highest levels in Norway and Iceland, where between 20 and 30% of men are in R1a1a. Vikings and Normans may have also carried the R1a1a lineage westward; accounting for at least part of the small presence in the British Isles. In East Germany, where Haplogroup R1a reaches a peak frequency in Rostock at a percentage of 31.3%, it averages between 20%-30%.
Haplogroup R1a1a was found at elevated levels amongst a sample of the Israeli population who self-designated themselves as Ashkenazi Jews, possibly reflecting gene flow into Ashkenazi populations from surrounding Eastern European populations, over a course of centuries. This haplogroup finding was apparently consistent with the latest SNP microarray analysis which argued that up to 55 percent of the modern Ashkenazi genome is specifically traceable to Europe. Ashkenazim were found to have a significantly higher frequency of the R-M17 haplogroup Behar reported R-M17 to be the dominant haplogroup in Ashkenazi Levites (52%), although rare in Ashkenazi Cohanim (1.3%) and Israelites (4%).
In Southern Europe R1a1a is not common amongst the general population, but it is widespread in certain areas. Significant levels have been found in pockets, such as in the Pas Valley in Northern Spain, areas of Venice, and Calabria in Italy. The Balkans shows lower frequencies, and significant variation between areas, for example >30% in Slovenia, Croatia and Greek Macedonia, but <10% in Albania, Kosovo and parts of Greece.
The remains of a father and his two sons, from an archaeological site discovered in 2005 near Eulau (in Saxony-Anhalt, Germany) and dated to about 2600 BCE, tested positive for the Y-SNP marker SRY10831.2. The R1a1 clade was thus present in Europe at least 4600 years ago, in association with one site of the widespread Corded Ware culture.
Central and Northern Asia
R1a1a frequencies are patchy in Central Asia. This variation is possibly a consequence of population bottlenecks in isolated areas and the movements of Scythians in ancient times and later the Turco-Mongols.
High frequencies of R1a1a (R-M17 or R-M198; 50 to 70%) are found among the Ishkashimis, Khujand Tajiks, Panjakent Tajiks, Turkic-speaking Kyrgyzs, and in several peoples of Russia's Altai Republic, but frequencies are relatively lower (16 to 25%) among the Dushanbe Tajiks, Samarkand Tajiks, Yaghnobis and Shughnis.
Although levels are comparatively low amongst some Turkic-speaking groups (e.g. Turks, Azeris, Kazakhs, Yakuts), levels are high (19 to 28%) in certain Turkic or Mongolic-speaking groups of Northwestern China, such as the Bonan, Dongxiang, Salar, and Uyghurs.
In Eastern Siberia, R1a1a is found among certain indigenous ethnic groups including Kamchatkans and Chukotkans, and peaking in Itel'man at 22%.
Middle East and Caucasus
R1a1a has been found in various forms, in most parts of Western Asia, in widely varying concentrations, from almost no presence in areas such as Jordan, to much higher levels in parts of Kuwait, Turkey and Iran.
The Shimar (Shammar) Bedouin tribe in Kuwait show the highest frequency in the Middle East at 43%.
Wells et al. (2001), noted that in the western part of the country, Iranians show low R1a1a levels, while males of eastern parts of Iran carried up to 35% R1a. Nasidze et al. (2004) found R1a in approximately 20% of Iranian males from the cities of Tehran and Isfahan. Regueiro et al. (2006), in a study of Iran, noted much higher frequencies in the south than the north.
Turkey also shows high but unevenly distributed R1a levels amongst some sub-populations. For example Nasidze et al. (2005) found relatively high levels amongst two Kurdish groups of Turkey, the Kurmanji (13%) and Zazaki (26%).
Further to the north of these Middle Eastern regions on the other hand, R1a levels start to increase in the Caucasus, once again in an uneven way. Several populations studied have shown no sign of R1a, while highest levels so far discovered in the region appears to belong to speakers of the Karachay-Balkar language amongst whom about one quarter of men tested so far are in haplogroup R1a1a.
Possible place of origin Southwest Asia
Descendants R1b1a (R-P297), R1b1b (R-M335), R1b1c (R-V88)
Defining mutations 1. M343 defines R1b in the broadest sense
P25 defines R1b1, making up most of R1b, and is often used to test for R1b
In some cases, major downstream mutations such as M269 are used to identify R1b, especially in regional or out-of-date studies
Highest frequencies Western Europe, Northern Cameroon, Hazara, Bashkirs
In human genetics, Haplogroup R1b is the most frequently occurring Y-chromosome haplogroup in Western Europe, parts of central Eurasia (for example Bashkortostan), and in parts of sub-Saharan Central Africa (for example around Chad and Cameroon). R1b is also present at lower frequencies throughout Eastern Europe, Western Asia, Central Asia, and parts of South Asia and North Africa. Due to European emigration it also reaches high frequencies in the Americas and Australia. While Western Europe is dominated by the R1b1a2 (R-M269) branch of R1b, the Chadic-speaking area in Africa is dominated by the branch known as R1b1c (R-V88). These represent two very successful "twigs" on a much bigger "family tree."
R1b1c is found in northern Cameroon in west central Africa at a very high frequency, where it is considered to be caused by a pre-Islamic movement of people from Eurasia.
Suggestive results from other studies which did not test for the full range of new markers discovered by Cruciani et al. have also been reported, which might be in R-V88.
Wood et al. reported high frequencies of men who were P25 positive and M269 negative, amongst the same north Cameroon area where Cruciani et al. reported high R-V88 levels. However they also found such cases amongst 3% (1/32) of Fante from Ghana, 9% (1/11) of Bassa from southern Cameroon, 4% (1/24) of Herero from Namibia, 5% (1/22) of Ambo from Namibia, 4% (4/92) of Egyptians, and 4% (1/28) of Tunisians.
Luis et al. found the following cases of men M173 positive (R1), but negative for M73 (R1b1b1), M269 (R1b1b2), M18 (R1b1a1, a clade with V88, M18 having been discovered before V88) and M17 (R1a1a): 1 of 121 Omanis, 3 of 147 Egyptians, 2 of 14 Bantu from southern Cameroon, and 1 of 69 Hutu from Rwanda.
Pereira et al. (2010) in a study of several Saharan Tuareg populations, found one third of 31 men tested from near Tanut in Niger to be in R1b.
R1b1c1 is a sub-clade of R-V88 which is defined by the presence of SNP marker M18. It has been found only at low frequencies in samples from Sardinia and Lebanon.
The DNA tests that assisted in the identification of Czar Nicholas II of Russia found that he had haplogroup R1b.
Haplogroup K-M526 (Y-DNA).
K(xLT) is the ancestral haplogroup to haplogroups K1, K2, K3, K4, M, NO, P (which contains haplogroups Q and R), and S (formerly MNOPS). Possible time of origin 35,000-45,000 years BP in South or Central Asia.
Haplogroup P-M45 (Y-DNA) is the parent of haplogroups (P*, Q, R). It is believed to have arisen 27,000-41,000 years BP in Central Asia - South Asia.
This haplogroup contains the patrilineal ancestors of most Europeans and almost all of the indigenous peoples of the Americas. It also contains approximately one third to two thirds of the males among various populations of Central Asia and Southern Asia.
Haplogroup R-M207 (Y-DNA)
In human population genetics, haplogroups define the major lineages of direct paternal (male) lines back to a shared common ancestor in Africa.
haplogroup R-M207 is a Y-chromosome DNA haplogroup. It marks a major split in paleolithic lineages some descendant lines are common throughout Europe, Central Asia and South Asia, and also common in parts of the West Asia and Africa. Others are primarily from West Asia and South Asia. This line is a descendant of haplogroup P-M45.
This haplogroup is believed to have arisen around 20,000-34,000 years ago,(Karafet 2008) somewhere in Central Asia or South Asia, where its ancestor Haplogroup P-M45 is most often found at polymorphic frequencies.(Wells 2001)
The two currently defined subclades are R-M173 and R-M479. Haplogroup R-M173 is estimated to have arisen during the height of the Last Glacial Maximum (LGM), about 18,500 years ago, most likely in southwestern Asia.
Y-haplogroup R-M207 is found throughout all continents, but is fairly common throughout Europe, South Asia and Central Asia. Small frequencies are found in Malaysia, Indonesia, Philippines, and Indigenous Australians.(Kayser 2003) It also occurs in Caucasus, Near East, West China, Siberia and some parts of Africa. It has a high frequency in the Native Americans due primarily to the introduction of Eurasian lineages in the last 500 years.
Haplogroup R-M173 (Y-DNA)
In human genetics, Haplogroup R-M173 is a Y-chromosome DNA haplogroup, a subgroup of haplogroup R, associated with the M173 mutation. It is dominated in modern populations by two Eurasian clades, R-M240 and R-M343, which together are found all over Eurasia except in Southeast Asia and East Asia. However, other types of R-M173, less well-known and undefined so far by any identified SNP, and therefore referred to collectively simply as R-M173*, have been reported in the Americas, all over Asia and Oceania.
In the Americas, it is not a pre-Colombian founding lineage. However, it is the second most common haplogroup in Indigenous peoples of the Americas following haplogroup Q-M242, and spreads specially in Algonquian peoples from United States and Canada.
The origins of R-M173 remain unclear. Haplogroup R-M207 is part of the family of haplogroup P-M45, and a sibling clade, therefore, of haplogroup Q-M242, which is common in the Americas and Eurasia. In Eurasia, Q-M242's geography includes eastern areas such as Siberia. Based on these ancestral lineages, an inferred origin for R-M173 to the east of the West Asia. For example, Kivisild 2003 believes the evidence "suggests that southern and western Asia might be the source of this haplogroup." and "Given the geographic spread and STR diversities of sister clades R1 and R2, the latter of which is restricted to India, Pakistan, Iran, and southern central Asia, it is possible that southern and western Asia were the source for R1 and R1a differentiation." Soares 2010 felt in their review of the literature, that the case for South Asian origins is strongest, with the Central Asian origin argued by (Wells 2001) being also worthy of consideration.
Haplogroup R-M173 is fairly common throughout Europe, South Asia and Central Asia. It also occurs in Africa, Near East and Native Americans from North America. Low frequencies in Siberia, Malay Archipelago and Indigenous Australians.
In Indigenous Americans groups, R-M173 is the most common haplogroup after the various Q-M242, especially in North America in Ojibwe people at 79%, Chipewyan 62%, Seminole 50%, Cherokee 47%, Dogrib 40% and Papago 38%. The decreasing gradient of haplogroup R-M207 from Northeastern to Southwestern North America is evidence that this results from European admixture.