Because of its strategic geographic location, Tunisia has been a crossroads of multiple civilizations and a point of contact for many human migrations during successive historical periods. At present, both Arabic and Berber-speaking populations live in Tunisia. Arab origins in Tunisia can be traced back to the Islamic expansions of the 7th century AD, but the origin of the Berbers, both within and outside of Tunisia, is a matter of debate. Berbers are commonly considered as in situ descendants of local Palaeolithic and/or Mesolithic populations, and archaeological data points to a human occupation of North Africa since at least 45,000 years ago, as attested by the Aterian industry (Garcea and Giraudi, 2006) and subsequent cultures until the Neolithic (the adoption of agriculture and husbandry), which began around 5,500 years ago in the region (Camps, 1974; Camps, 1982). As a result, the current Tunisian population is composed of an ancient Berber background together with influences from the different civilizations settled in this region in historical times: Phoenicians (814–146 B.C.), Romans (146–439 B.C.), Vandals (439–534 B.C.), and Byzantines (534–647 B.C.). By the end of the 7th century, Arab armies migrated from the Arabian Peninsula to expand the Islamic religion and Arabic language into North Africa, and the late 10th century saw an important movement of Arab populations within North Africa, mostly Bedouins (Murdock, 1959; Hiernaux, 1975). These latter inter-African migrations had a strong cultural importance on Berber customs, religion, and language (Ibn Khaldoun, 1968). The only populations that escaped this influence were some tribes that were forced back to the mountains or settled in remote villages located in the north and south of Tunisia and maintained their tribal structures (e.g., Jradou, Sened, Matmata, and Chenini–Douiret). In more recent times, the most important population movement from Iberia to North Africa resulted from the expulsion of the “Moors” residing in Andalusia (in southern Spain) after Granada's capitulation (in 1492 A.D.). Tunisia received a large number of immigrants, estimated to have been over 80,000 “Andalusians” (Abdul-Wahab, 1917), who settled in the north of the country in dispersed, small isolated villages like Zaghouan village. “Andalusian” villages were similar to ancient Arabic towns in Spain (Latham, 1957), recognized in the many archaeological and cultural imprints, such as the architecture of houses and mosques (Marcias, 1942; Saadani, 1990), Arab–Andalusian music (Marzouki, 1994), marital clothes, and culinary traditions (DeEpalza, 1980; Skhiri, 1968; Skhiri, 1969). Nowadays, “Andalusian” families are often identifiable by their surnames, such as Zbiss, Marco, and Blanco, which are very different from Arabic ones (Marcias, 1942).
Many studies have attempted to describe the genetic structure of Tunisian populations using different autosomal markers: the GM and KM allotypes (Chaabani etal., 1984; Loueslati et al., 2001; Fadhlaoui-Zid et al., 2004a), HLA class II polymorphisms (Guenounou et al., 2006; Fadhlaoui-Zid et al., 2010), autosomal short tandem repeats (STRs) (Bosch et al., 2001a; Khodjet-El-Khil et al., 2008), and polymorphic Alu insertions (Ennafaa et al., 2006; Frigi et al., 2010). These results have suggested a certain inter-population diversity in Tunisia compared with Europe and sub-Saharan Africa, with migrations from both neighboring regions but with a greater Eurasian contribution. Concerning uniparental markers, mitochondrial DNA analyses have been performed on different populations from Tunisia (Berbers, Arabs, and Andalusians), showing that Tunisian Berber groups present a remarkable genetic inter-population diversity (Fadhlaoui-Zid et al., 2004b; Cherni et al., 2005a; Frigi et al., 2006a; Loueslati et al., 2006). Small effective sizes, founder effects, and isolation processes followed by genetic drift have been postulated as the main factors contributing to current population differentiation of these Berber samples. Compared with Arab and Andalusian groups in Tunisia, Berbers are genetically more heterogeneous, and the maternal lineages found in Tunisians (including Berbers and non-Berbers) are the result of admixture from Eurasian, sub-Saharan, and autochthonous North African components. Although the Eurasian maternal component is the most frequent in all Tunisian populations, variations of sub-Saharan African traces in their gene pool could be partly responsible for this considerable diversity. In addition, these studies have pointed to the mosaic structure of Tunisian populations with an absence of ethnic, linguistic, and geographic effects (Fadhlaoui-Zid et al., 2004b).
In contrast with abundant mtDNA data in Tunisia, the data for the Y-chromosome is scanty. Previous studies have been focused on Y-STR polymorphisms in several Tunisian groups, namely Berbers (Cherni et al., 2005a; Khodjet-el-Khil et al., 2005; Frigi et al., 2006a), Arabs (Cherni et al., 2005a; Khodjet-el-Khil et al., 2005), Andalusians (Cherni et al., 2005a), individuals from Sfax (Ayadi et al., 2006), or the general Tunisian population (Brandt-Casadevall et al., 2003; Arredi et al., 2004; Cherni et al., 2005a). Unfortunately, only two studies of Y-chromosome haplogroup structure within Tunisia have been carried out. These studies, based on relatively low phylogenetic resolution, were restricted to the Jerba Island (Khodjet-el-Khil et al., 2005) and the general population (Arredi et al., 2004). Nevertheless, despite the known sub-Saharan origin of Jerbans, the paternal lineage results revealed a quite homogeneous pattern of diversity among Tunisians with a high frequency of the autochthonous North African haplogroup E-M81 (E1b1b1b, following the nomenclature by Karafet et al., 2008). However, up to now, no Y-chromosome high resolution analyses in well-defined Tunisian ethnic groups (Berbers, Arabs, and Andalusians) have been carried out.
In order to fill this void and to unravel the genetic structure of male lineages in Tunisian ethnic groups, we have performed the first high resolution Y-chromosome lineage analysis of Berber-speaking isolates from the north and south of Tunisia (Jradou, Sened, and Chenini–Douiret), Andalusians from Zaghouan, and Cosmopolitan Arabs from Tunis. In addition, 17 Y-STR loci were typed to determine the lineage diversity of these ethnic groups.
The aim of this work is to evaluate the possible population diversity of Tunisians, previously described for the mtDNA, and to analyze the genetic relationships between different ethnic groups. Overall, these comparisons should give us a better understanding of the Tunisian paternal genetic structure and in turn help us to understand the intricate history of this region.
MATERIALS AND METHODS
Blood samples were collected from 159 unrelated healthy men belonging to the following communities: an urban sample of Arab speakers from the capital Tunis (n = 33); one “Andalusian” community (Zaghouan n = 32); and four small Berber communities (Sened n = 35; Jradou n = 32; Chenini–Douiret n = 27). Samples from Chenini and Douiret, two neighboring villages 20 km apart, were pooled in the analyses. Their geographic location is shown in Figure 1. The sampling in the “Andalusian” community was based on a patronymic criterion, whereas Berbers were Berber Chelha speakers born in one of the four villages mentioned above. Genealogical information was acquired to avoid paternal relatedness for at least three generations. All subjects were volunteers and consented to participate in this study by donating a blood sample according to the ethical standards of the institutions involved. Genomic DNA was extracted using a standard phenol-chloroform method. DNA concentration of all samples was normalized to 1 ng/μl, measured with the Quantifiler™ Human DNA Quantification Kit (Applied Biosystems) using the ABI 7500 Real-time PCR System (Applied Biosystems).
DNA amplification of 17 Y-specific STR loci (DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS385a, DYS385b, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, and Y-GATA H4) was performed using the AmpFlSTR®YFiler™ PCR Amplification kit (Applied Biosystems) according to manufacturer's instructions. DNA fragment separation and detection was achieved in an ABI Prism 3130 Genetic Analyzer (Applied Biosystems). Genescan500 LIZ was utilized as an internal size standard. Amplicon sizes were determined using the GeneMapper v3.2 software.
Biallelic marker genotyping
Forty-two Y-chromosome binary genetic markers were genotyped hierarchically using TaqMan® probes (Applied Biosystems). Information about primer sequences, polymorphic positions, and haplogroup nomenclature can be found in Karafet et al., 2008. All the samples were first analyzed for the biallelic markers (M168, M89, M9, M45, M207, M174, M96, M201, M69, M170, M304, and M20), which classify samples within the main branches of Y-chromosome phylogeny (YCC 2002; Karafet et al., 2008). The reaction volume was set to 5 μl, containing 1 μl of DNA template, 0.25 μl of 20× SNP (single nucleotide polymorphism) Genotyping Assay, 1.25 μl of sterile-filtered water, and 2.5 μl of TaqMan® Universal PCR Master Mix amplified with standard conditions. For the remaining biallelic markers, the following hierarchical genotyping scheme was used. Samples derived for M96 were tested for P147, M75, P177, M33, P2, M215, M35, M78, M81, M123, M224, M107, and M165. Samples derived for M267 and ancestral for M172 were tested for M62, M365, M390, P56, P58, M367, M368, and M369. One individual derived for the marker M172 was tested for M410, M12, M47, M67, M68, M92, M137, and M158. One individual derived from M201 was screened with M285, P287, and P15 markers. All these samples were genotyped using the same polymerase chain reaction (PCR) conditions described above. After the PCR amplification, alleles were discriminated with the SDS Software™ v2.3 (Applied Biosystems).
Summary statistics [haplotype diversity (HD), mean number of pairwise differences (MPD), population-pairwise genetic distances FST (for Y-SNP haplogroups) and RST (for Y-STR haplotypes)] were calculated using the Arlequin Software ver. 3.5 (Excoffier et al., 2005). Haplogroup frequencies were estimated by direct counting.
Genetic structure was examined by performing two sets of analyses of molecular variance (AMOVA; Y-STR and Y-SNP) using the Arlequin software ver. 3.5 (Excoffier et al., 2005). The molecular distances between haplogroups were considered by taking into account the number of mutational steps among haplogroups.
Genetic relationships among Tunisian populations were analyzed by two approaches. The first approach was based on Y-STR genetic distances (RST) calculated between our samples and seven other native Tunisian populations collected from the literature (Supporting Information Table 3). Genetic distances were estimated at the level of haplotype information of twelve Y-STR loci (DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS385a, DYS385b, DYS437, DYS438, and DYS439) with the Arlequin software ver. 3.5 (Excoffier et al., 2005) and represented in a Multidimensional Scaling (MDS) plot. The second approach was based on haplogroup absolute frequencies included in the seven populations previously used for the first approach, for which haplogroups were predicted using the Athey's haplogroup predictor (http://www.hprg.com/hapest5/index.html) (Athey and Nordtvet, 2005). In addition, five samples without STR profile were added in this second analysis: the three ethnic groups from Jerba Island namely Berber, Arab, and sub-Saharan (Khodjet-el-Khil et al., 2005), and two general Arab-speaking populations (Brandt-Casadevall et al., 2003; Arredi et al., 2004). To make comparisons reliable, haplogroup frequencies were normalized to the equivalent phylogenetic haplogroup resolution level. Haplogroup frequencies were used in order to construct a Correspondence Analysis (CA) plot. Both analyses MDS and CA were performed with SPSS statistical package 17.0 (SPSS).
Median-joining networks of STR haplotypes for M81-E1b1b1b and P58-J1e haplogroups were constructed after processing the data with the reduced median network method using the NETWORK 18.104.22.168 software package available at http://www.fluxus-engineering.com (Bandelt et al., 1999). The network generated for the five Tunisian populations reported in this study is based on 16 Y-STR loci typed. Time to the most recent common ancestor for the E1-M81 and J1e-P58 clades were calculated from STR variances. Mean STR variance was estimated as proposed by Kayser et al. (2001) and transformed in divergence time using a mean STR mutation rate of 0.00069 per generation of 25 years (Zhivotovsky et al., 2004). For these estimations loci DYS385 was excluded, and the repeats of DYS389I were subtracted to the DYS389II so that its diversity was not considered twice.
Paternal lineage composition in Tunisian populations
A total of eight paternal haplogroups were identified among the five Tunisian samples analyzed, namely E1b1b1*, E1b1b1a, E1b1b1b, E1b1b1c, J1e, J2a5, F*, and G2a. Hierarchical phylogenetic relationships among Y-chromosome haplogroups and their frequencies in Tunisia are shown in Figure 2, whereas their STR-haplotype frequencies are shown in Supporting Information Table 1.
Table 1. Genetic diversity parameters in Tunisian populations analyzed in this study
0.9886 ± 0.0131
5.8774 ± 2.8971
0.0000 ± 0.0000
0.9765 ± 0.0132
6.7445 ± 3.2568
0.4824 ± 0.0634
0.8407 ± 0.0557
2.3629 ± 1.3222
0.0000 ± 0.0000
0.9940 ± 0.0095
7.5907 ± 3.6366
0.6270 ± 0.0469
0.9924 ± 0.0104
9.0340 ± 4.2673
0.5076 ± 0.0818
n: number of samples, K: number of haplotypes, HD: haplotype diversity, MPD: mean number of pairwise differences, and HgD: haplogroup diversity.
The most frequent haplogroup was E1b1b1b (71% of the samples) defined by the M-81 marker and previously reported to be frequent in other North African populations (Bosch et al., 2001b; Arredi et al., 2004; Cruciani et al., 2004). This haplogroup is the only one found in two of the Berber samples analyzed (Jradou and Chenini–Douiret). The network performed with the E1b1b1b STR profiles (Figure 3a) shows a total of 77 haplotypes of which the most frequent is haplotype H30 (Supporting Information Table 1) found in 12 Berber individuals from Jradou, whereas no differential distribution in the network was found in the Chenini–Douiret Berber group. This frequent haplotype has also been described in the Y Chromosome Haplotype Reference Database (YHRD, www.yhrd.org) in one individual from Morocco. The low E1b1b1b HD in Jradou Berbers (0.8407 ± 0.0557) contrasts with the higher diversity found in the Chenini–Douiret Berber group (0.9886 ± 0.0131), which also presents exclusively E1b1b1b lineages. The age estimation for the whole Tunisian haplogroup E-M81 is 7.4 ± 5.5 kya.
Besides the common E1b1b1b lineage, traces of other lineages within the major E1b1b haplogroup that are present in North and East Africa, such as E1b1b1* (M35), E1b1b1a (M78), and E1b1b1c (M123), were also detected in Andalusians and in Cosmopolitan Arabs. However, no traces of haplogroup E1b1a, which is particularly frequent in sub-Saharan Africa (Underhill et al., 2000; Cruciani et al., 2002) and present in some North African samples as a result of sub-Saharan migration (Bosch et al., 2001b; Arredi et al., 2004; Robino et al., 2008), were observed in the present sample set.
The second most prevalent haplogroup in the present sample set was major haplogroup J, which has been postulated to have a Middle Eastern origin (Semino et al., 2004). Haplogroup J1e, defined by P-58, which is supposed to have originated in eastern Anatolia (Chiaroni et al., 2010), was observed in three populations with frequencies over 30%. A total of 18 STR haplotypes were found within haplogroup J1e (Figure 3b), and haplotype diversities were similar in Andalusians and Cosmopolitan Arabs (0. 9670 ± 0.0366 and 0.9643 ± 0.0772, respectively), whereas they were slightly lower for Berbers from Sened (0.9273 ± 0.0665). The J1e modal haplotype (H83) was found in two Moroccan Arabs (YHRD, www.yhrd.org, Laouina et al., 2011). The age estimation of haplogroup J1e was 4.4 ± 4.5 kya, which is more recent than that found in the Arabian Peninsula, Europe, and Africa (10.1 kya, Chiaroni et al., 2010). One Andalusian individual belonged to haplogroup J2a5, a lineage rarely found (Underhill et al., 2000; Cinnioglu et al., 2004; Sengupta et al., 2006) and absent in a recent study along the Mediterranean basin (Semino et al., 2004).
Besides the E1b1b and J lineages, some traces of G2a and F* haplogroups were detected in the present Tunisian samples (see Figure 2). G2a has previously been found in European samples (e.g., see Supporting Information in Francalacci and Sanna, 2008 and references therein) as well as in North African Berbers (Alonso et al., 2005), whereas F* has been found in southern Europe and Turkey at very low frequencies (Flores et al., 2004; Capelli et al., 2007; Morelli et al., 2010), which makes it difficult to assign these samples to a geographical origin. However, no traces of typical European lineages (such as R1a, R1b, and I; Karafet et al., 2008) were found in the present sample set.
The diversity parameters of each sample are shown in Table 1. With the exception of the Berber community of Jradou, all samples showed high HD, with the highest values being observed in the Andalusians and Cosmopolitan Arab samples. In addition, the MPD is lower in most Berber samples compared with Cosmopolitan Arabs.
Pairwise Rst genetic distances between samples showed significant distances (P < 0.001) among all three Berber groups and also between Andalusians and Jradou and Chenini–Douiret Berbers, whereas Cosmopolitan Arabs showed significant distances with Andalusians and Sened (Supporting Information Table 2). In order to establish the genetic relationship between the Tunisian samples under study and the other populations from the same geographical region (Supporting Information Table 3), pairwise Rst genetic distances were calculated between populations based on 12 loci Y-STR haplotypes and subsequently represented in a MDS plot (Supporting Information Fig. 1). The structure shown in the MDS does not clearly distinguish between Arab and Berber. Andalusian samples tended to cluster together with Cosmopolitan Arabs. Berbers from Chenini–Douiret and Jradou are located in the opposite edges on the second dimension.
Table 2. Analyses of Molecular Variance (AMOVA) in Tunisian samples
b Jerba Island was excluded from the Southern group.
* P < 0.05.
** P < 0.01.
On the basis of haplogroup frequencies, a CA was performed with the 12 samples analyzed for the MDS, plus four Tunisian samples for which the STR profile was not available (Supporting Information Table 3, Fig. 4). Berbers from Chenini–Douiret and Jradou are associated in the plot with haplogroup E1b1b, and cluster with Berber populations from Takrouna and Kesra-Skira as well as Arabs from Zriba. The rest of the samples are scattered in the plot except for the outlier position of Arabs from Jerba. The CA did not reveal distinct groupings based on either geographic or linguistic affiliation.
In order to explore potential correlations between genetic diversities and linguistic or cultural partitioning, and to evaluate the inter-population diversity among Tunisian samples, an AMOVA was performed using both Y-STR haplotypes and the Y-SNP haplogroups data. The results of the AMOVAs for the five samples from the present study, in addition to the Tunisian data set described in Supporting Information Table 3, are shown in Table 2.
When all samples were considered as a single group, 10.58% and 9.50% of the variance were attributed to differences among populations for haplogroups and haplotypes, respectively (Table 2). These results show a remarkable genetic heterogeneity among native Tunisian populations. In order to evaluate if this sample heterogeneity was mainly due to any of the ethnic groups analyzed, the AMOVA was performed separately in Arabs, Andalusians, and Berbers. Table 2 shows that Berbers exhibit the highest inter-population diversity for both haplogroup and haplotype composition. Nonetheless, no significant differences were found when comparing these three groups or when performing pairwise comparisons, pointing to a lack of correlation between male lineages and cultural or linguistic criteria.
Tunisian populations were next classified according to geographic criteria in order to detect if population structure may be correlated with regional or environmental segregation. Populations were first clustered according to their cardinal localization (North vs South), and AMOVA was performed within each group. No significant differences were found among groups for haplogroups and haplotypes. When the analysis was also performed considering two groups (populations in Mountains vs populations in Plains), the variation among groups was significant for both haplogroups and haplotypes (Table 2).
The paternal gene pool in the studied Tunisian populations is characterized by the high frequency of the specific North African haplogroup E-M81 (E1b1b1b lineage), comprising 71% of the Y-chromosome lineages and being the only haplogroup present in Berber samples from Chenini–Douiret and Jradou. Our results are in agreement with previous patterns found in other North African populations (Bosch et al., 2001b; Arredi et al., 2004; Cruciani et al., 2004; Semino et al., 2004) where the phylogeographic analysis of the E-M81 lineage has shown it to be remarkably frequent in Berber-speaking groups and with declining frequency from west to east (80% in Mozabites in Algeria; 65%–73% in Berbers from Morocco; and ∼10% in Egypt; Arredi et al., 2004; Cruciani et al., 2004; Semino et al., 2004). In Europe, this haplogroup has been found in frequencies ∼5% in those regions that were invaded from North Africa in historical times, namely Iberia and Sicily (Maca-Meyer et al., 2003; Semino et al., 2004; Beleza et al., 2006; Adams et al., 2008; Capelli et al., 2009). The origin of this autochthonous North African haplogroup has been controversial. The early works on Y haplogroups in Northwest African populations suggested a Paleolithic origin (Bosch et al., 2001b), whereas others have pointed to a Neolithic origin (Arredi et al., 2004; Cruciani et al., 2004; Semino et al., 2004; Cruciani et al., 2007). According to Keita (2008), E-M81 is not native to the Middle East, indicating that M81 most likely emerged from clade M35 either in the Maghreb, or possibly as far south as the Horn of Africa. In Tunisia, the expansion time of haplogroup E-M81 points to a Neolithic origin (7.4 ± 5.5 kya), in agreement with the age estimated by Arredi et al. (2004).
On the other hand, the Y-chromosome background of Tunisians also indicates gene flow from the Near East due to the presence of haplogroup J. Previous studies have shown that haplogroup J1-M267 is found at high frequencies among the Arabic-speaking populations of the Middle East (Semino et al., 2004; Cadenas et al., 2008; Zalloua et al., 2008; Tofanelli et al., 2009) and have concluded that subhaplogroup J1e has a coalescent time ∼10 kya (Chiaroni et al., 2010). In Tunisia, J1e lineages are found in Cosmopolitan Arabs, Andalusians and Berbers from Sened, with a coalescence age of 4.4 ± 4.5 kya, which suggests a post-Neolithic signal from the Middle East. The presence of this haplogroup in Berbers from Sened (31.4%) attests for a gene flow from the Near East and contrasts with its absence in the rest of Tunisian Berber populations analyzed, which suggests regional differences in the genetic structure and migration patterns in Berbers.
The Tunisian male genetic composition contrasts with what has been observed in maternal lineages. The data obtained for the Y-chromosome did not show evidence of sub-Saharan lineages (such as haplogroup E1b1a) or European lineages (such as R1a, R1b, and I). This pattern contrast with the relevant mitochondrial DNA sub-Saharan contribution (shown by L lineages) that varied from 3% in Takrouna to 49% in Kesra (Cherni et al., 2005b; Frigi et al., 2006b) and the European contribution (mainly shown by H, U5, and V lineages) with 45.6% in Tunisia (Fadhlaoui-Zid et al., 2004b). The contrast between male and female contributions in Tunisia suggests a large degree of asymmetrical gene flow that might be explained by a higher female than male migration rate due to the patrilocal social structure system. However, one of the limitations of the comparison between Y-chromosome and mtDNA results is the uncertainty of geographical assignment of some mtDNA haplogroups (such as subhaplogroups within H), which tends to classify some lineages within the “European” pool, overemphasizing the European influence in North Africa (Cherni et al., 2009). In fact, when the Tunisian mtDNA data (Fadhlaoui-Zid et al., 2004b), with similar populations as the ones presented here for the Y-chromosome, is compared in a matching analysis to North African and European hypervariable region I data (Mendizabal et al., 2008; Fadhlaoui-Zid et al., 2011), all sequences belonging to H, HV0, R0a, and T haplogroups match with individuals in both European and North African populations, which prevents the proper assignment of the origin of these sequences given the resolution of HVSI. A refinement of the mtDNA resolution using complete sequences or specific mtDNA coding markers might help to resolve this uncertainty.
The comparison between male and female data also shows the degree of heterogeneity among Tunisian samples. The mtDNA composition of Tunisian groups has been shown to be heterogeneous, especially in Berber groups, which has been explained by isolation processes that have caused haplogroup frequencies to drift to unusual values when compared with other samples (Fadhlaoui-Zid et al., 2004b). The present Y-chromosome analysis also points to a certain inter-population diversity between Tunisian groups, especially relevant in Berbers as shown in the AMOVA and correspondence analyses, despite the reduced number of paternal lineages found in these groups.
Our overall results show that the Berbers from Chenini–Douiret and Jradou seem to be the most genetically isolated samples with an exclusively autochthonous component in their paternal gene pool. In terms of Y-STR diversity, Berbers from Jradou present the lowest HD with a strong founder effect that yields a reduction in microsatellite diversity (HD = 0. 8407), which is shown in the E-M81 network, whereas the Y-STR haplotypes for Berbers from Chenini–Douiret are more diversified (HD = 0. 9886), suggesting a less dramatic effect of genetic drift in this group. The out-lying state of Chenini–Douiret Berbers has been previously observed with GM markers (Fadhlaoui-Zid et al., 2004a), mtDNA (Fadhlaoui-Zid et al., 2004b), and autosomal STRs (Khodjet-El-Khil et al., 2008). Sub-Saharan African GM haplotypes and mtDNA haplogroups were present at low frequency (7% and 13%, respectively) among the Berbers from Chenini–Douiret, whereas the Eurasian component was much more frequent (62% and 87%, respectively for GM and mtDNA). The genetic isolation of these two Berber groups (Chenini–Douiret and Jradou) who inhabit mountainous areas might be explained by the fact that around 1048 A.D., two Arab tribes (Beni Hilal and Beni Souleïm), originally from the Arab Peninsula and living in Egypt, migrated in significant numbers (estimated at 125,000) and established themselves in North Africa (Renfrew, 1991), forcing most Berbers to live in the mountains for fear of persecution. During the Arab conquests of North Africa, gene flow between immigrants and local populations might have occurred differentially in remote or isolated areas. Berber populations might have been genetically much more homogeneous among them before the first Arab invasion, and they further differentiated either by genetic drift, in the remote areas where some of them were isolated, or by Arab admixture in more accessible places. Such hypothesis is compatible with our findings because, among our data, small and isolated Berber populations situated in mountainous localities (Chenini–Douiret and Jradou) are genetically the most divergent.
Besides Berbers, one of the ethnic groups differentiated in Tunisia are Andalusians, who are supposed to be descendants of some groups expelled from southern Spain at the end of the 15th century. Despite their cultural differences with surrounding Arab and Berber populations, their male genetic background is similar to Cosmopolitan Arabs in Tunisia without traces of European lineages, suggesting a North African, rather than European, origin of this group. This result agrees with the data provided by mtDNA lineages in Andalusians (Cherni et al., 2009) and points to a cultural rather than a genetic influence from southern Spain in these Andalusian groups.
Finally, the lack of differentiation between North African Arabs, Berbers, and Andalusians has been observed using different genetic markers such as classical markers (Bosch et al., 1997), autosomal STRs (Bosch etal., 2000), Alu insertion polymorphisms (Comas et al., 2000), Y-chromosome lineages (Bosch et al., 2001b; Cherniet al., 2005a), and mtDNA lineages (Plaza et al., 2003; Fadhlaoui-Zid et al., 2004b). The cultural differentiation present in Tunisia, between Berber, Arab, and Andalusian samples seems not to reflect genetic differences between these three groups. This pattern suggests that the male gene pool of Tunisian groups, regardless of their ethnic affiliation, has a common origin and suggests that the differences found when comparing samples are the result of isolation and genetic drift in some populations, especially in geographically remote Berber groups.
The authors thank Roger Anglada and Stéphanie Plaza (UPF, Barcelona, Spain) for technical support and Brandon Invergo for his advice. They also thank Dr. Habib Belhadi and Dr. Ahmed Khazzen for their help with sample collection, and all the volunteers that donated their DNA, making this study possible.