S32.1: Evolution of the mitochondrial control-region in populations of galliforms (Alectoris, Tetrao and Lagopus)

Ettore Randi, Vittorio Lucchini & Patrick De Marta

Istituto Nazionale per la Fauna Selvatica, Via Cą Fornacetta 9, 40064 Ozzano dell'Emilia (BO), Italy, fax 39 51 79 6628, e-mail met0217@iperbole.bologna.it

Randi, E., Lucchini, V., & DeMarta, P. 1999. Evolution of the mitochondrial control-region in populations of galliforms (Alectoris, Tetrao and Lagopus). In: Adams, N.J. & Slotow, R.H. (eds) Proc. 22 Int. Ornithol. Congr., Durban: 1873-1880. Johannesburg: BirdLife South Africa.

The geographic distributions of west European populations of partridges (Alectoris) and grouse (Tetrao, Lagopus) have been sharply affected by cyclical climate and landscape changes during the Pleistocene. Patterns of local extinction and recolonisation might have determined the distribution of genetic variability within and among populations. Geographic variation in partridge and grouse populations could have been affected differently, because they should have had different biogeographic histories. Partridges presently prefer arid open habitats, and should have been restricted to southern refuges during glacial periods. Alternately, grouse live at higher altitudes and in northern latitudes in woodland and cold forests, and their fossil remains were widespread throughout all central Europe during glacials. Therefore, present partridge populations derive from recent postglacial expansions, while the fragmented grouse populations are the relicts of recent contraction and isolation in cold and Alpine areas. We have sequenced the hypervariable domain I of the mtDNA control-region, and used classical molecular analyses of variance and coalescent models to investigate the patterns of geographic distribution of genetic variability in west European populations of grouse and partridges. Results suggest that each species reacted differently and independently to similar habitat changes, and that past demographic and genetic events can be at least partially reconstructed using mtDNA sequences.

 

INTRODUCTION

The amount of polymorphism maintained in a population can be estimated from an aligned set of DNA sequences by counting the number (or frequency) of different haplotypes (H), the number of different mutations (e ), the number (or percent) of segregating sites (s), and the average number of pairwise nucleotide differences per site (k). Using these data one can compute the haplotypic (h) and nucleotide diversities (p ), which correspond to values of heterozygosity at the gene and nucleotide level, respectively.

The neutral theory (Kimura 1968; Watterson 1975) showed the expected (E) values of s and k roughly corresponding to the population genetic parameter Q = 4Nu (or 2Nu in case of haploid genomes as the mitochondrial DNA, where N is the effective population size, and u is the mutation rate per sequence per generation). Therefore:

E(k) @ E(s) @ Q

These expectations will be realised only if the studied populations conform to the theoretical model of neutral evolution, which states that: (1) polymorphic sites are selectively neutral; (2) every mutation is counted as a polymorphism and there are no multiple hits (‘infinite site model’); (3) there is no recombination among the sequenced haplotypes; (4) the population is random mating with non-overlapping generations; (5) the population is isolated and there is no migration; (6) the effective population size N is large and stable in time (i.e. the population is in demographic-genetic equilibrium). Of course, it is very implausible that real populations conform to all these assumptions, and therefore classical equilibrium models can produce biased estimates of the population parameters.

However, an alignment of DNA sequences contains not just information on pairwise differences, but also genealogical information. Using a variety of phylogenetic methods it is possible to reconstruct the genealogies representing the most probable evolutionary relationships among the sampled haplotypes (or the alleles). Coalescent methods (Hudson 1990) are based on the assumption that all the haplotypes sampled in a populations must derive from common ancestors. Pairs of haplotypes are linked in a continuous chain of common ancestors backward in time since their single most recent common ancestor, which is at the origin of the genealogy. Under a neutral model a genealogy has simple properties, which have received rigorous mathematical treatments, and have been implemented in computer programmes. Neutral models of coalescence are used as null hypotheses to predict the distribution of variability and estimate the relevant genetic parameters in the studied populations. A strength of the coalescent is that it is possible to modify the neutral models and incorporate the effects of balancing selection, selective sweeps, migration between fragmented populations, fluctuating effective population size and recombination.

Aligned mitochondrial DNA (mtDNA) sequences are being increasingly used in population genetic studies. The mtDNA is maternally inherited and does not recombine. Nucleotide sequences evolve quickly, in particular at domain I of the CR (CR-I; Baker & Marshall 1997; Randi & Lucchini 1998; Lucchini & Randi 1998). Within-population sequence variability is generated mainly by point mutations, it is expected to be neutral and, at low divergence times, it should conform to the infinite site model. Therefore, CR-I has nice genetic properties, which might be usefully exploited to describe and analyse the extent and dynamics of genetic diversity at the population level within a Pleistocene time frame (about 1.5 million years, MYR).

We have studied some west European populations of three species of galliforms (the Rock Partridge Alectoris graeca, the Rock Ptarmigan Lagopus mutus, and the Black Grouse Tetrao tetrix) as case studies to explore the application of classical and coalescent genetic models, with the aim: 1) to describe the geographic distribution of genetic variability; 2) to describe rates and patterns of gene flow; 3) to infer the likelihood of possible alternative scenarios of historical population dynamics; and 4) to understand the consequences of Pleistocene climate and landscape changes on the geographic structuring of genetic diversity. The Pleistocene histories of the studied species are expected to be different. In fact, partridges prefer arid, open mountain habitats, and should have been restricted to southern refuges during glacial periods. On the contrary, grouse live in woodland and cold forests, and their fossil remains were widespread throughout all central Europe during glacials. Therefore, some of the present populations of partridges are the result of recent postglacial expansions, while the fragmented grouse populations are the result of recent contraction and isolation in cold and Alpine areas.

In this study we have sequenced the mtDNA CR-I, which is the hypervariable part of the mtDNA. Classical population genetic and coalescent methods are used to describe the patterns of geographic variation and explain the genetic structure of the galliform populations. In particular, we have applied: (1) classical (Tajima's D, 1989), and coalescent (Fu & Li's D* and F*, 1993) test of neutrality of evolution of the CR-I sequences; (2) analyses of mismatch distributions of pairwise nucleotide differences (Rogers 1995); (3) classical Fst-based (Hudson et al. 1992) and coalescent (Beerli 1998) models to estimate the rate of gene flow (Nm).

METHODS

DNA samples

Total DNA was extracted from feather tips and tissue samples, stored in 95% ethanol, using guanidinium thiocyanate and diatomaceous silica particles (a procedure modified after Gerloff et al. 1995). We collected these DNA samples from the Rock Partridge, the Rock Ptarmigan, and the Black Grouse.

PCR amplification and sequencing

The entire mtDNA CR was amplified using the Polymerase Chain Reaction (PCR) and following the protocol described by Randi & Lucchini (1998). Purified PCR products were sequenced by double-stranded DNA cycle sequencing with ABI Prism Dye Terminator chemicals in an ABI 373 automatic sequencer.

Sequence analyses

CR-I sequences were aligned using CLUSTAL W (Thompson et al. 1994). Estimates of within population genetic diversity, Tajima's and Fu & Li's neutrality tests were computed using DnaSP 2.9 (Rozas & Rozas 1997). Genealogical relationships among CR-I haplotypes were inferred by neighbor joining (NJ; Saitou & Nei 1987) using MEGA 1.01 (Kumar et al. 1993) with pairwise exclusion of gaps, and by minimum spanning tree networks (MST), using MinSpNet (Excoffier 1997). Classical analyses of Fst and Nm were performed by AMOVA (Excoffier et al. 1992) and Slatkin's (Hudson et al. 1992) procedures, as implemented in ARLEQUIN 1.1 (Schneider et al. 1997). Coalescent analyses of gene flow were performed using MIGRATE 0.4 (Beerli 1998). Classical and coalescent estimates of Q = 2Nu were computed with DnaSP and FLUCTUATE 1.1 (Khuner & Yamato 1998), respectively. Mismatch analyses and simulations were computed using MISMATCH (Rogers 1995).

RESULTS

Population genetic structure and geographic variation in Rock Partridge Alectoris graeca

We have sequenced 436 nucleotides (nt) of CR-I in 70 samples of Rock Partridges collected in Italy (Sicily, n = 10; central Apennines, n = 14; Italian Alps, n = 41) and in Albania (n = 5). In this sample of sequences there were 25 mutations (e ), generating 24 segregating sites (s). Therefore, one site mutated twice (in violation to the infinite site model). There were 13 different haplotypes (H). Haplotype diversity was h = 0.834 (S. D. = 0.027), nucleotide diversity was p = 0.014 (S. D. = 0.007), and the average number of pairwise differences among haplotypes was k = 6.072 (S. D. = 2.926). Values of 2Nu, estimated using s and k, were: Q s = 0.011 and Q k = 0.010, respectively. Neutrality tests computed as Tajima's D and Fu & Li's D* and F* were not significantly different from 0. Therefore, these sequences evolved neutrally, either from a classical and a coalescent perspective.

Interpopulation genetic diversity was highly significant: the AMOVA proportion of variability among Sicily, Apennines, Alps and Albania was Fst = 0.81. The most important contribution to geographic variation was due to the Sicilian Rock Partridge population, which had pairwise Fst values > 0.90 vs. all the other populations. However, Fst values were significant also among the geographically continuous Alpine populations. In particular, Fst was 0.61 among the western and eastern Italian Alps. Therefore, Rock Partridge populations showed highly significant geographic structure. Genealogical relationships among CR haplotypes defined four distinct clades corresponding to the highly divergent population of Sicily, Albania/Apennines, and to the (prevalently) western Alpine and eastern Alpine populations, respectively. Congruence between genealogical clades and geographic distribution of haplotypes indicates that Rock Partridge populations were phylogeographically structured.

Phylogeographic structuring was the cause of the ragged observed mismatch distribution, which showed two main waves, corresponding to the pairwise differences among Sicily and all the other populations, and among the Albania/Apennines vs. the Alpine populations, respectively. The observed mismatch distribution was therefore mainly determined by interpopulation genetic divergence, and can be considered an intermatch distribution (sensu Rogers 1995). The separate mismatch distributions of the single geographic populations were unimodal, corresponding to an expected moderate population growth with Tau @ 2. A very moderate population growth, or stability, was also suggested by coalescent simulations computed by FLUCTUATE. These findings suggest that: 1) genetic divergence among Rock Partridge populations occurred before any eventual recent population expansion; and 2) postglacial colonisation of the Italian Alps was probably not associated with a population expansion large enough to leave a genetic signature on the distribution of genetic variability at CR-I sequences.

Classical Fst based estimates of Nm suggested that historical gene flow was generally low among Italian peninsular Rock Partridge populations (Nm was about 1, or lower, except for central and eastern Alpine populations, which had a Nm = 6.4). The lowest Nm was 0.1, between Apennines and eastern Alps. Nm was also low between the continuous Alpine populations (Nm was only 0.3 between eastern and western Alpine Rock Partridges), suggesting reduced or absent gene flow across the Alps. Coalescent estimates computed using MIGRATE, suggest that gene flow could be (or has been) asymmetrical: immigration from western Alps into the Apennines was 0, while emigration from the Apennines to western Alps was 0.6 to 1.9. As Nm values are probably related to historical rather than current gene flow, these findings support historical contacts (either gene flow or genetic tracks of postglacial colonisation routes) between the Apennines and western Italian Alps.

Population genetic structure and geographic variation in Rock Ptarmigan Lagopus mutus

We have sequenced 481 nt of CR-I in 125 samples of Rock Ptarmigan collected in the Italian Alps (n = 41), French Pyrenees (n = 49), Scotland (n = 6) and Scandinavia (n = 29). In this sample of sequences there were 20 mutations (e ), generating 20 segregating sites (s) and 23 different haplotypes (H). Haplotype diversity was h = 0.879 (S. D. = 0.014), nucleotide diversity was p = 0.004 (S. D. = 0.002), and the average number of pairwise differences among haplotypes was k = 2.085 (S. D. = 21.173). Values of 2Nu estimated using s and k were: Q s = 0.008, Q k = 0.017, respectively. The neutrality tests computed as Tajima's D was = - 1.2, p < 0.10, and it was D* = - 3.4, p < 0.02 when computed following Fu & Li. Therefore, these sequences did not evolve neutrally, at least from a coalescent perspective.

Interpopulation genetic diversity was significant: the AMOVA proportion of variability among Pyrenees, Alps, Scotland and Scandinavia was Fst = 0.57. Pairwise Fst values among the four populations were about 0.6 to 0.8, except for a lower value of 0.18 between the Alps and Scandinavia. The geographically continuous western, central and eastern Alpine populations were not significantly differentiated. Samples collected in Pyrenees and Scotland had unique CR-I haplotypes, while some highly frequent haplotypes were widespread and shared between Alpine and Scandinavian Rock Ptarmigan populations. Mutational distances among haplotypes were low (k = 2.08), and the genealogical trees were not structured. The MST network was star-shaped, with one common haplotype at the center. This kind of genealogical structure suggests a sudden change of population size (or a selective sweep of the mtDNA genome), and consequently the mismatch distribution was unimodal with Tau = 2.0. The intermatch and the separate mismatch distributions of the single geographic populations were similar and unimodal, corresponding to moderate population growth, with Tau @ 2.0. A moderate population growth, or stability, was also suggested by coalescent simulations computed by FLUCTUATE: in the Italian Alps the expected maximum likelihood Q was 0.011, allowing for fluctuating population size, and it was in accordance with a slightly positive population growth and female N @ 2900. Similar values of Q and N were obtained for the Scandinavian population. These findings suggest that: 1) a moderate population expansion occurred before recent population fragmentation into the present geographical isolates; and 2) postglacial colonisation of the Italian Alps was not associated with a large population expansion.

Tajima's D was slightly negative, and Fu & Li's D* and F* were significantly negative, indicating departure form neutrality. D becomes negative when k < s. The average number of pairwise differences (k) is affected more heavily than the number of segregating sites (s) in case of selective sweep or recent population bottleneck. Fu & Li's statistics are expected to became negative when there is an excess of external mutations, which can result from positive selection (fixation of advantageous mutations or a selective sweep), or from a population bottleneck. The observed negative values could be the result of natural selection on mtDNA haplotypes or of a sudden population change at Tau = 2.0. Both causes are expected to produce the observed star genealogy of CR-I haplotypes in Rock Ptarmigan, but at the moment it is not possible to decide among them, and data on other genes, not linked to the mtDNA, are necessary. In fact, a sudden population change is expected to affect the entire genome at the same time, while it is very implausible that selective pressures on mtDNA mutations will affect also other independent nuclear loci.

Population genetic structure and geographic variation in Black Grouse Tetrao tetrix

We have sequenced 479 nt of CR-I in 105 samples of Black Grouse collected in the Italian Alps (n = 76), Norway (n = 27), and Kazakhstan (n = 2). In this sample of sequences there were 27 mutations (e ), generating 27 segregating sites (s) and 29 different haplotypes (H). Haplotype diversity was h = 0.928 (S. D. = 0.012), nucleotide diversity was p = 0.011 (S. D. = 0.006), and the average number of pairwise differences among haplotypes was k = 5.409 (S. D. = 2.627). Values of 2Nu, estimated using s and k, were: Q s = 0.011, Q k = 0.025, respectively. Neutrality tests computed as Tajima's D, and Fu & Li's D* and F* were not significantly different from 0. Therefore, these sequences evolved neutrally, either from a classical or a coalescent perspective.

Interpopulation genetic diversity was significant with average Fst = 0.38 (AMOVA). Fst values were not significant among the geographically continuous Alpine populations, except for western vs. eastern populations (Fst = 0.08). CR-I haplotypes were different and not shared among the three sampled regions, and clustered into four distinct clades (two distinct Alpine clades, Norway and Kazakhstan).

Population divergence produced a ragged mismatch distribution of pairwise differences, with two ‘waves’, corresponding to the pairwise differences among Norway and all the other populations, and between the two Alpine clades, respectively. The observed mismatch distribution was determined by interpopulation genetic divergence, and can be considered an intermatch distribution. The separate mismatch distributions of the single geographic populations were unimodal, corresponding to an expected moderate population growth, with Tau @ 1.5 to 2.0. Very moderate population growth, or stability, was also suggested by coalescent simulations computed by FLUCTUATE: in Norway the expected maximum likelihood estimate of Q was 0.011, allowing for fluctuating population size, and it was in accordance with a slightly positive population growth with female N @ 4500. In the Italian Alps Q was 0.011, allowing for fluctuating population size, and it was in accordance with a very slight negative population growth with female N @ 2800. These findings suggest that: 1) genetic divergence among Black Grouse populations predated any eventual recent population expansion; and 2) postglacial colonisation of the Italian Alps was probably not associated with any population expansion. The Alps could have been colonised by populations of stable size.

Classical Fst based estimates of Nm suggested that historical gene flow was high in the Italian Alps (Nm was higher than 5), and that Alpine populations are not reproductively isolated. On the contrary coalescent estimates computed by MIGRATE suggested that gene flow between eastern and western Alpine population was low and asymmetrical: emigration from western populations was 0.5, while emigration from eastern populations was 2.0 to 4.0. These findings are in accordance with the low Fst = 0.15 between Alps and Kazakhstan, and suggest that the Alps were colonised from central European Black Grouse populations.

DISCUSSION

Comparative analyses of population genetic structure and geographic variation in three species of galliforms distributed in western Europe, suggest that they had different recent histories. Population structure was strong in Rock Partridges that had been sampled at a relatively local geographic scale. The strong divergence between Sicily and continental populations indicates that Sicilian partridges have been isolated for a long time, whereas the Albanian-Apennine populations could have been in contact since the last glacial maximum through the north Adriatic land-bridge. Divergence in the Alps suggests that eastern and western parts of the range could have been colonised at two different times or by two different source populations. The ragged intermatch distribution of pairwise differences was generated by geographic subdivisions that predated any eventual population fluctuation. Therefore, postglacial colonisation of the Alps was not correlated with a significant demographic expansion. Intermatch distributions were different in the two species of grouse, the Rock Ptarmigan and the Black Grouse. A ragged intermatch distribution suggests that Black Grouse populations originated anciently and underwent geographical differentiation, probably as a consequence of repeated cycles of isolation and dispersal in the different areas of their distributions. Therefore, geographical structuring anticipated any eventual population fluctuation. Alternately, the ptarmigan populations probably originated and were isolated more recently, perhaps at the end of last glaciation, in the present fragmented areas, and did not develop any geographic divergence, except for the effects of random drift. The smooth intermatch distribution of pairwise differences suggested that some demographic fluctuations anticipated fragmentation and population divergence.

Assuming that:

(1) the avian mtDNA evolves at an average mutation rate m = 1 to 2% per nucleotide per myr:

m mtDNA = 1 to 2 x 10-8

(2) CR domain I evolves about 10 to 20 times faster than average mtDNA mutation rate:

m CR-I = 1 to 4 x 10-7

therefore,

(3) the substitution rate for a sequence about 450 nt long is:

u = m x nt = 0.5 to 2.0 x 10-4

and,

(4) Tau = 1/2u = 10000 to 2500 generations, corresponding to time T = 20000 to 5000 years if generation time is 2 years in grouse and partridges.

In accordance with these very tentative estimates, we can date the main events in the recent population history of the studied species as follows:

(1) Rock Partridges: isolation in Sicily at Tau = 15 and T = 300000 to 75000, corresponding to the second part of the mid-Pleistocene; divergence between Albania/Apennines and Alpine populations at Tau = 2.5 and T = 50000 to 12500, corresponding to the second part of the last glaciation and initial Holocene.

(2) Rock Ptarmigan: sudden population change at Tau = 2.0 (intermatch value), and T = 40000 to 10000, corresponding to the second part of the last glaciation and to the initial Holocene. The Alpine population has Tau = 1.4, corresponding to T = 28000 to 7000, which corresponds to the period of the last deglaciation of the Italian Alps.

(3) Black Grouse: main population differentiation at Tau = 7 and T = 150000 to 37500, corresponding to the second part of the mid-Pleistocene; divergence between the Alpine populations at Tau = 1.5 and T = 30000 to 7500, corresponding to the second part of the last glaciation and initial Holocene.

ACKNOWLEDGEMENTS

The results presented in this paper are part of an ongoing project involving the study of phylogenetic relationships, and population and conservation genetics of Galliformes; it is in progress at the Italian Institute of Wildlife Biology, Ozzano dell'Emilia (BO), Italy. We thank the many collaborators who helped us with sample collection. This work was supported by the Italian Institute of Wildlife Biology.

REFERENCES

Baker, A.J., & Marshall, H.D. 1997. Mitochondrial control region sequences as tools for understanding evolution. In: Mindell, D. P. (ed) Avian molecular evolution and systematics. San Diego; Academic Press: 51-82.

Beerli, P. 1998. MIGRATE: Documentation and program, part of LAMARC. http://evolution.genetics.washington.edu/lamarc.html.

Excoffier, L., Smouse, P., & Quattro, J. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial restriction data. Genetics 131: 479-491.

Excoffier, L. 1997. MinSpNet. Windows Minimum Spanning Network. ftp://anthropologie.unige.ch/pub/comp/win/min-span-net.

Fu, Y.-X., & Li, W.-H. 1993. Statistical tests of neutrality of mutations. Genetics 133: 693-709.

Gerloff, U., Schlotterer, C., Rassmann, K., Rambold, I., Hohmann, G., Frutth, B., & Tautz, D. 1995. Amplification of hypervariable simple sequence repeats (microsatellites) from excremental DNA of wild living Bonobos (Pan paniscus). Molecular Ecology 4: 515-518.

Hudson, R.R. 1990. Gene genealogies and the coalescent process. Oxford Surveys in Evolutionary Biology 7: 4-44.

Hudson, R.R., Slatkin, M., & Maddison, W.P. 1992. Estimation of levels of gene flow from DNA sequence data. Genetics 132: 583-589.

Khuner, M., & Yamato, J. 1998. FLUCTUATE: Documentation and program, part of LAMARC. http://evolution.genetics.washington.edu/lamarc.html.

Kimura, M. 1968. Evolutionary rate at the molecular level. Nature 217: 624-626.

Kumar, S., Tamura, K., & Nei, M. 1993. MEGA: Molecular Evolutionary Genetic Analysis, Version 1.01. Pennsylvania State University.

Lucchini, V., & Randi, E. 1998. Mitochondrial DNA sequence variation and phylogeographic structure of rock partridge (Alectoris graeca) population. Heredity 81: 528-536.

Randi, E., & Lucchini, V. 1998. Organization and evolution of the mitochondrial DNA control-region in the avian genus Alectoris. Journal of Molecular Evolution 47: 449-462.

Rogers, A.R. 1995. Genetic evidence for a Pleistocene population explosion. Evolution 49: 608-615.

Rozas, J., & Rozas, R. 1997. DnaSP version 2.0: A novel software package for extensive molecular population genetic analysis. Computer Applications in Bioscience 13: 307-311.

Saitou, N., & Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4: 406-425.

Schneider, S., Kueffer, J.-M., Roessli, D., & Excoffier, L. 1997. ARLEQUIN ver 1.1. A software for population genetic analysis. http://anthropologie.unige.ch/arlequin.

Tajima, F. 1989. Statistical methods for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123: 585-595.

Thompson, J.D., Higgins, D.G., & Gibson, T.J. 1994. CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acid Research 22: 4673-4680.

Watterson, G. 1975. On the number of segregating sites in genetical models without recombination. Theoretical Population Biology 7: 256-276.