Deep mining of data offers information on human evolution and relationships among populations.
Through sophisticated statistical analyses and advanced computer simulations, researchers are learning more about the genomic patterns of human population structure around the world.
Revealing such patterns provides insights into the history of human evolution, the predominant evolutionary forces that shaped local populations, and the relationships among populations.
"Studying genomic patterns of human population structure also has practical applications in disease-gene mapping," noted Dr. Joshua M. Akey, University of Washington (UW) assistant professor of genome sciences. Akey is senior author of new genomic research findings about the fine-scale structure of diverse human populations. The results will be published May 15 in the American Journal of Human Genetics. The lead authors were Shameek Biswas and Dr. Laura B. Scheinfeldt, both of the UW Department of Genome Sciences.
A statistical method called principle component analysis allows researchers to look through a thick, voluminous fog of genetic data and see significant variations. The UW researchers applied this method to a data set of almost 650,000 SNPs (pronounced "snips").
A single nucleotide polymorphism is a genetic variation in which the DNA code differs by only one "letter" in the same DNA sequence from two individuals of the same species. The data set of 650,000 SNPs came from 944 unrelated persons representing 52 broadly classified populations living in several regions on seven continental groups: Africa, America, Central and South East Asia, East Asia, Europe, Middle East, and Oceania. This global sampling came from The Human Genome Diversity Project - Center for the Study of Human Polymorphisms
Most previous genomic studies of this nature have focused on broad-scale patterns of structure among geographically diverse populations, the UW researchers noted. These studies concluded that 85 to 95 percent of human genetic variation can be attributed to differences among individuals, and 5 to 15 percent is due to differences between populations.
In contrast to these broad-scale patterns, more recently researchers have tried to look at fine-scale population structure. Usually these studies focus on only the two or three top- ranking axes of genetic variation emerging from the statistical analysis. For example, studies of European individuals have shown a strong correlation between the top two axes of genetic variation and the actual geographical location of the sampling.
In their newly published findings, UW researchers demonstrated that substantial information on population structure is hidden more deeply in the genomic data. They were able to identify 18 significant, informative axes of variation. Some of these distinguished particular populations.
The UW researchers also conservatively estimated the set of all of the SNPs, or specific, tiny DNA code differences, matching each of the most informative axes of variation. These variations represent numerous fixed positions on the human genome where different biomarkers can sit and thereby form a "genomic signature" of population structure. They also allow for more detailed inferences, the UW researchers noted, about the evolutionary forces that shape the fine-scaled patterns of human population structure.
"The genome-wide distribution of these markers," Akey believes, "can largely be accounted for by genetic drift." Genetic drift is gradual accumulation of random changes in the gene pool of small populations. Akey added that some of these variations, however, do cluster in regions of the human genome considered to be targets of recent adaptive evolution.
The researchers also observed patterns of human genetic variation that correlated with geography in essentially every continental group. While such geographical patterns have been described in European samples, the researchers think that the extent to which such geographic correlations might be found in other continents may not be fully appreciated.
In mentioning the limitations of his study approach, Akey cautioned that there are still questions about the best way to design human population genetics research in terms of sampling individuals and populations, but that progress is likely.
"Now that we have increasingly dense catalogs of genetic variation," Akey wrote, "the details of human population structure are becoming more tractable. The testing of increasingly refined hypotheses about human population structure should yield new insights into the history and relationships among human genomes."
Eurekalert 2009.