hum-molgen.org is starting a series of interviews and discussions with authors of recent papers that have been highly influential in human molecular genetics.
We start with an article that has been central for the consolidation of genetic association studies (The future of genetic studies of complex human diseases) (1). This is a two page Perspective paper that was published in Science in September 1996. The authors were: Neil Risch, PhD (University of California at San Francisco) and Kathleen Merikangas, PhD (National Institute of Mental Health).
The authors presented the idea that association studies could be more powerful than linkage studies for the analysis of complex human disorders. They based their conclusions on a series of calculations of the numbers of families required to achieve an 80% power for the two methodological approaches: linkage (analyzing affected sibling pairs) and association (analyzing transmission of alleles in trios or association in pairs of affected siblings) for a number of different frequencies for the disease allele (0.01, 0.1, 0.5 and 0.8) and Genotypic Risk Ratios - GRR - (1.5, 2.0 and 4.0). For example, for a GRR of 1.5 and an allele frequency of 0.1, the sample sizes needed were around 30 times larger for linkage approaches than for association studies using trios (67.816 vs 2.218, respectively.
They suggested the importance of the future development of genome-wide exploration of polymorphisms in association studies. It was in 1996, years before the sequencing of the human genome, the consolidation of repositories with millions of SNPS, the identification of genome-wide patterns of LD, haplotype blocks and htSNPs and the development of genotyping chips for hundreds of thousands of SNPs. Although the number of human genes that were expected in 1996 was 5 times the actual number, they proposed a genome-wide significance level that it is similar to one currently used (10-8).
This paper, according to Google Scholar, has been cited 3012 times. It has been one of the theoretical cornerstones of the common disease/common variant hypothesis and one of the initial inspirations for the consolidation of the genome-wide association studies (GWAS). It has been quite an influential paper.
We asked Drs. Merikangas and Risch some questions in order to put into perspective the actual context of GWAS and to highlight possible implications for the continuation of the future of genetic association studies.
-In your opinion, 13 years after the publication of your paper, what has been its largest influence on human genetics?
The paper proposed that sequencing of one individual would not provide the incredibly valuable tool of knowing the genetically variable sites in the genome, which would require sequencing multiple individuals. So, in general terms, probably its greatest impact was that it led to the identification and characterization of genetic variation in the human genome. This information has now had a significant impact on a variety of areas, including human evolution and genetic relatedness of individuals and populations, the distribution of linkage disequilibrium between and within populations and its distribution across the genome, the characterization of individual ancestry, and, perhaps most significantly, it has now led to the novel identification of hundreds of genetic variants associated with both common and rare diseases, as we had originally proposed. GWAS have now led to the identification of more than 300 new variants for 70 complex diseases (2).
We were originally writing a perspective based on another story we had seen in Science Magazine titled "Has epidemiology reached its limits?" At the time, the major tool in human genetics used to identify disease susceptibility variants for complex diseases was linkage analysis. We could already see the limitations of that approach, based on the studies that had been done up to that point. That article did not comment at all on what was going on in human genetics and genetic epidemiology. So, originally we were going to write an article on "Has human genetics reached its limits?" But instead, we tried to decide what the real limitations were at the time in terms of human genetic analysis.
Instead of simply writing a negative piece, we wanted to give a more positive message. So, we thought if we could have any tool to do genetic studies, what would it be? We decided that annotating the genetic variation in the human genome and creating platforms for assaying it in disease studies was what we would want. We already knew from some candidate gene studies, primarily those based on HLA, that weak associations could be identified if you knew the right gene to study, and that signals from association studies were far stronger than for linkage studies. But the problem was that in most cases we have no knowledge, or even a guess really, as to the "right" gene. In fact, there might be many such genes.
Therefore, we considered the whole genome as a candidate, and asked what the power would be if we needed to assay genetic variation across the entire genome. We showed that even employing a very strict Bonferonni correction, you still had excellent power to detect weak associations that could never be identified in linkage studies. Thus, we decided to propose this approach as an alternative to linkage studies (also, association studies can be done on singletons and do not require multi-case families which are often more difficult and/or expensive to recruit). So, ultimately we ended up with that as the message of the perspective. It appears that time has borne out the logic of our conclusions.
-You showed that the main differences between linkage and association approaches were for GRR of 1.5. Many of the recent GWAS are showing that the ORs for the positive markers are below 1.5. What are your opinions about sample sizes/study designs under these circumstances?
Actually, when the GRR is less than 1.5, linkage studies are even less practical. Of course, GRR's below 1.5 require larger sample sizes, but fortunately the technology has advanced to a point to allow economic large scale genotyping on tens of thousands of individuals. However, there probably will be a limit in terms of how low a GRR is detectable, as there should be. For example, a GRR of 1.1 would be a reasonable lower limit. Of course, this also depends on allele frequencies, so this would be true for a common variant. For a rare variant, the GRR would need to be higher. So, because the costs are no longer prohibitive, there is room for more large scale studies, to see how much we can learn from this approach.
-What are your current opinions about the controversy around the Common Disease/Common Variants and the Common Disease/Rare Variants hypotheses?
This is a misnomer, and terminology we never used. How rare is rare? How common is common? The prevalence of disease is really irrelevant, because common genetic variants can underlie risk of rare diseases as well as common ones. Of course, association studies as they are being conducted today are not generally powerful to identify rare variants (frequency of 5% or less), which require a different approach.
It is also a false dichotomy. There is a distribution of allele frequencies, which is reasonably uniform until you get to 10%, and below 10% the number of variants starts to increase exponentially (or more). So, it is true that many variants with frequencies less than 10% have probably not been well identified in the current wave of GWAS studies. However, as the allele frequency gets low, the power will also be low unless the GRR is high. Ultimately, that starts to move into the realm of variants identifiable by linkage analysis.
Already the field has started to move towards sequencing, to identify lower frequency variation and its association with disease. The problem here is the power will be low unless GRR is high. Time will tell how many such variants will be successfully identified.
-Is there any recent paper that you think that will be highly influential for human genetics in the near future (to recommend to our readers)?
Probably the most significant impact will come from efficiencies in high throughput large scale sequencing. Of course, analysis of those data will also create new (but interesting!) challenges. There are several companies that are focusing on developing advanced technologies to do this. Advances in sequencing technology will be one of the most important future developments.
Application of the tools of systems biology, which attempts to put together information from SNP variation studies with expression studies and disease information, will enlighten the biologic interpretations of SNP associations. Several recent commentaries have described how GWAS may serve as a starting point for future functional studies of the biology underlying complex diseases (3). Integrative thinking about future applications of systems biology can be found in articles such as those by Eric Schadt and his colleague; for example, see reference (4).
-Dr. Merikangas, you are currently working in psychiatric genetics, do you consider that psychiatric disorders need special GWAS approaches?
No, I do not think that we will need special approaches for GWAS. The chief impediment to identification of genetic variants underlying psychiatric disorders is the complexity of the phenotypes. We still lack understanding of their etiology and valid disease markers, and there is also increasing evidence that the disorders have multifactorial etiology and are both phenotypically and etiologically heterogeneous. For example, emerging evidence from prospective studies has implicated exposure to environmental factors at critical periods in the development of schizophrenia.
-Dr. Risch, you have a very strong mathematical background, do you think that mathematics will be even more central in human genetics in the next years?
Indeed. Putting together all the vast amount of information is going to require new analytic tools as well as computational approaches. Having strong mathematical and computational skills will be a great asset in the years to come in the field of human genetics.
(1). Risch N, Merikangas K. The future of genetic studies of complex human diseases. Science. 1996 Sep 13;273(5281):1516-7.
(2). Donnelly P. Progress and challenges in genome-wide association studies in humans. Nature. 2008 Dec 11;456(7223):728-31.
(3). McCarthy MI, Hirschhorn JN. Genome-wide association studies: potential next steps on a genetic journey. Hum Mol Genet. 2008 Oct 15;17(R2):R156-65.
(4). Schadt EE, Zhang B, Zhu J. Advances in systems biology are enhancing our understanding of disease and moving us closer to novel disease treatments. Genetica. 2009, in press.
Neil Risch, PhD University of California, San Francisco, California, USA
Kathleen Merikangas, PhD
National Institute of Mental Health, Bethesda, MD, USA
We thank Drs. Merikangas and Risch for sharing with us these very interesting and inspiring comments. We will be covering other topics and influential papers during the coming weeks. Stay tuned!
Diego A. Forero, MD
Message posted by: Diego Forero