"Pythagoras wrote that numbers set a limit to the limitless and that they constitute the true nature of things." Perhaps this goes some way to explaining the frenzied fixation with the total number of genes in the human genome: that, in having a grip on the size of our genetic complement and complexity, we are better acquainted with our genomes or, some might even say, ourselves. On a more practical level, the number of human genes has some commercial relevance. Some biotechnology companies, such as Incyte and Human Genome Sciences, assert that they have evidence-and by insinuation, the sequences-of over 100,000 genes, thereby making access to their private databases more enticing to paying customers.
Two papers in this month's issue of Nature Genetics support a much lower number of human genes. Brent Ewing and Philip Green (University of Washington) present an elegant study in which they conclude that the true estimate is closer to 34,000 [Nature Genetics Volume 25 Number 2 Page 232-234 (2000)]. Strikingly, a similar estimate of 30,000 genes is made by Jean Weissenbach and colleagues (of Genoscope, France) using an independent method that involves comparing the human genome sequence-which is becoming more and more comprehensive-with the genome of another vertebrate (the pufferfish Tetraodon) [Nature Genetics Volume 25 Number 2 Page 235-238 (2000)]. These results are also consistent with predictions based on the combined number of genes in chromosomes 21 and 22.
In contrast is a third study carried out by John Quackenbush and colleagues at The Institute of Genome Research [Nature Genetics Volume 25 Number 2 Page 239-240 (2000)]. Using a different method altogether, these researchers come up with an estimate of 120,000 genes.
Why the difference? This question was the subject of considerable debate at a recent meeting at Cold Spring Harbor Laboratories. One explanation for the high estimate obtained by Quackenbush and colleagues concerns the method that they used. Like Ewing and Green, they use a method that involves comparing genomic DNA with expressed tagged sequences (ESTs; these are stretches of DNA that represent parts of genes, and can be easily synthesized in the laboratory). However, it is becoming increasingly clear that one cannot assume that a collection of ESTs accurately mimics the gene content of a genome and, moreover, the fact that one gene can generate two or more different kinds of ESTs further complicates the exercise of 'calling' genes from ESTs. Both groups took precautions to minimize these complications, although both used different methods to do so-in addition to using analytical approaches.
The present studies are the latest in a series of predictions regarding the number of genes in the human genome and therefore have the advantage of making use of an augmented data set. The coincident findings of two of the groups provide strong support to those who champion a more modest human gene complement-and are consistent with the emerging view that biological complexity is not dependent on gene number, but on gene regulation, splicing and evolution.
- Dr. Philip Green
Department of Molecular Biotechnology
University of Washington
Telephone +1 206 685 4341
Fax: +1 206 685 7344
- Dr. Jean Weissenbach
Centre National de Sequençage
Telephone +33 1 60 87 25 02
Fax: +33 1 60 87 25 32
- Dr. John Quackenbush
The Institute for Genomic Research
Telephone: +1 301 838 3528
Fax: +1 301 838 0208
- Dr. Samuel Aparicio
Wellcome Trust Center
Telephone: +44 1223 762 663
Fax: +44 1223 336 902
(C) Nature genetics press release.
Message posted by: Frank S. Zollmann