On 12 February Nature (Vol. 409, No. 6822, 15 Feb 2001) will publish a series of papers on the human genome that will have a significant impact on medicine and biology. These will be on www.nature.com, and there will be free access to them on the Nature site. The print version will appear in the 15 February issue. In all, Nature will publish 150 pages on the genome, organized into three major sections. These will cover: Research, comprising 4 Research Articles and 7 Letters covering a host of aspects of genome sequencing and mapping. Included is the definitive analysis of the draft sequence. It's long (over 30,000 words) but we think it's fascinating: one of our reviewers confessed that it "gave him goosebumps". Another Article covers mapping of the genome, dealing with what Clinton has described as "the most important, most wondrous map ever produced by mankind". Other papers focus on the Y chromosome, SNP maps, telomere regions, physical and genetic distances, and cytogenetics.
Analysis. Several months ago we paired up nine leading biological and biomedical scientists with bioinformatics experts and set them loose on the genome. The result is the first comprehensive exploration of the practical impact that the genome sequence might have on science. Look out for some important insights into areas such as addiction, cancer biology, circadian rhythms, membrane traffic and transcription.
News and Views. Seven of the most distinguished scientists in the world have provided articles on topics including the evolution and migration of humans, implications for medical genetics and the need for new bioinformatics tools.
Information about the individual papers follows below...
Lander et al.
This is the main sequence paper from the International Human Genome Sequencing Consortium. It's 62 pages long, with 49 figures and 27 tables. (Size: 1.4 megabytes)
Geoffrey Spencer (Press Office, NHGRI
tel +1 301 402 2219
Joni Westerhouse (Press Office, Washington University, St Louis)
tel +1 314 286 0120
Seema Kumar (Press Office, Whitehead Institute)
tel +1 617 258 7270
Shaun Griffin/Noorece Ahmed (Wellcome Trust Press Office)
tel +44 20 7611 8612 / 8540
McPherson et al.; Bruls et al.; Bentley et al.; Kucherlaparti et al.; Page et al.: the whole-genome clone-based physical map and other maps (+Olson N&V) Most whole genomes are sequenced by 'shotgun sequencing'. The genome is blasted into fragments and then reassembled. This is easy for small, simple genomes but is a real challenge for the human genome because it is large compared to previously sequenced genomes and also because it is more complex and more repeat-laden than any other genome to have been tackled. The same segment of sequence is often repeated over and over again, sometimes in
tandem and sometimes in completely different parts of the genome - this can confuse reassembly of the fragments. To get around this problem, the HGP created a physical map of the genome. They generated a series of overlapping fragments of about 100-200 kilobases covering the entire genome and fingerprinted each of them so they could be mapped relative to each other and positioned on the chromosomes. Each fragment was sequenced, then the sequence was overlaid onto the map scaffold and merged to reassemble the human genome. In their paper the International Human Genome Mapping Consortium describe how they constructed the first whole-genome physical map, how they created the templates from which the genome was sequenced and demonstrated how the map was essential for the accurate assembly of the human genome by the publicly funded effort. Four short reports accompanying the whole-genome mapping paper (Bruls; Bentley; Kucherlaparti; Page), describe alternative mapping strategies that were implemented for chromosomes 12, 14 and Y, as well as a host of other chromosomes. Information from all these papers were integrated into the whole-genome paper and demonstrate how a rich resource of mapping information can be generated by the cooperation of international independent efforts.
Dr John McPherson
tel +1 314 286 1800
Dr Thomas Bruls
Dr David Bentley
tel +44 1223 834 244
Dr Raju Kucherlapati
tel +1 718 430 2069
Dr David Page
tel +1 617 258 5203
Dr Maynard Olson
tel +1 206 685 7346
Riethman et al.: telomere map
Telomeres are the tips of the chromosomes. They are crucial in maintaining the chromosomes' stability and are important in the cell cycle and ageing. Because of the way the physical maps are constructed, many telomeres of chromosomes are left out. Riethman et al. used a special way of capturing the ends of all the telomeres to ensure that the whole-genome map stretches all the way to the tips of the chromosomes. The authors also show from their computational analysis that these areas are not boring junk DNA but contain many interesting gene sequences likely to be important to our cells.
Dr H C Riethman
tel +1 215 898 3872
Page et al.: Y chromosome
The Y is under siege. Not only has it been reduced to a little stump, it has also shut down most of the expression of its genes. It has essentially hunkered down to try to protect itself during battles of the sexes with the X chromosome (women have two X chromosomes; men have one X and one Y). While constructing the map of the Y chromosome, the authors stumbled across some surprises - basically, large chunks of the Y chromosome are repeats, and the regions are so similar that they are almost impossible to tell apart. Many of the male-specific genes lie in these regions, such as those involved in testes development and sperm production. It is possible that this duplication of the regions is a means of ensuring that they aren't lost in the battle with the X.
Dr David Page
tel +1 617 258 5203
Trask et al.: cytogenetic map
This paper takes a macroscopic view of the genome, visualizing all of the chromosomes. It places 7,500 landmarks across the entire set of chromosomes, like signposts, so that you always know which sector of the genome you're in. Each landmark was mapped to specific bands on the chromosomes using a method where each tag is labelled with a fluorescent dye and then bound to the genome to determine which part of the chromosome it lights up and therefore which position in the genome it corresponds to. This tagging provides a powerful tool for studying human diseases, such as cancer, which are frequently caused by chromosomes breaking in certain places. Researchers can much more rapidly determine which regions are 'broken' and which genes are affected. Indeed, the authors describe how they used their cytogenetic map to track down the region disrupted in a case of mental retardation and identified several strong candidate genes.
Dr Barbara Trask
tel +1 206 667 1470
Weber et al.: recombination and the human genome
Each of us is composed of two sets of 23 chromosomes and during formation of our sex cells, each pair of chromosomes join together to exchange various homologous regions - thus, each offspring is a unique combination of each parent's pairs of chromosomes swapping chunks. The rate of exchange between chromosomes drives our diversity and influences the evolution of our species. With the physical maps and sequence to hand, it is now possible to take a glimpse at the rate at which chromosomes swap with their partners. This reveals that recombination is not uniform across the genome. Rather, our
chromosomes are a mixed terrain, with 'deserts' where recombination seemingly happens infrequently, and 'jungles', where exchange often occurs. Understanding the pattern of recombination across the genome is important for mapping human disease genes because it will influence how these studies are designed. The higher the rate of recombination in a region, the higher density of genetic markers that are needed to determine, at sufficient resolution, whether the region of the genome is inherited with a disease and therefore potentially contains the causative gene aberration.
Dr James Weber
tel +1 715 387 9179
Altschuler et al.: the map of human variation (+Chakravarti, Stoneking N&Vs)
Most human genetic variation occurs as different nucleotides at single base positions -called single nucleotide polymorphisms, or SNPs. The latest map of nucleotide diversity across the human genome, from Altschuler et al., catalogues 1.42 million SNPs across the genome. On average, there is one SNP every 1.9 kilobases. Nucleotide diversity varies greatly across the genome, and the pattern of diversity varies for different populations. This chart of human variation will be crucial for studying complex traits, where subtle changes in one or several genes can lead to the common diseases that affect the human species, such as cardiovascular disease, diabetes, and so on. In addition, mapping the variation in genomes between populations in different nations will help us understand human evolution and migration.
Dr David Altshuler
Dr Aravinda Chakravarti
tel +1 410 502 7525
Dr Mark Stoneking
tel +49 341 995 2502
Boguski et al.: microarray annotation
Guessing the number of human genes is a speculative business. Boguski et al. have devised a means of experimentally validating gene predictions and refining the definition of gene structures using microarray technology. As proof of principle, the authors analyse gene predictions for chromosome 22, the first chromosome to be fully sequenced and exhaustively annotated. Although they were able to validate the majority of known genes on chromosome 22q, some were missed, indicating that their algorithms for detecting subtle differences in gene expression need further refinement. But more than half of the gene predictions based solely on ab initio computer predictions were confirmed - far exceeding earlier expectations - illustrating that this is an effective means of quickly assessing the validity of computational predictions. It's a long way from comprehensively defining the structure of every gene in the human genome, but this novel approach offers a rapid means of validating computational predictions and training the next generation of gene-hunting programs.
Mary C. Drummond (Rosetta Inpharmatics Inc)
tel +1 425 823 7369
For more detailed information from Rosetta, see http://www.rii.com/200102natPress/
WHAT WE LEARN FROM THE SEQUENCE: THE HIGHLIGHTS (+Baltimore, Rubin & Bork N&Vs)
Professor David Baltimore
tel +1 626 395 6301
Dr Gerald Rubin
tel +1 510 643 9945
Dr Peer Bork
tel +49 6221 387526
We have many fewer genes than might have been expected for a relatively complex organism. Flies (Drosophila) have 13,000 genes, nematode worms (Caenorhabditis elegans) have 18,000 and thale cress (Arabidopsis thaliana) has 26,000. From analysis of the human draft genome, there only seem to be 30,000-40,000 genes. Furthermore, the additional genes are not primarily the result of invention of new protein domains. We have many of the same protein families as flies and worms, although we have more in each family. The additional genes come from reshuffling the number and order of protein domains, analogous to making new cars out of old parts. So if the increasing complexity of humans isn't due to a big rise in gene number, what might explain it? It is likely to be an intricate combination of carefully timed gene expression, processing of gene products and modifications of proteins.
A repetitious genome
More than half of the euchromatic genome is comprised of repeat sequences, with the vast majority (45%) accounted for by repeats derived from parasitic DNA, called 'transposable elements' or 'transposons'. The elements propagate by replicating themselves at one site in the genome and then inserting the copy into another site. This degree of transposition is unprecedented in any other sequenced genome, compared with fly and worm. Curiously, much of our repeat content represents ancient remnants of long-'dead' transposons, contrasting with the fly and mouse genomes, which harbour younger, more active elements. The type and distribution of the transposable elements in our genome is indicative of the elements that have shaped the evolution of our genome. Some types of transposons seemingly flourish in our genome, such as LINE1 and Alu elements (which represent more than 60% of all interspersed repeat sequences). Others seem to have found the environment unsavoury: for example, only faint traces of LTR retroposons are detectable in the human genome, yet they are alive and kicking in the mouse genome. There is evidence that repeats shaped the evolution of our genome and mediated the creation of new genes. Analysis of the draft genome has brought the total number of genes likely to be derived from transposons to 47, including the genes encoding telomerase and RAG1 and RAG2, the recombinases involved in construction of immune system receptors. Several hundred genes use fragments of transposons in the regulatory sequences that control expression and transcription termination. This suggests that, at least in part, transposons are retained because they confer some advantage. Surprisingly, one of the two most prolific transposable elements found in the human genome, Alu, a type of short interspersed element (SINE), is enriched in the GC-rich regions of the genome where most genes are found. This is purely correlative, but it suggests that Alu elements like to snuggle up to actively transcribed genes and potentially convey some selective advantage, such as that observed to be offered by SINEs in other species that promote protein production in times of stress. But not all transposition is good. Approximately 1 in 600 mutations in humans are the result of transposition and, although most probably have little consequence, some may be detrimental. For example, a transposition event in the APC gene is responsible for a case of colorectal cancer.
Duplications also appear to have had a significant role in genome evolution, with roughly 5% of the sequence arising from duplication of large blocks (of more than 10 kilobases) within and between chromosomes. This is a much more prevalent feature in human than in fly, worm or yeast. Duplications enable one copy of a gene to relocate to a new site where it may take on a distinct physiological function. Highly homologous duplicated regions are likely to have contributed greatly to the expansion of gene families in humans, an extreme example being the large olfactory receptor gene family which comprises more than 1,000 members. But it can also cause problems. Unequal crossing over between large nearly identical regions on the same chromosomes can cause deletions of large chunks of genome, causing disease. An example is DiGeorge syndrome, or velocardiofacial syndrome, which is caused by duplications on chromosome 22, and Williams-Beuren syndrome, which arises from deletions on chromosome 7.
Input from bacteria
Bacteria have also left their mark on our genome. Remarkably, 223 genes found in human are more similar to bacterial genes than to anything seen in yeast, worm, fly or plants. And they appear to have been transferred from a range of bacterial species. The same genes are found in other vertebrate species, indicating that they were introduced early during the vertebrate lineage. Is this a case of bacterial genes hitchhiking an evolutionary ride, or is there something in it for us? Most of the bacterially inherited genes encode enzymes and these have been sequestered into specific pathways, such as stress responses and metabolism of xenobiotics, indicating that the genes have been adapted to important physiological functions. For example, relatives have been discovered of monoamine oxidase, an enzyme of the mitochondria outer membrane involved in metabolism of neuromediators and is a target of important psychiatric drugs.
ANALYSIS: THE DATA-MINING REPORTS
In a series of reports accompanying the research papers, Nature invited nine experts from different areas of the biological sciences to explore the human genome sequence and see how many genes from specific gene families they could find and what novel insight they could decipher from the sequence. As Birney et al. explain in an accompanying introduction to the data-mining series, a draft genome sequence is a challenging dataset and there are many hurdles to be overcome and things to be wary of when exploring the sequence.
Dr Ewan Birney
tel +44 1223 494420
Reppert et al.: clock genes
Our daily biological rhythm, or circadian clock, is controlled by genes. They drive our sleep-wake cycle, hormone levels and variations in body temperature. Disturbances in the rhythm of our inner clocks can lead to sleep disorders, poor health and neuropsychiatric disorders. It's also the reason why we suffer jetlag at the end of a long flight. Reppert et al. screened the genome for new clock genes. They found several that may help us understand the genetic basis of our daily rhythm, provide new targets for treating jet lag and sleep disorders, and for deciphering those genes that make some of us early risers and others night owls.
Dr Steve Reppert
tel +1 617 726 8450
Nestler et al.: genetics of addiction
A large amount of risk associated with addiction is genetic. Addiction to drugs typically involves changes in the sensitivity of a brain receptor. Nestler and Landsman scoured the human genome for relatives of genes encoding brain receptors that are involved in drug addiction. These findings are likely to improve the understanding of neurological changes associated with addiction and why some of us are more vulnerable to substance abuse than others.
Dr Eric Nestler
tel +1 214 648 1111
Green et al.: transcription and gene expression
Green et al. studied genes involved in the various processes that influence gene expression. They searched for relatives of genes critical to these processes. They found that the increased intricacy in the machinery that controls gene expression is an important reason why humans are more complex than 'simple' creatures such as flies and worms.
Dr Michael Green
tel +1 508 856 5331
Scheller et al.: compartment organization and trafficking
Our cells are divided into compartments where different cellular processes take place. Many proteins transiently pass through various compartments en route to their final destinations. The transport system of a cell involves bubble-like vesicles budding off from one compartment and then being trafficked to another. A number of well defined gene products are involved in this transport process. Scheller et al. explore the genome, finding many new members of these families; they also more clearly define distinct subgroups within families.
Dr Richard Scheller
tel +1 650 723 9075
Pollard: cytoskeleton and motility
The cytoskeleton shuttles proteins around the interior of a cell. It is made of actin filaments, intermediate filaments and microtubules. Myosin proteins move cargo along the actin filament conveyor belts; dynein and kinesin motors drive movement along the microtubules. Pollard hunted for new proteins of the cytoskeleton that are encoded by the genome sequence. The author tapped into a set of molecular targets for treating cardiovascular and skeletal muscle disorders, which can often involve disruption of the cell's scaffold and contractile motors.
Dr Thomas Pollard
tel +1 858 453 4100 ext 1716
Li et al.: evolutionary analyses
Li et al. examine repeats, the sharing and conservation of domains between species and duplications in the context of how this has influenced human evolution and how we differ from flies, worms and yeast. They conclude that most of our genome is composed of repeats - repeats that drive the evolution of new genes, because duplication of regions of the genome is another route to expansion of gene families. They also conclude that we share many common protein domains with other species but that our increase in gene number seems to be derived from unusual ways of mixing and matching the same domains rather than creating novel domains.
Professor Wen-Hsiung Li
tel +1 773 702 3104
Fahrer et al.: immunology
Fahrer et al. look for new members of three examples of immunologically important proteins: the tumour-necrosis factors receptor family, the B7 family of costimulatory proteins and cytokines. They find new candidates that potentially reveal new regulators and mediators of the immune system that underpin immunological traits and treatments for aberrations of the immune system.
Dr Aude Fahrer
Murray et al.: cell cycling
The cell cycle is an intricate choreography of cellular processes including DNA replication, segregation of the chromosomes and cell division. The cell cycle is governed by the activity of cyclins and cyclin-dependent kinases. The search of Murray et al. for relatives of these proteins reveals how amazingly conserved multicellular organisms are, from flies to humans, in using the same machinery to regulate the cell cycle.
Dr Andrew Murray
tel +1 617 496 1350
Stratton et al.: cancer
Cancer isn't typically caused by a single mutation - rather, it is large-scale genome meltdown, with chromosomes snapped in places, whole chunks rearranged, some bits lost and others duplicated. Stratton et al.'s perusal of the draft genome sequence did not reveal any new relatives of the infamous cancer genes, such as p53. But they did find that comparing the genomes from cancerous cells with that of the normal reference human genome sequence can uncover new genes involved in the major rearrangements of chromosomes that contribute to cancer.
Dr Michael Stratton
tel +44 1223 494757
Valle et al.: human disease genes
From analysis of a compilation of nearly 1,000 disease genes, Valle et al. show that most (just over 30%) of inherited diseases caused by a single gene defect are associated with genes that encode enzymes. Overall, they find strong correlations with the function of the gene and the features of the disease, including age of onset and pattern of inheritance. Specifically, each of the four functional categories defined have a different peak age at onset: transcription factors in utero, enzymes by one year, receptors between year one and puberty and modifiers of protein function in early adulthood.
Dr David Valle
tel +1 410 955 4260
Church et al.: the comparison
The authors compare the vital statistics for the publicly funded Human Genome Project and the private Celera effort in terms of amount of sequence, gaps, number of unique sequences and continuity of sequence. They find that the two sequences are essentially comparable. While there are some differences at the detailed level, such as how the genomes are packaged and the gap distributions, the authors predict that these differences will quickly be resolved as both sequences become more complete.
Dr George Church
tel +1 617 432 7562
Schuler et al.: the guide
This 'Guide' helps the novice navigate around the different features of the genome and the type of research described in Nature's pages.
Dr Greg Schuler
tel +1 301 496 2475
Quotes from Philip Campbell, Editor in Chief of Nature:
"Nature has published many groundbreaking papers. But never before have we published a collection of papers as informative or as breathtaking in the scope of what it reveals about human life, as today's papers on the analysis and sequence of the human genome."
"We at Nature are delighted to uphold the principle at the heart of the Human Genome Project: free and unrestricted access. Today Nature is making the Project's analysis of the details and significance of the sequence freely available to the world without restriction on the Web."
"Our bodies, our behaviour and our minds couldn't exist without the human genome, and are all profoundly influenced by its content in ways that we have yet to fathom. The sequence unveiled today by the International Human Genome Sequencing Consortium unlocks new ways of exploring those influences."
"Never have we published in one issue analyses that relate to so much of the essential make-up of every human being."
"Every editor dreams of a day like today."
(C) Nature press release.
Message posted by: Trevor M. D'Souza
Variants Associated with Pediatric Allergic Disorder
Mutations in PHF6 Found in T-Cell Leukemia
Genetic Risk Variant for Urinary Bladder Cancer
Antibody Has Therapeutic Effect on Mice with ALS
Regulating P53 Activity in Cancer Cells
Anti-RNA Therapy Counters Breast Cancer Spread
Mitochondrial DNA Diversity
The Power of RNA Sequencing
‘Pro-Ageing' Therapy for Cancer?
Niche Genetics Influence Leukaemia
Molecular Biology: Clinical Promise for RNA Interference
Chemoprevention Cocktail for Colon Cancer
more news ...