Cold Spring Harbor Laboratory, Cold Spring Harbor, New York USA
April 29-30, 2008
The Genome Access Course is an intensive two-day introduction to bioinformatics. The course is broken into modules that are each designed to give a broad overview of a given topic, with ample time for examples chosen by the instructors. Each module features a brief lecture describing the theory, methods and tools followed by a set of worked examples that students complete. Students are encouraged to engage instructors during the course with specific tasks or problems that pertain to their own research. The core of the course is the analysis of sequence information framed in the context of completed genome sequences. Featured resources and examples primarily come from mammalian species, but concepts can be applied to any species. The course also features methods to assist the analysis and prioritization of gene lists from large scale microarray gene expression and proteomics experiments. A summary of the topics are listed below: Sequence Resources and Databases Extracting information from ENTREZ and other repositories Sequence formats, including multiple sequence formats Sequence conversion – using PISE NCBI Entrez Gene, GenBank and other databases NCBI RefSeq – Reference sequences NCBI features such as LinkOut and BLink NCBI TaxBrowser – Taxonomic information Ensembl BioMart – bulk sequence and information retrieval Model Organism Databases – e.g. Mouse Genome Informatics, Rat Genome Database, FlyBase Expasy resources – SWISS-PROT PANTHER OMIM and human disease information JCVI resources Genome Browsers Overview of major genome browsers: Ensembl, UCSC and NCBI Map Viewer browsers Compare content of major browsers Adding custom tracks de novo Analysis of Sequences Pairwise and Multiple Sequence Alignments Methods: Dot Matrix Analysis; Dynamic Programming; k-tuple Methods Comparing sequences directly using blast2sequences Understanding BLAST and BLAT searches Scoring Matrices: PAM, BLOSUM Iterative profile and pattern searches: PSI-BLAST, PHI-BLAST Multiple sequence alignment programs: CLUSTALW, T-Coffee Visualizing & editing multiple alignments Sequence Polymorphisms SNPs – SNP Consortium Genotyping Haplotypes & HapMap Haploview Entrez dbSNP & Genome Browsers HGVbase Genome Analysis Finding functionally important units in genome sequence by comparing genomes Ortholog and paralog prediction at NCBI and Ensembl Multicontigview in Ensembl Comparison tracks in the UCSC Genome Browser DCODE Resources: The ECR Genome Browser, zPicture and rVista Jim Watson's Genome Protein Function and Disease Interrogating interlinked databases for protein function and associated disease-relevant information Entrez at NCBI: Unigene and Homologene Interpro at EBI Organism-specific databases: OMIM (Human), MGI (Mouse), ZFIN (Zebrafish), FlyBase (Drosophila) The Gene Ontology Consortium Protein Structure Protein Data Bank 3D structure database DeepView structure viewer NCBI Entrez Structure Molecular Modeling Database - MMDB Conserved Domain Database - CDD Cn3D - NCBI structure viewer Structure Comparison using VAST and DALI Fold Classification with FSSP Catalog of Membrane Protein Structures Catalytic Site Atlas Gene Set Enrichment and Pathways Analysis Prioritize genes from microarray or proteomics experiments Gene set enrichment analysis tools – GSEA and DAVID Pathway resources – Reactome, HPRD NetPath, KEGG Protein interaction resources –MIPS, MINT, BIND, DIP
|
|