Images and Other Help on the Internet

BMB 400

Many resources useful to both students and researchers in molecular biology and biochemistry are available on the Internet. Here are a few of the sites that are particularly useful for the material we are covering.

Access to information and analysis of genes and genomes

1. http://www.ncbi.nlm.nih.gov/

This is a wonderful resource maintained by the National Library of Medicine and the National Institutes of Health, providing easy access to sequences, searching through sequence databases (using BLAST), pairwise sequence comparisons (again using BLAST), organized information about human and mouse genes (LocusLink), genome maps (for many species), 3-D structures, literature, and much more.

2. http://www.genome.ad.jp/

The GenomeNet WWW Server in Japan has a great set of sequence analysis tools, information on genomes, AND a project to integrate the products of all known gene into metabolic pathways (KEGG).

3. http://www.hgsc.bcm.tmc.edu/SearchLauncher/

This server at the Baylor College of Medicine has many useful tools for sequence analysis and comparisons.

4. http://www.ebi.ac.uk/

The server at the European Bioinformatics Institute has many resources, including databases and tools. It has one of the faster and most flexible servers for the multiple sequence alignment program CLUSTALW.

5. http://genes.mit.edu/GENSCAN.html

The GENSCAN server will analyze sequences for high probability candidates for exons and introns in complete genes, as well as promoters and polyA signals. "GENSCAN was developed by Chris Burge in the research group of Samuel Karlin, Department of Mathematics, Stanford University. The program and the model that underlies it are described in Burge, C. and Karlin, S. (1997) Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78-94." This is one of the most effective ab initio programs for predicting genes in a DNA sequence.

6. http://ftp.genome.washington.edu/cgi-bin/RepeatMasker

RepeatMasker (A. Smit and P. Green) will compare an input sequence to a large database of know repetitive elements (RepBase), identify and locate the repeats, and mask them so that the non-repetitive portion of a sequence can be analyzed more thoroughly. RepBase has an extensive collection of repetitive elements from many eukaryotic organisms, especially mammals, plants and flies (Drosophila).

7. http://bio.cse.psu.edu/

This site (W. Miller and colleagues at Penn State University) specializes in alignments of long genomic sequences, with tools such as PipMaker and Laj.

8. http://genome.cs.mtu.edu/sas.html

The Sequence Analysis Server at Michigan Tech has some powerful programs for pairwise and multiple sequence alignment.

9. http://genome-www.stanford.edu/Saccharomyces/

http://speedy.mips.biochem.mpg.de/mips/yeast/index.htmlx

These two sites provide "all you ever wanted to know about" yeast.

 

10. http://www.wormbase.org/

"WormBase is a repository of mapping, sequencing and phenotypic information about the C. elegans nematode. This prototype of the final database is layered on top of ACeDB, and has not been subjected to the rigorous curation expected of the ultimate product. The data available here correspond to the July 2000 release and contain the "essentially complete" genomic sequence. "

 

11. http://flybase.bio.indiana.edu/

Flybase has almost "all you ever wanted to know about" Drosophila.

http://www.fruitfly.org/

Berkeley Drosophila genome project

 

12. http://www.informatics.jax.org/

Mouse genome informatics, from the Jackson Laboratories, provides a wealth of information about the mouse genome and mouse genes.

 

13. http://gdbwww.gdb.org/

Access to databases containing much information on human genes and genome.

14. http://www.ncbi.nlm.nih.gov/genome/seq/HsHome.shtml

Current status of the public human genome sequencing effort.

15. http://globin.cse.psu.edu

This site is a good example of integrating information from one particular locus (the b-globin gene cluster of humans) on gene cluster sequences, naturally occurring mutations (hemoglobinopathies) and globin gene regulation. Application of this approach to larger segments of sequenced DNA (e.g. two bacterial species or long segments of sequenced DNA in human and mouse) is also explored.

16. http://www.public.iastate.edu/~pedro/research_tools.html

Pedro's Molecular Biology Research Tools. Pedro did a good job.

 

Images of macromolecules on the Web

17. http://www.ncbi.nlm.nih.gov/

This is the NCBI site again. Click on the "Structure" button and you go to the Molecular Modeling Database. NCBI has developed a special viewer for 3-D images (called CN3D) that should work on any current computer platform.

18. http://kinemage.biochem.duke.edu/default.html

http://www.prosci.org/Kinemage/

These are good sources for the MAGE program, kinetic images, and information about them. The first is from the authors of MAGE, David Richardson and Jane Richardson at the Department of Biochemistry at Duke University. The second is maintained by the Protein Society, publishers of Protein Science. The Richardsons have made freely available these programs for generating images based on known 3-dimensional structures of molecules, and viewing them as movable, 3-D objects in real time. A file with the atomic coordinates of a molecule (obtained, e.g. from the Brookhaven Protein Data Base) can be converted into a "kinetic image," or "kinemage" using the program PREKIN_2.5.1, producing a file with the .kin suffix. These files can be viewed, moved, rotated, etc. using the program MAGE_4.0, hence the images are kinetic, and thus the name "kinemage." This ability to move the image really brings out the 3-D aspects, and in addition, one can select the parts of the molecule to be shown, so that complex structures can be broken down into simpler parts for better viewing.

19. http://www.umass.edu/microbio/chime/explorer/

Protein Explorer, from Eric Martz at the University of Massachusetts, provides a very accessible, informative, and powerful web interface to 3-D structures. It runs on Chime (from MDL), which in turn is built on Roger Sayle's RasMol code. There are some pre-packaged routines, like for DNA and for GAL4 complexed to DNA, and you can get a PE image of any molecule in the crystallographic databases (easy to use search engines are provided). This is a terrific resource.

 

20. http://www.bio.cmu.edu/Courses/03438/

This site (again from Carnegie Mellon) tells you how to access RasMol or MAGE and set up your WWW browser (such as NetScape) to run these programs automatically. Other sources for information on MAGE are listed next.

21. http://www.bio.cmu.edu/Courses/03438/

This is a site at Carnegie Mellon that supports the nucleic acid portion of a physical biochemistry course. I liked the images, and the site directs you to comprehensive resources for biomolecular structures.

22. http://molbio.info.nih.gov/cgi-bin/pdb

This site at the NIH, called "Molecules R Us" gives easy access to all the 3-D structures in the Protein DataBase at the Brookhaven National Labs. You can get "gif" files (images) of any structure in the PDB, in a variety of formats. These are flat, 2-D images. What is most helpful is to obtain the programs RasMol or MAGE or CN3D to allow you to move images generated from the atomic coordinates, and thereby get a variety of 3-D views.

 

Web sites with on-line courses similar to BMB 400

23. http://www.medkem.gu.se/edu/

This has an interactive molecular biology quiz and links to useful tools. It has a heavy emphasis on retrieving information from molecular biology and sequence databases.

24. http://esg-www.mit.edu:8001/esgbio/

This is the MIT Biology Hypertextbook. Click on The Biology Hypertextbook Chapters and follow the links from there. You will find text, figures and additional problems.

25. http://www.life.uiuc.edu/micro/316/#resources

This is a site at the University of Illinois (Urbana-Champaign) that can take you to additional problems in microbial genetics, including several on complementation analysis, a microbial genetics glossary, and other useful stuff.