Bioinformatics and Comparative Genomics of Chlorobi

Genome sequencing, assembly and annotation

Sanger sequencing: (with JGI-DOE except as noted; abbreviations used are noted in parentheses)

  • Chlorobium chlorochromatii CaD3 (cagg)
  • Chlorobium ferrooxidans DSM 13031 (cfer)
  • Chlorobium limicola DSM 245 (clim)
  • Chlorobium phaeobacteroides BS1 (cphb)
  • Chlorobium phaeobacteroides DSM 266 (cpha)
  • Chlorobaculum (formerly Chlorobium) tepidum TLS (ctep) (sequenced in collaboration with TIGR)
  • Pelodictyon luteolum DSM 273 (plut)
  • Pelodictyon phaeoclathratiforme BU-1 (ppha)
  • Prosthecochloris aestuarii DSM 271 (paes)
  • Prosthecochloris vibrioformis DSM 265 (cvib)

Pyrosequencing (PSU with Dr. Stephan C. Schuster):

  • Chlorobaculum parvum (cpar)
  • Chloroherpeton thalassium (cher)

Assembly methods

  • Phred/Phrap/Consed
  • Multiplex PCR
  • Pheromone-based genetic algorithm for comparative genome assembly

Annotation methods

  • Glimmer
  • COG, NCBI-NR database
  • TIGR Manatee

Whole-genome phylogeny of Chlorobi

Phylogenetic relationships among 12 sequenced GSBs based on their 813 core gene sets. Bootstrap values in green are estimated by concatenated protein sequences of 813 core sets using NJ method. Red numbers are consensus values estimated by randomly choosing 100 proteins from 813 core sets to construct the phylogeny with their concatenated sequences (ML method), then repeating this process 100 times. Blue values are estimated by the consensus tree with 50% cutoff from 813 phylogenies constructed by ML method.

General genome features of Chlorobi

  Cvib Ctep Cpar Plut Cfer* Cagg Paes Cphb Clim Ppha Cpha Cher*
Size (bp) 1,966,858 2,154,939 2,289,237 2,364,842 ~2,538,957 2,572,079 2,579,695 2,736,403 2,763,182 3,018,240 3,133,902 ~3,270,419
GC content(%) 52.99 56.53 55.81 57.33 50.11 44.28 50.05 48.92 51.33 48.05 48.35 45.06
# ORFs 1,809 2,026 2,141 2,188 2,349 2,099 2,380 2,596 2,623 2,971 3,070 2,955
ORF aver. Length(bp) 997 926 932 967 941 1087 933 900 930 889 883 975
# rRNA operon 1 2 2 2 2 1 1 2 2 3 2 1
# tRNA 45 50 49 48 44 45 46 46 48 49 47 44
# ORFs in COG 1,390 1,543 1,581 1,613 1,645 1,460 1,674 1,760 1,794 1,916 1,890 1,976
# unique ORFs 119 140 216 214 328 307 303 371 349 528 643 1,139
# ISs 22 19 2 8 5 25 43 45 56 48 88 10

Note: * indicates data retrieved from the incomplete but nearly finished genome.

Genomic structures of Chlorobi

Comparisons of genome structures among Chlorobi. The central blue lines represent the chromosomes of four Chlorobi species.

Linear genomic comparison of C. aggregatum (Top), C. tepdium (Center), and C. limicola (Bottom). Red lines show the orthologous genes on the forward strands; blue lines show the orthologous genes on the reverse strands.

Orthologous and Non-orthologous genes in C. vibrioforme and P. luteolum. Circles on the map show the genomic islands along the chromosome.

Unequal evolutionary rates in Chlorobi

The accelerated evolution of C. aggregatum. (A) Comparisons of nonsynonymous substitution rates (dN) between C. aggregatum (in yellow) and C. limicola (in blue). The rate of dN is significantly elevated in C. aggregatum, indicating the accelerated evolution of the symbiotic green sulfur bacterium. (B) The relative branch depth of C. aggregatum derived from 813 core gene sets shows the protein sequences of C. aggregatum evolve faster than other speices. (C) The relationships between evolutionary rates and topologies. (D) The frequency spectra of relative branch depth of C. aggregatum under three different topologies as shown in figure C. Thus, the incongruence of gene phylogeny in C. aggregatum with the species phylogeny may be due to the long branch attraction effect caused by the accelerated evolution of C. aggregatum.

Support from