Identifying the metabolic potential of Chlamydomonas reinhardtii by large-scale annotation of its encoded open reading frames
Lila Ghamsari1,2, Balaji Santhanam1,2, Roger L. Chang3, Haiyuan Yu1,2,4, Yun Shen1, Xinping Yang1,2, Ani Manichaikul5, Erik F. Y. Hom6, Dawit Balcha1,2, Marc Vidal1,2, Jason A. Papin5, and Kourosh Salehi-Ashtiani1,2
1) Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA
2) Harvard Medical School, Boston, MA
3) University of California San Diego, La Jolla, CA
4) Cornell University, Ithaca, NY
5) University of Virginia, Charlottesville, VA. 6Harvard University, Cambridge, MA.
 
We recently described an integrated method of metabolic transcript verification and network reconstruction in C. reinhardtii (Manichaikul et al., Nature Methods 2009). Using this approach and the newly released JGIv4.0 and the "Augustus5" models, we carried out functional annotation and structural verification of the metabolic Open Reading Frames (ORFs) of the organism in conjunction with genome-scale reconstruction of the metabolic network. We assigned enzymatic functions to the translated ORF models of C. reinhardtii by reciprocal blast analysis of the putative proteomes against Uniprot protein sequences. The respective paralogs were assigned EC numbers by clustering ORFs within each annotation group. In contrast to the high commonality of EC numbers the ORFs from the two sets carrying, less than half of the ORFs belonging to either JGIv4.0 or Augustus5 were found to be 100% identical in sequence, indicating that the structural annotations of the metabolic ORFs need to be revisited. We carried out RT-PCR followed by Gateway cloning to experimentally verify the structure of the metabolic ORFs. After 454FLX sequencing of the ORFs, we obtained more than 90% coverage of the ORF length for about 60% of the Augustus5 and %75 of the JGIv4.0 metabolic ORF models. Based on the annotated ORFs and literature resources, we constructed a genome-scale metabolic model which accounts for 1069 genes associated with 1788 reactions and includes 1066 unique metabolites. Our genome-scale reconstruction of C. reinhardtii metabolism provides a valuable quantitative and predictive resource for metabolic engineering. The generated metabolic ORF clones will be available without restrictions to the research community.
 
 
 
e-mail address of presenting author: lila_ghamsari@dfci.harvard.edu