The US Department of Energy Joint Genome Institute (DOE JGI) has released a preliminary assembly and annotation of the soybean genome, Glycine max, to the greater scientific community to enable bioenergy research. The preliminary data can be accessed at http://www.phytozome.net/soybean.
The large-scale shotgun DNA sequencing project began in the middle of 2006 and will be completed in 2008. A total of about 13 million shotgun reads have been produced and deposited in the National Center for Biotechnology Information (NCBI) Trace Archive in accordance with the consortium’s commitment to early access and consistent with the Fort Lauderdale genome data release policy.
Although the first plant and animal genomes were sequenced by a BAC-by-BAC (Bacterial Artificial Chromosomes) approach, almost all current animal and fungal genome sequencing projects use the whole genome shotgun strategy in which the entire genome is randomly sheared, subcloned, and redundantly sequenced. The ease, cost-efficiency, and speed of whole genome shotgun approach has made it the method of choice in many cases, but there are lingering concerns about its effectiveness for large repeat-rich plant genomes, especially grasses. Soybean is the most complex plant genome sequenced to date by this strategy.
The current assembly (representing 7.23x coverage), gene, set, and browser are collectively referred to as “Glyma0”. Glyma0 is a preliminary release, based on a partial dataset. This is expected to be replaced with an improved, chromosome-scale Glyma1 version by the end of 2008. Early users of this data are encouraged to track their favorite genes by saving local copies of the DNA sequences of these loci, and not by identifier or sequence coordinate, as these will change in future versions.
DOE JGI’s interest in sequencing the soybean stems from its role as a principal source of biodiesel.
Detailed knowledge of the soybean genetic code will enable crop improvements for more effective application of this plant for clean bioenergy generation. Knowing which genes control specific traits, researchers are able to change the type, quantity, and/or location of oil produced by the crop. Through utilization of the sequence information generated by DOE JGI, it may be possible to develop a customized biomass production platform for combining oil seed production for biodiesel with enhanced vegetative growth for ethanol conversion—doubling the energy output of the crop. (Earlier post.)
In 2004, more than 3.1 billion bushels of soybeans were grown on nearly 75 million acres in the US, with an estimated annual value exceeding $17 billion—second only to corn, and about twice that of wheat.
The soybean genome project was initiated through the DOE JGI Community Sequencing Program (CSP) by a consortium led by DOE JGI’s Dan Rokhsar, Stanford’s Jeremy Schmutz, Gary Stacey of the University of Missouri-Columbia, Randy Shoemaker of Iowa State University, and Scott Jackson of Purdue University, with support from the US Department of Agriculture and the National Science Foundation.
The US Department of Energy Joint Genome Institute, supported by the DOE Office of Science, unites the expertise of five national laboratories—Lawrence Berkeley, Lawrence Livermore, Los Alamos, Oak Ridge, and Pacific Northwest—along with the Stanford Human Genome Center to advance genomics in support of the DOE missions related to clean energy generation and environmental characterization and cleanup.