Sequencing of Cacao Genome will Help US Chocolate Industry, Subsistence Farmers
Having the genome sequenced is expected to speed up the process of identifying genetic markers for specific genes that confer beneficial traits, enabling breeders to produce superior new lines through traditional breeding techniques.
Sep 16 2010 --- U.S. Department of Agriculture (USDA) scientists and their partners have announced the preliminary release of the sequenced genome of the cacao tree, an achievement that will help sustain the supply of high-quality cocoa to the $17 billion U.S. chocolate industry and protect the livelihoods of small farmers around the world by speeding up development, through traditional breeding techniques, of trees better equipped to resist the droughts, diseases and pests that threaten this vital agricultural crop.
The effort is the result of a partnership between USDA's Agricultural Research Service (ARS); Mars, Inc., of McLean, Va., one of the world's largest manufacturers of chocolate-related products; scientists at IBM's Thomas J. Watson Research Center in Yorktown , N.Y.; and researchers from the Clemson University Genomics Institute, the HudsonAlpha Institute for Biotechnology, Washington State University, Indiana University, the National Center for Genome Resources, and PIPRA (Public Intellectual Property Resource for Agriculture) at the University of California-Davis.
Team leaders from USDA included molecular biologist David Kuhn and geneticist Raymond Schnell, both at the ARS Subtropical Horticulture Research Station in Miami, Fla., and ARS computational biologist Brian Scheffler at the Jamie Whitten Delta States Research Center in Stoneville, Miss. ARS is the principal intramural scientific research agency of USDA. This research supports the USDA priority of promoting international food security, and USDA's commitment to agricultural sustainability.
"Because of the talent and dedication brought together by this unique partnership, researchers and plant breeders will be able to accelerate the genetic improvement of the cacao crop now cultivated in tropical regions around the world," said Edward B. Knipling, ARS administrator. "This will benefit not only the chocolate industry, but also millions of small farmers who will be able to continue to make their living from cacao."
Cocoa comes from the cacao tree, Theobroma cacao. The tree seeds are processed into cocoa beans that are the source of cocoa, cocoa butter and chocolate. But fungal diseases can destroy seed-bearing pods and wipe out up to 80 percent of the crop, and cause an estimated $700 million in losses each year.
Worldwide demand for cacao now exceeds production, and hundreds of thousands of small farmers and landholders throughout the tropics depend on cacao for their livelihoods. An estimated 70 percent of the world's cocoa is produced in West Africa.
Scientists worldwide have been searching for years for ways to produce cacao trees that can resist evolving pests and diseases, tolerate droughts and produce higher yields. ARS researchers have been testing new cacao tree varieties developed with genetic markers. But having the genome sequenced is expected to speed up the process of identifying genetic markers for specific genes that confer beneficial traits, enabling breeders to produce superior new lines through traditional breeding techniques.
Sequencing cacao's genome also will help researchers develop an overall picture of the plant's genetic makeup, uncover the relationships between genes and traits, and broaden scientific understanding of how the interplay of genetics and the environment determines a plant's health and viability.
Despite being led and funded by a private company, Mars Inc., Cacao Genome Database scientists say one of their chief concerns has been making sure the Theobroma cacao genome data was published for all to see -- especially cacao farmers and breeders in West Africa, Asia and South America, who can use genetic information to improve their planting stocks and protect their often-fragile incomes.
"When you have to wait three or more years for a tree you plant to bear the beans you sell, you want as much information as possible about the seedlings you're planting," said Keithanne Mockaitis, IU Center for Genomics and Bioinformatics (CGB) sequencing director and IU project leader. "We expect this information will positively impact some of the poorest regions in the world, where tropical tree crops are grown. Making the genome data public further enables breeders, farmers and researchers around the world to use a common set of tools, and to share information that will help them fight the spread of disease in their crops."
Mockaitis, a biochemist-turned-genomicist, joined the project in early 2009, and quickly set to work with her collaborators to tackle the challenge of sequencing and accurately pasting together the approximately 400 million base pairs of the tree's genome. Mockaitis' Cacao Genome Group partners at the U.S. Department of Agriculture's Subtropical Horticulture Research Station in Miami sent samples to Bloomington, and these were prepared and sequenced in a redundant manner by her sequencing team in the CGB genomics laboratory. Sequence of some of the same material was generated using additional methods in laboratories of the USDA Agricultural Research Service (USDA-ARS) and at the National Center for Genome Resources in Santa Fe, N.M.
Raw data were then sent to HudsonAlpha Biotechnology Institute, a partner of the U.S. Department of Energy-funded Joint Genome Institute, for assembly. Other important datasets generated by Mockaitis' group were not the sequences of the DNA itself, but of the RNA, or transcripts produced in different tissues of the tree. Transcript sequences reveal which genes are expressed (turned on).
Finally, IU Bloomington Department of Biology scientist Don Gilbert analyzed both the genome and transcriptome sequences and generated the annotations that point to the locations in which each active gene and its components (exons and introns) reside.
"The final number of genes is still being counted and validated, but we currently estimate the cacao plant has about 35,000 genes," Mockaitis said.
That's a typical gene number for flowering plants whose genomes have thus far been sequenced. Humans have approximately 30,000 genes. Rice has about 40,000.
Since its inception about 11 years ago, the CGB has been involved in dozens of different projects that address the workings of different species' genomes with the use of high-throughput technologies.
"Cacao is something of a first for us," Mockaitis said. "This is the largest genome the CGB has sequenced to date. As a group we now have more experience and more resources to take on a wider variety of projects."
Mockaitis says the relative efficiency of the project so far has been due to Mars' support of the academic and non-profit contributing laboratories.
"We've benefited from having a collegial group of researchers, from the USDA-ARS and a variety of genomics-focused laboratories, that each bring different scientific expertise to the table to complete this genome. It's also been particularly inspiring to see West African cacao researchers come to some of our meetings -- they listen to us talk about the esoteric technologies we're using, and we know that they'll soon go to work and start benefitting from the data. That's a rare treat for an academic researcher."