Archive

  • Visit JGI.DOE.GOV
News & Publications
Home › Publications › The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization

The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization

Published in:

Plant J (Nov 21 2017)

Author(s):

McCormick, R. F., Truong, S. K., Sreedasyam, A., Jenkins, J., Shu, S., Sims, D., Kennedy, M., Amirebrahimi, M., Weers, B. D., McKinley, B., Mattison, A., Morishige, D. T., Grimwood, J., Schmutz, J., Mullet, J. E.

DOI:

10.1111/tpj.13781

Abstract:

Sorghum bicolor is a drought tolerant C4 grass used for production of grain, forage, sugar, and lignocellulosic biomass and a genetic model for C4 grasses due to its relatively small genome (~800 Mbp), diploid genetics, diverse germplasm, and colinearity with other C4 grass genomes. In this study, deep sequencing, genetic linkage analysis, and transcriptome data were used to produce and annotate a high-quality reference genome sequence. Reference genome sequence order was improved, 29.6 Mbp of additional sequence was incorporated, the number of genes annotated increased 24% to 34,211, average gene length and N50 increased, and error frequency was reduced 10-fold to 1 per 100 kbp. Sub-telomeric repeats with characteristics of Tandem Repeats In Miniature (TRIM) elements were identified at the termini of most chromosomes. Nucleosome occupancy predictions identified nucleosomes positioned immediately downstream of transcription start sites and at different densities across chromosomes. Alignment of more than 50 resequenced genomes from diverse sorghum genotypes to the reference genome identified ~7.4M SNPs and 1.9M indels. Large scale variant features in euchromatin were identified with periodicities of ~25 kbp. An RNA transcriptome atlas of gene expression was constructed from 47 RNA-seq profiles of growing and developed tissues of the major plant organs (roots, leaves, stems, panicles, and seed) collected during the juvenile, vegetative and reproductive phases. Analysis of the transcriptome data indicated that tissue type and protein kinase expression had large influences on transcriptional profile clustering. The updated assembly, annotation, and transcriptome data represent a resource for C4 grass research and crop improvement. This article is protected by copyright. All rights reserved.

View Publication

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to print (Opens in new window)
  • JGI.DOE.GOV
  • Disclaimer
  • Accessibility / Section 508
Lawrence Berkeley National Lab Biosciences Area
A project of the US Department of Energy, Office of Science

JGI is a DOE Office of Science User Facility managed by Lawrence Berkeley National Laboratory

© 1997-2025 The Regents of the University of California