Archive

  • Visit JGI.DOE.GOV
About Us
Home › About Us › Annual Progress Report › 2023 Progress Report
  • Annual Progress Reports

2023 Progress Report

DOE Joint Genome Institute staff walk past murals on the third floor of the Integrative Genomics Building. located at Lawrence Berkeley National Laboratory in Berkeley, California.

Vision

Lead genomic innovation for a sustainable bioeconomy.

Mission

As a U.S. Department of Energy Office of Science user facility, we provide advanced genomic capabilities, large-scale data, and professional expertise to support the global research community in studies of complex biological and environmental systems. We optimize our service to the community through responsibly managing our people and resources.

Director’s Perspective

Nigel Mouncey, Director, DOE Joint Genome Institute

Honors & Awards

In 2023, several researchers were recognized by multiple organizations.

Yasuo award
Nikos award
Highly cited
previous arrowprevious arrow
next arrownext arrow

Accomplishments at a Glance

Below are briefs on some of the scientific collaborations that came out of the JGI in 2023, as well as highlights around our outreach efforts.

A tiled collage of square photos of different plants - soybeans, and sorghum, for example. A Collaboration to Improve Plant Genome Annotations Across Species: Researchers have developed an atlas that maps gene expression patterns in the Arabidopsis root from single root cell profiles. a close-up photo of green and purple sphagnum moss Sequencing Sphagnum Leads to Discovery of Sex Chromosomes: Research illuminates the significant role sex plays in how this moss — which constitutes most peatlands — grows, stores carbon and responds to stress.
Pictured is an eelgrass habitat off the coast of the Western Baltic Sea, Falckenstein, at three metres deep. There is a purple starfish in the bottom left corner. Eelgrass Proves to be Much Younger than We Thought: Investigating the evolutionary history of eelgrass shows it colonized the Arctic ocean as recently as 25,000 years ago. Two researchers work in a science lab with a cart of small plants. Crops as Tough as World Cup Turf: Grasses do better in our warming world than crops like corn and sorghum, so researchers looked into traits that could transfer from one to the other.
Pictured is a micrograph of Neocallimastix californiae. Busting the Unbreakable Lignin: This ground-breaking discovery demonstrated anaerobic fungi’s ability to break down lignin — the hardiest of plant materials. Shiitake_(Lentinula_edodes)_WikimediaCommons Tracing the Evolution of Shiitake Mushrooms: A fresh analysis of global Lentinula samples saw 24 new mushroom genomes sequenced and genomes assembled from 60 existing sequences.
Green plant matter grows from the top, with the area just beneath the surface also visible as soil, root systems and a fuzzy white substance surrounding them. Supercharging SIP in the Fungal Hyphosphere: Successful automation of aspects of stable isotope probing, used to study microbial communities, proved to vastly reduce labor and improve results. New Research Sheds Light on Diversity in the Deep Sea: Charting variability in microbial communities around hydrothermal vents and underwater volcanoes across five oceanic regions.
Illustration of a magnifying glass identifying viruses and plasmids. You Can Move, But You Can’t Hide: The geNomad software tool quickly identifies and classifies mobile genetic elements based upon their gene content and their genetic sequences. iPHoP: A Matchmaker for Phages and their Hosts: Building on existing virus-host prediction approaches, iPHoP reliably matches viruses with their archaea and bacteria hosts.
A grey microscopy photo taken at micron-scale. Microbes shown are small, round and slightly spiky in shape. For the Tiniest Archaea, A Genomic Switch of Friend or Foe: Meta-omics datasets show that CRISPR-Cas systems determine mutualism or parasitism between some archaeal hosts and their hitchhikers. A photo of Great Boiling Spring in the forefront with mountains in the background. New Research Finds Flagella in the Terrestrial Roots of Marine Bacteria: A deeper understanding of how these bacteria evolved could help inform engineering microorganisms for biofuel production and other sustainable applications.
A scientist in a small orange raft paddles out in a freshwater lake. Methane Makers in Yosemite’s Lakes: The microbial communities in isolated, mountaintop lakes of the Sierra Nevada have a methane paradox, and show how climate change affects freshwater ecosystems.

a picture of cotton growing in the wild, cut up strawberries on a cutting board, and camelina grass growing in the wild

From Berkeley to Binghamton: Tracking Strawberry Evolution: A method for correctly identifying polyploid subgenomes without knowing their ancestral genomes was applied to plants including cotton and strawberry.
JGI-UC Merced Genomics Internship Program at 10: From two interns to 75 alumni in the decade since the inception of the JGI’s flagship internship program with UC Merced. An array of growth chambers sit on a lab table for student experiments. Experimenting with EcoFABs for Student Labs: Encouraging STEM careers with small plastic growth chambers that enable researchers to compare their work consistently. 

Impact: By the Numbers

Spending Profile

Animated gif shows spending profile by percentage of total budget: genomic technologies—34.7%; science programs and analysis—32.2%; data science and informatics—15%; management—7.5%; project management office—3.8%; compute infrastruture & support team (@NERSC)—4.2%; operations—2.6%

Users on the Global Map

An animated map of the world shows how users are distributed globally: North America—1,694 (U.S> 1,592); South America—34; Europe—479; Africa—14; Asia—82; Austrialia/New Zealand—70. A chart in the corner shows distrubtion by type of user: Academic—1,708; Industry—38; DOE nat'l labs—208; Government—235; Other—184; Total Users—2,373

North America 1,694 Asia 82 Europe 479    
United States 1,592 China 24 Austria 13 Norway 18
Canada     95 India 10 Belgium 17 Poland   2
Mexico       7 Israel   9 Croatia   1 Portugal   7
    Japan 29 Czech Republic 13 Russia   5
South America 34 Malaysia   1 Denmark 17 Serbia   2
Argentina   1 Singapore   3  Estonia   2 Slovenia   2
Brazil 23 South Korea   6 Finland 13 Spain 43
Chile   2     France 62 Sweden 28
Colombia   1 Africa 14 Germany 86 Switzerland 12
Ecuador   1 Morocco   1 Greece 10 Turkey   1
Peru   1 South Africa 12 Hungary 10 United Kingdom 60
Uruguay   5 Tunisia   1 Iceland   1    
        Ireland   3 Australia + New Zealand 70
        Italy 22 Australia 53
        Netherlands 29 New Zealand 17

Users on the U.S. Map

A map of the U.S. shows a breakdown of users in the country: (354) California; (30–74) Wisconsin, Michigan, Washington, Colorado, New York, Georgia, Illinois, Massachusetts, Tennessee, Ohio, Arizona, Missouri, North Carolina, Florida, Minnesota, Montana, Texas; (11–29) Oregon, New Mexico, Conneticut, Alabama, Indiana, Maryland, Nebraska, New Jersey, Delaware, New Hamphsire, Pennsylvania, Utah, Hawaii, Iowa, Virginia ; (0–10) Kansas, Oklahoma, South Carolina, Lousiana, West Virginia, Maine, Nevada, Rhode Island, Arkansas, Idaho, Missouri, North Dakota, Washinton, D.C., Puerto Rico, Vermont, Kentucky, Whyoming, Alaska, South Dakota.. A box in the corner shows a breakdown by industry: Academic—1,190; Industry—28; DOE nat'l labs—208; Government—76; Other—90; Total Users—1,592 

Cumulative Number of Projects Completed

Cumulative Number of Scientific Publications

A side-by-side graphic shows projects completed and publications over the last 10 years. Projects: 2014-11,770; 2015-16,313; 2016-23,047; 2017-31,476; 2018-38,999; 2019-82,539; 2020-104,988; 2021-142,076; 2022-179,990; 2023-219,860. Publications: 2014-1,207; 2015-1,388; 2016-1,555; 2017-1,702; 2018-1,934; 2019-2,156; 2020-2,368; 2021-2,610; 2022-2,862; 2023-3,100.

Sequencing Output

(in billions of bases or GB)

The JGI supports short- and long-read sequencers, where a read refers to a sequence of DNA bases. Short-read sequencers produce billions of paired-end 150 basepair reads used for quantification, such as in gene expression analysis. Long-read sequencers currently average 60,000–70,000 bp reads and are used for de novo genome assembly. Combined short-read and long-read totals per year give JGI’s annual sequence output. The total sequence output in 2023 was 716,929 GB.

Sequencing Productivity

Billions of Base Pairs

A motion graphic shows sequencing productivity. SIngle molecule long-read sequencing: 2014-596; 2015-1,470; 2016-1,907; 2017-3,625; 2018-7,305; 2019-24,744; 2020-123,793; 2021-158,690; 2022-161,463; 2023-281,928. Massively parallel short-read sequencing: 2014-100,013; 2015-141,707; 2016-139,964; 2017-174,519; 2018-217,995; 2019-301,746; 2020-166,699; 2021-308,500; 2022-312,309; 2023-435,001.

Users Letters of Intent/Proposals Submitted & Approved

A bar graph showing letters of intent received and approved, as well as proposals submitted and approved, for proposal calls. Community Science Program: (2022) LOIs submitted-50; LOIs approved-46; Proposals submitted-41; Proposals approved-17. (2023) LOIs submitted-46; LOIs approved-39; proposals submitted-35; proposals approved-19. FICUS(JGI-EMSL): (2022)LOIs submitted-56; LOIs approved-54; proposals submitted-52; proposals approved-11. (2023) LOIs submitted-47; LOIs approved-45; proposals submitted-43; proposals approved-10. New Investigator: (2022) proposals submitted-55; proposals approved-20. (2023) proposals submitted-64; proposals approved-26. Functional Genomics: (2022) proposals submitted-24; proposals approved-10. (2023) proposals submitted-19; proposals approved-9.

Computational Infrastructure

Two bioinformatics researchers sit together at a conference room table while a graph cycles through different data visualizations on the screen above them.

Daniela Cassol (right) and Mario Melara (left), two members of the JGI’s advanced analysis group, work behind the scenes on the bioinformatics tools that users have access to via the JGI Analysis Workflow Service (JAWS). These tools make it easier to run bioinformatics workflows across multiple resources, in order to foster scalable projects and collaboration.

Users of JGI Tools & Data

The Genome Portal provides unified access to all JGI genomic databases and analytical tools. Users can search, download and explore data sets available for all JGI sequencing projects including their status, assemblies, and annotations of sequenced genomes. The Data Portal allows JGI users to more easily access public data sets through a common set of metadata across files submitted by each scientific program. The Genome Portal will be retired once the Data Portal reaches data- and feature-parity with its predecessor. FY2023 improvements to the Data Portal include improved data parity, cart download, navigation by pagination, and significant progress on privileged access and access management.

 

An infographic shows "Known Data  System Usage since 2000" — in both biosciences field and other disciplines. Under biosciences: genetics-12,000; zoology-100; biochem and cell biology-3,600; ecology-2,300; microbiology-10,000; evolutionary biology-1,000; informatics and comp.bio.-6,200; general biosci.-900; industrial biotech-1,400; plant biology-5,800. In other disciplines: environmental sciences-500; chemical sciences-500; misc.+other-300; agriculture, veterinary and food sciences-3,200; biomedical and clinical sciences-1,700; earth sciences-200; engineering-200; info. and comp. sciences-300.

JGI Archive and Metadata Organizer (JAMO):

15.192 million file records

JAMO Archived Data Footprint:

15.952 Petabytes

Data Downloads in FY23:

Genome Portal: 7.286 million file-downloads
Data Portal: 0.633 million file-downloads
Total: 7.919 million file-downloads

 

 

Since the retirement of NERSC’s Cori system, a number of JGI’s pipelines and processes have moved out of NERSC to the JGI’s new informatics cluster, Dori, situated at LBL’s LabIT, and the computing infrastructure at IGB. This has required JGI’s data management system, JAMO (JGI Archive and Metadata Organizer), to expand operations across data centers — both ingesting new data and delivering data stored at NERSC for processes running at these other centers. This has been our first step in creating a distributed version of JAMO, which in the future will be capable of sharing data across registered lab members.

As part of our business continuity planning, JGI has worked with the Environmental Molecular Sciences Laboratory to enable JAMO to automatically transmit and store files on EMSL’s HPSS tape system via Globus. Currently, all new raw sequencing data is being transmitted. Over the next year, all legacy data will be restored from NERSC’s tape system and transmitted automatically to EMSL.

 

Computational graphic by Neil Byers, JGI.  Photography and cinemagraphs by Thor Swift, Berkeley Lab. ‘By the Numbers’ infographics by Creative Services, IT Division, Berkeley Lab. 

  • JGI.DOE.GOV
  • Disclaimer
  • Accessibility / Section 508
Lawrence Berkeley National Lab Biosciences Area
A project of the US Department of Energy, Office of Science

JGI is a DOE Office of Science User Facility managed by Lawrence Berkeley National Laboratory

© 1997-2025 The Regents of the University of California