Archive

  • Visit JGI.DOE.GOV
News & Publications
Home › Publications › efam: an expanded, metaproteome-supported HMM profile database of viral protein families

efam: an expanded, metaproteome-supported HMM profile database of viral protein families

Published in:

Bioinformatics ( 2021)

Author(s):

Zayed, Ahmed A, Lücking, Dominik, Mohssen, Mohamed, Cronin, Dylan, Bolduc, Ben, Gregory, Ann C, Hargreaves, Katherine R, Piehowski, Paul D, White, Richard A, III, Huang, Eric L, Adkins, Joshua N, Roux, Simon, Moraru, Cristina, Sullivan, Matthew B

DOI:

10.1093/bioinformatics/btab451

Abstract:

Viruses infect, reprogram, and kill microbes, leading to profound ecosystem consequences, from elemental cycling in oceans and soils to microbiome-modulated diseases in plants and animals. Although metagenomic datasets are increasingly available, identifying viruses in them is challenging due to poor representation and annotation of viral sequences in databases.Here we establish efam, an expanded collection of Hidden Markov Model (HMM) profiles that represent viral protein families conservatively identified from the Global Ocean Virome 2.0 dataset. This resulted in 240,311 HMM profiles, each with at least 2 protein sequences, making efam >7-fold larger than the next largest, pan-ecosystem viral HMM profile database. Adjusting the criteria for viral contig confidence from “conservative” to “eXtremely Conservative” resulted in 37,841 HMM profiles in our efam-XC database. To assess the value of this resource, we integrated efam-XC into VirSorter viral discovery software to discover viruses from less-studied, ecologically distinct oxygen minimum zone (OMZ) marine habitats. This expanded database led to an increase in viruses recovered from every tested OMZ virome by

View Publication

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to print (Opens in new window)
  • JGI.DOE.GOV
  • Disclaimer
  • Accessibility / Section 508
Lawrence Berkeley National Lab Biosciences Area
A project of the US Department of Energy, Office of Science

JGI is a DOE Office of Science User Facility managed by Lawrence Berkeley National Laboratory

© 1997-2025 The Regents of the University of California