Archive

  • Visit JGI.DOE.GOV
News & Publications
Home › Publications › EnZymClass: Substrate specificity prediction tool of plant acyl-ACP thioesterases based on ensemble learning

EnZymClass: Substrate specificity prediction tool of plant acyl-ACP thioesterases based on ensemble learning

Published in:

Current Research in Biotechnology 4 , 1-9 ( 2022)

Author(s):

Banerjee, Deepro, Jindra, Michael A., Linot, Alec J., Pfleger, Brian F., Maranas, Costas D.

DOI:

10.1016/j.crbiot.2021.12.002

Abstract:

Characterizing the functional properties of plant acyl-ACP thioesterases (TEs), a key enzyme class used in the production of renewable oleochemicals in microbial hosts, experimentally, can be an expensive and time consuming process since it requires manual screening of thousands of candidates in a database. Using amino acid sequence to computationally predict an enzyme’s function might accelerate this process; however obtaining the necessary amount of information on previously characterized enzymes and their respective sequences required by standard Machine Learning (ML) based approaches to accurately infer sequence-function relationships can be prohibitive, especially with a low-throughput testing cycle. Experimental noise, unbalanced dataset where high sequence similarity does not always imply identical functional properties will further prevent robust prediction performance. Herein we present a ML method, Ensemble method for enZyme Classification (EnZymClass), that is specifically designed to address these issues. We used EnZymClass to classify TEs into short, long and mixed free fatty acid substrate specificity categories. While general guidelines for inferring substrate specificity have been proposed before, prediction of chain-length preference from primary sequence has remained elusive for plant acyl-ACP TEs. By applying EnZymClass to a subset of TEs in the ThYme database, we identified two medium chain TEs, ClFatB3 and CwFatB2, with previously uncharacterized activity in E. coli fatty acid production hosts. EnZymClass can be readily applied to other protein classification challenges and is available at: https://github.com/deeprob/ThioesteraseEnzymeSpecificity.

View Publication

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to print (Opens in new window)
  • JGI.DOE.GOV
  • Disclaimer
  • Accessibility / Section 508
Lawrence Berkeley National Lab Biosciences Area
A project of the US Department of Energy, Office of Science

JGI is a DOE Office of Science User Facility managed by Lawrence Berkeley National Laboratory

© 1997-2025 The Regents of the University of California