Archive

  • Visit JGI.DOE.GOV
Data & Tools
Home › Data & Tools › Software › BBTools › BBTools User Guide › Statistics Guide

Statistics Guide

Stats is designed to generate basic assembly statistics such as scaffold count, N50, L50, GC content, gap percent, etc. It can also generate per-sequence GC-content information. The reason for the existence of stats is to replace prior tools that had similar function, but could not scale to large metagenomes; Stats is capable of processing an assembly of practically unbounded size, with sequences of practically unbounded length. And it does this rapidly, in a small amount of memory. Stats can also estimate the memory requirements of BBMap for a given assembly and kmer length.

*Notes*

Memory:

Stats uses 120MB of RAM regardless of the assembly size.

Threads:

Stats is single threaded; it does not do garbage-collection or even use independent threads for I/O streams, unlike other BBTools.

*Usage Examples*

To get stats on an assembly:

stats.sh in=contigs.fa

To compare multiple assemblies:

statswrapper.sh in=a.fa,b.fa,c.fa format=6

To print GC and length information per sequence:

stats.sh in=contigs.fa gc=gc.txt gcformat=4

  • BBTools User Guide
    • Usage Guide
    • Installation Guide
    • Data Preprocessing
    • Add Adapters Guide
    • BBDuk Guide
    • BBMap Guide
    • BBMask Guide
    • BBMerge Guide
    • BBNorm Guide
    • CalcUniqueness Guide
    • Clumpify Guide
    • Dedupe Guide
    • Reformat Guide
    • Repair Guide
    • Seal Guide
    • Split Nextera Guide
    • Statistics Guide
    • Tadpole Guide
    • Taxonomy Guide
  • BBTools FAQ and Support Forums

More from the JGI archives:

  • Software Tools
  • Science Highlights
  • News Releases
  • Blog
  • User Proposals
  • 2018-24 Strategic Plan
  • Progress Reports
  • Historical Primers
  • Legacy Projects
  • Past Events
  • JGI.DOE.GOV
  • Disclaimer
  • Accessibility / Section 508
Lawrence Berkeley National Lab Biosciences Area
A project of the US Department of Energy, Office of Science

JGI is a DOE Office of Science User Facility managed by Lawrence Berkeley National Laboratory

© 1997-2025 The Regents of the University of California