- Research
- Open access
- Published:
The quest for environmental analytical microbiology: absolute quantitative microbiome using cellular internal standards
Microbiome volume 13, Article number: 26 (2025)
Abstract
Background
High-throughput sequencing has revolutionized environmental microbiome research, providing both quantitative and qualitative insights into nucleic acid targets in the environment. The resulting microbial composition (community structure) data are essential for environmental analytical microbiology, enabling characterization of community dynamics and assessing microbial pollutants for the development of intervention strategies. However, the relative abundances derived from sequencing impede comparisons across samples and studies.
Results
This review systematically summarizes various absolute quantification (AQ) methods and their applications to obtain the absolute abundance of microbial cells and genetic elements. By critically comparing the strengths and limitations of AQ methods, we advocate the use of cellular internal standard-based high-throughput sequencing as an appropriate AQ approach for studying environmental microbiome originated from samples of complex matrices and high heterogeneity. To minimize ambiguity and facilitate cross-study comparisons, we outline essential reporting elements for technical considerations, and provide a checklist as a reference for environmental microbiome research.
Conclusions
In summary, we propose absolute microbiome quantification using cellular internal standards for environmental analytical microbiology, and we anticipate that this approach will greatly benefit future studies.
Video Abstract
Graphical Abstract

1. Introduction
The environmental microbiome, a complex assembly of microbial cells and their genetic constituents, plays multifaceted roles in natural biochemical processes and engineered systems [1,2,3,4,5], and has advantageous or deleterious impacts on humans, animals, plants, and the environment [6]. Analogous to environmental analytical chemistry, which investigates the distribution and concentrations of chemical pollutants [7], we propose a new discipline “Environmental Analytical Microbiology (EAM),” which treats microbes and related genetic elements in the environment as analytes. This will encompass the documentation of various microbial cells in different habitats (Fig. 1a), and enable the spatiotemporal monitoring of microbial pollutants such as pathogens (Fig. 1b) [1, 6] and microbial genetic elements like antibiotic resistance genes (ARGs) (Fig. 1c) [8, 9]. Additionally, EAM facilitates the profiling of the physiological properties, such as cell growth, metabolic pathways, and activity, of desired functional microbes and their response to changes in environmental variables [10, 11]. By integrating EAM with appropriate management practices, it is possible to augment the favorable effects of the microbiome on humans, animals, plants, and the environment. This integration involves establishing connections between microbiomes and the physio-chemical conditions of systems [12], pinpointing biomarkers to assess the performance of engineered systems [13, 14], mitigating the negative impacts through bioaugmentation remediation technology [15], and enhancing the beneficial effects on the environment through enriching functional populations to accelerate rates of biochemical reactions in element cycling, such as nitrogen fixation or the mineralization of organic matter [16, 17].
Absolute abundance of environmental microbiome, including total microbial loads across different habitats using diverse AQ methods (a) (data source: [6, 18,19,20,21,22,23,24,25,26]); pathogens and pathogenic antibiotic-resistant bacteria (PARB) density in water samples and wastewater treatment works (WWTPs) (b) (data source: [1, 23, 27]); and ARG concentrations in water samples and WWTPs using IS-based AQ methods (c) (data source: [1, 6, 27])
The advent of high-throughput sequencing technologies has revolutionized the way researchers explore the microbial world and profoundly influenced the field of genomics, offering numerous benefits such as larger sample sizes (such as global and regional surveillance), culture-independent analysis, reduced labor intensity, comprehensive scanning, in-depth investigations, and rapid detection [1, 28, 29]. The incorporation of this rapidly advancing technology into environmental studies has the potential to greatly increase the efficiency of the analytical processes and yield a more comprehensive understanding of the examined microbiome. However, a few challenges reduce the reliability of quantitative results generated from high-throughput sequencing data [30]. First, technical bias can be introduced at any stage of microbiome analysis, from sample collection to results interpretation [31, 32]. Variability can easily arise from technical factors such as sampling strategy (e.g., grab vs. composite sampling [33]), sample preservation and storage (e.g., the effect of ethanol concentration [32, 34], storage temperature, and duration [35]), DNA extraction methods and/or kits [34, 36, 37], biological replication [38], technical replication [39, 40], sequencing library preparation [41, 42], uneven sequencing of samples in a multiplexed run [28], and different sequencing platforms [43].
To eliminate potential biases caused by these sources of variability and conduct EAM, microbiome quantification should be performed in absolute values for inter-sample comparison and reliable statistical analyses. It helps rectify for possible compositional artifacts [44] associated with relative abundances [18, 31]. Without considering variations in total microbial loads (also referred as cell density, i.e., cell counts per unit mass/volume), the compositional nature of the relative data obtained from the sequencing of different samples can result in misinterpretations of microbial findings [18, 19, 45, 46]. Since the compositional data is constrained to a constant sum, an increase in one taxon’s abundance inevitably leads to a concurrent decrease in the abundance of other taxa [19, 20]. This characteristic can lead to high false-positive rates in differential abundance analyses [47,48,49], introduce spurious correlations [30], be especially severe when communities have dominant taxa [50], and miss possible correlation pairs [18]. All of these factors hinder inter-sample and inter-study comparisons. Additionally, biological and ecological insights, including microbiota-host interplay [18] and inter-species interactions engineered systems [21], can be enhanced by the knowledge of absolute abundances [45].
To achieve absolute quantification (AQ), various methods can be adopted to determine the absolute abundance of microbial taxa via known “anchor” points that convert relative data into absolute values [19, 45]. In this review, we categorize the current methods into two groups: (i) incorporating relative abundance with total microbial load and (ii) internal standard (IS)-based AQ (also known as IS-facilitated AQ). We argue that conventional cultivation-based and direct counting methods, while backed by well-established protocols, may not be able to fulfill the key criteria of applicability, reliability, rapidity, and affordability, particularly for complicated environmental samples, owing to their inherent limitations. These conventional methods often face challenges with the complexity and variability of these samples, including diverse matrix effects, leading to less reliable results, longer processing times, and increased costs. Instead, we propose that IS-based metagenomic quantification can benefit EAM because it is applicable for (i) diverse environmental samples, regardless of whether cells are in a free-living state or in flocs; (ii) independent of cultivation, as the majority of bacteria in natural or engineered systems have not been isolated [51]; and (iii) wide-spectrum scanning, including the enumeration of both single species and higher phylogenetic taxa (e.g., genera or classes, phyla). However, this approach has several drawbacks, including potential biases arising from the selection of IS and sequencing technologies, the requirement for specialized computational resources and expertise to analyze the data and interpret quantitative results, and relatively high limits of detection (LoD) compared with the conventional methods. Furthermore, we outline key elements for technical consideration, aimed at improving the reliability and feasibility of AQ methods, and provide an element checklist as a reference for future studies in the field of the environmental microbiome to advance EAM.
2. Estimation of total microbial load
The total microbial load can be estimated using various methods of three groups: (1) direct counting including cultivation, microscopic counting, and flow cytometry (FCM); (2) indirect microbial indicator measurements, including biomass dry weight, total extracted DNA, OD600, microbial biomass carbon and nitrogen (MBC and MBN), phospholipid fatty acid (PFLA), and adenosine triphosphate (ATP); and (3) molecular methods including quantitative polymerase chain reaction (qPCR) and digital PCR (dPCR), and computation-based metagenomic absolute quantification. Each method has advantages and shortcomings and is affected by diverse factors (as summarized in Table S1 and Fig. 2).
2.1 Direct counting methods
The heterotrophic plate count (HPC), measured in colony-forming unit (CFU) [52], and multiple-tube fermentation technique, measured in most probable number (MPN) [53], are cultivation-based methods for counting cell numbers in water and wastewater samples [54]. The major limitation is the underestimation of the total number of cells because the “unseen majority” is non-culturable at given conditions [55]. To include cells in a viable but nonculturable state, dead and non-culturable cells, microscopic counting via fluorescence microscopes, and hemocytometers can be employed. The fluorescence microscopic method involves staining cells with DNA-specific dyes or probes, capturing cells on the membrane, and enumerating them under a microscope [51, 56,57,58,59]. Additionally, the hemocytometer, a specialized counting chamber, allows cells suspended in a sample to be counted under a light microscope [60]. However, these above methods are largely affected by the cell distribution status of the samples and the skills of the operators. Typically, pre-treatments or specialized devices (such as a multi-volume hemocytometer) are required to ensure the number of cells within measurable ranges [60, 61].
FCM has gained considerable attraction for cell number enumeration because of its highly informative nature in counting various cells in microbial communities [62, 63], ability to distinguish live and dead cells using different DNA dyes [64], and advantages of high accuracy, reproducibility, rapid processing, automation potential, and cost-effectiveness [55]. For example, FCM can deliver reproducible results within 15 min with relative standard deviations less than 3% [55]. However, FCM faces challenges such as potential bias during sample preparation [55], a lack of a universal FCM setting and analytical protocol [21], and interference from cell debris and aggregates [22, 51]. FCM counting is more suitable for environmental samples with low biomass and well-dispersed cells, such as drinking water [62], cooling water samples [46], and river samples [23].
As an alternative to DNA dyes, oligonucleotide probes targeting ribosomal RNA (rRNA) genes can be used for cell identification via fluorescence in situ hybridization (FISH); when it combined with FCM and confocal laser scanning microscopy, free-living and aggregated microbes can be quantified in absolute values [51, 65, 66]. Catalyzed reporter deposition FISH (CARD-FISH) can amplify signals from low abundance or activity microbes, recovering an average of 94% of cells [67], and performs well in monitoring and localizing microbial populations within particles [68]. However, the application of FISH-facilitated methods for absolute quantification in complex samples is limited due to high demand on operating experience, sample preparation challenges, and target molecule accessibility [69, 70].
2.2 Indirect microbial indicator measurements
A variety of indirect microbial indicators can be used for approximating cell numbers, and these methods necessitate the linkage between actual cell counts and the parameters used in a specific method. The volatile suspended solid (VSS) and total DNA amount (in the unit of µg DNA per mg sample) serve as proxies for biomass in wastewater treatment systems [71,72,73], and the microbial load [74], respectively. These estimations may be imprecise because non-microbial organic particles in VSS interfere with measurements, and non-bacterial DNA sources and variations in microbial genome size affect the conversion from total DNA to cell numbers [74]. The UV–Vis spectrophotometric optical density (OD) at a wavelength of 600 nm is a commonly employed method to determine the number of pure culture microorganisms in liquid culture [75, 76], but is not commonly used for environmental samples. Other methods involve the extraction of relevant materials from cells to estimate the microbial cell numbers, including (i) MBC [77,78,79] and MBN [80]; (ii) PLFA with an empirical value of 1.40 × 10−8 nmol bacterial PLFA/cell [81]; and (iii) ATP with an empirical value of 1.75 × 10−10 nmol ATP/cell [73, 77, 82] and other case-specific conversation values [83]. However, the calibration between cell numbers and values of MBC, MBN, PLFA, and ATP is needed, making it very challenging for environmental samples containing diverse microbes and cells with different activities.
2.3 Molecular methods
qPCR and dPCR, both reliant on PCR technology and fluorescent signals, provide quantitative data on microbial nucleic acid targets in diverse environments [40, 84]. The advantages of these methods include high analytical sensitivity and specificity, high throughput for multiple genes and samples, and extensive dynamic ranges, and rapid turnaround times [40]. qPCR achieves quantification by comparing Ct values with a standard curve generated from DNA fragment with known copy numbers, while dPCR partitions PCR reactions, measures fluorescence signals, and applies Poisson statistics for quantification [40, 85]. dPCR offers several advantages over qPCR for quantification, including enhanced sensitivity, greater tolerance to PCR inhibitors, and not relying on a standard curve [86, 87]. However, qPCR is generally more cost-effective and faster, and has well-established protocols [88, 89]. One critical part of PCR is the need for specific primer pairs, which may not be easily accessible depending on the study objectives, necessitating design and evaluation before application [90]. Importantly, contamination is a common issue [40], particularly for samples with low total microbial loads [19]. Moreover, the main challenge for qPCR and dPCR is amplification bias and the matrix effect of complex environmental samples, which can potentially obscure the real quantitative results [91,92,93,94].
Like qPCR and dPCR quantification methods, the DNA mass-based metagenomic absolute quantification method assumes a 100% extraction efficiency which is consistent across all studied samples [2]. The DNA mass-based metagenomic absolute quantification method involves coverage (D) of microbial cells or genetic elements in the sequencing reads (Eq. 1 [95]) (Table 1), and the sequencing rate of DNA (Sr), which is the ratio between the DNA mass of metagenomic sequenced reads (m) and the total extracted DNA mass (M) (calculated using Eqs. 2 and 3) [2]. The absolute abundance can be calculated by dividing the coverage by the sequencing rate [2]. In the reality, the quantification of total extracted DNA mass (M) might be influenced by instrumental measurement, while bioinformatic analysis, including the selection of alignment algorithms, tools, and cutoffs might affect coverage (D) values.
2.4 Summary and outlook
Each microbial load estimation method has specific application scenarios, and the method selection should depend on the physiochemical characteristics of the samples and the inherent requirements of the methods. For example, FCM may be preferred for well-dispersed water samples, whereas qPCR and dPCR could be more suitable for samples requiring high sensitivity and specificity, such as those from clinical settings. Integrating multiple techniques to enhance microbiome quantification is necessary, such as combining FISH and FCM could achieve the quantification and sorting of multiple microorganisms by targeting specific signals. Furthermore, the development of absolute quantification in a rapid and high-throughput manner will significantly benefit the decision-making process, such as in the timely assessment of microbial pollutants in environmental samples to safeguard public health.
3. IS-based absolute quantification
Environmental samples often contain highly aggregated microbial states with microbial cells clustering in flocs rather than being evenly distributed and complicated environmental matrices with interfering substances, which together diminish the feasibility of current methods for absolute quantification of microbial populations in diverse environmental habitats. Given the extensive experience of using the ISs in analytical chemistry, incorporating ISs as references in environmental samples is advantageous. ISs undergo processes concomitantly with indigenous populations from sample pre-treatment to data generation [6, 21] and function as an “anchor” to convert the relative abundance in sequencing datasets to absolute abundance. The principles underpinning the use of ISs for calibration to achieve AQ encompass (i) the capacity of the IS to capture variations introduced by different steps in the experimental workflow, i.e., sample pre-treatment, DNA extraction, library construction, sequencing [21, 45]; (ii) the assumption that throughout the workflow, the ISs behaves analogously to the native cells and/or DNA molecules in the sample (e.g., GC content, length, and primer binding affinity) [24, 42, 97]; and (iii) the ability to unequivocally identify the DNA-based-IS bioinformatically [1, 6], or to ensure that the ISs is exogenous from the studied biological samples [31, 98].
3.1 Equations used in IS-based AQ
An IS can serve as an “anchor” for calculating the scaling factor (SF), which is the ratio between the number of ISs spiked in the sample per unit volume/mass and the number of ISs covered in the sequencing dataset, for example, marker-gene-based SFs (Eq. 4) [1, 21] (Table 1). Similarly, other parameters have been proposed, such as the size factor [98] and the IS normalization factor [99] for the downstream calibration. These factors are both sample- and IS-specific, facilitating the quantification of absolute abundance for a particular taxon and gene. The underlying assumption for using marker-gene-based SF is that one cellular IS cell carries one copy of the fluorescent marker gene [1, 21]. However, it has been noted that, under nutritious conditions, the ratio might surpass 1:1 due to the active and cell constant division [6]. As a result, the SF could be refined by introducing the adjustment factor, which is the ratio of between the coverage of marker gene and the coverage cellular IS genome for bacteria under exponential growth, and calculated the adjusted SF [6]. An alternative calculation is about to determine microbial load (X) by leveraging the relative abundance of the ISs in amplicon datasets (Eq. 5) [31]. If the DNA extraction efficiency (ƞ) can be ascertained, the absolute abundance of a specific lineage can be calculated via Eq. 6 (Table 1).
3.2 IS selection and IS-based AQ applications
ISs can be classified into two categories: DNA-based-ISs, which are specific DNA molecules, and cellular ISs (cell-based-ISs), which are whole cells. DNA-based-IS encompasses genomic DNA (gDNA) from populations that are exogenous to the indigenous microbiome, and synthetic DNA (sDNA) sequences. For instance, Smet et al. (2016) pioneered the DNA-based-IS AQ method by investigating the feasibility of adopting gDNA from Aliivibrio fischeri (DSM 507, with a GC content of 38%, commonly associated with marine environments) and Thermus thermophilus (DSM 46338, with a GC content of 69%, typically found in geothermal environments) as DNA-based-ISs to study soil microbiomes [25]. To quantify microbial abundance in 16S and 18S rRNA gene copies per milliliters of seawater, the gDNA of T. thermophilus (representing the 16S rRNA gene IS) and Schizosaccharomyces pombe (representing the 18S rRNA gene IS) were spiked into cells on the filter membrane collected from marine water, followed by amplicon sequencing after the DNA extraction [100]. Given the influence of GC content on quantitative accuracy, Venkataraman et al. (2018) incorporated gDNA from two bacterial species, i.e., Aliivibrio fischeri and Rhodopseudomonas palustris with GC contents of 38% and 65% respectively, into fecal samples [101]. In addition, gDNA-based-ISs from a marine bacterium, Marinobacter hydrocarbonoclasticus (ATCC 700491, Gram negative, GC content of 57%), were introduced into the extracted DNA of dairy samples, followed by metagenomic sequencing, enabling the quantification of the absolute copy number of thousands of genes concurrently [99].
Besides, sDNA-based-ISs, which usually have two parts, i.e., conserved regions for universal primer binding and artificial regions for differentiating ISs from native sequences of indigenous populations [24, 96, 102, 103], have been utilized in a few environmental microbiome studies to compute the absolute abundance of amplicons. Taking into account common primer binding sites from genes used for the identification of prokaryotes (16S), eukaryotes (18S), and fungi (ITS), and sequence lengths, sDNA-based-ISs were directly added to soil samples, followed by DNA extraction, and amplification using universal PCR primers to calculate the absolute abundance of amplicon families (i.e., 16S, 18S, and ITS) [24]. Five sDNA-based-ISs were generated by referring to the most abundant bacterial and fungal populations in the studied environment and added in gradient concentrations to solid-state fermentation samples for liquor production to quantify absolute abundance of microbiota [103]. Maintaining the conserved regions of selected natural 16S rRNA genes from five microorganisms, Tourlousse et al. (2017) replaced the variable regions with artificial sequences to generate twelve sDNA-based-ISs and added them to calculate the abundance of microbes in sludge and soil samples [102]. The same strategy was adopted to design a single sDNA-based-IS with modification of a 45-bp region of a 733-bp segment from an Escherichia coli strain 16S rRNA gene, which was used to retrieve the concentrations of the 16S rRNA gene per gram of fecal sample [96]. Nine sDNA-based-ISs at different concentrations were mixed with a DNA pool of DNA extracted from the soil samples to quantify the absolute abundance of soil microbiome [104]. As an alternative, sDNA-based-ISs can be designed by referring to other feature sequences of whole genomes [105]. For instance, microbial genomes were retrieved from RefSeq, which represent features such as taxa, genome size, GC content, rRNA operon count, and isolated environments; subsequently, subsequences for each genome and inverted subsequences were selected, but the original genome features were retained to generate 86 sDNA-based-ISs with varying lengths (1–10 kb) and GC contents for absolute quantitative microbial profiling using metagenomes [106]. In these studies, in silico analyses were conducted to ensure that sDNA-based-IS sequences satisfy target the GC content ranges, lack homopolymers, have no repeats of > 16 bp and no self-complementary regions of > 10 bp [102, 103], and confirm the uniqueness of artificial sequences after conducting BLAST searches against NCBI databases [102, 103, 106].
However, DNA-based-IS cannot correct the cell lysis efficiency in DNA extraction; thus, the cellular IS is suggested. The rationale of selecting a cellular IS is based on the assumption that the extraction efficiencies and sequencing biases of endogenous microbes are the same as those of the selected cellular ISs. Additionally, the cellular ISs should either be absent from the studied environment or can be clearly distinguishable from endogenous microorganisms. Stämmler et al. (2016) spearheaded the adoption of three exogeneous bacterial strains (i.e., Salinibacter ruber (Gram negative), Rhizobium radiobacter (Gram negative), and Alicyclobacillus acidiphilus (Gram positive)) as cellular ISs to quantify the bacterial absolute abundance of gut specimens [98]. The usage of an environmental bacterium, Sporosarcina pasteurii (Gram positive), as a cellular IS together with replicate sampling to achieve absolute quantification of microbial populations in soil and stool samples to evaluate the reproducibility revealed that more than 50% of species changes is due to technical variations, such as in the sampling step [31].
Besides, engineered microbes with marker gene labels can be used as cellular ISs when the marker gene is absent from the studied environment and is distinguishable from genes of indigenous populations. Using marker-gene labeled cellular ISs has two additional assumptions: (i) the genome coverage of the cellular ISs is the same as the coverage of the single-copy marker gene or at a fixed ratio; and (ii) the mapping recovery rates of different genes/populations are the same as those of the marker gene. The cellular IS-based method was developed using mClover-labeled E. coli with nanopore sequencing to achieve rapid identification and quantification of pathogens and ARGs in influent and effluent samples from wastewater treatment plants (WWTPs) [1]. Considering the inconsistent DNA extraction efficiencies between Gram positive and negative microbes, mCherry-labeled Bacillus amyloliquefaciens (Gram positive) and gfp-labeled Pseudomonas putida (Gram negative) were applied as cellular ISs with Illumina sequencing to profile the population dynamics under various anaerobic digestion operating conditions [21]. The cellular IS-based AQ application was broadened from documenting the absolute abundance of microbial populations to quantitative microbial risk assessment (QMRA) for beach water using egfp-labeled Pseudomonas_E hunanensis via nanopore sequencing [6].
The accurate identification and quantification of ISs have been demonstrated via mock communities which have known amounts, as their quantitative results closely align with pre-designated values [102, 106], which validates the feasibility of adopting ISs in diverse environmental samples. For example, when sDNA-based-ISs were added to a plasmid-based DNA mock community of 15 bacterial strains at known concentrations, the resulting proportion was 35.8 ± 4.2% (range 31.6–41.0%), which is comparable to the expected value of 44.4% [102]. A staggered mixture of sDNA-based-ISs was incorporated into gDNA from a mock community at an abundance of 1%, and a value of 1.5% was observed [106].
4. Addressing uncertainties of IS-based AQ with a technical checklist
It is crucial to pose critical questions that can guide the selection of the most suitable AQ method for an environmental microbiome study. One should assess whether the cells within samples are well suspended or in a heterogeneous state. If cells are well suspended, direct counting could be used together with ISs for a double check, while, in principle, the IS already can capture technical variations in multiple-step analyses [45]. A systematic approach involving a checklist and technical suggestions can aid in identifying the optimal strategy for using an IS-based AQ (Figure S1 and SI 1). The key factors to consider in the experimental design include determining the most suitable IS type (e.g., DNA-based-IS or cellular IS), assessing the optimal quantity of ISs to be incorporated into the samples, and evaluating the lowest concentrations that can be reliably discerned. Additionally, it is also important to ascertain at which processing step the IS should be spiked, select the sequencing approach (i.e., metagenomic or amplicon), and validate the accuracy and reliability of the chosen method (Fig. 3 and Table S2).
4.1 Optimizing IS addition for accurate quantification
A positive correlation has been observed between the IS input and the total number of read counts assigned to the IS in both amplicon and metagenomic sequencing datasets [98, 99, 102, 103]. This relationship has been substantiated in multiple studies, including those utilizing staggered IS mixtures (of 12 sDNA-based-ISs) with a dynamic range of approximately 210 [102], inputting IS concentrations varying across three orders of magnitude, i.e., spike-in mass percentage (ratio of mass between spiked-in DNA/total DNA) of 0.1%, 1%, and 10% [99], and employing sDNA-based-ISs at five distinct concentrations, ranging from 104 to 108 copies/g solid fermentation samples [103].
However, using low IS amounts may lead to biased quantitative results [24], whereas using high IS amounts will scarify the effective sequencing dataset size. As benchmarked by a Rhizobium pure culture, accurate estimates of Rhizobium abundance were achieved with high levels of IS addition, whereas low levels led to substantial underestimation [24]. Moreover, a dilution series experiment revealed a more robust linear relationship between soil weight and total 16S rRNA gene copies quantified using gDNA-based-IS AQ when gDNA-based-ISs were added at 1% rather than at 0.1% of total DNA [25]. A study using three gDNA-based-IS addition percentages (0.1%, 1%, and 10%, IS DNA/total DNA mass) suggested that IS addition should exceed 0.1% to ensure the detection of ISs in metagenomic sequencing [99]. These findings underscore the importance of optimizing IS addition to environmental samples, as it is crucial for ensuring accurate detectability [45, 99] while efficiently allocating sequencing resources to microbes in samples [24, 25, 45, 106, 107].
4.2 Correcting variability in DNA extraction
The variability in DNA extraction across samples can be attributed to differences in environmental matrices, microbial populations, extraction methods/kits, etc. Considering the high diversity and heterogeneity of microbial populations in environmental samples, it is suggested to use a larger starting amount of sample within the recommended mass or volume range specified in the protocol, to achieve an increased DNA amount for better presentiveness, on top of replicate samples and homogenization pre-treatment.
The DNA extraction process typically comprises three main steps: cell lysis, DNA isolation from cell lysates, and DNA purification. Sometimes, concentration is required for low-biomass samples. Generally, DNA extraction efficiency (η) is calculated by using spiked cellular ISs or DNA-based-ISs (not simulating cell lysis), and by comparing the detected values and the known spiking amount of ISs [96, 103]. When sDNA-based-IS was spiked into soil samples, the η were 40 ~ 84% of soil samples using Quick-DNA™ Fecal or Soil Microbe Miniprep Kit™ [96], and 72 ~ 87% of the bacteria in a community using E.Z.N.A. soil DNA kit [103].
To include the cell lysis efficiency in calculating DNA extraction efficiency, cellular ISs are more acceptable under the assumption of the consistency of extraction efficiency between cellular IS cells and indigenous cells of environmental samples. The η were 75 ~ 110% using QIAamp PowerFecal kit for manure slurry and manure stockpile samples [99]. There are two ways to calculate the η. One method is to take the ratio between the copies of the IS in the extracted DNA, measured using qPCR, and the copies of the ISs added to the sample, as shown in Eq. 7. Alternatively, η can also be calculated using the coverage of the ISs in the extracted DNA (i.e., the coverage of the ISs normalized against the sequencing rate), divided by the number of the ISs added to a sample (Eq. 8).
4.3 LoD in absolute quantification using metagenomic sequencing
The LoD serves as a crucial indicator of AQ methods, with a lower LoD indicating increased sensitivity in AQ methods. However, the LoD of AQ methods based on metagenomic sequencing is subject to the influence of various factors such as sequencing depth [1, 99] and microbial loads [106]. An increase in the LoD, i.e., lower sensitivity, was observed when the sequencing depth decreased or the microbial load of the sample increased [1, 21]. It is essential to recognize that LoD is also species-specific due to different genome sizes [21].
There are inconsistent definitions of LoD across different studies. One study added gDNA-based-ISs into the extracted DNA of samples at three gradient percentages (i.e., 0.1%, 1%, and 10%) and sequenced at a depth of 50 million 150 bp paired-end reads of each metagenome [99]. LoD was ascertained as 3.2 × 107 gene copies/g (dry weight) sample of dairy waste, since 95 out of 4272 genes of the ISs cannot be detected at 0.1% IS addition percentage [99]. Indeed, higher coverage of genes/genomes in AQ methods (Eq. 1) ensures higher reliability of absolute quantitative results [106]. However, setting the coverage cutoff too high may result in unnecessary conservative LoD, which will underestimate the sensitivity of the method. For example, the LoD defined via the criterion in the above study, which requires that the majority of genes on the genome of a species should have at least 1 × coverage [99], is very conservative one compared with another criterion, which only requires 1 × coverage of a region in any of the unique genes of microbial genomes [1, 21].
Our previous studies determined the LoD by explicitly specifying the minimum read length assigned to a particular taxon/gene [1, 6, 21]. With a minimum nanopore read mapping length of 1 kb as the cutoff of a prokaryotic taxon, the LoDs of Klebsiella pneumoniae were 19 cells/mL and 591 cells/mL at sequencing depths (sequencing data amounts) of 12.6 Gb and 0.57 Gb, respectively, in the influent sample (with a microbial load of 3.43 × 1011 cells/L) from sewage treatment works [1]. When 150 bp was used as the minimum Illumina read mapping cutoff, the developed AQ method reported LoDs of 131 ± 94 cells/mL and 265 ± 136 cells/mL for Gram-negative microbes and Gram-positive ones, respectively, in anaerobic digestion of sewage sludge at a sequencing depth of 10 Gb [21]. We suggest that the LoD (in the unit of cells or copies per unit volume or mass) should be calculated using the minimum alignment length of metagenomic reads assigned to specific taxon/genes, normalized genome size or gene length, and converted to absolute abundance using SFs, in metagenomic studies using IS-based AQ methods. The LoD using the same approach will be different for samples with various microbial loads. Interestingly, this LoD is independent of the amount of ISs addition. Microbial loads and genome size/gene length cannot be controlled, while the sequencing depth is another crucial factor affecting LoD, and striking a balance between sequencing effort and LoD constitutes another critical technical consideration, particularly in metagenomic sequencing.
4.4 Enhancing AQ with multiple ISs and using ISs to calibrate variability
Multiple ISs can benefit the AQ in three aspects. First, compared with the use of a single IS, the adoption of multiple ISs can provide insights into variations inherent to sequences or cells that emulate a variety of taxa [102], different gram-stain phenotypes (such as cell wall structure) [6, 21], genomes of varied GC contents [102, 106], cell shapes, etc. Second, by benchmarking known ratios between multiple ISs, potential outlier ISs can be easily identified and omitted from the subsequent calibration [45, 108]. Third, the recovery rates of different ISs have been shown to vary largely due to their differential detection efficiencies. Multiple ISs can be used to “average-out” differential detection rates to normalize the technical noise [102] or sum the read counts of all ISs for the calibration to reduce errors [98]. Multiple ISs can be employed in two ways: (i) all ISs are mixed approximately equally [6, 21]; or (ii) they are staggered mixtures with a wide range of concentrations of ISs [102, 106].
Bias could be introduced in different steps of microbiome studies, from sample pre-treatment to DNA extraction, library construction, sequencing, and bioinformatic analyses (Fig. 3). Like cellular ISs, DNA-based-ISs can be spiked at different steps to benchmark variations across various technical stages. When added to soil samples [25, 96], to human fecal samples [101], to seawater samples [100], and liquor solid-fermentation samples [103] before DNA extraction, DNA extraction efficiencies can be calculated although DNA-based-ISs do not simulate cell lysis. Even just spiking DNA-based-ISs into DNA pools extracted from manure-related samples [99], soil [104], and saltmarsh samples [106] allows for benchmarking the technical variations caused by library construction and sequencing for better quantification.
4.5 Validation of IS-based AQ
The feasibility of most IS-based AQ methods is substantiated by employing mock communities or alternative independent enumeration techniques. Among the sixteen IS-based AQ investigations, seven utilized mock communities to corroborate the accuracy of the quantitative results generated by the suggested methodologies. These include the incorporation of commercialized ZymoBIOMICS™ microbial community standards [1, 6] and laboratory-made artificial communities, such as one consisting of 16 bacteria and fungi at equal cell concentration [101], gDNA from a well-defined MBARC-26 mock community [106], plasmids containing near-full-length 16S rRNA genes of 15 different bacteria [102], a microbial community consisting of five bacteria and five fungi [103], and a Rhizobium leguminosarum pure culture cell suspension [24]. In conjunction with mock communities, various enumeration methods have been employed to assess the feasibility of IS-based AQ methods, such as qPCR [24, 98, 99, 102, 103], direct counting using FCM [21, 100], and indirect estimation using the PLFA [25].
In addition to the validation of the accuracy of IS-based AQ methods, mock communities and ISs could also be used to identify uncertainties in bioinformatic analysis, especially when using metagenomic sequencing. Mapping the reference ISs for the quantification will be influenced by multiple factors, such as sequencing read types, mappers and parameter settings, reference gene length, reference gene GC contents, and nonspecific mapping [1, 99, 106]. With respect to the evaluation of mapping algorithms, Bowtie2 (paired mode) outperformed Kallisto when Illumina reads were used, as determined by the gene recovery of the spiked gDNA [99]. Consequently, an 80% identity threshold was employed using nanopore sequences with Minimap2 (map-ont) as the mapper [6], and the 99% identity threshold was deployed using Illumina reads with the mapper of Bowtie2 (paired mode) [21] for mapping the fluorescence marker gene mCherry. Additionally, reference IS recovery may be underestimated when the gene length is near or shorter than the library insert size, since read pairs are less likely to map accurately because a substantial portion of the read pair extends beyond the target gene reference sequence [99]. Furthermore, DNA-based-IS recovery decreased with increasing GC content and then gradually increased using HiSeq4000 platform [99]. The mismatch error rate of each sDNA-based-IS steadily increased with increasing GC contents, but limited GC bias was observed in the coverage of reference genes using PCR-free nanopore sequencing [106]. GC bias in DNA sequencing library preparation and sequencing platforms has notably been reported, especially from MiSeq, NextSe, followed by Nextera XT, PacBio, and HiSeq sequencing platforms [109]. When gDNA is used as ISs, nonspecific mapping can impact the mapping process and distort the quantification results [99]. In this situation, more mapping tools with algorithms developed under different assumptions should be tested, and stricter mapping parameter cutoffs, such as higher alignment identities and coverage thresholds, are recommended. Obviously, in such instances, sDNA-based-ISs can alleviate bias caused by nonspecific mapping [106].
4.6 Metagenomics vs. amplicon sequencing
Both metagenomic and amplicon sequencing can be integrated with ISs to achieve AQ for microbial populations and/or elements in environmental samples. These two sequencing technologies possess distinct advantages and limitations and offer specialized insights into microbial communities.
Compared with metagenomic sequencing, inherent PCR amplification biases in amplicon sequencing can distort quantitative results. First, biases attributed to varying templates of different GC contents have been observed [100], as templates with higher GC contents have higher melting temperatures and are less efficiently amplified [92, 94]. Second, amplification bias is introduced by primer sequence selection [91, 93] and the number of PCR cycles [110], as DNA sequences with G/C at the degenerate position can be overamplified compared with sequences with A/T [100]. Third, the PCR process is influenced by sample characteristics, such as low biomass samples are prone to bias caused by overamplification, and as the number of PCR cycle increases, contaminating microorganisms are cumulatively over-represented [19, 41]. If amplicon sequencing is adopted, ISs should be designed according to the primers, GC contents, and sequencing length corresponding to the amplification regions [24, 96, 102, 103] and tested with different primer sets [102]. The necessity of correcting the 16S rRNA gene copy number into cell number is a matter of debate. Caution should be exercised in rRNA gene copy number correction, as a significant portion of operational taxonomic units (OTUs) or amplicon sequence variants (ASVs) remains unclassified, and the limited number of known 16S rRNA gene copy numbers from sequenced genomes likely does not reflect the natural variability in 16S rRNA gene copy numbers [100]. Additionally, existing tools perform poorly for the majority of tested genomes and OTUs [111].
Metagenomic sequencing has been reported to be less susceptible to PCR bias [42, 99, 106] and allows for the profiling of the full-spectrum of genes in an entire community [112], transcending the limits of phylogenetic marker gene analysis [41]. Additionally, owing to the ongoing improvement in chemical and sequencing technologies, significant progress has been made in long-read nanopore and PacBio sequencing. However, while there are inherent problems with the sensitivity and specificity of mapping algorithms, challenges also arise from the relatively high error rate in nanopore long reads during quantification. Additionally, the quality of DNA is crucial when nanopore sequencing is employed. For instance, it has been suggested that the DNA purity should meet the requirements of an A260/A230 ratio > 2 and an A260/A280 ratio > 1.8 for human gut microbiome studies [113]. Researchers should carefully review and modify existing DNA extraction and purification protocols prior to nanopore sequencing to ensure optimal sequencing results.
4.7 Summary and outlook for IS-based AQ methods
Current practices advocated for standardizing microbiome studies include adopting consistent storage approaches and pre-treatment methods, following identical experimental protocols, and employing the same analytical tools. Quantifying microbial populations in absolute abundance has biological meaning and provides more comparable, reliable, and reproducible datasets. Arguments about the poor representativeness of cellular ISs for indigenous microbial populations highlight the need to develop more engineered cellular ISs. By leveraging our knowledge of the phylogeny of microbes in the studied environment, we can genetically modify these species by adding marker genes to their genomes and use marker gene labeled engineered cells as cellular ISs, thereby simulating indigenous microbes. This approach shares the same experimental design for quantitative stable isotope probing, which is crucial for the identification and quantification of active microbial populations under given conditions [114, 115]. Moreover, marker-gene-labeled engineered archaeal populations are essential, as they play critical roles in diverse environments and biochemical cycles. To increase the reliability and applicability of IS-based AQ methods, incorporating a checklist of technical factors in the method’s development and application processes is of highly importance (SI 1 and Figure S1). Before IS application, ISs with gradients should be sequenced together with real samples to establish a relationship between spiked amounts and quantified values. Important actions include spiking cellular ISs during sample collection or pre-treatment to capture variations inherent in the entire methodological workflow, employing multiple ISs, optimizing IS addition amounts, and validating the method with microbial mock communities or independent enumeration approaches. Furthermore, metagenomics offers a broader scope of information than amplicon sequencing does, especially as the ongoing development of long-read sequencing technologies overcomes the limitation of short reads. However, meticulously addressing biases stemming from bioinformatics analyses, which could be evaluated and partially resolved using simulated datasets, is crucial.
5. Application opportunities of EAM
The field of EAM has advanced to enable the detection of even trace amounts of microbial pollutants, such as pathogens and ARGs. By utilizing absolute quantitative metagenomics, this scientific discipline allows for the identification of microbial pollutants across diverse systems and establishes links between targeted ecological or engineered systems and key microbial populations or pollutants. As a result, such knowledge lays the foundation for the development of precise control strategies in various systems through the deliberate manipulation of microorganisms (Fig. 4).
5.1 Quantitative microbiome profiling
Quantitative microbiome profiling circumvents compositionality effects by documenting microbial dynamics in absolute values [18]. This approach enables the reconstruction of reliable co-occurrence networks of microorganisms, the identification of genuine inter-species interactions, and the provision of insights into underlying ecological laws in response to varying disturbance variables. For example, the absolute quantitative metagenomics allows the calculation of the growth and decay rates of microbes under different living conditions [21, 26].
5.2 Quantitative microbial risk assessment
Key questions that must be addressed by QMRA include the following: (i) Which microbial pollutants exist, pathogens and/or ARGs? (2) What is their quantity? (3) Are pathogens viable? (4) Do ARGs have high dissemination potential [116]? The insights gained from QMRA are crucial for setting standards of microbial regulatory parameters and legislative frameworks.
IS-based methods in analytical chemistry are used to develop standard methods, such as the U.S. EPA Method 200.8 [117], which is employed for regulatory monitoring of trace elements in drinking water and wastewater in compliance monitoring programs, such as the Clean Water Act and the Safe Drinking Water Act. IS-based analytical methods, such as the U.S. EPA Method 1694 [118], which is a standard method developed to screen samples from various sources to profile the occurrence and concentrations of pharmaceuticals and personal care products, are used to detect organic pollutants. However, standard methods used for microbial pollutants are primarily cultivation-based, such as methods for enumerating cells of fecal coliforms or E. coli [119]. The selection of E.coli as a regulatory parameter of public health significance hinges on the availability of a standard analytical method [119], underscoring the necessity for standardized methods for microbiome studies. The WHO has published an updated list of bacterial priority pathogens, including 24 bacteria across 15 families of antibiotic-resistant pathogens [120]. High-throughput output and wide-spectrum screening capabilities are additional needs of standard methods. As sequencing becomes prevalent in microbiome research and as genomic analysis matures into becoming more quantitative, IS-based absolute quantitative metagenomics enables the generation of reliable, comparable, systematic, and reproducible datasets. These datasets can be applied to establish regional and global baselines and to conduct QMRA. In turn, this process can facilitate the formulation of regulatory standards and legislative frameworks to safeguard the total environment.
5.3 Improvement of mathematical modeling
Mathematical modeling facilitates the comprehension of complex ecological interactions, including how microbial populations respond to environmental changes, how inter-species interactions affect community dynamics, and how microbial metabolic traits contribute to biogeochemical cycles and system stability [1]. By incorporating the absolute abundance of individual taxa to train models, the accuracy and predictive power of models can be greatly enhanced. Furthermore, the integration of multi-omics data in absolute abundance values, such as metatranscriptomics and metabolomics, offers a multi-layer perspective on microbe-mediated systems by capturing the interplay between community composition (community structure), gene expression, and metabolic activities in response to different environmental variables. This multidimensional approach promotes the development of more advanced mathematical models.
5.4 Precision control in engineered systems and microbial community interventions
By adopting absolute abundance data, the removal efficiencies of microbial pollutants in sewage treatment work can be calculated [1], and the specific activity of functional microorganisms can be computed to evaluate the performance of engineered systems [21]. Documenting the dynamics of microbes serves as a crucial reference for fine-tuned manipulation of complex communities by simplifying or deconstructing them using a drop-out approach after key environmental variables that promote the enrichment of functional microbes are identified [121,122,123]. Moreover, employing absolute abundance information in the design process of engineered systems enables a better representation of desired microbial populations, ultimately leading to improved system performance and a more tailored approach to address specific environmental and biological challenges.
6. Conclusion
The rapid increase in studies on the environmental microbiome using high-throughput sequencing has facilitated the development of EAM. Nonetheless, bias in different steps of sequencing-based EAM and compositional data can compromise the reliability of quantitative results and impede effective comparisons. To address these challenges, quantifying microbial elements in absolute abundance using ISs is crucial for standardizing microbiome analyses. The estimation of total microbial load and DNA mass-based metagenomic absolute quantification approaches can be utilized for AQ, although inherent limitations of quantification methods restrict their application. To capture technical variability of multiple-step microbial analysis, IS-based AQ methods are recommended as they provide both quantitative and qualitative data, while also serving as quality control. However, uncertainties in IS-based AQ methods should be properly resolved to achieve reliable comparisons across samples and studies. In principle, EAM using an absolute quantitative microbiome enables applications such as microbial co-occurrence network reconstruction and risk assessment, contributing to a deeper understanding of microbial interaction and effective management strategies. Integrating multi-omics data and tracking microbial taxa dynamics also facilitates advanced model development and improved performance of engineered systems.
Data availability
No datasets were generated or analysed during the current study.
References
Yang Y, Che Y, Liu L, Wang C, Yin X, Deng Y, et al. Rapid absolute quantification of pathogens and ARGs by nanopore sequencing. Sci Total Environ. 2021;809:152190.
Yin X, Yang Y, Deng Y, Huang Y, Li L, Chan LY, et al. An assessment of resistome and mobilome in wastewater treatment plants through temporal and spatial metagenomic analysis. Water Res. 2022;209:117885.
Li L-G, Yin X, Zhang T. Tracking antibiotic resistance gene pollution from different sources using machine-learning classification. Microbiome. 2018;6:1–12.
Evans PN, Boyd JA, Leu AO, Woodcroft BJ, Parks DH, Hugenholtz P, et al. An evolving view of methane metabolism in the Archaea. Nat Rev Microbiol. 2019;17(4):219–32.
Mei R, Liu W-T. Quantifying the contribution of microbial immigration in engineered water systems. Microbiome. 2019;7(1):1–8.
Yang Y, Deng Y, Shi X, Liu L, Yin X, Zhao W, et al. QMRA of beach water by nanopore sequencing-based viability-metagenomics absolute quantification. Water Res. 2023;235:119858.
Barceló D, Petrovic M. Challenges and achievements of LC-MS in environmental analysis: 25 years on. TrAC, Trends Anal Chem. 2007;26(1):2–11.
Yin X, Deng Y, Ma L, Wang Y, Chan LY, Zhang T. Exploration of the antibiotic resistome in a wastewater treatment plant by a nine-year longitudinal metagenomic study. Environ Int. 2019;133:105270.
Yin X, Li L, Chen X, Liu Y-Y, Lam TT-Y, Topp E, et al. Global environmental resistome: distinction and connectivity across diverse habitats benchmarked by metagenomic analyses. Water Res. 2023;235:119875.
Wang Y, Niu Q, Zhang X, Liu L, Wang Y, Chen Y, et al. Exploring the effects of operational mode and microbial interactions on bacterial community assembly in a one-stage partial-nitritation anammox reactor using integrated multi-omics. Microbiome. 2019;7(1):1–15.
Nobu MK, Narihiro T, Mei R, Kamagata Y, Lee PK, Lee P-H, et al. Catabolism and interactions of uncultured organisms shaped by eco-thermodynamics in methanogenic bioprocesses. Microbiome. 2020;8:1–16.
Ju F, Lau F, Zhang T. Linking microbial community, environmental variables, and methanogenesis in anaerobic biogas digesters of chemically enhanced primary treatment sludge. Environ Sci Technol. 2017;51(7):3982–92.
Wang C, Wang Y, Wang Y, Cheung K k, Ju F, Xia Y, et al. Genome-centric microbiome analysis reveals solid retention time (SRT)-shaped species interactions and niche differentiation in food waste and sludge co-digesters. Water Res. 2020;181:115858.
Segata N, Izard J, Waldron L, Gevers D, Miropolsky L, Garrett WS, et al. Metagenomic biomarker discovery and explanation. Genome Biol. 2011;12:1–18.
Gentry T, Rensing C, Pepper I. New approaches for bioaugmentation as a remediation technology. Crit Rev Environ Sci Technol. 2004;34(5):447–94.
Thauer RK, Kaster A-K, Seedorf H, Buckel W, Hedderich R. Methanogenic archaea: ecologically relevant differences in energy conservation. Nat Rev Microbiol. 2008;6(8):579–91.
Qin W, Zheng Y, Zhao F, Wang Y, Urakawa H, Martens-Habbena W, et al. Alternative strategies of nutrient acquisition and energy conservation map to the biogeography of marine ammonia-oxidizing archaea. ISME J. 2020;14(10):2595–609.
Vandeputte D, Kathagen G, D’hoe K, Vieira-Silva S, Valles-Colomer M, Sabino J, et al. Quantitative microbiome profiling links gut community variation to microbial load. Nature. 2017;551(7681):507–11.
Barlow JT, Bogatyrev SR, Ismagilov RF. A quantitative sequencing framework for absolute abundance measurements of mucosal and lumenal microbial communities. Nat Commun. 2020;11(1):2590.
Jian C, Luukkonen P, Yki-Järvinen H, Salonen A, Korpela K. Quantitative PCR provides a simple and accessible method for quantitative microbiota profiling. PLoS ONE. 2020;15(1):e0227285.
Wang C, Yang Y, Wang Y, Wang D, Xu X, Wang Y, et al. Absolute quantification and genome-centric analyses elucidate the dynamics of microbial populations in anaerobic digesters. Water Res. 2022;224:119049.
Foladori P, Bruni L, Tamburini S, Ziglio G. Direct quantification of bacterial biomass in influent, effluent and activated sludge of wastewater treatment plants by using flow cytometry. Water Res. 2010;44(13):3807–18.
Liang J, Mao G, Yin X, Ma L, Liu L, Bai Y, et al. Identification and quantification of bacterial genomes carrying antibiotic resistance genes and virulence factor genes for aquatic microbiological risk assessment. Water Res. 2020;168:115160.
Tkacz A, Hortala M, Poole PS. Absolute quantitation of microbiota abundance in environmental samples. Microbiome. 2018;6(1):110.
Smets W, Leff JW, Bradford MA, McCulley RL, Lebeer S, Fierer N. A method for simultaneous measurement of soil bacterial abundances and community composition via 16S rRNA gene sequencing. Soil Biol Biochem. 2016;96:145–51.
Long AM, Hou S, Ignacio-Espinoza JC, Fuhrman JA. Benchmarking microbial growth rate predictions from metagenomes. ISME J. 2021;15(1):183–95.
Wang C, Yin X, Xu X, Wang D, Liu L, Zhang X, et al. Metagenomic absolute quantification of antibiotic resistance genes and virulence factor genes-carrying bacterial genomes in anaerobic digesters. Water Res. 2024;253:121258.
Brennan C, Salido RA, Belda-Ferre P, Bryant M, Cowart C, Tiu MD, et al. Maximizing the potential of high-throughput next-generation sequencing through precise normalization based on read count distribution. Msystems. 2023;8:e0000623.
Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc Natl Acad Sci USA. 2011;108(supplement_1):4516–22.
Tsilimigras MC, Fodor AA. Compositional data analysis of the microbiome: fundamentals, tools, and challenges. Ann Epidemiol. 2016;26(5):330–5.
Ji BW, Sheth RU, Dixit PD, Huang Y, Kaufman A, Wang HH, et al. Quantifying spatiotemporal variability and noise in absolute microbiota abundances using replicate sampling. Nat Methods. 2019;16(8):731–6.
Kumar T, Bryant M, Cantrell K, Song SJ, McDonald D, Tubb HM, et al. Effects of variation in sample storage conditions and swab order on 16S vaginal microbiome analyses. Microbiol Spectr. 2024;12(1):e03712-e3723.
Bautista de Los Santos QM, Schroeder JL, Blakemore O, Moses J, Haffey M, Sloan W, et al. The impact of sampling, PCR, and sequencing replication on discerning changes in drinking water bacterial community over diurnal time-scales. Water Res. 2016;90:216–24.
Li A-D, Metch JW, Wang Y, Garner E, Zhang AN, Riquelme MV, et al. Effects of sample preservation and DNA extraction on enumeration of antibiotic resistance genes in wastewater. FEMS Microbiol Ecol. 2018;94(2):fix189.
Poulsen CS, Kaas RS, Aarestrup FM, Pamp SJ. Standard sample storage conditions have an impact on inferred microbiome composition and antimicrobial resistance patterns. Microbiol Spectr. 2021;9(2):e01387-e1421.
Brooks JP, Edwards DJ, Harwich MD, Rivera MC, Fettweis JM, Serrano MG, et al. The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies. BMC Microbiol. 2015;15:1–14.
Cruaud P, Vigneron A, Lucchetti-Miganeh C, Ciron PE, Godfroy A, Cambon-Bonavita M-A. Influence of DNA extraction method, 16S rRNA targeted hypervariable regions, and sample origin on microbial diversity detected by 454 pyrosequencing in marine chemosynthetic ecosystems. Appl Environ Microbiol. 2014;80(15):4626–39.
Davis BC, Brown C, Gupta S, Calarco J, Liguori K, Milligan E, et al. Recommendations for the use of metagenomics for routine monitoring of antibiotic resistance in wastewater and impacted aquatic environments. Crit Rev Environ Sci Technol. 2023;53(19):1731–56.
Reese SE, Archer KJ, Therneau TM, Atkinson EJ, Vachon CM, De Andrade M, et al. A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis. Bioinformatics. 2013;29(22):2877–83.
Borchardt MA, Boehm AB, Salit M, Spencer SK, Wigginton KR, Noble RT. The environmental microbiology minimum information (EMMI) guidelines: qPCR and dPCR quality and reporting for environmental microbiology. Environ Sci Technol. 2021;55(15):10210–23.
Knight R, Vrbanac A, Taylor BC, Aksenov A, Callewaert C, Debelius J, et al. Best practices for analysing microbiomes. Nat Rev Microbiol. 2018;16(7):410–22.
Jones MB, Highlander SK, Anderson EL, Li W, Dayrit M, Klitgord N, et al. Library preparation methodology can influence genomic and functional predictions in human microbiome research. Proc Natl Acad Sci USA. 2015;112(45):14024–9.
Yang Y, Deng Y, Liu L, Yin X, Xu X, Wang D, et al. Establishing reference material for the quest towards standardization in environmental microbial metagenomic studies. Water Res. 2023;245:120641.
Arumugam M, Raes J, Pelletier E, Le Paslier D, Yamada T, Mende DR, et al. Enterotypes of the human gut microbiome. Nature. 2011;473(7346):174–80.
Harrison JG, John Calder W, Shuman B, Alex BC. The quest for absolute abundance: the use of internal standards for DNA-based community ecology. Mol Ecol Resour. 2021;21(1):30–43.
Props R, Kerckhof F-M, Rubbens P, De Vrieze J, Sanabria EH, Waegeman W, et al. Absolute quantification of microbial taxon abundances. ISME J. 2017;11(2):584–7.
Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ. Microbiome datasets are compositional: and this is not optional. Front Microbiol. 2017;8:2224.
Weiss S, Xu ZZ, Peddada S, Amir A, Bittinger K, Gonzalez A, et al. Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome. 2017;5:1–18.
Hawinkel S, Mattiello F, Bijnens L, Thas O. A broken promise: microbiome differential abundance methods do not control the false discovery rate. Brief Bioinform. 2019;20(1):210–21.
Li H. Microbiome, metagenomics, and high-dimensional compositional data analysis. Annu Rev Stat Appl. 2015;2:73–94.
Daims H, Ramsing NB, Schleifer K-H, Wagner M. Cultivation-independent, semiautomatic determination of absolute bacterial cell numbers in environmental samples by fluorescence in situ hybridization. Appl Environ Microbiol. 2001;67(12):5810–8.
APHA. 9215 Heterotrophic plate count. In: Clesceri LS, Greenberg AE, Eaton AD, editors. Standards Methods for the Examination of Water and Wastewater. Washington, DC: American Public Health Association; 1998.
APHA. 9221 C. Estimation of bacterial density. In: Clesceri LS, Greenberg AE, Eaton AD, editors. Standards Methods for the Examination of Water and Wastewater. Washington, DC, USA: American Public Health Association; 1998.
Gronewold AD, Wolpert RL. Modeling the relationship between most probable number (MPN) and colony-forming unit (CFU) estimates of fecal coliform concentration. Water Res. 2008;42(13):3327–34.
Van Nevel S, Koetzsch S, Proctor CR, Besmer MD, Prest EI, Vrouwenvelder JS, et al. Flow cytometric bacterial cell counts challenge conventional heterotrophic plate counts for routine microbiological drinking water monitoring. Water Res. 2017;113:191–206.
APHA. 9216 Direct total microbial count. In: Clesceri LS, Greenberg AE, Eaton AD, editors. Standards Methods for the Examination of Water and Wastewater. Washington, DC: American Public Health Association; 1998.
Lisle JT, Hamilton MA, Willse AR, McFeters GA. Comparison of fluorescence microscopy and solid-phase cytometry methods for counting bacteria in water. Appl Environ Microbiol. 2004;70(9):5343–8.
Hobbie JE, Daley RJ, Jasper S. Use of nuclepore filters for counting bacteria by fluorescence microscopy. Appl Environ Microbiol. 1977;33(5):1225–8.
Kepner RL Jr, Pratt JR. Use of fluorochromes for direct enumeration of total bacteria in environmental samples: past and present. Microbiol Rev. 1994;58(4):603–15.
Absher M. Hemocytometer counting. In: Tissue culture. Academic Press; 1973. p. 395–7.
Thunyaporn R, Doh I, Lee DW. Multi-volume hemacytometer. Sci Rep. 2021;11(1):14106.
Hammes F, Berney M, Wang Y, Vital M, Köster O, Egli T. Flow-cytometric total bacterial cell counts as a descriptive microbiological parameter for drinking water treatment processes. Water Res. 2008;42(1–2):269–77.
Prest E, El-Chakhtoura J, Hammes F, Saikaly P, van Loosdrecht MC, Vrouwenvelder JS. Combining flow cytometry and 16S rRNA gene pyrosequencing: a promising approach for drinking water monitoring and characterization. Water Res. 2014;63:179–89.
Berney M, Vital M, Hülshoff I, Weilenmann H-U, Egli T, Hammes F. Rapid, cultivation-independent assessment of microbial viability in drinking water. Water Res. 2008;42(14):4010–8.
Amann RI, Binder BJ, Olson RJ, Chisholm SW, Devereux R, Stahl D. Combination of 16S rRNA-targeted oligonucleotide probes with flow cytometry for analyzing mixed microbial populations. Appl Environ Microbiol. 1990;56(6):1919–25.
Daims H, Lücker S, Wagner M. Daime, a novel image analysis program for microbial ecology and biofilm research. Environ Microbiol. 2006;8(2):200–13.
Pernthaler A, Pernthaler J, Amann R. Fluorescence in situ hybridization and catalyzed reporter deposition for the identification of marine bacteria. Appl Environ Microbiol. 2002;68(6):3094–101.
Woebken D, Fuchs BM, Kuypers MM, Amann R. Potential interactions of particle-associated anammox bacteria with bacterial and archaeal partners in the Namibian upwelling system. Appl Environ Microbiol. 2007;73(14):4648–57.
Wallner G, Erhart R, Amann R. Flow cytometric analysis of activated sludge with rRNA-targeted probes. Appl Environ Microbiol. 1995;61(5):1859–66.
Amann RI, Ludwig W, Schleifer K-H. Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiol Rev. 1995;59(1):143–69.
Solera R, Romero L, Sales D. Determination of the microbial population in thermophilic anaerobic reactor: comparative analysis by different counting methods. Anaerobe. 2001;7(2):79–86.
Mei R, Kim J, Wilson FP, Bocher BT, Liu W-T. Coupling growth kinetics modeling with machine learning reveals microbial immigration impacts and identifies key environmental parameters in a biological wastewater treatment process. Microbiome. 2019;7(1):65.
Vignola M, Werner D, Hammes F, King LC, Davenport RJ. Flow-cytometric quantification of microbial cells on sand from water biofilters. Water Res. 2018;143:66–76.
Contijoch EJ, Britton GJ, Yang C, Mogno I, Li Z, Ng R, et al. Gut microbiota density influences host physiology and is shaped by host and microbial factors. Elife. 2019;8:e40553.
Myers JA, Curtis BS, Curtis WR. Improving accuracy of cell and chromophore concentration measurements using optical density. BMC Biophys. 2013;6:1–16.
Beal J, Farny NG, Haddock-Angelli T, Selvarajah V, Baldwin GS, Buckley-Taylor R, et al. Robust estimation of bacterial cell count from optical density. Commun Biol. 2020;3(1):512.
Zhang Z, Qu Y, Li S, Feng K, Wang S, Cai W, et al. Soil bacterial quantification approaches coupling with relative abundances reflecting the changes of taxa. Sci Rep. 2017;7(1):4837.
Vance ED, Brookes PC, Jenkinson DS. An extraction method for measuring soil microbial biomass C. Soil Biol Biochem. 1987;19(6):703–7.
Jenkinson DS, Powlson DS. The effects of biocidal treatments on metabolism in soil—I. Fumigation with chloroform Soil Biol Biochem. 1976;8(3):167–77.
Brookes P, Landman A, Pruden G, Jenkinson D. Chloroform fumigation and the release of soil nitrogen: a rapid direct extraction method to measure microbial biomass nitrogen in soil. Soil Biol Biochem. 1985;17(6):837–42.
Frostegård A, Bååth E. The use of phospholipid fatty acid analysis to estimate bacterial and fungal biomass in soil. Biol Fertil Soils. 1996;22:59–65.
Hammes F, Goldschmidt F, Vital M, Wang Y, Egli T. Measurement and interpretation of microbial adenosine tri-phosphate (ATP) in aquatic environments. Water Res. 2010;44(13):3915–23.
Velten S, Hammes F, Boller M, Egli T. Rapid and direct estimation of active biomass on granular activated carbon through adenosine tri-phosphate (ATP) determination. Water Res. 2007;41(9):1973–83.
Mao X, Yin X, Yang Y, Che Y, Xu X, Deng Y, et al. Standardization in global environmental antibiotic resistance genes (ARGs) surveillance. Crit Rev Environ Sci Technol. 2024;54(22):1633–50.
Hindson BJ, Ness KD, Masquelier DA, Belgrader P, Heredia NJ, Makarewicz AJ, et al. High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal Chem. 2011;83(22):8604–10.
Cao Y, Raith MR, Griffith JF. Droplet digital PCR for simultaneous quantification of general and human-associated fecal indicators for water quality assessment. Water Res. 2015;70:337–49.
Ding J, Xu X, Deng Y, Zheng X, Zhang T. Comparison of RT-ddPCR and RT-qPCR platforms for SARS-CoV-2 detection: implications for future outbreaks of infectious diseases. Environ Int. 2024;183:108438.
Pabinger S, Rödiger S, Kriegner A, Vierlinger K, Weinhäusel A. A survey of tools for the analysis of quantitative PCR (qPCR) data. Biomol Detect Quantif. 2014;1(1):23–33.
Pfaffl MW. Quantification strategies in real-time PCR. AZ of Quant PCR. 2004;1:89–113.
Li F, Liu J, Maldonado-Gómez MX, Frese SA, Gänzle MG, Walter J. Highly accurate and sensitive absolute quantification of bacterial strains in human fecal samples. Microbiome. 2024;12(1):168.
Walker AW, Martin JC, Scott P, Parkhill J, Flint HJ, Scott KP. 16S rRNA gene-based profiling of the human infant gut microbiota is strongly influenced by sample processing and PCR primer choice. Microbiome. 2015;3(1):1–11.
Suzuki MT, Giovannoni SJ. Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR. Appl Environ Microbiol. 1996;62(2):625–30.
Soergel DA, Dey N, Knight R, Brenner SE. Selection of primers for optimal taxonomic classification of environmental 16S rRNA gene sequences. ISME J. 2012;6(7):1440–4.
Dutton CM, Christine P, Sommer SS. General method for amplifying regions of very high G+ C content. Nucleic Acids Res. 1993;21(12):2953–4.
Sims D, Sudbery I, Ilott NE, Heger A, Ponting CP. Sequencing depth and coverage: key considerations in genomic analyses. Nat Rev Genet. 2014;15(2):121–32.
Zemb O, Achard CS, Hamelin J, De Almeida ML, Gabinaud B, Cauquil L, et al. Absolute quantitation of microbes using 16S rRNA gene metabarcoding: a rapid normalization of relative abundances by quantitative PCR targeting a 16S rRNA gene spike-in standard. MicrobiologyOpen. 2020;9(3):e977.
Satinsky BM, Gifford SM, Crump BC, Moran MA. Use of internal standards for quantitative metatranscriptome and metagenome analysis. Methods in Enzymology. Vol. 531. Academic Press. 2013. p. 237–50.
Stämmler F, Gläsner J, Hiergeist A, Holler E, Weber D, Oefner PJ, et al. Adjusting microbiome profiles for differences in microbial load by spike-in bacteria. Microbiome. 2016;4:1–13.
Crossette E, Gumm J, Langenfeld K, Raskin L, Duhaime M, Wigginton K. Metagenomic quantification of genes with internal standards. MBio. 2021;12(1):e03173. https://doiorg.publicaciones.saludcastillayleon.es/10.1128/mbio.03173-20.
Lin Y, Gifford S, Ducklow H, Schofield O, Cassar N. Towards quantitative microbiome community profiling using internal standards. Appl Environ Microbiol. 2019;85(5):e02634-e2718.
Venkataraman A, Parlov M, Hu P, Schnell D, Wei X, Tiesman JP. Spike-in genomic DNA for validating performance of metagenomics workflows. Biotechniques. 2018;65(6):315–21.
Tourlousse DM, Yoshiike S, Ohashi A, Matsukura S, Noda N, Sekiguchi Y. Synthetic spike-in standards for high-throughput 16S rRNA gene amplicon sequencing. Nucleic Acids Res. 2017;45(4):e23-e.
Wang S, Wu Q, Han Y, Du R, Wang X, Nie Y, et al. Gradient internal standard method for absolute quantification of microbial amplicon sequencing data. Msystems. 2021;6(1):e00964. https://doiorg.publicaciones.saludcastillayleon.es/10.1128/msystems.00964-20.
Jiang S-Q, Yu Y-N, Gao R-W, Wang H, Zhang J, Li R, et al. High-throughput absolute quantification sequencing reveals the effect of different fertilizer applications on bacterial community in a tomato cultivated coastal saline soil. Sci Total Environ. 2019;687:601–9.
Tatusova T, Ciufo S, Federhen S, Fedorov B, McVeigh R, O’Neill K, et al. Update on RefSeq microbial genomes resources. Nucleic Acids Res. 2015;43(D1):D599–605.
Hardwick SA, Chen WY, Wong T, Kanakamedala BS, Deveson IW, Ongley SE, et al. Synthetic microbe communities provide internal reference standards for metagenome sequencing and analysis. Nat Commun. 2018;9(1):3096.
Camacho-Sanchez M. A new spike-in-based method for quantitative metabarcoding of soil fungi and bacteria. Int Microbiol. 2024;27(3):719–30.
Ji Y, Huotari T, Roslin T, Schmidt NM, Wang J, Yu DW, et al. SPIKEPIPE: a metagenomic pipeline for the accurate quantification of eukaryotic species occurrences and intraspecific abundance change using DNA barcodes or mitogenomes. Mol Ecol Resour. 2020;20(1):256–67.
Browne PD, Nielsen TK, Kot W, Aggerholm A, Gilbert MTP, Puetz L, et al. GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms. GigaScience. 2020;9(2):giaa008.
Bonnet R, Suau A, Doré J, Gibson GR, Collins MD. Differences in rDNA libraries of faecal bacteria derived from 10-and 25-cycle PCRs. Int J Syst Evol Microbiol. 2002;52(3):757–63.
Louca S, Doebeli M, Parfrey LW. Correcting for 16S rRNA gene copy numbers in microbiome surveys remains an unsolved problem. Microbiome. 2018;6:1–12.
Abubucker S, Segata N, Goll J, Schubert AM, Izard J, Cantarel BL, et al. Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput Biol. 2012;8(6):e1002358.
Maghini DG, Moss EL, Vance SE, Bhatt AS. Improved high-molecular-weight DNA extraction, nanopore sequencing and metagenomic assembly from the human gut microbiome. Nat Protoc. 2021;16(1):458–71.
Hungate BA, Mau RL, Schwartz E, Caporaso JG, Dijkstra P, van Gestel N, et al. Quantitative microbial ecology through stable isotope probing. Appl Environ Microbiol. 2015;81(21):7570–81.
Radajewski S, Ineson P, Parekh NR, Murrell JC. Stable-isotope probing as a tool in microbial ecology. Nature. 2000;403(6770):646–9.
Haas CN. Quantitative microbial risk assessment and molecular biology: paths to integration. Environ Sci Technol. 2020;54(14):8539–46.
U.S. Environmental Protection Agency: Determination of trace elements in water and wastes by inductively coupled plasma-mass spectrometry. Revision 5.4. In. Edited by U.S. EPA; 2015. https://www.epa.gov/sites/default/files/2015-06/documents/epa-200.8.pdf.
U.S. Environmental Protection Agency: Pharmaceuticals and personal care products in water, soil, sediment, and biosolids by HPLC/MS/MS. In. Edited by U.S. EPA; 2007. https://www.epa.gov/sites/default/files/2015-10/documents/method_1694_2007.pdf.
World Health Organization (WHO). Guidelines for safe recreational water environments: coastal and fresh waters. Vol. 1. World Health Organization; 2003.
World Health Organization (WHO). WHO bacterial priority pathogens list, 2024. Geneva: World Health Organization; 2024.
McClure R, Farris Y, Danczak R, Nelson W, Song H-S, Kessell A, et al. Interaction networks are driven by community-responsive phenotypes in a chitin-degrading consortium of soil microbes. Msystems. 2022;7(5):e00372-e422.
Huet S, Romdhane S, Breuil M-C, Bru D, Mounier A, Spor A, et al. Experimental community coalescence sheds light on microbial interactions in soil and restores impaired functions. Microbiome. 2023;11(1):42.
Jing J, Garbeva P, Raaijmakers JM, Medema MH. Strategies for tailoring functional microbial synthetic communities. ISME J. 2024;18:049.
Acknowledgements
C. Wang, Y. Deng and L. Li would like to thank The University of Hong Kong for the financial support for research assistant professor. X. Xu, D. Wang, L. Liu, would like to thank The University of Hong Kong for a postdoctoral fellowship. Y. Yang and X. Shi would like to thank The University of Hong Kong for the postgraduate studentship.
Funding
This work was financially supported by the Hong Kong Theme-based Research Scheme (T21-705/20-N), Hong Kong GRF (17212124), and the Seed Fund for Basic Research for New Staff from The University of Hong Kong.
Author information
Authors and Affiliations
Contributions
C. Wang performed the literature review and prepared the manuscript. T. Zhang supervised the whole study and revised the manuscript. Y. Yang, X. Xu, D. Wang, X. Shi, L. Liu, Y. Deng, L. Li contributed to the data interpretation and revision of the manuscript. All authors reviewed and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The manuscript does not report data collected from humans and animals.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wang, C., Yang, Y., Xu, X. et al. The quest for environmental analytical microbiology: absolute quantitative microbiome using cellular internal standards. Microbiome 13, 26 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40168-024-02009-2
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40168-024-02009-2