Considerations for designing mock calibration controls

mock communities experimental design calibration

Some considerations for designing mock communities to use as calibration controls in community sequencing experiments.

Michael R. McLaren (North Carolina State University)https://callahanlab.cvm.ncsu.edu/
2021-02-06

Status: Partial draft.

This post summarizes various choices and important considerations when designing mock communities to serve as calibration controls—samples that are used to estimate the bias associated with the taxa they contain (the control taxa) so that that bias can be corrected in measurements of the focal experimental samples. The recommendations that follow are based on observations in McLaren, Willis, and Callahan (2019), our experience in an unpublished experiment we conducted to compare bias between in vitro and in vivo communities, and personal communications I’ve had with experimental microbiologists.

Constructing mocks that resemble the target samples

A primary concern is that the bias in the mock controls could significantly differ from that in the target samples. To minimize this risk, one would like the controls to (biologically, physically, and chemically) resemble the primary samples as well as possible. Alternatively, having multiple mock controls in a variety of conditions can give a sense of the robustness of the estimated bias, and thus the reliability of calibration, under varying conditions. There are tradeoffs in representativeness versus experimental effort, and controls can still be useful without following all or even most of these suggestions, you’d just want to treat the calibrated microbiome measurements more skeptically.

Cell vs. DNA controls

Cell or DNA mixtures can be used as controls. Cellular controls are ideal because they can go through the exact same measurement protocol as the real samples, allowing a measurement of the full protocol bias, whereas bias measured from DNA controls will not include bias due to DNA extraction. But DNA controls are (probably) better than nothing as they still allow estimating bias due to PCR bias and variation in 16S/ITS copy number. My understanding is that it is much easier to make well-quantified DNA mixtures than cell mixtures for at least some taxa, so I can imagine in some cases choosing to use DNA controls over cell controls.

Sample matrix

Any chemicals of physical material that differs between controls and targets could influence bias, through interactions with extraction and/or PCR. For example, the fecal or plant matter in gut and plant microbiome experiments might affect bias, and so one might choose to mix the mocks with fecal or plant matter.

Storage and freeze/thaw cycles

Preservative chemicals and temperature (and especially temperature changes) have the potential to effect bias (especially extraction bias). Ideally the mocks and target samples would be undergo the same treatment.

Composition and total concentration

The core assumption of our model of bias is that it is independent of the underlying (relative or absolute) abundances of taxa in a sample. However, this assumption has not been extensively tested, and it is plausible that bias will be affected by large changes in total biomass/cell concentration or with the abundances of specific taxa. For this reason, I currently suggest aiming to construct cellular controls with concentrations spanning the range of expected concentrations in the target samples (or in the middle of this range, if that is not possible). This could be achieved by constructing a mock at a high concentration and then creating one or more dilutions. Similarly, it is useful to have each taxon appear in a range of relative abundances (say, spanning 0.01 to 0.3), to verify that bias is approximately independent of composition and gain confidence in calibration over the range of compositions seen in real data.

When determining which mock compositions to construct, it may be important to consider tradeoffs between construction accuracy and variation in abundances (see Quantification section below). It may be more difficult to accurately construct a mock with 1% of a taxon than 10% of that taxon.

Experimental batches

Subtle variation in experimental conditions can in principle affect bias. Thus, if the sequencing experiment is done in multiple batches of extraction and/or sequencing, then it would be best to have mock controls in each batch, to maintain the ability to capture any batch effects.

Quantification

Here the key concerns are quantifying the mocks in useful biological units, and avoiding large (unquantified) error in the true mock compositions. Ideal: Controls have accurately quantified relative abundances, with estimated precision or uncertainty in the true relative abundances, in the desired units (e.g., cell concentration).

Possible quantification methods

Cells:

DNA:

McLaren, Michael R, Amy D Willis, and Benjamin J Callahan. 2019. Consistent and correctable bias in metagenomic sequencing experiments.” eLife 8: 46923. https://doi.org/10.7554/eLife.46923.

References

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/mikemc/qmm, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

McLaren (2021, Feb. 6). Quantitative Microbiome Measurement: Considerations for designing mock calibration controls. Retrieved from https://microbiomemeasurement.org/posts/mock-design/

BibTeX citation

@misc{mclaren2021considerations,
  author = {McLaren, Michael R.},
  title = {Quantitative Microbiome Measurement: Considerations for designing mock calibration controls},
  url = {https://microbiomemeasurement.org/posts/mock-design/},
  year = {2021}
}