Journal of Bioscience and Bioengineering, Vol.124, No.3, 359-364, 2017
Estimation of the influence of sequencing errors and distribution of random sequence tags on quantitative sequencing
To simultaneously sequence and quantify target DNA, quantitative sequencing (qSeq) employs stochastic labeling of target DNA molecules with random-sequence tags (RSTs). This recently developed approach allows parallel quantification of hundreds of microorganisms in natural habitats in a single sequencing run. Yet, no study has addressed to what extent sequencing errors affect quantification and how many sequence reads are needed for quantification. Here, we addressed those issues by using numerical simulations and experimental data from second-generation sequencing of various RSTs. We found that heterogeneous distribution of observed RSTs affected the number of sequence reads required to quantitate target genes, whereas the effect of sequencing errors is smaller than of the RSTs distribution. Because of the heterogeneous RSTs distribution, 15-fold more sequence reads than the number of observed RSTs should be obtained to retrieve almost all RSTs needed for quantification; in that case, quantification error is estimated to be within 5%. (C) 2017, The Society for Biotechnology, Japan. All rights reserved.