Elsevier

Analytica Chimica Acta

Volume 852, 10 December 2014, Pages 274-283
Analytica Chimica Acta

A real-time decoding sequencing based on dual mononucleotide addition for cyclic synthesis

https://doi.org/10.1016/j.aca.2014.09.009Get rights and content

Highlights

  • Templates are determined without directly measuring the base sequence in this method.

  • Templates are sequenced with the incorporation of AG/CT, AC/GT or AT/CG.

  • Templates will be sequentially decoded by two sets of encodings.

  • This method applies fewer cycles to obtain longer read length.

  • This method is able to be applied to differentiate nucleic acid sequences.

Abstract

We propose a real-time decoding sequencing strategy in which a template is determined without directly measuring base sequence but by decoding two sets of encodings obtained from two parallel sequencing runs. This strategy relies on adding a mixture of different two-base pair, A + G, C + T, A + C, G + T, A + T or C + G (abbreviated as AG, CT, AC, GT, AT, or CG), into the reaction each time. When a template is cyclically interrogated twice with any two kinds of dual mononucleotide addition (AG/CT, AC/GT, and AT/CG), two sets of encodings are obtained sequentially. The two sets of encodings allow for the bases to be sequentially decoded, moving from first to last, in a deterministic manner. This strategy applies fewer cycles to obtain longer read length compared to the traditional real-time sequencing strategy [1]. Partial rnpB gene was applied to verify the applicability of the decoding strategy via pyrosequencing. The results indicated that the sequence could be reconstructed by decoding two sets of encodings. Moreover, streptococcal strains could be differentiated by comparing signal intensity in each cycle and encoding size of each template. This strategy is likely to be applied to differentiate nucleic acid sequence as encoding size and signal intensity in each cycle vary with the base size and composition. Furthermore, it has the potential in building a promising strategy that could be utilized as an alternative to conventional sequencing systems.

Introduction

Primer-directed polymerase extension is able to incorporate thousands of base pairs, which indicates sequencing-by-synthesis (SBS) based technologies have great potential in read length. The existing SBS based high-throughput sequencing methodologies are classified as single nucleotide addition [2], [3], [4], [5], and four nucleotides addition [6], [7], [8], [9], [10]. The former is to add only one kind of nucleotide into a reaction to determine the number of incorporated nucleotides; the latter is to add special modified monomers to determine the type of incorporated nucleotides. In high-throughput DNA sequencing, the length of one read is an important indicator to measure the sequencing strategies. As for SBS, the read length is relevant to the number of reaction steps as well as the type of incorporated nucleotides. The large number of steps to read one base pair lends itself to significant inefficiencies [11]. In SBS with four nucleotides addition, a function of the read length and the cycle efficiency (Ceff) is as following: (Ceff)readlength = 0.5 [12]. When Ceff is 90%, the read length is 7 bp. However, when Ceff is more than 99%, the read length is greater than 100 bp. Therefore, an additional step (such as cleavage or incorporation of nucleotides) is likely to affect the Ceff, and eventually affect read length. In addition, the use of modified nucleotides gives rise to much shorter read lengths because of asynchrony [13]. That is probably why the read length from the existing commercialized sequencing instruments which use natural nucleotides as raw materials is longer than those using modified nucleotides.

Real-time sequencing methods have many advantages, such as maintaining the characteristics of natural nucleotides and eliminating the subsequent processing in the next sequencing cycles, but there is still one drawback that not every sequencing reaction gives an efficient message which may affect the Ceff, thus affecting the read length. There are strategies that may extend the read length by use of fewer than four labeled nucleotides. One of these strategies utilizes six different runs of different two-base pair combinations. The sequence is reconstructed from the order of each of the two bases [14]. Another strategy, called the ordered label strategy, relies on using all combinations of two labeled nucleotides to determine the order for each set, and then reconstructing the full sequence information [15]. Not long ago, some DNA sequencing technologies (ABI SOLiD sequencing) have been developed, which do not directly measure the base sequence, but measure DNA bases in pairs and in an encoded form, such as two-base encoding. In order to compare such data to a reference sequence, the encodings must be decoded into base sequence [16], [17]. This technology has great potential in error tolerance by differentiating biological variants from sequencing errors. However, read length is limited for fluorescently-labeled nucleotides and for large number of steps to read one base pair. In this report, we addressed a real-time decoding sequencing which contained the step of sequencing template sequence twice with dual mononucleotide addition, obtaining two sets of encodings, and finally reestablishing the base sequence with the two sets of encodings. This strategy relied on adding one of different two-base pair combinations, AG, CT, AC, GT, AT and CG, into the reaction each time. Here, AG, CT, AC, GT, AT and CG referred to reagent additions containing mixtures of those mononucleotide triphosphates. These different two-base pair combinations could form three kinds of dual mononucleotide addition, AG/CT, AC/GT, and AT/CG. This strategy interrogated the template sequence by any two combinations of AG/CT, AC/GT, and AT/CG. It is based on the principle that the signal intensities of released detection molecules are proportional to the number of incorporated nucleotides. When dual mononucleotide are added into the reaction each time, much stronger signal intensities are captured, making the detection of a trace of template possible. Moreover, the encodings obtained from a single sequencing run is also found to be useful for differentiation of nucleic acid sequence, since the signal intensity in each cycle and encoding size vary with the sequence composition and size. This strategy applies fewer cycles to obtain longer read length compared with traditional real-time sequencing strategy. It is compatible with most of the commercial sequencing instruments and is likely to be used as an alternative to the sequencing system. We hope it will provide the researchers with a new technology to analyze nucleic acid sequence.

Section snippets

Sequences and reagents

All the oligonucleotide sequences used are shown in Table 1. Synthetic sequences were purchased from Invitrogen (Shanghai, China). 5′ modifications were also performed by Invitrogen™ (Shanghai, China). The SQA PyroMark Gold Q96 Reagents (1 × 96), and solutions including annealing buffer (20 mM Tris–Acetat, 2 mM MgAc2, pH 7.6), denaturation buffer (0.2 M NaOH), wash buffer (10 mM Tris–Acetate, pH 7.6) and binding buffer (10 mM Tris–HCl, 2 M NaCl, 1 mM EDTA, 0.1% Tween 20, pH 7.6) were purchased from

Principle of the decoding sequencing strategy

Generally, when the first sequencing reaction is performed in the presence of two different nucleotides (named X and Y), an encoding XYn is generated, in which ‘X’ and ‘Y’ represent the type of two incorporated nucleotides and ‘n’ represents the number of incorporated nucleotides (in this study, a sequencing reaction is defined as a cycle). The obtained encoding XYn contains a total of 2n possible encodings (Table 2). Obviously, only one encoding XYn is not sufficient to determine the

Discussion

According to the previous description, this strategy enables to characterize natural nucleic acids, and to obtain longer read length with fewer cycles. It bases on the principle that the released molecules (here are pyrophosphates) are proportional to the number of incorporated nucleotides. In this strategy, the base sequence can be determined by two parallel sequencing runs. Interestingly, for appointed nucleic acid fragments (such as specified PCR product), the encoding information obtained

Conclusion

In summary, we have developed a novel real-time decoding sequencing strategy in this assay. By sequentially decoding two sets of encodings obtained from two parallel sequencing runs, the template sequences have been reconstructed successfully. We have applied this decoding strategy to differentiate nucleic acid sequences, and four species with few differences in the P3 regions were successfully differentiated by only a single sequencing run. This strategy applies fewer steps to obtain longer

Acknowledgements

This work was supported by the Major State Basic Research Development Program of China (2012CB517706); the National Natural Science Foundation of China (60971018, 61227803); and the Fundamental Research Funds for the Central Universities (CXLX13_112).

References (26)

  • E.Y. Chan

    Advances in sequencing technology

    Mutat. Res.

    (2005)
  • M. Ronaghi et al.

    A sequencing method based on real-time pyrophosphate

    Science

    (1998)
  • P.A. Sims et al.

    Fluorogenic DNA sequencing in PDMS microreactors

    Nat. Methods

    (2011)
  • M. Margulies et al.

    Genome sequencing in microfabricated high-density picolitre reactors

    Nature

    (2005)
  • J.M. Rothberg et al.

    An integrated semiconductor device enabling non-optical genome sequencing

    Nature

    (2011)
  • J. Eid et al.

    Real-time DNA sequencing from single polymerase molecules

    Science

    (2009)
  • D.C. Knapp et al.

    Fluoride-cleavable, fluorescently labelled reversible terminators: synthesis and use in primer extension

    Chem.-Eur. J.

    (2011)
  • J. Bowers et al.

    Virtual terminator nucleotides for next-generation DNA sequencing

    Nat. Methods

    (2009)
  • D.R. Bentley et al.

    Accurate whole human genome sequencing using reversible terminator chemistry

    Nature

    (2008)
  • F. Chen et al.

    Reconstructed evolutionary adaptive paths give polymerases accepting reversible terminators for sequencing and SNP detection

    Proc. Natl. Acad. Sci. U. S. A.

    (2010)
  • J.Y. Ju et al.

    Four-color DNA sequencing by synthesis using cleavable fluorescent nucleotide reversible terminators

    Proc. Natl. Acad. Sci. U. S. A.

    (2006)
  • M.L. Metzker

    Applications of next-generation sequencing sequencing technologies – the next generation

    Nat. Rev. Genet.

    (2010)
  • M.L. Metzker et al.

    Termination of DNA synthesis by novel 3′-modified-deoxyribonucleoside 5′-triphosphates

    Nucleic Acids Res.

    (1994)
  • 1

    The author contributed equally to this work.

    View full text