Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

T cell fate and clonality inference from single-cell transcriptomes

Abstract

We developed TraCeR, a computational method to reconstruct full-length, paired T cell receptor (TCR) sequences from T lymphocyte single-cell RNA sequence data. TraCeR links T cell specificity with functional response by revealing clonal relationships between cells alongside their transcriptional profiles. We found that T cell clonotypes in a mouse Salmonella infection model span early activated CD4+ T cells as well as mature effector and memory cells.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Length distributions of reconstructed TCR sequences.
Figure 2: Clonal CD4+ T cell expansion during Salmonella infection.
Figure 3: Distribution of expanded clonotypes throughout the TH1 response to Salmonella infection.

Similar content being viewed by others

Accession codes

Primary accessions

ArrayExpress

References

  1. Lieber, M.R. FASEB J. 5, 2934–2944 (1991).

    Article  CAS  PubMed  Google Scholar 

  2. Becattini, S. et al. Science 347, 400–406 (2015).

    Article  CAS  PubMed  Google Scholar 

  3. Mamedov, I.Z. et al. EMBO Mol. Med. 3, 201–207 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Dash, P. et al. J. Clin. Invest. 121, 288–295 (2011).

    Article  CAS  PubMed  Google Scholar 

  5. Linnemann, C. et al. Nat. Med. 19, 1534–1541 (2013).

    Article  CAS  PubMed  Google Scholar 

  6. Kim, S.-M. et al. PLoS One 7, e37338 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Han, A., Glanville, J., Hansmann, L. & Davis, M.M. Nat. Biotechnol. 32, 684–692 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Buettner, F. et al. Nat. Biotechnol. 33, 155–160 (2015).

    Article  CAS  PubMed  Google Scholar 

  9. Jaitin, D.A. et al. Science 343, 776–779 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Trapnell, C. et al. Nat. Biotechnol. 32, 381–386 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Mahata, B. et al. Cell Rep. 7, 1130–1142 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Bolotin, D.A. et al. Nat. Methods 12, 380–381 (2015).

    Article  CAS  PubMed  Google Scholar 

  13. Shugay, M. et al. Nat. Methods 11, 653–655 (2014).

    Article  CAS  PubMed  Google Scholar 

  14. Thomas, N., Heather, J., Ndifon, W., Shawe-Taylor, J. & Chain, B. Bioinformatics 29, 542–550 (2013).

    Article  CAS  PubMed  Google Scholar 

  15. Kuchenbecker, L. et al. Bioinformatics 31, 2963–2971 (2015).

    Article  CAS  PubMed  Google Scholar 

  16. Ramsköld, D. et al. Nat. Biotechnol. 30, 777–782 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Brady, B.L., Steinel, N.C. & Bassing, C.H. J. Immunol. 185, 3801–3808 (2010).

    Article  CAS  PubMed  Google Scholar 

  18. Gaublomme, J.T. et al. Cell 163, 1400–1412 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Kolodziejczyk, A.A. et al. Cell Stem Cell 17, 471–485 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Mittrücker, H.-W., Köhler, A. & Kaufmann, S.H.E. Infect. Immun. 70, 199–203 (2002).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Brennan, P.J., Brigl, M. & Brenner, M.B. Nat. Rev. Immunol. 13, 101–117 (2013).

    Article  CAS  PubMed  Google Scholar 

  22. Stubbington, M.J.T. et al. Biol. Direct 10, 14 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Kallies, A. Immunol. Cell Biol. 86, 325–332 (2008).

    Article  CAS  PubMed  Google Scholar 

  24. Sallusto, F., Lenig, D., Förster, R., Lipp, M. & Lanzavecchia, A. Nature 401, 708–712 (1999).

    Article  CAS  PubMed  Google Scholar 

  25. Whitfield, M.L., George, L.K., Grant, G.D. & Perou, C.M. Nat. Rev. Cancer 6, 99–106 (2006).

    Article  CAS  PubMed  Google Scholar 

  26. Wu, T.D. & Nacu, S. Bioinformatics 26, 873–881 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Anders, S., Pyl, P.T. & Huber, W. Bioinformatics 31, 166–169 (2015).

    Article  CAS  PubMed  Google Scholar 

  28. Lefranc, M.-P. et al. Nucleic Acids Res. 37, D1006–D1012 (2009).

    Article  CAS  PubMed  Google Scholar 

  29. Langmead, B. & Salzberg, S.L. Nat. Methods 9, 357–359 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Grabherr, M.G. et al. Nat. Biotechnol. 29, 644–652 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Ye, J., Ma, N., Madden, T.L. & Ostell, J.M. Nucleic Acids Res. 41, W34–W40 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Bosc, N. & Lefranc, M.-P. Dev. Comp. Immunol. 27, 465–497 (2003).

    Article  CAS  PubMed  Google Scholar 

  33. Bray, N., Pimentel, H., Melsted, P. & Pachter, L. Preprint at arXiv:1505.02710 (2015).

  34. Magocˇč, T. & Salzberg, S.L. Bioinformatics 27, 2957–2963 (2011).

    Article  Google Scholar 

Download references

Acknowledgements

We thank V. Svensson, T. Hagai, J. Henriksson and other members of the Teichmann laboratory along with G. Lythe for helpful discussions. We thank the Wellcome Trust Sanger Institute Sequencing Facility for performing Illumina sequencing and the Wellcome Trust Sanger Institute Research Support Facility for care of the mice used in these studies. This work was supported by European Research Council (grant ThSWITCH, number 260507, to S.A.T.) and the Lister Institute for Preventative Medicine (S.A.T.).

Author information

Authors and Affiliations

Authors

Contributions

M.J.T.S. conceived the project, designed the computational method, wrote the software, designed PCR sequencing primers, analyzed data, generated figures and wrote the manuscript. T.L. and S.C. designed and performed the Salmonella experiments. T.L. performed cell collection and purification, generated scRNA-seq libraries, performed gene expression analyses, analyzed data, generated figures and wrote the manuscript. V.P. performed PCR-based TCR-sequencing experiments. A.O.S. designed the cell-sorting strategy, performed the sorting and generated figures. S.A.T. and G.D. supervised work and wrote the manuscript.

Corresponding author

Correspondence to Sarah A Teichmann.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Method for reconstructing TCR sequences from single-cell RNA-seq data.

(a) Overview of data-processing steps for TCR sequence reconstruction. Single-cell RNA-sequencing was performed on individual T lymphocytes to produce a pool of paired-end sequencing reads for each cell. These reads were used to quantify gene expression within each cell. In addition, sequencing reads that are derived from TCR mRNA are extracted and assembled into long contiguous TCR sequences. TCR contigs are filtered and analysed with IgBlast to determine the gene segments used and the junctional nucleotides. (b) Example of a combinatorial recombinome entry used as alignment reference for extraction of TCR-derived reads. Each TCR locus is represented by a fasta file containing entries comprising every possible combination of V and J genes for that locus. V–J combinations contain the sequence of the appropriate constant gene along with stretches of N nucleotides to represent V leader and variable junctional regions.

Supplementary Figure 2 FACS strategy for cells used in this study.

(a) uninfected control and day 14 cells (b) day 49 cells. In both cases, cells were sorted to be CD4+TCRB+NK1.1CD44hiCD62Llo. Additionally, cells from the mouse at day 49 were sorted to be CD127hi.

Supplementary Figure 3 Validation of RNA-seq TCR reconstruction.

(a) Schematic illustrating approach for targeted PCR amplification and sequencing of recombined TCR genes. (b) Numbers of concordant and discordant events from comparison between RNA-seq and PCR. Concordant events include 39 occasions where no sequence was detected by either method for a particular locus. (c) Discordancy between PCR and RNA-seq TCR sequence due to sequencing error. The TCR identifiers above were found in the same cell by RNA-seq and PCR. They differ solely by two G residues within the long homopolymeric G tract within the junctional region. (d) Expression levels of concordant sequence and discordant recombinant sequences as determined by RNA-seq (upper) or from targeted PCR (lower). Expression levels of TCR sequences were calculated as transcripts per million (TPM) from RNA-seq data or as numbers of reads from PCR data. P values were calculated using the Mann-Whitney U test since the data are not normally distributed. (e) Number of cells with zero, one, two or three recombinants for each TCR locus from combined RNA-seq and PCR results. Either for all (‘All’) or productive recombinants only (‘Prod’).

Supplementary Figure 4 Sensitivity analysis of RNA-seq reconstruction.

All single-cell datasets from day 14 mouse 1 were randomly subsampled three independent times to contain decreasing total read numbers followed by TCR reconstruction. Points representing each TCR sequence found in the full datasets are plotted according to their expression levels and the minimum total read depth required for detection in at least (a) three, (b) two or (c) one out of three subsamples. For clarity, points are jittered about the y-axis.

Supplementary Figure 5 Performance of TCR reconstruction with single-end reads and/or 50-bp reads.

Length reconstructions and sensitivity analyses as per Fig. 1 and Supplementary Fig. 4 respectively

Supplementary Figure 6 Clonotype network graph from uninfected mouse.

Each node in the graph represents an individual splenic CD4+ T lymphocyte. Identifiers within the nodes indicate the reconstructed TCR sequences that were detected for each cell. Dark coloured identifiers are productive, light coloured are non-productive. The lack of edges between nodes in this graph indicates that no nodes share TCR sequences.

Supplementary Figure 7 Clonotype network graph from day 14, mouse 1.

Each node in the graph represents an individual splenic CD4+ T lymphocyte. Identifiers within the nodes indicate the reconstructed TCR sequences that were detected for each cell. Dark coloured identifiers are productive, light coloured are non-productive. Red edges between the nodes indicate shared TCRα sequences whilst blue edges indicate shared TCRβ sequences. Edge thickness is proportional to the number of shared sequences. For clarity, the nodes without edges are not displayed.

Supplementary Figure 8 Clonotype network graph from day 14, mouse 2.

Each node in the graph represents an individual splenic CD4+ T lymphocyte. Identifiers within the nodes indicate the reconstructed TCR sequences that were detected for each cell. Dark coloured identifiers are productive, light coloured are non-productive. Red edges between the nodes indicate shared TCRα sequences whilst blue edges indicate shared TCRβ sequences. Edge thickness is proportional to the number of shared sequences. For clarity, the nodes without edges are not displayed.

Supplementary Figure 9 Clonotype network graph from day 49 mouse.

Each node in the graph represents an individual splenic CD4+ T lymphocyte. Identifiers within the nodes indicate the reconstructed TCR sequences that were detected for each cell. Dark coloured identifiers are productive, light coloured are non-productive. Red edges between the nodes indicate shared TCRα sequences whilst blue edges indicate shared TCRβ sequences. Edge thickness is proportional to the number of shared sequences. For clarity, the nodes without edges are not displayed.

Supplementary Figure 10 Clonotype network graph from day 14, mouse 1 showing examples of cells with the same TCRβ but different TCRα sequences.

Nodes filled with gray share at least one TCRβ sequence with the other members of the clonotype network but do not share alpha chains.

Supplementary Figure 11 Distribution of shared TCR sequences within the Fluidigm C1 integrated fluidics circuit (IFC) and 96-well plate.

Transcripts per million (TPM) expression values of the two most highly-shared TCR sequences are shown within the C1 IFC capture sites, harvest sites and the resulting 96-well plate that contained the associated single cells.

Supplementary Figure 12 Clonotype distribution in gene-expression space.

All clonotypes from day 14 mouse 2 are shown as purple points on top of all other cells within the gene expression space.

Supplementary Figure 13 Clonotype distribution in gene-expression space.

All clonotypes from day 14 mouse 2 are shown as purple points on top of all other cells within the gene expression space.

Supplementary Figure 14 Clonotype distribution in gene-expression space.

All clonotypes from the day 49 mouse are shown as turquoise points on top of all other cells within the gene expression space.

Supplementary Figure 15 Comparison of methods for calculating TCR productivity.

TCR productivity was assessed (a) from full-length sequences generated from IMGT reference sequences or (b) the assembled contigs generated from sequencing reads. See Online Methods for more detail.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–15, Supplementary Table 1 and Supplementary Notes 1–4 (PDF 3438 kb)

Supplementary Table 2

TCR sequences reconstructed from single-cell RNA-sequencing data. (XLSX 82 kb)

Supplementary Table 3

Comparison between RNA-seq reconstruction and PCR-based detection of TCR sequences. (XLSX 60 kb)

Supplementary Table 4

Antibodies used in FACS sorting. (XLSX 39 kb)

Supplementary Table 5

PCR primers used for TCR sequencing. (XLSX 35 kb)

Supplementary Software

Tracer-master Software (ZIP 12223 kb)

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Stubbington, M., Lönnberg, T., Proserpio, V. et al. T cell fate and clonality inference from single-cell transcriptomes. Nat Methods 13, 329–332 (2016). https://doi.org/10.1038/nmeth.3800

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.3800

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics