ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Software Tool Article
Revised

scRepertoire: An R-based toolkit for single-cell immune receptor analysis

[version 2; peer review: 2 approved]
PUBLISHED 15 Jun 2020
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the RPackage gateway.

Abstract

Single-cell sequencing is an emerging technology in the field of immunology and oncology that allows researchers to couple RNA quantification and other modalities, like immune cell receptor profiling at the level of an individual cell. A number of workflows and software packages have been created to process and analyze single-cell transcriptomic data. These packages allow users to take the vast dimensionality of the data generated in single-cell-based experiments and distill the data into novel insights. Unlike the transcriptomic field, there is a lack of options for software that allow for single-cell immune receptor profiling. Enabling users to easily combine mRNA and immune profiling, scRepertoire was built to process data derived from 10x Genomics Chromium Immune Profiling for both T-cell receptor (TCR) and immunoglobulin (Ig) enrichment workflows and subsequently interacts with a number of popular R packages for single-cell expression, such as Seurat. The scRepertoire R package and processed data are open source and available on GitHub and provides in-depth tutorials on the capability of the package.

Keywords

Single-cell RNA sequencing, immune receptor profiling, R, clonotypic analysis

Revised Amendments from Version 1

After receiving the very generous reviews, the new version of the manuscript and software reflects our attempts at improving expanding the usability of scRepertoire as a whole. As Drs. Drufuca, Bonnal, and Pagani suggested we have added scRepertoire interaction with a number of R packages and changed the definition of clonotype for B cells. Per the comments made by Dr. Stuart, we have extensively modified the code and accompanying documentation. In addition, we are in the process of submitting the package to Bioconductor. We have added an author, Gloria Kraus, who assisted in the development of the software after the paper was initially submitted. We want to thank both reviews for the suggestions, as well as a number of users, for immensely improving the scRepertoire package.

See the authors' detailed response to the review by Lorenzo Drufuca, Raoul Jean Pierre Bonnal and Massimiliano Pagani
See the authors' detailed response to the review by Tim Stuart

Introduction

The molecular resolution offered by single-cell sequencing (SCS) technologies has led to extensive investigations in the realms of developmental biology, oncology, and immunology. In terms of the latter field, SCS offers the ability to couple the exploration of transcriptomic heterogeneity in immune cells along a disease process with clonality1. A number of methods exist for dimensional reduction of mRNA data, reviewed by Chen et al.2 that have been implemented into R packages to assist in processing and analysis of SCS experiments. However, a gap exists in the processing of V(D)J sequencing, descriptive statistics, clonal comparisons, and repertoire diversity with the current SCS R packages.

With these limitations in mind, scRepertoire3 was generated (Figure 1). Built using R, scRepertoire is a toolkit to assist in the analysis of immune profiles for both B and T cells, while interacting with the popular Seurat pipeline46, as well as SingleCellExperiment and monocle3 class expression objects. scRepertoire also includes processed single-cell mRNA and V(D)J sequencing data of 12,911 tumor-infiltrating and peripheral-blood T cells derived from three renal clear cell carcinoma patient, which is characterized below to demonstrate the capabilities of the package.

829d1cf5-34ba-4f52-84da-d01135306628_figure1.gif

Figure 1. A general workflow for single-cell data analysis involving scRepertoire.

The analysis starts with the single-cell immune and mRNA sequencing and Cell Ranger-based alignment with the 10x Genomics pipeline. With the TCR or Ig sequencing, scRepertoire can import the filtered overlapping DNA segments, or contigs. The alignments are filtered by cell type of interest and combined using the individual cell barcodes. Clonotypes can be called using the gene sequence of the immune receptor loci, CDR3 nucleotide sequence or CDR3 amino acid sequence. After clonotype assignment, more extensive clonotypic analysis can be performed at the individual sample level or across all samples. General outputs from scRepertoire can be imported into a number of single-cell expression formats to visualize clonotype data overlaid onto the cell clustering. Likewise, metadata from the expression objects can be imported into scRepertoire to analyze clonotypes by assigned clusters.

Methods

Operation

System requirements for running scRepertoire3 include the installation of R v3.5.1 and the the Seurat R package (v3.1.2). Utilization of scRepertoire is dependent on the total number of single-cells being processed, with a base estimate of 1 Gb of random-access memory and a modern CPU.

Data

The isolation and processing of the 10x-Genomics-based single-cell mRNA and V(D)J Chromium sequencing data for immune cells has previously been described7,8. In addition, T cells were identified using expression values for canonical T cell markers: CD3D, CD4, CD8A, CD8B1 and previous clustering. T cells were isolated and reclustered using the integration method from the Seurat R package (v3.1.2) with 20 principal components and a resolution of 0.54. All code used to generate the figures appearing in the manuscript is available at https://github.com/ncborcherding/scRepertoire.

Implementation

The scRepertoire was built and tested in R v3.5.1. Analysis for scRepertoire was inspired from the bulk immune profiling tcR (v2.2.4) R package without derivations in code9. Clonotypes can be called using the combination of immune loci genes, a more sensitive approach, or the nucleotide/amino acid sequence of the complementary-determining region 3 (CDR3). In addition to the base functions in R, data processing was performed using the dplyr (v0.8.3) and reshape2 (v1.4.3) R packages. Visualizations are generated using the ggplot2 (v3.2.1) and ggalluvial (v0.11.1) R packages with color pallets derived from the use of colorRamps (v2.3) and RColorBrewer (v1.1.2) R packages. Diversity metrics are calculated using the vegan (v2.5-6) R package. Visual outputs of functions are stored as layers of geometric or statistical ggplot layering, allowing users to easily modify presentation.

Results

Clonal analysis

scRepertoire3 can be used to call clonotypes using the CDR3 amino acid/nucleotide sequences, by gene usage, or by the combination of CDR3 nucleotide sequences and genes. Using the quantContig function, unique clonotypes can be visualized as raw values or scaled to the size of the library for samples or by type (Figure 2A). The total abundance of clonotypes can also be visualized calling abundanceContig (Figure 2B) or relative abundance of clonotypes (Figure 2C). Additionally, the distribution of CDR3 nucleotide or amino acid sequences for clonotypes can be visualized with lengthContig (Figure 2D). More advance distribution analysis is also available using the clonesizeDistribution function based on recent work using Jensen-Shannon divergence.

829d1cf5-34ba-4f52-84da-d01135306628_figure2.gif

Figure 2. Basic clonotypic analysis functions in scRepertoire.

(A) Scaled unique clonotypes by total number of TCRs sequenced by patient and type of sample (peripheral, P; tumor, T), using the quantContig function. (B) Total abundance of clonotypes by sample and type using the abundanceContig function. (C) Relative abundance of clonotypes using density comparing peripheral blood to tumor samples. (D) CDR3 nucleotide length analysis by sample using the lengthContig function. The bimodal nature of the curve is a function of calling clonotypes for cells with both one and two immune receptors sequenced.

Proportional analysis and diversity measures

More in depth analysis of clonal architecture is available. Within the framework of scRepertoire, analysis of clonal homeostasis, or the clonal space occupied by clonotypes of specific proportions, can be visualized by clonalHomeostasis function (Figure 3A). Similarly, clonalProportion can be called to look at the proportion of clonal space occupied by specific clonotypes (Figure 3B). Overlap between the samples can be calculated and visualized with clonalOverlap, using either the overlap coefficient or Morisita index methods (Figure 3C). Measured of diversity across samples or groups can be quantified with the clonalDiversity function, demonstrating an overall reduction in clonal diversity in tumor samples (Figure 3D).

829d1cf5-34ba-4f52-84da-d01135306628_figure3.gif

Figure 3. Advanced clonal measures between samples.

(A) Clonal homeostatic space representations across all six samples using the gene and CDR3 AA sequence for clonotype calling. (B) Relative proportional space occupied by specific clonotypes across all six samples using the gene and CDR3 AA sequence for clonotype calling. (C) Morisita overlap quantifications for clonotypes across all six samples. (D) Diversity measures based on clonotypes by sample type using Shannon, Inverse Simpson, Chao, and abundance-based coverage estimator (ACE) indices.

Expression interaction

After the processing and analysis of the TCR repertoire with the base features, the next step is using scRepertoire to interact with the single-cell mRNA data. The expression data for the 12,911 cells built into the package have already been clusters (Figure 4A), with a clear distribution of the clusters into peripheral-blood- versus tumor-predominant (Figure 4B). Using the combineExpression function in scRepertoire, we can look at the clonotypic frequencies of cells that comprise the UMAP-based clusters (Figure 4C). This function also works with the SingleCellExperiment and monocle3 class of expression objects. (Figure 4D). In addition to clonal distribution, we can also use highlightClonotypes to set specific sequences of clonotypes to be visualized (Figure 4D), with clonotype 1 referring to the amino acid sequence “CAVNGGSQGNLIF_CSAEREDTDTQYF” and clonotype 2 for the amino acid sequence "NA_CATSATLRVVAEKLFF". Interesting clonotype 2 is restricted to a subcluster of the C6 cluster (Figure 4D). After combining both the clonotype and expression data, interaction between categories, such as cluster label and clonotype frequency can be visualized with the alluvialClonotypes function (Figure 4E). This function can also be used to examine the dynamics of single or multiple expanded clonotypes across the categorical variables (Figure 4E). Further, after the attachment of the expression information to a single-cell expression object, the function, expression2List() allows users generate analyses based on any categorical variable in the meta data..

829d1cf5-34ba-4f52-84da-d01135306628_figure4.gif

Figure 4. Interaction of scRepertoire with the single-cell expression R packages.

(A) UMAP projection from Seurat of the ccRCC T cells (n=12,911) into 12 distinct clusters. (B) UMAP projection with peripheral blood (red) and tumor (blued) populations highlighted and an accompanying relative proportion composition of each cluster, scaled by the total number of peripheral blood and tumor cells, respectively. (C) Using the combineExpression function places individual cells into groups by the number of clonotypes, which then can be displayed overlaid with the UMAP projection. (D) After combining the clonotype information with the Seurat object, highlightClonotypes can be used to specifically highlight the individual clonotypes of interest using the sequence information. (E) Interaction of clonotypes between multiple categories can be examined using the alluvialClonotypes function.

Conclusions

scRepertoire3 is a R-based toolkit for the analysis of single-cell immune receptor profiling. The package is able to take the annotated filtered outputs from the 10x Genomics Cell Ranger platform and provide analysis a number of modalities, including calling clonotypes, clonal space/homeostasis, clonal diversity, and repertoire overlap between samples. Outputs from scRepertoire can combined with dimensional reduction strategies for single-cell RNA quantifications, allowing users to analyze mRNA and immune profiles together. Visualization functions in scRepertoire have a parameter, exportTable, allowing users to examine the quantifications underlying the generation of the graphs. Under the creative commons v4.0 license, the scRepertoire package is freely available from the GitHub repository and is extensively annotated to assist in implementation and modification.

Data availability

Source data

Zenodo: scRepertoire. https://doi.org/10.5281/zenodo.38568273.

Folder ‘Data’ contains all data required to run the vignettes described in the Results. This is also available on GitHub.

Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).

Software availability

Source code is available from GitHub: https://github.com/ncborcherding/scRepertoire.

Archived source code at the time of publication: https://doi.org/10.5281/zenodo.38568273.

License: Creative Commons Attribution 4.0 International.

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 27 Jan 2020
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Borcherding N, Bormann NL and Kraus G. scRepertoire: An R-based toolkit for single-cell immune receptor analysis [version 2; peer review: 2 approved] F1000Research 2020, 9:47 (https://doi.org/10.12688/f1000research.22139.2)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 15 Jun 2020
Revised
Views
24
Cite
Reviewer Report 03 Aug 2020
Raoul Jean Pierre Bonnal, National Institute of Molecular Genetics, Milan, Italy 
Lorenzo Drufuca, School of Medicine and Surgery, Università degli Studi di Milano Bicocca, Milan, Italy 
Massimiliano Pagani, National Institute of Molecular Genetics, Milan, Italy 
Approved
VIEWS 24
I can confirm that the authors addressed all the ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Bonnal RJP, Drufuca L and Pagani M. Reviewer Report For: scRepertoire: An R-based toolkit for single-cell immune receptor analysis [version 2; peer review: 2 approved]. F1000Research 2020, 9:47 (https://doi.org/10.5256/f1000research.27282.r64820)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
24
Cite
Reviewer Report 09 Jul 2020
Tim Stuart, New York Genome Center, New York City, NY, USA 
Approved
VIEWS 24
The authors have now ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Stuart T. Reviewer Report For: scRepertoire: An R-based toolkit for single-cell immune receptor analysis [version 2; peer review: 2 approved]. F1000Research 2020, 9:47 (https://doi.org/10.5256/f1000research.27282.r64821)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 27 Jan 2020
Views
51
Cite
Reviewer Report 09 Mar 2020
Tim Stuart, New York Genome Center, New York City, NY, USA 
Approved with Reservations
VIEWS 51
This paper presents a new R package for the analysis of T-cell and B-cell clonotypes inferred from V(D)J recombination events. The package is designed for the analysis of data generated using the 10x Genomics V(D)J kit that allows clonotypes and ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Stuart T. Reviewer Report For: scRepertoire: An R-based toolkit for single-cell immune receptor analysis [version 2; peer review: 2 approved]. F1000Research 2020, 9:47 (https://doi.org/10.5256/f1000research.24415.r60873)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 15 Jun 2020
    Nicholas Borcherding, Medical Scientist Training Program, University of Iowa, Iowa City, USA
    15 Jun 2020
    Author Response
    This paper presents a new R package for the analysis of T-cell and B-cell clonotypes inferred from V(D)J recombination events. The package is designed for the analysis of data generated ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 15 Jun 2020
    Nicholas Borcherding, Medical Scientist Training Program, University of Iowa, Iowa City, USA
    15 Jun 2020
    Author Response
    This paper presents a new R package for the analysis of T-cell and B-cell clonotypes inferred from V(D)J recombination events. The package is designed for the analysis of data generated ... Continue reading
Views
54
Cite
Reviewer Report 03 Mar 2020
Lorenzo Drufuca, School of Medicine and Surgery, Università degli Studi di Milano Bicocca, Milan, Italy 
Raoul Jean Pierre Bonnal, National Institute of Molecular Genetics, Milan, Italy 
Massimiliano Pagani, National Institute of Molecular Genetics, Milan, Italy 
Approved with Reservations
VIEWS 54
The paper describes a new package (scRepertoire) of functions for the analysis of clonality in single cell experiments in the R analytical environment.

While it is true, as it is claimed, that 'a gap exists in the ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Drufuca L, Bonnal RJP and Pagani M. Reviewer Report For: scRepertoire: An R-based toolkit for single-cell immune receptor analysis [version 2; peer review: 2 approved]. F1000Research 2020, 9:47 (https://doi.org/10.5256/f1000research.24415.r59209)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 15 Jun 2020
    Nicholas Borcherding, Medical Scientist Training Program, University of Iowa, Iowa City, USA
    15 Jun 2020
    Author Response
    The paper describes a new package (scRepertoire) of functions for the analysis of clonality in single cell experiments in the R analytical environment.

    While it is true, as it ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 15 Jun 2020
    Nicholas Borcherding, Medical Scientist Training Program, University of Iowa, Iowa City, USA
    15 Jun 2020
    Author Response
    The paper describes a new package (scRepertoire) of functions for the analysis of clonality in single cell experiments in the R analytical environment.

    While it is true, as it ... Continue reading

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 27 Jan 2020
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.