Introduction

There is a need in molecular biology and biomedical research for open-ended, hypothesis-generating research, in order to discover previously unknown molecular mechanisms. Genetic screening provides a powerful approach for identifying genes, pathways and mechanisms involved in a given phenotype or biological process. This is illustrated by the many successes of forward genetics in cell lines1 and in model organisms such as flies2,3, worms4, yeast5, plants6 and fish7, and pioneering work in RNA interference (RNAi) screens8,9.

CRISPR screens exploit the efficiency and flexibility of CRISPR–Cas genome editing10. They have become a popular and productive tool for biological discovery in a broad range of applications11,12. In a typical pooled CRISPR screen (Fig. 1), a CRISPR guide RNA (gRNA) library is introduced in bulk into cells, such that individual cells receive different gRNAs and are perturbed according to the gRNA received by the cell. These gRNAs are usually delivered by lentiviral transduction and are integrated into the DNA of the target cells, making it possible to efficiently determine the induced perturbations based on the gRNA sequence. The CRISPR–Cas protein is either stably expressed in the cells or ectopically introduced as a plasmid, virus, mRNA or protein. The gene-edited cells are challenged with a selective pressure such as drug treatment, viral infection or cell proliferation, such that the cells compete with each other based on the fitness effect of the engineered genetic perturbations. The gRNAs are then counted in the pool of cells retained after the challenge. This is usually done by high-throughput sequencing. Finally, their representation is compared between different challenges or different time points. In the resulting data, depletion of specific gRNAs identifies genes whose disruption sensitizes cells to the challenge, whereas their enrichment identifies genes whose disruption confers a selective advantage.

Fig. 1: Experimental design for CRISPR screening.
figure 1

CRISPR screens can be described along four dimensions: the biological model in which the screen is conducted; perturbations introduced using CRISPR technology; challenges to which the perturbed cells are exposed; and a read-out that measures the induced molecular or cellular effects. In pooled CRISPR screens, perturbations are introduced in bulk. They are genetically encoded and typically read out by guide RNA (gRNA) sequencing. In arrayed CRISPR screens, different perturbations are introduced separately — for example, in different wells of a 96-well plate. As each reaction compartment is subjected to a defined perturbation, the read-out does not need to include gRNA sequencing.

In contrast to pooled screens, arrayed CRISPR screens maintain physical separation between perturbations throughout the screen (Fig. 1). As each target gene occupies a separate compartment — for example, different wells on a 96-well plate — arrayed screens tend to be more labour-intensive, costly and limited in scale than pooled screens. Their advantage is that the perturbation a cell receives is predefined by the study design and does not need to be measured explicitly. Therefore, arrayed screens are easier to combine with read-outs that do not involve any sequencing, such as imaging, proteomics and metabolomics profiling.

For reasons of scale and scope, pooled screens are primarily used for discovery, whereas arrayed screens are primarily used for validation and follow-up investigation. Nevertheless, recent technological advances make it possible to obtain detailed biological insights as part of discovery-oriented pooled CRISPR screens. The use of sophisticated models such as organoids and whole organisms, flexible perturbations such as gene activation and repression, diverse biological challenges and data-rich read-outs such as single-cell sequencing and imaging are establishing pooled CRISPR screens as a powerful method for functional biology. Such high-content CRISPR screens provide exciting opportunities to perform mechanistic research at scale.

This Primer introduces concepts, practical considerations and applications of CRISPR screens, with a focus on pooled screens and high-content approaches. We describe how a typical CRISPR screen is designed and executed, including the choice and optimization of the model system. We discuss various CRISPR perturbations such as gene knockout, activation and inhibition, and complex read-outs such as single-cell sequencing and imaging. We outline good practices for analysing and interpreting data from CRISPR screens, and review groundbreaking applications of CRISPR screening across a broad range of fields. We also describe how CRISPR screens should be documented to enhance their reproducibility and provide lasting value. Finally, we outline current challenges and future developments in the area of high-content CRISPR screening.

Experimentation

The experimental design of a typical CRISPR screen comprises four main elements (Fig. 1): the biological model; a method for CRISPR-based perturbation; biological challenges that influence the competition among the perturbed cells; and a read-out that connects the observed biological phenotypes to the gRNAs that induced them. The research question typically defines the selection of the right model and the most relevant biological challenges, and recent advances in CRISPR technology and high-throughput profiling provide flexibility for selecting suitable perturbations and read-outs. This section provides an overview of typical CRISPR screens. In addition, a checklist of considerations when starting a CRISPR screen is shown in Box 1.

Conducting a pooled CRISPR screen

Selection of the biological model

The first step for a successful CRISPR screen is to select a model system that captures the relevant biological processes and is amenable to genetic screening. Immortalized cell lines provide an inexpensive and easy to handle model for studying biological mechanisms that are adequately represented by simple in vitro cultures. To study more complex and context-dependent biological phenomena, screens can be conducted in primary cells, in tissue explants or in stem-cell-derived cultures including organoids. CRISPR screening is also possible in living animals13, albeit at a smaller scale than is feasible for in vitro screens. For example, screens in mouse models can be performed by editing cells ex vivo and then transplanting them into the organism, or by delivering the gRNAs and the CRISPR–Cas protein into the mice for in vivo editing.

To enhance the efficiency of CRISPR screening, the target cells can be engineered to express the CRISPR–Cas protein. This way, only the gRNAs need to be delivered ectopically during the screen. This separation can also improve safety as no single construct contains both components needed for inducing double-strand breaks. Clones with high and stable Cas9 expression can be preselected14. However, working with just one or a few clones increases the risk of clone-specific artefacts and requires careful validation to ensure that the selected clones are representative and informative for the biological question. For in vivo screens in mice and ex vivo screens in mouse primary cells, transgenic mice can be used that constitutively express Cas9 (refs15,16) or one of its derivatives17. For human primary cells, viral delivery of the gRNAs can be combined with transfection of a Cas-encoding plasmid, a chemically modified mRNA or the CRISPR–Cas protein itself18.

Careful selection and optimization of the screening model are essential to ensure that the results are broadly relevant to the investigated biological phenomena. It is often advisable to use several variants of the same model — for example, cell lines with different genetic backgrounds — to enhance the relevance and interpretability19,20. Such biological replicates are particularly important because the same perturbation may cause different phenotypic consequences depending on the genetic background21. CRISPR screens should be performed using at least three biological replicates to obtain robust and interpretable results, although well-designed and carefully implemented screens can produce reliable data with fewer replicates22.

Perturbation and CRISPR library design

The CRISPR–Cas protein is expressed in the model cells to enable CRISPR-based perturbations. This is done either transiently by introducing the plasmid, mRNA or protein into the cells, or stably by lentiviral transduction or genome engineering, with or without selection or subcloning of individual clones with high CRISPR–Cas protein levels. The efficiency of CRISPR perturbation should be tested and, if necessary, optimized; this can be achieved by delivering gRNAs targeting several test loci by lentiviral transduction23,24,25 and evaluating editing efficiency at the DNA level, for example using targeted DNA sequencing. It is also advisable to confirm successful perturbation at the protein level26 using techniques such as western blotting or flow cytometry.

For the CRISPR screen, a gRNA library is prepared (Fig. 2) and transduced into the cells in bulk. Most screens target protein-coding genes, although CRISPR screens can also be performed for non-coding DNA and gene regulatory regions27,28. CRISPR gRNA libraries may be genome-wide or focused on a smaller set of tens to thousands of genes. Genome-wide screens do not depend on prior knowledge and may reuse existing genome-wide gRNA libraries; however, they require a large number of cells for adequate coverage, which makes them labour-intensive and costly — or even infeasible for rare cell types. Focused screens often provide a useful alternative to genome-wide screens, with the limitation that their scope is restricted to the chosen target genes and that unexpected biological mechanisms are easily missed. It is possible to combine both strategies, by performing a genome-wide screen with modest coverage (which includes all genes but has comparatively low sensitivity for each individual gene) and a targeted screen with high coverage (which focuses on specific candidate genes or gene sets, and thereby achieves higher sensitivity of detection for these genes).

Fig. 2: Preparation of CRISPR gRNA libraries.
figure 2

The guide RNA (gRNA) library defines which genes are probed in a CRISPR screen. Application-specific libraries are designed using bioinformatic tools (detailed in Table 1), synthesized as oligonucleotide pools, cloned in bulk into the plasmid vector and packaged into a lentivirus for delivery into cells.

Various tools and resources facilitate the selection of gRNAs for the widely used Streptococcus pyogenes-derived Cas9 protein22,29,30,31,32 and alternatives such as AsCas12a and LbCas12a (refs33,34). For both genome-scale and focused CRISPR knockout (CRISPRko) screens, it is advisable to include at least four different gRNAs for each target gene22, although carefully designed and validated gRNA libraries with fewer gRNAs per gene can provide reliable results35. All gRNA libraries should include gRNAs for application-specific positive and negative control genes, which are important to validate the screen. In addition, they should include control gRNAs that target known safe harbour loci or other genomic regions where no specific effects of gene editing are expected, in order to account for the DNA damage response and for non-specific reduction in cell proliferation caused by CRISPRko; this is particularly relevant in aneuploid cancer cells36 or when targeting multiple loci per cell33,37.

In pooled CRISPR screens, the gRNA library is typically cloned into a lentiviral vector, and the cells are transfected at a relatively low multiplicity of infection (MOI), often between 0.3 and 0.5. This is to ensure that few cells receive more than one gRNA simultaneously. As a result, it is not usually necessary to account for potential genetic interactions between different gRNAs in the same cell. Screens with much higher MOIs are being used when cell numbers are limited38, when most gRNAs are not expected to have any effect39 or when studying genetic interactions40. As an alternative to combinatorial screens with high MOIs, multiplexed gRNA expression systems provide finer control of gRNA expression. Such screens can be implemented with paired expression cassettes41,42,43,44,45 or by exploiting the ability of Cas12a to process multiple gRNAs33,37,46,47.

The goals of the screen influence the advisable coverage of cells per target gene. For positive selection screens, for example to identify perturbations that confer drug resistance, a coverage of 100–200× per target gene is desired; this can be broken down to 4 gRNAs per gene with 25–50× coverage each. By contrast, negative selection screens tend to require a coverage of 500–1,000× per target gene to detect essential genes with high sensitivity. Bioinformatic tools such as CRISPulator48 can be used to simulate the effect of different design decisions on the coverage and statistical power of CRISPR screens.

The number of target genes multiplied by the desired coverage provides a rough estimate for the number of cells that must be infected and maintained during the screen. If this estimate exceeds what is practically feasible, it is typically better to select fewer target genes than to run the screen with low coverage and risk poor reproducibility. Once the scale of the screen has been established, it is advisable to perform a series of pilot experiments for optimization, with the goal to achieve high consistency between biological replicates and to minimize unwanted selective pressures, cell stress or population bottlenecks. Moreover, it is important to ensure that cell culture vessels and media changes support the planned cell numbers and growth rates, particularly for large and logistically challenging screens with cell numbers in the hundreds of millions.

Selection and calibration of the challenge

In pooled CRISPR screens, the perturbed cells compete with each other for representation in the final pool of cells. By adding a biological challenge that modulates the selective pressures, such screens can be tailored to many different research questions. For biological processes that are closely linked to cell survival and proliferation, unconstrained in vitro proliferation may be sufficient as a challenge, whereas other research questions often require tailored biological challenges such as drug treatment, viral infection or functional assays.

For a successful screen, it is important to understand the dose–response relationship of the challenge, and to calibrate the selective pressures such that the perturbed cells compete with each other in a meaningful way. For negative selection screens, the selection should be mild — for example, using an effective dose of a drug or virus that kills 25–50% of perturbed cells; by contrast, stronger selection with effective doses that kill above 50% of cells is advisable for positive selection screens49,50.

The timing of the challenge is an important consideration that can influence the results of CRISPR screens. In screens for essential genes, for example, read-outs at early time points tend to enrich for genes whose knockout causes immediate cell death — such as genes involved in transcription and translation — whereas later time points also identify genes that affect cell proliferation and fitness more indirectly51. Inducible Cas9 makes it possible to screen for the early effects of targeting essential genes in genome-wide screens52 and optogenetic modulation of Cas9 activity enables precise spatio-temporal control of genome editing53,54.

Guide RNA frequency as the screening read-out

To assess the effect of different perturbations, gRNA frequencies can be counted and compared between conditions. This either includes the entire cell population (as a read-out of cell survival and proliferation) or is artificially restricted to certain cell populations to investigate specific biological phenomena. For example, cells can be enriched based on surface markers or cellular phenotypes prior to gRNA counting. Importantly, the gRNAs in a CRISPR screen both induce the perturbation and serve as genetic barcodes that denote the perturbations that the cells have been exposed to. It is thus possible to obtain a precise quantitative representation of gRNAs by sequencing gRNA amplicon libraries prepared from each analysed sample. The enrichment or depletion of each gRNA and target gene is determined relative to the representation of gRNAs in the gRNA library and by comparing between experimental conditions.

gRNAs targeting control genes with known phenotypic effects are useful for validating the screen, quantifying the signal-to-noise ratio55,56,57 and calibrating the selective pressure. It is advisable to perform small test screens that target a handful of positive and negative control genes for different strengths of the challenge and to select the conditions that provide maximum separation of gRNA counts between the positive and negative controls. It can also be useful to perform variations of the same screen with different conditions — for example, different concentrations of a drug — in order to capture the full spectrum of genes that contribute to the biological process of interest58. Whereas many CRISPR screens analyse only a single ‘end point’ that has been optimized for best separation between positive and negative controls59,60, it can be informative to assess multiple time points to capture dynamic changes over the course of the screen.

When interpreting the results of a pooled CRISPR screen, the target genes are often ranked in order to identify the most promising screening hits for follow-up. However, even the top-ranking hits may contain false positives, which can arise from biases in the assay or random fluctuations. Moreover, the precise ranking of hits can be noisy even in successful screens. This is particularly true when the observed differences are small. It is advisable to select multiple hits for experimental validation, for example by more focused pooled or arrayed validation screens or small-scale assays, ideally using complementary models and read-outs. Although it is common practice to cherry-pick interesting screening hits for validation, this approach cannot validate the screen as a whole. For a representative assessment of the screening hits, some hits should be selected for validation solely based on their ranks — for example, the top 20 hits as well as 10 hits each around the 5th, 10th, 25th, 50th and 75th percentiles.

Alternative perturbations

CRISPR interference and CRISPR activation

Most CRISPR screens to date have used CRISPRko, where DNA double-strand breaks are induced in a gRNA-directed way and target genes are knocked out, typically as the result of frameshift mutations introduced by the cell’s DNA repair machinery. Beyond CRISPRko, which is based on Cas nucleases, the CRISPR–Cas system can also be used for RNA-directed recruitment of other molecular functions to specific loci in the genome. To that end, a nuclease-deactivated Cas protein that cannot cut DNA — for example, Cas9 endonuclease dead, usually abbreviated as dCas9 — is combined with protein domains that implement the desired function (Fig. 3).

Fig. 3: CRISPR-mediated perturbation of cells.
figure 3

CRISPR technology provides many options to perturb cells. a | Genome editing. Directed by a guide RNA (gRNA), Cas9 nucleases introduce double-strand breaks into the target site; subsequent DNA repair results in compromised gene function (CRISPR knockout (CRISPRko)). CRISPR base editors induce specific mutations, by combining a base modification enzyme, a uracil DNA glycosylase inhibitor (UGI) domain that inhibits base excision repair and a Cas nickase that nicks the non-edited strand of DNA to favour repair with the edited base. A cytosine base editor is shown; adenosine base editors are also available. CRISPR prime editing can introduce new sequence information into the genome; it uses a Cas9 nickase fused to a reverse transcriptase and a prime editing gRNA (pegRNA) corresponding to the target locus, which also provides new genetic information. b | Epigenome editing. Cas9 endonuclease dead (dCas9) can be combined with epigenetic writer and eraser enzymes such as the demethylase Tet1 (not pictured), the methyltransferase DNMT3A or the H3K27 acetyltransferase p300 to induce changes in DNA methylation or histone marks. c | Transcriptional control. CRISPR interference (CRISPRi) uses dCas9 fused to transcriptional repressors such as Krüppel associated box (KRAB), causing repression of genes close to the gRNA target site. CRISPR activation (CRISPRa) uses dCas9 with transcriptional activators such as the VP64 domain and MCP–p65–HSF1 fusion proteins recruited via an MS2 stem–loop sequence. d | RNA modulation. RNA base editors induce specific mutations into RNA molecules using an adenosine deaminase (ADAR) targeted to RNA; it converts adenosine into inosine, which acts as guanine during translation. RNA splicing can be altered at the RNA level by substituting the RNA-binding domain of the RBFOX1 protein with dCas9. Finally, RNA interference (RNAi) can be achieved by a ribonuclease (CasRx) that binds to RNA and cleaves it. PAM, protospacer adjacent motif; RT, reverse transcription.

Recruitment of transcriptional repressors enables CRISPR-mediated downregulation of target genes, a technique known as CRISPR interference (CRISPRi); and transcriptional activators enable CRISPR-mediated upregulation, known as CRISPR activation (CRISPRa)60,61,62,63,64,65,66,67. In CRISPRi, dCas9 is fused with a Krüppel associated box (KRAB) domain, which represses gene expression when targeted to promoter regions63,68,69,70,71. As CRISPRi does not induce DNA damage, it is less toxic to cells than CRISPRko55,72. Gene repression may yield different phenotypes to gene knockout as it is less prone to activating compensatory pathways73. A limitation of CRISPRi compared with CRISPRko is the need for continuous expression of the dCas9 protein and gRNA to maintain inactivation, although recent work shows that long-term stable repression can be achieved by extensive epigenetic remodelling at the target sites74. In CRISPRa, fusions of dCas9 and the transcriptional activator VP64 have a modest effect on gene activation63,75. Improvements include the dCas9–SunTag system, which increases gene activation by recruiting many copies of VP64 (ref.66), and the dCas9–VPR system, which combines three activator domains (VP64 and activator domains from the transcription factors p65 and Rta) into a single fusion protein60. Finally, the widely used synergistic activation mediator (SAM) system recruits the two transcription factors HSF1 and p65 to enhance gene activation on top of a dCas9–VP64 fusion61.

Cas proteins other than Cas9 have also been adapted for CRISPRi and CRISPRa. Cas12a (previously known as Cpf1) can cleave a single transcript into multiple gRNAs76, which facilitates the simultaneous targeting of several genes77. dCas12a–VPR and dCas12f–VPR work well as a transcriptional activator, whereas dCas12a–KRAB shows modest repression activity compared with dCas9–KRAB (refs77,78). Further, Cas proteins that target RNA have been fused to transcriptional repressor domains, giving rise to post-transcriptional repression with better specificity than conventional RNAi79,80. It is even possible to regulate RNA splicing using CRISPR technology, for example by fusing dCas13d to splicing factors81 or by editing of splice donor and acceptor sites82,83.

CRISPRi and CRISPRa induce epigenetic changes as a side effect of transcriptional repression or activation. In addition, CRISPR technology can be used to directly perturb the epigenome. For example, fusions of dCas9 to epigenetic effectors enable the editing of methylated histone H3K4, H3K9, H3K27, H3K79 and H4K20, acetylated H3K27 and DNA methylation84,85,86,87,88,89,90,91,92,93. The effects of these epigenetic modifications vary widely in the size of the affected target region (footprint), the stability of the induced changes over time and the strength of the effect, which includes the ability to affect the expression of neighbouring genes. CRISPR has also been used to manipulate the three-dimensional structure of the chromatin, namely by inducing chromatin looping between two genomic loci94,95 and by recruiting gRNA-specified genomic loci to nuclear structures such as Cajal bodies or the nuclear periphery96. Epigenome editing is also feasible for RNA and the epitranscriptome; for example, N6-methyladenosine (m6A) modifications can be induced by fusing dCas9 with an RNA methyltransferase complex comprising the methyltransferases METTL3 and METTL4 and the splicing regulator WTAP97,98.

Base editing

CRISPR base editors are fusions of dCas9 with protein domains that chemically modify bases in the DNA; this enables the introduction of genetic changes without inducing double-strand breaks. Base editing has been used to modify disease-associated genetic variants, where it provides better control of the induced changes compared with CRISPRko99. Base editing can also be used to perform single-nucleotide-level mapping of regulatory elements, with higher resolution but usually smaller effects than screens using CRISPRi or CRISPRa100.

CRISPR base editors typically combine a base modification enzyme that targets single-stranded DNA, a uracil DNA glycosylase inhibitor (UGI) domain that inhibits base excision repair and a Cas9 nickase that nicks the non-edited strand of DNA to trigger repair according to the edited base99. Several types of base editors have been developed, changing A to G (adenosine base editors) or C to T (cytosine base editors)101,102. Most cytosine base editors use APOBEC1 or CDA1 as the base modification enzyme, and most adenosine base editors use an evolved version of the Escherichia coli adenosine deaminase TadA. The efficiency of cytosine base editing was reported to be ~50%, with low indel generation (<1%) and little off-target editing (<1%)99,103. Base editing efficiency is lower in post-mitotic cells (~10%), but still more efficient than homology-directed repair in such cells104. Although less established than the systems described above, C to G base editors are also available and can provide high editing efficiency in some instances105.

Genome sequence context affects base editing efficiency and needs to be considered during gRNA design101,102,106. Most base editors can modify their target base within a small window around the gRNA-encoded genomic position, potentially giving rise to bystander edits99,103,107. They can also introduce off-target effects elsewhere in the genome and transcriptome108. Base editing of RNA has been demonstrated using a fusion of dCas13b to the adenosine deaminase ADAR2, which converts adenosine into inosine (translationally equivalent to guanine) with an on-target efficiency of ~45% and a low rate of off-target editing109. This system was subsequently extended to add cytosine deaminase properties110. Owing to constant RNA turnover, RNA editing does not cause permanent changes; therefore, the Cas protein must be expressed throughout the duration of the experiment to maintain the editing effect.

Insertions and deletions

CRISPRko introduces small insertions and deletions that typically comprise only a few base pairs. For well-designed gRNAs this is usually sufficient to knock out the target gene, either by introducing frameshift mutations or by disrupting essential protein domains111. By contrast, large-scale genome engineering such as insertion or deletion of entire genes requires other methods. One strategy is to induce double-strand breaks at both ends of the target locus, such that the intervening region is either deleted or replaced by a co-delivered homology-directed repair template that carries the desired DNA sequence112. However, this method suffers from low efficiency and the need to deliver several matched components to the same cell.

CRISPR prime editing is a promising approach for introducing new DNA sequences into the genome113. This method uses a fusion protein comprising dCas9, an engineered reverse transcriptase and a prime editing gRNA (pegRNA) that both encodes the target site and acts as a template for the new genetic information to be introduced into the DNA113. Although the initial efficiency of CRISPR prime editing was low, a CRISPRi screen conducted on DNA repair proteins identified DNA mismatch repair as an impediment to prime editing and devised ways to overcome this by transient inhibition of this process or by modification of the pegRNA sequence114. The efficiency of prime editing is further enhanced by engineering the pegRNA for higher stability115, and there is an increasing collection of software tools that facilitate the design of efficient pegRNAs116,117,118.

A promising application of prime editing is the introduction of many genetic variants in a pooled screen, as represented in a recent preprint article119. Although prime editing is generally restricted to short edits of fewer than 20 bp, it is possible to introduce larger genomic deletions by targeting complementary prime editing events on both sides of the target DNA120,121. With a similar approach, it is possible to introduce gene-sized DNA fragments into specific loci, for example placing genes in safe harbour loci or tagging proteins with fluorescent labels122,123. These demonstrations of versatility, together with recent gains in efficiency, suggest that CRISPR prime editing screens are becoming broadly useful for investigating complex genetic changes.

High-content read-outs

Cell sorting and enrichment

Counting gRNAs provides a simple and scalable read-out for studying molecular mechanisms linked to cell survival and proliferation. This technique can be generalized to other biological phenomena by sorting or enriching cells with certain biological characteristics (Fig. 4). For example, a reporter cell line can be constructed for a signalling pathway of interest, expressing a fluorescent protein under the control of a promoter that is stably activated by this pathway124. Reporter cell lines require extensive validation before use in CRISPR screening to ensure that the activity of the reporter correlates with the biological phenotype of interest and that it provides a sufficient signal-to-noise ratio. Some biological signals are weak and difficult to detect, which has led to the development of synthetic biological circuits for signal amplification125.

Fig. 4: CRISPR screening with high-content read-out.
figure 4

Sequencing-based counting of guide RNAs (gRNAs) is a straightforward and widely used read-out of pooled CRISPR screens, especially for phenotypes that affect cell proliferation and survival. To broaden CRISPR screening to additional cellular phenotypes, several high-content read-outs have been introduced. Cell sorting prior to gRNA sequencing makes it possible to identify gRNAs and genes that affect predefined, sortable cellular phenotypes — such as expression of a fluorescent reporter. Single-cell sequencing of transcriptomes and matched gRNAs can identify regulators of gene expression and transcriptome-linked cell states. Spatial imaging combined with methods to distinguish gRNAs in individual cells makes it possible to identify genes that affect imaging-based cellular phenotypes, for example inducing changes in cell shape.

The enrichment of marker-positive cells is typically performed by fluorescence-activated cell sorting (FACS). One common strategy is to sort cells with high marker expression versus low marker expression (for example, in the top and bottom 1% or 10%, respectively) and to compare gRNA frequencies between these populations. However, processing large numbers of cells using FACS can be time-consuming and costly, especially in the context of genome-wide screens with hundreds of millions of cells. This limitation can be addressed by alternative methods of marker-based cell enrichment based on magnetic beads or microfluidics126,127,128.

Fluorescent dyes can be used to mark specific parts of the cell or to record cell division129 and fluorescently labelled antibodies can detect proteins of interest, without the need for engineering transgenic reporter cells. This approach has been used in screens of T cell tumour engagement130, cancer signalling pathways131, immune cell activation132 and cell differentiation133,134,135. Fluorescence in situ hybridization (FISH) enables quantification of mRNA abundance and can be combined with flow cytometry, which was used in screens for regulators of gene expression136,137. Further, mass cytometry can facilitate the parallel analysis of many antibodies, enabled by combinatorial labelling of gRNAs with heavy metals to detect them using mass spectrometry138.

Single-cell CRISPR sequencing

Single-cell sequencing provides a data-rich read-out for CRISPR screens, which is particularly useful for biological phenotypes that are not easily measured by a single marker gene (Fig. 4). CRISPR screens with a single-cell sequencing read-out simultaneously determine the gRNAs that induce a perturbation as well as the corresponding transcriptome profiles in single cells. We refer to this approach as scCRISPR-seq, including methods such as Perturb-seq139,140, CRISP-seq141, CROP-seq142 and Mosaic-seq143. Transcriptomes provide a particularly high-content read-out for a CRISPR screen, for example allowing scCRISPR-seq screens to determine the type and state of the perturbed cells and to quantify induced changes in gene expression, gene regulatory networks, signalling pathway activity and other properties that can be inferred from the single-cell transcriptomes.

Such scCRISPR-seq screens provide high flexibility for subsequent data analysis; for example, many different virtual screens can be performed on the same scCRISPR-seq data set by assessing gRNA enrichment for different gene signatures, and the data can be re-analysed in light of new biological insights by adding new gene signatures to the bioinformatic analysis, without the need for new experiments. Because scCRISPR-seq screens readily detect differences in cell type among the assayed cells, they are well suited for screens in complex, heterogeneous biological systems such as organoids and primary tissues. The feasibility of in vivo scCRISPR-seq has been demonstrated by two studies focusing on the mouse haematopoietic system and the brain141,144.

A technical challenge with scCRISPR-seq screens is that CRISPR gRNAs are typically expressed by RNA polymerase III, and therefore not polyadenylated and not captured by standard single-cell RNA sequencing (RNA-seq) assays. This issue can be addressed in several ways. First, individual gRNAs can be linked to matched barcodes that are expressed under an RNA polymerase II promoter and are readily detected by single-cell RNA-seq139,140,141,143. A limitation of this approach is lentiviral ‘template switching’ when preparing gRNA libraries in bulk, which can break the association between gRNAs and expressed barcodes145; this can be reduced by decoy transfer plasmids (as shown in a preprint article)146 or — for small screens — eliminated by preparing the lentivirus in an arrayed format. Second, the CROP-seq vector creates a polyadenylated copy of the gRNA, making it directly readable in single-cell RNA-seq profiles142. Third, gRNA capture enables the amplification and sequencing of the gRNAs without the need for a polyadenylation tail147,148.

Although genome-wide screens with a single-cell RNA-seq read-out are conceptually feasible, such screens are currently limited by the high cost of single-cell sequencing. Therefore, scCRISPR-seq screens are typically conducted at a scale of up to a few thousand target genes, for example to validate and better characterize gene sets obtained from genome-wide CRISPR screens, population genomics studies and general biological knowledge.

To reduce sequencing costs, gene panels can be used to assay only genes of specific biological interest148,149. Further, scCRISPR-seq screens can be performed with high MOIs, such that most cells carry several perturbations; this is particularly useful when most gRNAs are not expected to have any effect39. Finally, cost-effective methods for single-cell RNA-seq such as combinatorial indexing150,151 and the related scifi-RNA-seq assay152 will facilitate scCRISPR-seq and arrayed CRISPR screens with millions of single-cell transcriptomes.

The ongoing development of single-cell multi-omics assays153,154 will likely lead to screens with single-cell read-outs of the genome, epigenome, transcriptome, proteome and/or metabolome. Indeed, scCRISPR-seq screens have already been conducted with a single-cell ATAC-seq read-out, providing insights into the complexities of chromatin regulation155,156,157. Screens with combined measurement of the transcriptome and cell surface protein have also been established147 and applied to study the regulation of immune checkpoints and genes mediating immune evasion in cancer treatment158,159.

CRISPR screening with spatial imaging read-outs

Imaging provides an attractive read-out for CRISPR screens that complements the molecular perspective of single-cell sequencing with a focus on cell morphology (Fig. 4). Exploiting recent advances in imaging technology, it is now possible to assay the localization, transportation and dynamics of individual molecules, molecular assemblies and organelles inside cells, on top of the characterization of cell shape and behaviour.

CRISPR screens with imaging read-outs can be performed in an arrayed format following established concepts from high-content drug screening160. In addition, imaging-based pooled CRISPR screens have been demonstrated using several approaches. Following a similar concept to FACS-based screens, cells with specific phenotypes can be identified by imaging, isolated and subjected to gRNA sequencing to determine the genetic perturbations. In one method, this was achieved by culturing cells on arrays containing 40,000 individual microrafts, physically isolating microrafts that contain cells with the imaging-based phenotype and harvesting them for gRNA sequencing161. Using this method, 1,078 RNA-binding proteins were screened for potential roles in stress granule formation. In an alternative method, a photoactivatable protein was selectively illuminated in cells with an imaging-based phenotype of interest, and these fluorescently marked cells were enriched by FACS and subjected to gRNA sequencing162,163,164. This method was used to investigate the subcellular localization of the transcription factor EB163 and the regulation of nuclear size164.

A limitation of imaging followed by cell selection and gRNA sequencing is the small number of cellular phenotypes that can be studied in a single experiment. This limitation is addressed by methods that determine the gRNA identity for each cell directly based on the imaging data165,166. One such method uses a microfluidic chip to randomly seed transfected cells into individual trap chambers, where the cells divide and fill each chamber with clones that share the same perturbation and barcode. The induced phenotypes are observed by imaging, and the barcodes are determined by sequential rounds of FISH imaging166. This method was applied to a CRISPRi screen in E. coli, investigating the effect of 235 genes on cell division and the location of the replication fork as measured by live-cell imaging167. A conceptually similar method uses multiplexed error-robust FISH (MERFISH)168 to read the barcodes associated with the perturbations165,169. This method has been applied to screen 54 RNA-binding proteins for their effect on the localization of long non-coding RNAs in mammalian cells169.

Perturbation-specific barcodes can also be determined by in situ sequencing170. The barcodes are reverse transcribed, subjected to rolling circle amplification and profiled using in situ sequencing by synthesis171,172. This approach was used to analyse 952 genes for potential roles in the nuclear translocation of NF-κB and p65 (ref.170). The same study also demonstrated direct in situ sequencing of gRNAs using the CROP-seq vector, which removes the need for perturbation-specific barcodes and may help increase scalability.

Results

The primary results of CRISPR screens are gRNA counts based on amplicon sequencing for pooled screens with cell survival, proliferation and FACS-based read-outs; single-cell sequencing data for scCRISPR-seq screens; and microscopy data for imaging-based screens. Bioinformatic methods have been developed for analysing and interpreting these results; this typically involves the five main steps of data processing, quality control, gene ranking, hit analysis and visual interpretation (Fig. 5). Multiple open-source software tools are available to support the analysis of CRISPR screens; these are listed in Table 1.

Fig. 5: Bioinformatic analysis of CRISPR screening data.
figure 5

Starting from sequencing data, typical steps in the analysis of a pooled CRISPR screen comprise data processing, quality control, gene ranking, hit analysis and visual interpretation. This workflow is depicted with a description of tasks and an illustration of typical results. gRNA, guide RNA.

Table 1 Software tools for analysing CRISPR screens

Data processing

Raw data such as sequencing reads or imaging data are processed and converted to count matrices, which form the basis of subsequent analyses. For pooled CRISPR screens with gRNA amplicon sequencing as the read-out, the raw sequencing reads are mapped to a reference of gRNA sequences, either as part of an integrated analysis pipeline such as MAGeCK173, CERES174 or CB2 (ref.175), or using standard sequence alignment tools such as Bowtie176 or BWA177. The result is a matrix that contains count data for each analysed gRNA. For scCRISPR-seq screens, the gRNA sequences and single-cell sequencing data are typically processed separately and connected using unique cell barcodes. Processing of the single-cell sequencing data follows established practices178,179 with bioinformatic software tools such as Seurat180, ScanPy181 and Monocle150. Imaging-based screens can build on the extensive methodology for analysing high-throughput imaging data182, including software tools such as CellProfiler183 and EBImage184. However, these tools do not explicitly account for gRNA detection and analysis, and custom processing scripts are required to annotate the single-cell sequencing or imaging data with the corresponding gRNAs as the basis for further analysis.

Quality control

Quality control is essential for reliable downstream analysis. Relevant quality metrics for pooled CRISPR screens include the average read and/or cell coverage per gRNA, the percentage of missing gRNAs and the evenness of read coverage across gRNAs as measured by the Gini coefficient185. The consistency of the results across biological replicates can be evaluated by analysing pairwise correlations and visualizing the global similarity of all experiments, for example using principal component analysis. It is advisable to compare the depletion of known essential genes against non-essential genes to determine the effectiveness of the perturbations57,186. Finally, screens that involve single-cell sequencing or imaging data should undergo corresponding quality control178,179,182, including bioinformatic detection of potential technical artefacts and correction for batch effects. Box 2 provides a list of important quality control criteria for pooled CRISPR screens.

Gene ranking

Following quality control, gRNAs and their target genes or genomic regions are ranked according to their effect on the phenotype of interest. For pooled screens with an amplicon sequencing read-out, these phenotypes are typically defined by the stimuli and conditions in the study design. For scCRISPR-seq and imaging-based screens, they can be derived from the high-content data — for example by unsupervised clustering or supervised classification of cell states based on gene signatures139,140,141,142 or spatial location141,142,170. Enrichment/depletion scores and corresponding P values are calculated for each phenotype, gRNA and target gene using statistical approaches such as negative binomial, Bayesian or hierarchical mixture models, which may account for variability in gRNA efficiency185,187,188, off-target effects174 and the effect of population bottlenecks189.

To improve gene ranking, many software packages including MAGeCK173, CERES174, CRISPY190 and CRISPRcleanR191 statistically correct for DNA copy number variation, which is a relevant source of bias in CRISPR screens for cell survival and proliferation in cancer cell lines36. These tools use existing profiles of copy number variation (where available) or estimate this effect from gRNAs that target nearby genomic locations in genome-scale libraries. Differences in gRNA efficiency can affect the gene ranking, and statistical models have been developed to model such variability and to account for off-target effects187,192. For robust and interpretable results it is useful to compare gRNA frequencies across related conditions. Several software packages provide built-in support for different study designs, including paired samples, multiple time points or alternative treatment conditions185,193,194. For example, MAGeCKFlute194 and DrugZ195 were designed for the common scenario of comparing gRNA frequencies between two conditions (for example, screens of drug-treated and untreated cells). Finally, dedicated software tools have been developed for the analysis of scCRISPR-seq screens, which account for heterogeneity among perturbed cells140,159,194,196,197.

Hit analysis

Once target genes have been ranked by their enrichment/depletion score, the top hits can be assessed in terms of plausibility and biological relevance. Such manual inspection is assisted by online resources including PubMed, Ensembl, GeneCards, ClinVar, OMIM and COSMIC. To detect biological patterns that are shared among several top hits, it can be informative to perform enrichment analyses using databases such as MSigDB198 and STRING199, and software tools such as Enrichr for genes200 and LOLA for genomic regions201. For the most interesting hits of a CRISPR screen, it is advisable to manually review the abundance of the corresponding gRNAs across all tested conditions. Highly variable gRNA frequencies across biological replicates can be an indicator of a noisy and unreliable screen, and gRNA enrichment or depletion among the negative controls may point to technical biases affecting the results.

Visual interpretation

Visualizations facilitate the biological interpretation of CRISPR screens. A heat map of sample correlations helps evaluate consistency across biological replicates and the global differences between conditions. Receiver operating characteristic curves are used to assess the sensitivity and specificity of essential gene depletion, a widely useful metric of data quality. Scatter plots of gRNA counts or log fold changes visualize the top hits in the context of the screen’s background. Differences between two conditions can also be visualized using volcano plots, which plot gRNA counts or log fold changes against their associated P values. Many software tools for CRISPR data analysis include dedicated visualization modules (Table 1). For example, the R functions of MAGeCKFlute can help visualize screening results for individual genes across multiple experimental conditions194. Using such visualizations for individual target genes is often useful for assessing the enrichment/depletion of all gRNAs targeting those genes. For example, consistent behaviour across different gRNAs targeting the same gene supports the validity of a top hit, whereas outliers and inconsistent results may raise caution about a hit’s reliability.

Applications

The versatility and discovery power of CRISPR screening is demonstrated by its wide range of applications, which include the investigation of fundamental molecular mechanisms, genetic diseases, processes relevant to cancer, immune regulation and microbiology (Fig. 6).

Fig. 6: Applications of CRISPR screening.
figure 6

CRISPR screens are broadly contributing to our understanding of biology. Carefully designed screens can help address a wide range of research topics, some of which are outlined here.

Molecular and cell biology

Pooled CRISPR screens with simple gRNA count-based read-outs have uncovered genes involved in fundamental biological processes such as transcription regulation202,203, epigenetic mechanisms204, protein production205,206, cell signalling207,208, proliferation209,210 and differentiation211,212. Such screens have also identified regulators of cellular organelles such as mitochondria124,213, lysosomes214, proteasomes52 and the autophagosome215,216. scCRISPR-seq screens have been used to dissect transcription-regulatory processes associated with dendritic cell stimulation140, T cell receptor (TCR) induction142 and epithelial to mesenchymal transition217. Further, CRISPR screens with imaging read-outs have investigated regulators of NF-κB translocation170. As single-cell sequencing and imaging read-outs do not depend on cell proliferation, they are applicable to perturbations that do not cause cell death and cell types that are naturally post-mitotic; for example, a scCRISPR-seq screen examined how autism-linked genes affect cell types and states in the brain144.

Most CRISPR screens target only one gene per cell; this facilitates data analysis but makes it difficult to dissect biological processes characterized by redundancy218,219,220,221, where a phenotypic effect may be detected only when several paralogous genes are perturbed simultaneously. Accounting for redundancy is important for understanding quantitative traits and genetic diseases222,223 and can be exploited for combination therapies224. Pooled CRISPR screens focusing on genetic interactions have mapped more than 200,000 gene pairs in cell lines44 and several hundred gene pairs in vivo in mice47. Single-cell sequencing read-outs have helped dissect complex, non-additive effects on cell state in the context of combinatorial gene regulation140 and the unfolded protein response139. More recently, scCRISPR-seq was used to profile hits from a CRISPRa screen to identify genetic interactions that drive differentiation to specific cell types225. For complex molecular and cellular read-outs, two genes may synergize with respect to some aspects of their induced transcriptomes, while acting antagonistically with regards to other aspects. Computational methods for manifold learning may help interpret genetic interaction data based on scCRISPR225 and imaging read-outs226, towards the goal of inferring models of genetic interactions that are both predictive and interpretable227.

Medical genetics and rare genetic diseases

CRISPR screening facilitates the annotation of disease-linked genes and genetic variants, specifically by assessing the biological function of variants of uncertain significance in disease-causing genes228. Deleterious genetic variants in BRCA1 are a risk factor of breast cancer with high clinical relevance, yet many variants are too rare to assess based on medical genetics data alone whether the variant is pathogenic or benign. This challenge has been tackled by saturation genome editing, using CRISPR and homology-directed repair to introduce several thousand single-nucleotide variants into the BRCA1 locus and measuring their effect in vitro229. Further, cytosine base editors have been used to assess genetic variants in BRCA1 and BRCA2 (refs230,231), and a recent study extended this approach to 86 genes involved in the DNA damage response232.

Not all genetic variants can be targeted with base editors owing to their sequence specificity and the need for a Cas-specific PAM sequence close to the target site. Nevertheless, a gRNA library has been developed that targets more than 50,000 ClinVar variants with base editing230, and CRISPR prime editing promises to provide even more flexibility to engineer a wide range of genetic variants across the genome113.

CRISPR screening is useful for the functional analysis of genetic variants associated with polygenic diseases, including risk alleles identified through genome-wide association studies (GWAS) and population genome sequencing. Such studies have statistically linked thousands of genomic regions to a wide range of diseases and human phenotypes, although their rate of pinpointing causal variants and underlying mechanisms has been low. CRISPR screens can complement genetic association studies by testing the biological function of a large number of genetic variants in parallel before labour-intensive investigation of individual variants. In one such screen, gRNAs were tiled across the BCL11A enhancer to map which parts of this enhancer are associated with fetal haemoglobin expression, which may be therapeutically relevant for β-thalassaemia and sickle cell anaemia233. Another study applied CRISPRi screens with a FISH read-out of target gene expression to quantify gene regulatory effects for thousands of candidate enhancer–gene pairs136.

A challenge of using CRISPR screens for assaying genetic variants is the need to develop and validate specific reporter assays for each gene or phenotype of interest. This can be avoided with scCRISPR-seq, exploiting the versatility of transcriptional profiles as correlates of diverse cellular phenotypes. For example, CRISPRi followed by a single-cell RNA-seq read-out was used to measure the effect of enhancer silencing on the transcriptome143 and to obtain single-cell profiles for the effect of several thousand enhancers in a single experiment39. This approach is capable of linking genetic variants to the genes they regulate, which is an important task given that most disease-linked genetic variants identified by GWAS lie in non-coding genomic regions.

Cancer research

CRISPR screens are well suited for studying cancer biology given the wide range of available models and the cancer relevance of readily screenable phenotypes such as cell proliferation and drug resistance. Initial studies focused on mapping essential genes in cancer cell lines57,59, following the hypothesis that cancer-specific gene essentiality may indicate worthwhile drug targets. CRISPR screens have been performed in hundreds of cancer cell lines21,51,57,174,220, making it possible to distinguish between cancer-specific effects and core essential genes, and to study the context dependence of gene essentiality. It has been predicted from RNAi data that at least a thousand cancer cell lines need to be screened to observe most cancer-relevant gene dependencies at least once234 and it will take many more to chart reliable genetic fitness landscapes of cancer cells. Encouragingly, CRISPR screening data have been successfully aggregated across different laboratories20,235, suggesting that large-scale efforts that combine data across multiple sites are feasible.

CRISPR screens conducted in isogenic pairs of cancer cell lines220,236 provide a powerful tool to identify cases of synthetic lethality237. As many cancers lack dominant, druggable oncogenes for targeted therapy, one promising approach is to identify genes that are non-essential but synthetic lethal with cancer-specific gene alterations that are not themselves druggable. This concept is illustrated by the synthetic lethal effect of PARP1 inhibition in BRCA1/BRCA2 mutant cancers238,239. CRISPR screens recently identified a potent synthetic lethal interaction between the helicase-encoding WRN gene and microsatellite instability21,240,241,242.

CRISPR screens in cancer cell lines are broadly useful for investigating tumour-specific biological processes, including oncogenic transcription regulation243, hypoxia58, metabolic stress244, cytokines19, immune evasion245,246 and DNA damage247,248. However, certain aspects such as metastasis and the tumour microenvironment are difficult to model in vitro. To address these limitations, CRISPR screens have been performed using mouse models of cancer19,133,249, either based on engraftment of gene-edited cells or in vivo genome editing. Because the number of cells that engraft or are edited in vivo is typically low, such screens tend to be limited to a few hundred target genes. Cancer organoids have the potential to combine certain advantages of in vivo models including their three-dimensional structure with those of in vitro models, such as high throughput and easy access for perturbations. Indeed, organoids have a greater overlap with the in vivo properties of human tumours than cancer cell lines and have helped uncover relevant cancer vulnerabilities35,250,251.

CRISPR screens also provide new insights into therapeutically relevant mechanisms. In immuno-oncology, they found that tumour immune evasion occurs through diverse mechanisms including Ras signalling, interferon, antigen presentation, autophagy and epigenetic remodelling19,50,129,130,249,252. Relevant to molecularly targeted therapy, a CRISPR screen that challenged cells with the BRAF inhibitor vemurafenib found that depletion of neurofibromin, merlin and the mediator complex component MED12 conferred drug resistance in BRAF-mutant melanoma cells29,210. Similarly, mutagenesis of the proteasome component PSMB5 in a base editing screen revealed novel mutations that confer resistance to the cancer drug bortezomib253.

Immunology

CRISPR screens for immunological mechanisms can be conducted in primary immune cells by exploiting methods for efficient delivery of Cas9 and gRNAs209,254. For example, the SLICE method combines lentiviral delivery of gRNAs into stimulated human CD8+ T cells with electroporation of the cells to introduce the Cas9 protein129,255. This method led to the identification of genes that modulate the proliferation response of CD8+ T cells129. CRISPR screens in primary T cells have identified new genes involved in T cell proliferation129, activation132 and antitumour activity130,131,256,257. Further, screens in dendritic cells stimulated with bacterial lipopolysaccharide (LPS) uncovered novel regulators of Toll-like receptor 4 (TLR4) signalling258, and screens in macrophages identified genes involved in inflammasome activation and other inflammatory pathways259,260. CRISPR screens have also identified host factors involved in SARS-CoV-2 viral infection, contributing to our understanding of host cell entry and exit, and virus replication261,262,263,264,265,266,267,268,269.

Pooled CRISPR screens are generally restricted to studying cell-intrinsic regulatory mechanisms; by contrast, arrayed CRISPR screens support the investigation of cell-extrinsic effects and complex immune cell interactions270. Further, in vitro screens are usually performed in super-physiological conditions and may overestimate perturbation effects. For example, TCR activation assays use a bead-based method for immune stimulation, which does not adequately recapitulate the complexities of the immunological synapse271. These challenges can be addressed by in vivo CRISPR screens130,133,257,272,273, either in immunocompetent mice for a focus on murine immune cells, or in immunodeficient or humanized mice with xenotransplanted human immune cells274. Such screens may provide insights into tissue-resident immune cells275 and the role of structural cells such as epithelial cells, endothelial cells and fibroblasts276, which have not yet been a focus of in vivo CRISPR screening.

There are multiple challenges faced by in vivo screens. First, delivery of the CRISPR machinery can be challenging and inefficient in living animals; hence, it is often advisable to use transgenic mice that constitutively express the Cas protein. Second, in vivo screens are more limited in scale compared with in vitro screens; it is therefore important to define relevant target genes. Data from the ImmGen consortium277, the Human Cell Atlas278, the BLUEPRINT project279, public databases and the scientific literature can facilitate the design of application-specific gRNA libraries for in vivo screens in haematopoietic cells. Third, the antigenic repertoire of T cells and B cells can affect clonal dynamics independent of the CRISPR-induced perturbations and add noise to the screen, thus requiring multiple replicates.

Microbiology

CRISPR screens are broadly useful for microbiological applications, including the analysis of animal and plant pathogens and the development of new biotechnological tools for food production, waste treatment and pharmaceutical manufacturing (Fig. 7). CRISPR technology has enabled the genetic manipulation of microbial species that were considered genetically intractable, including species of bacteria, fungi and parasites280,281. CRISPRi-based gene knockdown is predominantly used for screens in microbiology because it circumvents potential species-specific differences in homologous recombination and the repair of double-strand breaks.

Fig. 7: Applications of CRISPR screening in diverse microorganisms.
figure 7

a | Gene repression by CRISPR interference (CRISPRi) facilitated genome-wide screening for essential genes in the model organism Escherichia coli283. In the depicted system, the guide RNA (gRNA) sequence is constitutively expressed from a replicating plasmid, while the Cas9 endonuclease dead (dCas9) gene is integrated in the bacterial genome under an inducible Ptet promoter. Gene repression is induced by addition of anhydrotetracycline (aTc), an antibiotic derivative of tetracycline. b | A similar inducible CRISPRi approach was used in the bacterial pathogen Streptococcus pneumoniae to study population bottlenecks during in vivo infection of a murine host290. c | CRISPR knockout (CRISPRko) enabled the large-scale construction of double mutants to map genetic interactions in the yeast pathogen Candida albicans280. d | CRISPRko screening and barcode tagging of clones in the parasite Leishmania mexicana uncovered genes involved in stress adaptation during sandfly infection336.

Early examples of CRISPR screens in bacteria included CRISPRi-based analysis of non-coding RNAs in E. coli282 and identification of host factors relevant to bacteriophage infection for improving phage therapy283. These studies challenged the essentiality of genes previously characterized by transposon insertion sequencing (Tn-seq) and established CRISPRi as a complementary method for identifying essential genes in both model and non-model bacteria. For example, a CRISPRi-based analysis of Bacillus subtilis284 was extended to diverse species of Firmicutes and Gammaproteobacteria285, and a comparative CRISPRi analysis of gene essentiality included diverse isolates of E. coli286. CRISPR screens also identified condition-specific essential genes in Mycobacterium smegmatis287 and in the marine bacterium Vibrio natriegens288.

One key advantage of CRISPR-based methods in microbiology is their robustness across a wide range of species289. This has enabled the discovery of novel virulence mechanisms for bacterial pathogens such as Streptococcus pneumoniae290 and technically challenging pathogenic fungi291 and parasites292. For example, CRISPR screens identified virulence factors in Toxoplasma gondii293,294, Candida albicans295,296 and Cryptococcus neoformans297. CRISPR screening can also help identify novel targets for antimicrobial drugs based on gene essentiality282,283,284 in pathogens such as S. pneumoniae, Streptococcus mutans and Vibrio cholerae298,299,300. CRISPR screens have been used to identify complex genetic interactions and synthetic lethality295,296,301,302, novel targets for combination antimicrobial therapeutics303, genetic determinants of resistance to existing antimicrobial therapeutics296,302,304 and mechanisms of bacterial resistance to bacteriophage infection305. Finally, it is possible to combine pathogen screens with screens of human host factors306.

Beyond microbial pathogens, CRISPR screening can help identify key factors in industrially relevant microorganisms with the goal of optimizing bioprocesses282,307,308,309,311. For example, genome-wide CRISPRi screening in E. coli identified new genes that confer resistance to isobutanol and furfural, which are important traits for biofuel production282. Pooled CRISPRi-based screens in Synechocystis spp. cyanobacteria have been used to increase productivity and tolerance for l-lactate, which is an important compound for renewable plastics309,310. In Saccharomyces cerevisiae, a gene knockdown library identified genes involved in furfural tolerance307.

Advances in CRISPR technology broaden the scope for CRISPR screening in microbiology. A recent study combined CRISPRi and CRISPRa for genome-wide titration of gene expression in S. cerevisiae311, and a modified CRISPRi platform with mismatched gRNAs enabled fine-grained control of gene expression in B. subtilis and E. coli312. CRISPRi has been combined with the fluorescent TIMER protein and FACS enrichment in E. coli to identify slow-growing mutant strains with high metabolic activity313. Multilocus CRISPR editing with homologous recombination has been optimized for use in E. coli308 and S. cerevisiae314, and applied as a molecular barcoding tool to study genotype–phenotype relationships. Finally, CRISPR base editors have been used to assess the effect of genetic variants on the S. cerevisiae proteome at single-residue resolution315. Finally, it is possible to combine pathogen screens with screens of human host factors306.

Reproducibility and data deposition

CRISPR screens are sometimes performed with little consideration for assay optimization and reproducibility, essentially relying on follow-up experiments to obtain robust results for a small subset of manually selected hits. Although this approach has been successful in some cases, it does not realize the full potential of CRISPR screens as a method for systematic biological discovery. Poorly conducted screens can have many false positives, which add to the validation burden and potentially yield spurious patterns of biological enrichment among the hits of the screen. Moreover, they tend to produce false negatives, compromising the screen’s ability to provide a reliable list of genes that have a strong effect in the investigated model.

To quantify and enhance reproducibility, CRISPR screens should be conducted with at least three biological replicates and the results should be compared using quantitative metrics such as correlation coefficients and receiver operating characteristic curves. Further, the most interesting hits should be validated with complementary assays. This selection should include some randomly selected hits in order to counter expert selection bias. Complementary to the in-depth validation of a few hits, medium-scale validation screens with high-content read-outs such as single-cell RNA-seq or imaging can help investigate a broader range of hits than is feasible with small-scale assays. This strategy can also provide an estimate for the false positive rate of the primary screen258. Whenever possible, validations should include additional gRNAs to assess off-target effects, or use alternative perturbations — for example, CRISPRa or CRISPRi following a primary screen using CRISPRko. On the computational side, reproducibility can be improved by using workflow management systems such as Snakemake316 for data processing and R/Python notebooks for documentation of the analysis.

An essential aspect of reproducibility is proper documentation of the screening data and the experimental and computational workflows, such that other researchers can verify and build upon the results. Raw and processed data from each published screen — including the raw sequencing reads, count matrices, ranked and annotated lists of hits, and global metrics of screening performance — must be deposited in suitable data repositories together with a detailed description of the experiment. Raw and processed data should be submitted to the NCBI Gene Expression Omnibus (GEO) or the EBI ArrayExpress database. Alternatively, raw sequencing data can be submitted to the International Nucleotide Sequence Database Collaboration, which comprises the NCBI Sequence Read Archive (SRA), the EBI European Nucleotide Archive (ENA) and others. Submission to the EBI European Genome-phenome Archive (EGA) or the NCBI database of Genotypes and Phenotypes (dbGAP) may be preferred or required for screens performed on primary human cells or human tissue to comply with data protection regulations. Additional raw and processed data that do not fit into any application-specific database can be submitted to all-purpose repositories such as Zenodo.

There is currently a lack of established community standards for the documentation of CRISPR screens. In particular, no reporting standard exists for CRISPR screens that would correspond to standardization efforts in other areas of high-throughput biology, such as the Minimum Information About a Microarray Experiment (MIAME), Minimum Information About a Next-generation Sequencing Experiment (MINSEQE) and Minimum Information About a Proteomics Experiment (MIAPE) guidelines. In the absence of a widely accepted community standard, Box 3 provides a brief outline of specific points that we consider important for reporting a CRISPR screen, in order to enhance reproducibility.

Carefully conducted and well-documented CRISPR screens can be highly reproducible across replicates and across laboratories. This was confirmed by a comparative analysis of two large CRISPR screening data sets for cancer cell lines generated independently by the Broad Institute and the Sanger Institute, which identified essential genes with high reproducibility20,235. Despite many differences between the reagents and methodologies used to generate these data sets, variation between the two data sets were largely attributed to gRNA library design and differences in the duration of the screens following CRISPRko, which are experimental sources of variation that can be mitigated with appropriate study design and validated methodology.

The accumulation of raw CRISPR screening data sets in public databases provides interesting opportunities for meta-analysis of genotype–phenotype relations. Dedicated databases have been developed to find, retrieve and analyse these data sets. The DepMap resource provides CRISPR screening data and associated genomic profiles for hundreds of cancer cell lines; the CRISP-view database provides a standardized reanalysis of published CRISPR screening data sets using the MAGeCK-VISPR pipeline317; and the BioGRID ORCS database collects and curates CRISPR screening data sets from the scientific literature318.

Limitations and optimizations

A successful CRISPR screen depends on the combination of a suitable biological model, efficient perturbation of the target genes, well-calibrated stimuli and a read-out that captures relevant biological processes. Each of these aspects comes with certain limitations, which can be addressed by careful optimization.

Efficient delivery of the Cas protein and the gRNAs can be a challenge and needs careful optimization, especially for CRISPR screens in primary cells and in vivo. Further, some experimental models are characterized by strong population bottlenecks — for example, only a few cells engraft in most xenotransplantation models. In such cases, genetic drift and selective forces unrelated to the CRISPR-induced perturbation may dominate the analysis and cause high levels of noise. It is therefore important to minimize all population bottlenecks that are unrelated to the screened phenotype; this can be achieved by optimizing cell viability in vitro or engraftment rates in vivo, and by selecting a sufficiently low number of target genes to ensure adequate coverage.

CRISPRko is a mature technology with high efficiency and few off-target effects; however, a subset of the perturbed cells may retain target protein function owing to in-frame editing. Within CRISPR screens, it is usually not possible to verify the edits made in individual cells, although scCRISPR-seq provides some opportunities in this regard. In-frame editing is therefore difficult to exclude and can reduce the screen’s signal-to-noise ratio. For alternative perturbations such as CRISPRi and CRISPRa, selecting a suitable gRNA design can be challenging and is often cell type-specific30,187,319. Where possible, gRNAs should be validated in the cell type of interest, for example by measuring their effect on the expression of their respective target genes in arrayed experiments225. Alternative perturbations also tend to suffer from stronger off-target effects than CRISPRko320.

Most stimuli that have been used in CRISPR screens impose strong selective pressures in a setting of rapidly growing cell lines cultured in nutrient-rich media. Although this set-up has identified important biological mechanisms, the physiological challenges faced by cells in vivo are often less pronounced, are longer lasting and occur in an environment with a comparatively low cell growth rate and nutrient supply. It usually takes careful optimization to calibrate the stimuli in a CRISPR screen in a way that maximizes physiological relevance.

A key limitation of gRNA amplicon sequencing as the screening read-out is that it reduces the screen’s biological complexity to measuring only one-dimensional RNA enrichment/depletion scores. High-content read-outs with single-cell sequencing or imaging provide much more detail on the perturbed cells but are limited by their complexity and assay costs. Finally, simple sequencing-based read-outs cannot readily detect clonal outgrowth and PCR amplification artefacts, which can be addressed by perturbation and sequencing protocols that incorporate unique molecular identifiers to distinguish individual editing events38,321,322.

Outlook

Genetic screens play a major role in shaping our understanding of biology. Over the past two decades, RNAi and gene-trap screens have established the feasibility of genome-wide screens in mammalian cells; and CRISPR–Cas has emerged as a highly efficient perturbation tool that can be programmed with gRNA and read out with sequencing. These developments have dramatically increased the power of genetic screens for biological discovery across a broad range of models. We anticipate that progress in models, perturbations, stimuli and read-outs will continue to enhance the practical utility and discovery power of CRISPR screening.

In terms of new models, it is useful to look beyond cancer cell lines, which have been by far the most widely used biological model in the first wave of CRISPR screens. For example, organoids replicate important aspects of human physiology and pathophysiology in vitro and are amenable to CRISPR screening250,323,324,325. Humanized mice enable in vivo screens of human immune cells, helping to dissect species-specific differences in the regulation of the immune system326,327. As CRISPR screening technology works well across various species, there are unique opportunities to apply CRISPR screens in emerging model organisms with interesting biological properties or biotechnological applications, ranging from mammals to microorganisms.

New methods for CRISPR-based perturbation complement established CRISPRko technology and provide additional options for the manipulation of cell states. Once current challenges of efficiency and reliable gRNA design are resolved, powerful gain-of-function screens will be possible using CRISPRa and comprehensive characterization of regulatory regions and epigenetic cell states will be achievable using epigenome editing. Further, CRISPR screening with base editors, homologous recombination and prime editing will enable the functional analysis of a wide range of potentially disease-linked genetic alterations at high resolution and throughput.

Much potential lies in broadening the range of cellular stimuli used in CRISPR screens. The first generation of pooled CRISPR screens exposed cell lines to competitive growth conditions or challenged them with drugs or viruses; now, methods and practices have advanced enough to apply CRISPR screens to milder and more complex challenges. For example, cells can be exposed to specific microenvironments, metabolites or cell–cell interactions. Cells can also be challenged in terms of their regulatory plasticity, ability to adapt to dynamically changing environmental conditions and response to paracrine signalling in heterogeneous biological models, with applications in areas such as immuno-oncology and regenerative medicine.

Although well-designed pooled CRISPR screens with gRNA counting as their read-out continue to be broadly useful, screens with high-content read-outs such as single-cell sequencing and imaging can provide additional insights into molecular mechanisms already as part of the screen. Such screens do not require the ex ante selection of markers; rather, they support rapid computational iteration of hypotheses and analyses based on the screening data. High-content screens have so far focused mainly on preselected sets of candidate target genes; however, cost-effective high-throughput assays make it possible to screen gene sets comprising several thousand genes with adequate coverage — for example, including all kinases, transcription factors or cellular transporters. Given their flexibility, robustness and biological depth, it seems possible that such high-content screens will eventually become the primary method of CRISPR screening.

In summary, we describe the building blocks of effective CRISPR screening across a wide range of applications. High-content CRISPR screens conducted in complex biological models with single-cell sequencing or imaging read-outs provide a versatile approach for functional analysis at scale, bridging the gap between genome-wide descriptive assays and small-scale mechanistic dissection.