Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

Deep learning improves prediction of CRISPR–Cpf1 guide RNA activity

Abstract

We present two algorithms to predict the activity of AsCpf1 guide RNAs. Indel frequencies for 15,000 target sequences were used in a deep-learning framework based on a convolutional neural network to train Seq-deepCpf1. We then incorporated chromatin accessibility information to create the better-performing DeepCpf1 algorithm for cell lines for which such information is available and show that both algorithms outperform previous machine learning algorithms on our own and published data sets.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Deep learning outperforms conventional machine learning for the task of predicting Cpf1 activity based on the target sequence composition.
Figure 2: Consideration of chromatin accessibility significantly improves the prediction of Cpf1 activities at endogenous target sites.

Similar content being viewed by others

Accession codes

Primary accessions

Sequence Read Archive

References

  1. Zetsche, B. et al. Cell 163, 759–771 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Zetsche, B. et al. Nat. Biotechnol. 35, 31–34 (2017).

    Article  CAS  PubMed  Google Scholar 

  3. Hur, J.K. et al. Nat. Biotechnol. 34, 807–808 (2016).

    Article  CAS  PubMed  Google Scholar 

  4. Kim, Y. et al. Nat. Biotechnol. 34, 808–810 (2016).

    Article  CAS  PubMed  Google Scholar 

  5. Xu, R. et al. Plant Biotechnol. J. 15, 713–717 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Kim, D. et al. Nat. Biotechnol. 34, 863–868 (2016).

    Article  CAS  PubMed  Google Scholar 

  7. Kleinstiver, B.P. et al. Nat. Biotechnol. 34, 869–874 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Kim, H.K. et al. Nat. Methods 14, 153–159 (2017).

    Article  CAS  PubMed  Google Scholar 

  9. Doench, J.G. et al. Nat. Biotechnol. 34, 184–191 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Lee, C.M., Davis, T.H. & Bao, G. Exp. Physiol. doi:10.1113/EP086043 (2017).

  11. Encode Project Consortium. Nature 489, 57–74 (2012).

  12. Chari, R., Yeo, N.C., Chavez, A. & Church, G.M. ACS Synth. Biol. 6, 902–904 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Haeussler, M. et al. Genome Biol. 17, 148 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Yamano, T. et al. Cell 165, 949–962 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Genome Biol. 10, R25 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  16. LeCun, Y., Bengio, Y. & Hinton, G. Nature 521, 436–444 (2015).

    Article  CAS  PubMed  Google Scholar 

  17. Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).

  18. Min, S., Lee, B. & Yoon, S. Brief. Bioinform. 18, 851–869 (2017).

    PubMed  Google Scholar 

  19. Alipanahi, B., Delong, A., Weirauch, M.T. & Frey, B.J. Nat. Biotechnol. 33, 831–838 (2015).

    Article  CAS  PubMed  Google Scholar 

  20. Kelley, D.R., Snoek, J. & Rinn, J.L. Genome Res. 26, 990–999 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Doench, J.G. et al. Nat. Biotechnol. 32, 1262–1267 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Wang, T., Wei, J.J., Sabatini, D.M. & Lander, E.S. Science 343, 80–84 (2014).

    Article  CAS  PubMed  Google Scholar 

  23. Chari, R., Mali, P., Moosburner, M. & Church, G.M. Nat. Methods 12, 823–826 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Moreno-Mateos, M.A. et al. Nat. Methods 12, 982–988 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Xu, H. et al. Genome Res. 25, 1147–1157 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Wong, N., Liu, W. & Wang, X. Genome Biol. 16, 218 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Bergstra, J. et al. in. Proc. 9th Python Sci. Conf. 3–10 (2010).

  28. Kingma, D.P. & Ba, J. Preprint at https://arxiv.org/abs/1412.6980 (2014).

  29. Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. J. Mach. Learn. Res. 15, 1929–1958 (2014).

    Google Scholar 

Download references

Acknowledgements

The authors thank E.-S. Lee for proofreading and R. Gopalappa, N. Kim, S. Park, and J. Park for assisting in sample preparation. This work was supported in part by the National Research Foundation of Korea (grants 2017R1A2B3004198 (H.K.), 2017M3A9B4062403 (H.K.), 2013M3A9B4076544 (H.K.), 2014M3C9A3063541 (S.Y.)), Brain Korea 21 Plus Project (Yonsei University College of Medicine), Brain Korea 21 Plus Project (SNU ECE) in 2017, Institute for Basic Science (IBS; IBS-R026-D1), and the Korean Health Technology R&D Project, Ministry of Health and Welfare, Republic of Korea (grants HI17C0676 (H.K.), and HI16C1012 (H.K.)).

Author information

Authors and Affiliations

Authors

Contributions

H.K.K., M.S., and S.J. performed experiments to build data sets of AsCpf1 indel frequencies. S.M. and S.Y. developed the framework, and carried out the model training and computational validation. J.W.C. performed bioinformatic analyses. Y.K. and S.L. made substantial contributions to the performance of the experiments including cell culture and deep-sequencing. H.H.K. conceived and designed the study. H.K.K., S.M., S.Y., and H.H.K. analyzed the data and wrote the manuscript.

Corresponding authors

Correspondence to Sungroh Yoon or Hyongbum (Henry) Kim.

Ethics declarations

Competing interests

Yonsei University and Seoul National University have filed a patent based on this work, in which H.K.K., S.M., M.S., S.J., S.Y., and H.K. are co-inventors.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–14 and Supplementary Note (PDF 2816 kb)

Life Sciences Reporting Summary (PDF 130 kb)

Supplementary Tables

All tables that are included together, Supplementary tables 2, 4, and 6 (PDF 521 kb)

Supplementary Table 1

Source data used for this study. (XLSX 2463 kb)

Supplementary Table 3

Model selection results of Seq-deepCpf1 (XLSX 19 kb)

Supplementary Table 5

Oligonucleotides used in this study (XLSX 40 kb)

Supplementary Table 7

Confidence intervals for the result values (XLSX 15 kb)

Supplementary Code

The source code of Seq-deepCpf1 and DeepCpf1 (ZIP 750 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, H., Min, S., Song, M. et al. Deep learning improves prediction of CRISPR–Cpf1 guide RNA activity. Nat Biotechnol 36, 239–241 (2018). https://doi.org/10.1038/nbt.4061

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.4061

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research