Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Rolling back human pluripotent stem cells to an eight-cell embryo-like stage

Abstract

After fertilization, the quiescent zygote experiences a burst of genome activation that initiates a short-lived totipotent state. Understanding the process of totipotency in human cells would have broad applications. However, in contrast to in mice1,2, demonstration of the time of zygotic genome activation or the eight-cell (8C) stage in in vitro cultured human cells has not yet been reported, and the study of embryos is limited by ethical and practical considerations. Here we describe a transgene-free, rapid and controllable method for producing 8C-like cells (8CLCs) from human pluripotent stem cells. Single-cell analysis identified key molecular events and gene networks associated with this conversion. Loss-of-function experiments identified fundamental roles for DPPA3, a master regulator of DNA methylation in oocytes3, and TPRX1, a eutherian totipotent cell homeobox (ETCHbox) family transcription factor that is absent in mice4. DPPA3 induces DNA demethylation throughout the 8CLC conversion process, whereas TPRX1 is a key executor of 8CLC gene networks. We further demonstrate that 8CLCs can produce embryonic and extraembryonic lineages in vitro or in vivo in the form of blastoids5 and complex teratomas. Our approach provides a resource to uncover the molecular process of early human embryogenesis.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Generation of 8CLCs from human PSCs.
Fig. 2: Chromatin landscape of 8CLCs.
Fig. 3: Molecular roadmap of 8CLC conversion.
Fig. 4: Generation of TSCs and blastoids from 8CLCs.
Fig. 5: scRNA-seq of teratomas from 4CL naive PSCs and 8CLCs.

Similar content being viewed by others

Data availability

All data needed to evaluate the conclusions in the paper are included in the paper and/or the supplementary materials. Raw sequencing data have been deposited in the CNGB Nucleotide Sequence Archive under the accession number CNP0001454. Reference datasets from other works can be accessed either in the NCBI Gene Expression Omnibus (GEO) repository under their GSE numbers or in the database of the EMBL’s European Bioinformatics Institute under their E-MTAB numbers: accession numbers of human embryo scRNA-seq data are E-MTAB-3929 and GSE36552; the accession number of human embryo ATAC-seq data is GSE101571; the accession number of human embryo histone CUT&RUN data is GSE124718; accession numbers of other naive PSC/EPSC media single-cell sequencing data are GSE150311 (RSeT and t2iLGö), GSE166422 (PXGL), GSE150578 (5iLA); accession numbers of other naive/EPSC media RNA-seq data are GSE52617 (NHSM), E-MTAB-2857 (t2iLGö), GSE59435 (5iLAF), E-MTAB-7254 (EPSC); accession numbers of DNA methylation data are GSE49828 (human early embryo), GSE52617 (NHSM), GSE60945 (t2iLGö-NK2), GSE111018 (5iLAF), GSE136715 (mouse early embryo) and GSE75751 (mouse ES cells and 2CLCs); the accession numbers of mouse early embryo scRNA-seq data is GSE45719; the accession number of mouse ES cells and 2CLCs is GSE168728; and the accession number of TSC RNA-seq data is GSE138762. The human reference genome is available at http://ftp.ensembl.org/pub/release-105/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_rm.alt.fa.gz. All reference datasets with links are summarized in Supplementary Table 8. Processed datasets with links and descriptions are summarized in Supplementary Table 11. Processed bulk RNA-seq data can be accessed at https://figshare.com/s/8ee132ff1366fa89d35a, processed scRNA-seq data can be accessed at https://figshare.com/s/34110eebb58462a79dd5, processed scATAC-seq data can be accessed at https://figshare.com/s/760d3ff54f1214a50cc2, processed RRBS and WGBS data can be accessed at https://figshare.com/s/ff707bf8242f7b3ed8f5, processed teratoma and blastoid scRNA-seq data can be accessed at https://figshare.com/s/037b348b1da763fb41d0, processed SMART-seq2 data can be accessed at https://figshare.com/s/a1b03a1463865b8a56c8, processed single-cell multiomics data can be accessed at https://figshare.com/s/9c01c3b58d34b80de230Source data are provided with this paper.

Code availability

The analysis pipelines for scRNA-seq data generated by DNBeLab C4 are implemented using an in-house workflow (https://github.com/MGI-tech-bioinformatics/DNBelab_C_Series_HT_scRNA-analysis-software).

References

  1. Macfarlan, T. S. et al. Embryonic stem cell potency fluctuates with endogenous retrovirus activity. Nature 487, 57–63 (2012).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  2. De Iaco, A. et al. DUX-family transcription factors regulate zygotic genome activation in placental mammals. Nat. Genet. 49, 941–945 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  3. Li, Y. et al. Stella safeguards the oocyte methylome by preventing de novo methylation mediated by DNMT1. Nature 564, 136–140 (2018).

    Article  ADS  CAS  PubMed  Google Scholar 

  4. Maeso, I. et al. Evolutionary origin and functional divergence of totipotent cell homeobox genes in eutherian mammals. BMC Biol. 14, 45 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Yanagida, A. et al. Naive stem cell blastocyst model captures human embryo lineage segregation. Cell Stem Cell 28, 1014–1022.e4 (2021).

    Article  Google Scholar 

  6. Manor, Y. S., Massarwa, R. & Hanna, J. H. Establishing the human naive pluripotent state. Curr. Opin. Genet. Dev. 34, 35–45 (2015).

    Article  CAS  PubMed  Google Scholar 

  7. Rodriguez-Terrones, D. et al. A distinct metabolic state arises during the emergence of 2-cell-like cells. EMBO Rep. 21, e48354 (2020).

    Article  CAS  PubMed  Google Scholar 

  8. Wang, Y. et al. Unique molecular events during reprogramming of human somatic cells to induced pluripotent stem cells (iPSCs) at naive state. eLife 7, e29518 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Yang, Y. et al. Derivation of pluripotent stem cells with in vivo embryonic and extraembryonic potency. Cell 169, 243–257.e25 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Yang, J. et al. Establishment of mouse expanded potential stem cells. Nature 550, 393–397 (2017).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  11. Gao, X. et al. Establishment of porcine and human expanded potential stem cells. Nat. Cell Biol. 21, 687–699 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Posfai, E. et al. Evaluating totipotency using criteria of increasing stringency. Nat. Cell Biol. 23, 49–60 (2021).

    Article  CAS  PubMed  Google Scholar 

  13. Guo, G. et al. Human naive epiblast cells possess unrestricted lineage potential. Cell Stem Cell 28, 1040–1056.e6 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Esteban, M. A. et al. Vitamin C enhances the generation of mouse and human induced pluripotent stem cells. Cell Stem Cell 6, 71–79 (2010).

    Article  CAS  PubMed  Google Scholar 

  15. Miranda, T. B. et al. DZNep is a global histone methylation inhibitor that reactivates developmental genes not silenced by DNA methylation. Mol. Cancer Ther. 8, 1579–1588 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Wu, J. et al. Chromatin analysis in human early development reveals epigenetic transition during ZGA. Nature 557, 256–260 (2018).

    Article  ADS  CAS  PubMed  Google Scholar 

  17. Theunissen, T. W. et al. Systematic identification of culture conditions for induction and maintenance of naive human pluripotency. Cell Stem Cell 15, 471–487 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Surani, M. A., Hayashi, K. & Hajkova, P. Genetic and epigenetic regulators of pluripotency. Cell 128, 747–762 (2007).

    Article  CAS  PubMed  Google Scholar 

  19. Liu, C. et al. A portable and cost-effective microfluidic system for massively parallel single-cell transcriptome profiling. Preprint at https://doi.org/10.1101/818450 (2019).

  20. Petropoulos, S. et al. Single-cell RNA-seq reveals lineage and X chromosome dynamics in human preimplantation embryos. Cell 165, 1012–1026 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902.e21 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Shakiba, N. et al. CD24 tracks divergent pluripotent states in mouse and human cells. Nat. Commun. 6, 7329 (2015).

    Article  ADS  CAS  PubMed  Google Scholar 

  23. Deng, Q., Ramskold, D., Reinius, B. & Sandberg, R. Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343, 193–196 (2014).

    Article  ADS  CAS  PubMed  Google Scholar 

  24. Goke, J. et al. Dynamic transcription of distinct classes of endogenous retroviral elements marks specific populations of early human embryonic cells. Cell Stem Cell 16, 135–141 (2015).

    Article  CAS  PubMed  Google Scholar 

  25. Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).

    Article  CAS  PubMed  Google Scholar 

  26. Yan, L. et al. Single-cell RNA-seq profiling of human preimplantation embryos and embryonic stem cells. Nat. Struct. Mol. Biol. 20, 1131–1139 (2013).

    Article  CAS  PubMed  Google Scholar 

  27. Yu, L. et al. Blastocyst-like structures generated from human pluripotent stem cells. Nature 591, 620–626 (2021).

    Article  ADS  CAS  PubMed  Google Scholar 

  28. Liu, X. et al. Reprogramming roadmap reveals route to human induced trophoblast stem cells. Nature 586, 101–107 (2020).

    Article  ADS  CAS  PubMed  Google Scholar 

  29. Gafni, O. et al. Derivation of novel human ground state naive pluripotent stem cells. Nature 504, 282–286 (2013).

    Article  ADS  CAS  PubMed  Google Scholar 

  30. Bayerl, J. et al. Principles of signaling pathway modulation for enhancing human naive pluripotency induction. Cell Stem Cell 28, 1549–1565.e12 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Zhang, M. et al. β-catenin safeguards the ground state of mouse pluripotency by strengthening the robustness of the transcriptional apparatus. Sci. Adv. 6, eaba1593 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  32. Guo, H. et al. The DNA methylation landscape of human early embryos. Nature 511, 606–610 (2014).

    Article  ADS  CAS  PubMed  Google Scholar 

  33. Smith, Z. D. et al. DNA methylation dynamics of the human preimplantation embryo. Nature 511, 611–615 (2014).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  34. Pastor, W. A. et al. Naive human pluripotent cells feature a methylation landscape devoid of blastocyst or germline memory. Cell Stem Cell 18, 323–329 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Di Stefano, B. et al. Reduced MEK inhibition preserves genomic stability in naive human embryonic stem cells. Nat. Methods 15, 732–740 (2018).

    Article  MathSciNet  PubMed  PubMed Central  Google Scholar 

  36. Takashima, Y. et al. Resetting transcription factor control circuitry toward ground-state pluripotency in human. Cell 158, 1254–1269 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Guo, G. et al. Epigenetic resetting of human pluripotency. Development 144, 2748–2763 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Wang, Y. et al. Single-cell multiomics sequencing reveals the functional regulatory landscape of early embryos. Nat. Commun. 12, 1247 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  39. Eckersley-Maslin, M. A. et al. MERVL/Zscan4 network activation results in transient genome-wide DNA demethylation of mESCs. Cell Rep. 17, 179–192 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Shen, H. et al. Mouse totipotent stem cells captured and maintained through spliceosomal repression. Cell 184, 2843–2859.e20 (2021).

    Article  CAS  PubMed  Google Scholar 

  41. Stirparo, G. G. et al. Integrated analysis of single-cell embryo data yields a unified transcriptome signature for the human pre-implantation epiblast. Development 145, dev158501 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Xia, W. et al. Resetting histone modifications during human parental-to-zygotic transition. Science 365, 353–360 (2019).

    Article  ADS  CAS  PubMed  Google Scholar 

  43. Yeom, Y. I. et al. Germline regulatory element of Oct-4 specific for the totipotent cycle of embryonal cells. Development 122, 881–894 (1996).

    Article  CAS  PubMed  Google Scholar 

  44. Pastor, W. A. et al. TFAP2C regulates transcription in human naive pluripotency by opening enhancers. Nat. Cell Biol. 20, 553–564 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Liu, L. et al. An integrated chromatin accessibility and transcriptome landscape of human pre-implantation embryos. Nat. Commun. 10, 364 (2019).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  46. Macnair, W. & Claassen, M. psupertime: supervised pseudotime inference for single cell RNA-seq data with sequential labels. Preprint at https://doi.org/10.1101/622001 (2019).

  47. Bredenkamp, N. et al. Wnt inhibition facilitates RNA-mediated reprogramming of human somatic cells to naive pluripotency. Stem Cell Rep. 13, 1083–1098 (2019).

    Article  CAS  Google Scholar 

  48. Morabito, S. et al. Single-nucleus chromatin accessibility and transcriptomic characterization of Alzheimer’s disease. Nat. Genet. 53, 1143–1155 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Xu, X. et al. Dppa3 expression is critical for generation of fully reprogrammed iPS cells and maintenance of Dlk1–Dio3 imprinting. Nat. Commun. 6, 6008 (2015).

    Article  ADS  CAS  PubMed  Google Scholar 

  50. Sang, H. et al. Dppa3 is critical for Lin28a-regulated ES cells naive-primed state conversion. J. Mol. Cell. Biol. 11, 474–488 (2019).

    Article  CAS  PubMed  Google Scholar 

  51. Mulholland, C. B. et al. Recent evolution of a TET-controlled and DPPA3/STELLA-driven pathway of passive DNA demethylation in mammals. Nat. Commun. 11, 5972 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  52. Wang, L. et al. Overexpression of Stella improves the efficiency of nuclear transfer reprogramming. J. Genet. Genomics 44, 363–366 (2017).

    Article  PubMed  Google Scholar 

  53. Hayashi, K., de Sousa Lopes, S. M. C., Tang, F. & Surani, M. A. Dynamic equilibrium and heterogeneity of mouse pluripotent stem cells with distinct functional and epigenetic states. Cell Stem Cell 3, 391–401 (2008).

    Article  CAS  PubMed  Google Scholar 

  54. Dong, C. et al. Derivation of trophoblast stem cells from naive human pluripotent stem cells. eLife 9, e52504 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  55. Castel, G. et al. Induction of human trophoblast stem cells from somatic cells and pluripotent stem cells. Cell Rep. 33, 108419 (2020).

    Article  CAS  PubMed  Google Scholar 

  56. Velychko, S. et al. Excluding Oct4 from Yamanaka cocktail unleashes the developmental potential of iPSCs. Cell Stem Cell 25, 737–753.e4 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Huang, K. et al. BMI1 enables interspecies chimerism with human pluripotent stem cells. Nat. Commun. 9, 4649 (2018).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  58. McDonald, D. et al. Defining the teratoma as a model for multi-lineage human development. Cell 183, 1402–1419.e18 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Xue, Z. et al. Genetic programs in human and mouse early embryos revealed by single-cell RNA sequencing. Nature 500, 593–597 (2013).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  60. Payer, B. et al. Stella is a maternal effect gene required for normal early development in mice. Curr. Biol. 13, 2110–2117 (2003).

    Article  CAS  PubMed  Google Scholar 

  61. Granja, J. M. et al. ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis. Nat. Genet. 53, 403–411 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Okae, H. et al. Derivation of human trophoblast stem cells. Cell Stem Cell 22, 50–63.e6 (2018).

    Article  CAS  PubMed  Google Scholar 

  63. Lee, C. Q. et al. What is trophoblast? A combination of criteria define human first-trimester trophoblast. Stem Cell Rep. 6, 257–272 (2016).

    Article  CAS  Google Scholar 

  64. Smallwood, S. A. et al. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nat. Methods 11, 817–820 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Yu, Y. et al. Single-nucleus chromatin accessibility landscape reveals diversity in regulatory regions across distinct adult rat cortex. Front. Mol. Neurosci. 14, 651355 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  PubMed  Google Scholar 

  68. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  69. Ramirez, F., Dundar, F., Diehl, S., Gruning, B. A. & Manke, T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, W187–W191 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Guo, W. et al. BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data. BMC Genomics 14, 774 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Stuart, T., Srivastava, A., Madad, S., Lareau, C. A. & Satija, R. Single-cell chromatin state analysis with Signac. Nat. Methods 18, 1333–1341 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank all members of our laboratories for their support; L. Lai and G. Pan (Guangzhou Institutes of Biomedicine and Health) for their helpful comments during the execution of this project; X. Quan (the Experimental Animal Center of Guangzhou Institutes of Biomedicine and Health) for technical help with the chimera experiments; staff at the CNGB for providing technical support; and staff at the instrument platforms of the Guangzhou Institutes of Biomedicine and Health and the Bioland Laboratory for their technical support. This work was supported by the National Key Research and Development Program of China (2018YFA0106903 and 2016YFA0100102 to M.A.E.), the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA16030502 to M.A.E.), the National Natural Science Foundation of China (U20A2015 to M.A.E., 31900466 to L.L., 32150410348 to M.A.M., 32070861 to B.Q. and 31950410553 to C.W.), the Guangdong Provincial Key Laboratory of Genome Read and Write (2017B030301011 to X.X.), the Guangdong Basic and Applied Basic Research Foundation (2021A1515110180 to Y.L.), and the Guangzhou Science and Technology Foundation (2021000122 to W.L.). M.A.M. was funded by CAS-TWAS President’s PhD Fellowship and CAS President’s International Fellowship Initiative (PIFI) for special experts (2020FSB0002).

Author information

Authors and Affiliations

Authors

Contributions

M.A.E., M.A.M. and W.L. conceived the original idea and designed the experiments. M.A.E., M.A.M., W.L. and L.L. supervised the study. M.A.M., W.L., Z.L., C.L. and Y.L. conducted most of the experiments (with help from W.J., Y.J., H.L., L.F., Y. Yang, D.P.I., J. Lai, P.G., Y. Yuan, Q.D., Y.W., Y. Liu, J.W. and G.W.). C.W., Y. Lai, L.W., J. Li, W.J., X.W. and J.A. performed bioinformatics analysis. M.A.E., M.A.M. and W.L. interpreted the data. F.G., S.Z., B.Q., G.W., P.H.M. and X.X. provided relevant advice regarding data interpretation and manuscript preparation. M.A.E. provided most of the financial support and L.L. provided essential materials and infrastructural support for the single-cell technologies. M.A.M., W.L. and M.A.E. wrote the manuscript with input from all authors. All authors approved the final version of the manuscript.

Corresponding authors

Correspondence to Md. Abdul Mazid, Longqi Liu, Wenjuan Li or Miguel A. Esteban.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Janet Rossant and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Medium optimization for generating human naïve PSC.

a. Schematic of the compound screening workflow. b. Representative phase contrast images of primed H9 ESC cultured in different basal recipes for three passages. hLIF, human LIF; CHIR, CHIR99021; PD, PD0325901; 2i, PD+CHIR. Scale, 40 μm. Representative of three independent experiments. c. RT-qPCR for the indicated pluripotency genes in primed H9 ESC cultured in different basal recipes for three passages. Data are the mean values ± standard error of the mean (SEM) of the fold-change compared to primed ESC. n = 3 biological replicates. d. Table of tested compounds and their known targets. e and f. RT-qPCR for the indicated pluripotency genes in primed H9 ESC cultured in different modified basal recipes for three passages. Data are presented as mean values ± SD of fold-change compared to primed ESC in basal medium with PD, IWR1 and human LIF. n = 3 biological replicates. g. RT-qPCR for the indicated pluripotency genes in human primed HUES1 and HUES7 ESC, and human primed iPSC-1, iPSC-2 and iPSC-3 clones cultured in 4CL for three passages (day 12). h. Representative images of G-banding karyotype of primed H9 ESC and primed iPSC-4 clone cultured in 4CL for 15 passages. Twenty metaphases were counted for each. All 20 metaphase spreads for H9 ESC are shown in Supplementary Figure 1. i. Representative phase contrast images of primed H9 ESC cultured in 4CL on feeders, ECM-coated surface (feeder-free), or in suspension, for three passages (day 12). Scale, 40 μm. Representative of three independent experiments. j. RT-qPCR for the indicated pluripotency genes in H9 ESC cultured in 4CL on either feeders or ECM-coated surface for three passages (day 12). Data are presented as mean values ± SD of fold-change compared to primed ESC. n = 3 biological replicates. k. RT-qPCR for the indicated pluripotency genes in primed H9 ESC, HUES1 ESC and iPSC-4 clone converted by 4CL in suspension for three passages (day 12). l. RT-qPCR for the indicated pluripotency genes in primed H9 ESC cultured in 4CL with or without (w/o) Vc for three passages (day 12). Data are presented as mean values ± SD of fold-change compared to 4CL with Vc. n = 3 biological replicates. P value was calculated using two-tailed unpaired Student’s t-test. m. RT-qPCR for the indicated genes in H9 ESC cultured in 4CL with different TSA or DZNep substitutes for three passages (day 12). Data are presented as mean values ± SD of fold-change compared to 4CL with TSA and DZNep. n = 3 biological replicates.

Source data

Extended Data Fig. 2 Validation of 4CL and e4CL media.

a. Heatmap showing the expression of preimplantation ICM-enriched genes in human naïve PSC cultured in NHSM29, tt2iLGö (this study), 5iLAF17 and 4CL (passage 5), and human ICM cells16 compared with primed PSC. All reference datasets used in this study are summarized in Supplementary Table 8. H9 ESC were used to generate our dataset. The full list of DEG is included in Supplementary Table 1. b. RT-qPCR validation of totipotency genes in primed H9 ESC cultured in stepwise e4CL (day 5). Data are presented as mean values ± SD of fold-change compared to 4CL-day 12 naïve ESC. n = 3 biological replicates. c. Heatmap showing the expression of totipotency genes in naïve ESC cultured in NHSM, tt2iLGö, 5iLAF or stepwise e4CL (day 5), EPSC11, sorted 8CLC from e4CL-day 5 cells, and human 8C-embryo cells16 compared with primed PSC. H9 ESC were used to generate our dataset. d. Representative images of G-banding karyotype of primed H9 ESC and iPSC-4 cultured in stepwise e4CL (day 5). Twenty metaphases were counted for each. All 20 metaphase spreads for H9 ESC are shown in Supplementary Figure 1. e. RT-qPCR validation of totipotency genes in primed H9 ESC cultured in direct e4CL (day 7). Data are presented as mean values ± SD of fold-change compared to primed ESC. n = 3 biological replicates.

Source data

Extended Data Fig. 3 scRNA-seq (droplet-based) analysis of 8CLC generated from human PSC.

a. UMAP comparing the developmental rolling back from human E7 to E3 embryonic stages20 in stepwise or direct e4CL induction scRNA-seq time courses. All reference datasets used in this study are summarized in Supplementary Table 8. H9 ESC were used to generate our dataset. b and c. UMAP visualization based on panel a highlighting the expression of the primed-enriched gene CD24 and the shared naïve pluripotency/8CLC genes DPPA3 and KLF17 (b), and several totipotency genes (TPRX1, DUXA and ZNF280A) (c) as human ESC transition from a primed state to a naïve and then 8CLC state. d. Representative immunostaining images for TPRX1, KLF17 and NANOG of primed H9 ESC converted by direct e4CL (day 7). Scale, 20 μm. Representative of three independent experiments.

Source data

Extended Data Fig. 4 Extended analysis of scRNA-seq (droplet-based) analysis of human 8CLC generated from PSC.

a. UMAP visualization of stepwise e4CL-day 5 cells shows seven clusters. The encircled cluster 5 (8CLC) comprises 11.9% of the whole population. b. Violin plot showing the log normalized expression of representative pluripotency and totipotency genes for each of the seven clusters (n = 111, 109, 99, 97, 92, 70 and 12 cells for clusters 0, 1, 2, 3, 4, 5 and 6, respectively) of panel a. c. Bubble plot representing the frequency of expression and average expression of representative pluripotency and totipotency genes in early human embryonic stages31 and primed ESC untreated or converted by 4CL (days 8 [passage 2] and 12 [passage 3]) and e4CL (day 5 C5 [8CLC] and non-8CLC [all other clusters summed]). D, day. All reference datasets used in this study are summarized in Supplementary Table 8. d. Pseudobulk correlation analysis of the in vitro time course data with the embryo data based on the top 2,000 most variable genes. The average linkage hierarchical clustering of the Pearson correlation is shown. e. Pseudobulk correlation analysis of e4CL-D5 clusters (in Extended Data Fig. 4a) with the embryo data based on top 2,000 most variable genes. The average linkage hierarchical clustering of the Pearson correlation is shown. f. Representative co-immunostaining images of stepwise e4CL-day 5 cells showing mutual exclusivity between SOX2 and TPRX1 (see arrows). Nuclei were counterstained with DAPI. Scale, 20 μm. Representative of three independent experiments. g. UMAP visualization of primed ESC converted by direct e4CL at day 7 shows nine clusters. The encircled cluster 3 (C3) cells resemble 8CLC and comprise 15.6% of the whole population. h. Violin plot showing the log normalized expression of representative pluripotency and totipotency genes for each of the nine clusters (n = 738 cells, 621 cells, 464 cells, 432 cells, 205 cells, 165 cells, 72 cells, 38 cells, and 36 cells for clusters 0, 1, 2, 3, 4, 5, 6, 7, and 8, respectively) of panel g. i. Violin plot showing the log normalized expression pattern of representative early embryo-enriched TE in early human embryonic stages and primed ESC untreated or converted by 4CL (days 8 and 12) or stepwise e4CL (day 5; 8CLC and non-8CLC)

Source data

Extended Data Fig. 5 Extended analysis of scRNAseq data generated by SMART-seq2.

a. UMAP visualization based on Fig. 1d highlighting the cells in this study (left panel), in Petropoulos et al.20 (middle panel) and in Yan et al.26 (right panel). P, passage; blast., blastocyst. All reference datasets used in this study are summarized in Supplementary Table 8. b. Hierarchical clustering diagram of sorted 8CLC (78 cells) and embryo E3/E4 cells showing the cell-to-cell similarity on the basis of Spearman correlation coefficients. Five closely correlated regions are defined: R1 and R2 include all E3 cells, 10 E4 cells and 31 sorted 8CLC (39.8%). R3 and R4 include E4, all late E4 cells and no sorted 8CLC. R5 includes 47 sorted 8CLC (60.2%) which share similarity with both E3 and E4. c–e. UMAP visualization based on Fig. 1d highlighting the expression of several totipotency genes (TPRX1, ZSCAN4, DUXA and ZSCAN5B) (c), the shared naïve pluripotency/8CLC genes DPPA3 and KLF17 (d) and the primed-enriched gene CD24 (e) as human ESC transition from a primed state to an 8CLC state. f. Pseudobulk correlation analysis of the in vitro culture data with the embryo data based on all the TE. The average linkage hierarchical clustering of the Pearson correlation is shown. g. Heatmap showing the embryonic stage-specific TE expression profiles in human E3 to E7, sorted 8CLC, 4CL-day 12 naïve ESC and primed ESC. The full list of stage-specific TE are included in Supplementary Table 2. h. GFP+ sorted 8CLC from e4CL-day 5 cells were cultured in 4CL for 24 h and subjected to FACS analysis. 48.8% remained GFP+ and 51.2% became GFP- (exited from the 8CLC state) (upper panels). GFP- cells were sorted from stepwise e4CL-day 5 and were cultured in e4CL for 24 h and subjected to FACS analysis. 5.76% became GFP+ (entered the 8CLC state) (lower panels). D, day.

Source data

Extended Data Fig. 6 Identification of 8CLC in scRNA-seq from other naïve PSC recipes.

a. UMAP visualization highlighting putative 8CLC and non-8CLC and the expression of the totipotency genes TPRX1, DUXA and ZNF280A for the indicated naïve media13,27,28. Annotation was based on the averaged expression scores of the gene signature used for defining 8CLC in this study. All reference datasets used in this study are summarized in Supplementary Table 8. b. UMAP visualization highlighting putative 8CLC and non-8CLC and the expression of the totipotency genes TPRX1, DUXA and ZNF280A in stepwise e4CL-day 5 cells. c. Bubble plot representing the frequency of expression and average expression of representative pluripotency and totipotency genes in early human embryonic stages20,26 and the annotated putative 8CLC in the indicated culture conditions. OCT4 reads have been removed from reprogramming t2iLGö and RSeT datasets due to overlap with the signal of the overexpressed factor.

Source data

Extended Data Fig. 7 Generation of 8CLC from other human naïve PSC and 2CLC from mouse naïve PSC.

a. RT-qPCR for the indicated pluripotency and totipotency genes in primed H9 ESC cultured in PXGL47, PXGL+DZNep, PXGL+TSA and PXGL+DZNep+TSA, and PXGL cells cultured in e4CL (PXGL to e4CL) for five days. Data are presented as mean values ± SD of the fold-change compared to cells in PXGL medium. n = 3 biological replicates. b. RT-qPCR for the indicated pluripotency and totipotency genes in primed H9 ESC cultured in 5iLA17, 5iLA+DZNep, 5iLA+TSA, and 5iLA+DZNep+TSA, and 5iLA cells cultured in e4CL (5iLA to e4CL) for five days. Data are presented as mean values ± SD of the fold-change compared to cells in 5iLA medium. n = 3 biological replicates. c. RT-qPCR for the indicated pluripotency and totipotency genes in primed H9 ESC cultured in HENSM30, HENSM+DZNep, HENSM+TSA, HENSM+DZNep+TSA and HENSM cells cultured in e4CL (HENSM to e4CL) for five days. Data are presented as mean values ± SD of the fold-change compared to cells in HENSM medium. n = 3 biological replicates. d. Representative immunofluorescence images for pluripotency (NANOG, KLF17 and TFAP2C) and totipotency (TPRX1) markers in PXGL, 5iLA or HENSM-derived naïve H9 ESC cultured in e4CL for five days. Scale, 20 μm. Representative of three independent experiments. e. Representative fluorescent (green) images of MuERV-L-LTR-GFP reporter mouse ESC cultured in the indicated conditions for three days. Scale, 40 μm. Representative of three independent experiments. m, mouse. f. Mouse MuERV-L-LTR-GFP reporter ESC cultured in the indicated conditions for three days and subjected to FACS analysis to determine the percentage of GFP+ cells. m, mouse. g. Bar plot indicating the percentage of MuERV-L-LTR-GFP+ cells generated in the indicated conditions measured by FACS. Data are presented as mean values ± SD. n = 3 biological replicates. P value was calculated using two-tailed unpaired Student’s t-test. h. Heatmap showing the expression of pluripotency and 2C-embryo enriched genes in MuERV-L-LTR-GFP reporter mouse ESC cultured in the indicated conditions for three days. m, mouse.

Source data

Extended Data Fig. 8 DNA methylation (measured by RRBS) status of human PSC converted by 4CL and e4CL.

a. Violin plot showing global CpG methylation levels measured by RRBS of human PSC cultured in primed conditions (n = 2 technical replicates), 4CL (day 12) (n = 2 technical replicates), 5iLAF35 (n = 2 technical replicates), t2iLGö-NK236 (n = 3 technical replicates), NHSM29 (n = 3 technical replicates), stepwise e4CL (day 5) (n = 2 technical replicates) and direct e4CL (day 7) (n = 3 technical replicates), human 8C-embryo (n =  3 biological replicates with 2 technical replicates each) and ICM32 (n = 2 technical replicates). All reference datasets used in this study are summarized in Supplementary Table 8. Our dataset was generated using H9 ESC. For boxplots, the blue central line is the median, the boxes indicate the upper and lower quartiles. b. Heatmap showing CpG methylation levels at a panel of imprinting control regions of human PSC cultured in primed conditions, 4CL (day 12), 5iLAF17, tt2iLGö37, NHSM29 and stepwise e4CL (day 5), ICM and postimplantation embryo32. c and d. Genome browser tracks showing CpG methylation levels at the indicated naïve pluripotency (blue) (c) and totipotency (red) loci (d) of PSC cultured in primed conditions, 4CL (day 12), 5iLAF, t2iLGö, NHSM, stepwise e4CL (day 5) and direct e4CL (day 7), human 8C-embryo and ICM. Each bar represents a single CpG and the height indicates the percentage of methylation.

Source data

Extended Data Fig. 9 DNA methylation (measured by WGBS) status of primed human PSC converted by 4CL and sorted 8CLC.

a. Bar plot comparing global CpG methylation levels in the human 8C-embryo, ICM and postimplantation embryo32, and primed ESC, 4CL-day 12 naïve ESC and sorted 8CLC from e4CL-day 5. All reference datasets used in this study are summarized in Supplementary Table 8. Our dataset was generated using H9 ESC. b. Comparative DNA methylation level at the TSS, gene body and TES in primed ESC, 4CL-day 12 naïve ESC and sorted 8CLC. c. Comparative DNA methylation level in different genomic regions in ESC cultured in primed, 4CL-day 12 naïve ESC and sorted 8CLC. d. Genome browser tracks visualization of DNA methylation levels measured by WGBS at the indicated naïve pluripotency (blue) and totipotency (red) loci of primed ESC, 4CL-day 12 naïve ESC and sorted 8CLC. Each bar represents a single CpG and the height indicates the percentage of methylation. e. Comparative DNA methylation level at human early embryo-enriched TE (MLT2A1, MLT2A2 and LRT12C) in primed ESC, 4CL-day 12 naïve ESC and sorted 8CLC. f. Genome-wide comparative DNA methylation profiles in human 8C-embryo, ICM, postimplantation embryo, primed ESC, 4CL-day 12 naïve ESC, sorted 8CLC from e4CL-D5, 5iLAF35 and NHSM29. Data from different biological replicates were pooled and binned into consecutive 10-kb tiles. Only tiles covered by all the datasets were shown.

Source data

Extended Data Fig. 10 DNA methylation and gene expression analysis in mouse embryo and 2CLC.

a. Genome browser tracks visualization of DNA methylation levels measured by RRBS at the indicated naïve pluripotency (blue) and totipotency (red) loci of mouse embryonic stages38, ESC and 2CLC39. Each bar represents a single CpG and the height indicates the percentage of methylation. All reference datasets used in this study are summarized in Supplementary Table 8. b. Violin plots showing the log normalized expression of representative naïve pluripotency and totipotency genes in mouse early embryonic stages23 (left panel), ESC and 2CLC40 (right panel).

Source data

Extended Data Fig. 11 Extended analysis of 8CLC chromatin landscape.

a. UMAP visualization of gene score for all genes in the scATAC-seq of primed ESC untreated (red) or converted by 4CL (day 12; blue) or stepwise e4CL (day 5; green). b and c. UMAP visualization based on panel a highlighting the gene score for primed (ZIC2), shared naïve pluripotency/8CLC (DPPA3) (b) and totipotency (c) genes projected onto each individual cell in the scATAC-seq of primed ESC untreated or converted by 4CL (day 12) and stepwise e4CL (day 5). d. UMAP visualization based on panel a highlighting the gene score for classical pluripotency genes (OCT4 and SOX2) projected onto each individual cell in the scATAC-seq of primed ESC untreated or converted by 4CL (day 12) and stepwise e4CL (day 5). e and f. UMAP visualization showing the gene score for totipotency (e) and classical pluripotency (f) genes projected onto the individual cells of stepwise e4CL-day 5 of Fig. 2a. g. UMAP visualization showing DNA binding motif deviation scores of the lineage specifiers (GSC, PITX1 and PITX2) and classical pluripotency (SOX2) gene projected onto the UMAP of Fig. 2a. h. Violin plot showing the log normalized expression of the lineage specifiers GATA2, GATA4, GATA6 and PITX2 in human embryonic stages and human ESC passage 1026 (left panel), and in primed ESC untreated or converted by 4CL (day 12) and e4CL (day 5 C5 [8CLC] and nnon-8CLC [all other clusters summed]) (right panel). Data were taken from the droplet-based scRNA-seq time course. All reference datasets used in this study are summarized in Supplementary Table 8. hESC, human ESC; P, passage. i. RT-qPCR showing upregulation of lineage specifiers (GATA6 and PITX2) in stepwise e4CL day 5 cells compared to primed ESC and 4CL-day 12 naïve ESC. H9 ESC were used to generate these data. Data are presented as mean values ± SD. n = 3 biological replicates. D, day. j. Violin plot showing the log normalized expression of indicated lineage specifiers in early mouse embryonic stages23 (left panel) and mouse 2CLC compared to naïve ESC39 (right panel). blast., blastocyst. k. Genome browser tracks showing chromatin accessibility, H3K27ac level42 and transcription factor DNA binding motif location at the naïve pluripotency KLF17 locus, multiple totipotency (TPRX1 and ZNF280A) loci and lineage specifier (GATA6) locus in the indicated cell types.

Source data

Extended Data Fig. 12 Cell type-specific chromatin accessibility and interactions.

a–d. Genome browser tracks visualizing cell type-enriched (primed ESC, 4CL-day 12 naïve ESC, and non-8CLC and 8CLC in cluster 5 of stepwise e4CL-day 5 cells) chromatin accessibility (upper panel) and co-accessibility loops (middle panel) for the indicated classical pluripotency (a), naïve pluripotency (b) and classical totipotency loci (c), and the 8C-enriched lineage specifier GATA6 locus (d). Color intensities of the loops represent the significance level of peak co-accessibility. A violin plot showing the log normalized expression level in the same cell types is displayed in the bottom panel for each locus. e. Average normalized chromatin accessibility signal for the early human embryo-enriched TE MLT2A2 in early human embryo bulk ATAC-seq16 and pseudobulk of our scATAC-seq data. All reference datasets used in this study are summarized in Supplementary Table 8. f. UMAP visualization of gene score of all genes in the scATAC-seq of primed H9 ESC untreated (green) or converted by 4CL (day 12; blue) and 8CLC (red) detected in Fig. 2a. g. Trajectory based on gene score and DNA binding motif distribution along the stepwise process of 8CLC induction from primed ESC to 8CLC passing through 4CL-day 12 naïve ESC projected onto the UMAP of panel f. h. UMAP visualization based on panel g highlighting the gene score for primed (ZIC2) and shared naïve pluripotency/8CLC (DPPA3) and totipotency (ZNF280A) genes projected onto each individual cell of primed ESC untreated or converted by 4CL (day 12) and 8CLC detected in Fig. 2a

Source data

Extended Data Fig. 13 Extended multiomics analysis of 8CLC chromatin landscape.

a. UMAP visualization of scRNA-seq in the multiomics analysis of stepwise e4CL-day 5 cells showing annotated 8CLC (13.5%) and non-8CLC clusters. b. Violin plot showing the log normalized expression of representative pluripotency and totipotency genes for 8CLC and non-8CLC of panel a. c. UMAP visualization of scATAC-seq in the multiomics analysis of stepwise e4CL-day 5 cells showing annotated 8CLC (5.1%) and non-8CLC clusters. d. Genome browser tracks visualization of chromatin accessibility for the indicated pluripotency and totipotency loci in 8CLC and non-8CLC of panel c.e . Venn diagram showing the overlap between 8CLC detected by scATAC-seq and 8CLC detected by scRNA-seq. f. Genome browser tracks visualization of chromatin accessibility and violin plot showing the gene expression for the indicated genes in 8CLC (cluster 5) and non-8CLC (all other clusters summed) of Fig. 2h

Source data

Extended Data Fig. 14 Dynamics of gene expression during the 8CLC conversion.

a. Hierarchical clustering of 1,497 variable genes among the indicated scRNA-seq (droplet based) samples along the pseudotime of stepwise e4CL induction of 8CLC from primed ESC identifies five gene groups. k = 5. b. Line plots showing the mean standardized gene score of the indicated scATAC-seq samples for each gene group of panel a. 8CLC (cluster 2) and non-8CLC (all other clusters summed) are based on Fig. 2a. The red line represents the median values for the group center. c. Enriched GO terms for biological processes in each of the five groups of panel a relative to all other gene groups. P value was calculated using hypergeometric test and adjusted for multiple testing using Benjamini-Hochberg correction. d. Pseudotime showing the expression pattern of the indicated genes for each cluster of panel a during the stepwise e4CL induction of 8CLC from primed ESC. e. Pseudotime showing the expression pattern of the same genes of panel d during early human embryonic development26. blast., blastocyst. All reference datasets used in this study are summarized in Supplementary Table 8. f. Hierarchical clustering of variable genes among the indicated scRNA-seq samples along the pseudotime of direct e4CL induction of 8CLC from primed ESC identifies five groups. k = 5. Each transcript expression is shown as a gray line; the black line represents the mean expression. g. Pseudotime showing the expression pattern of the indicated genes for each cluster of panel f during the direct e4CL induction of 8CLC from primed ESC. h. Enriched GO terms for biological processes in each of the five groups of panel f relative to all other gene clusters. P value was calculated using hypergeometric test and adjusted for multiple testing using Benjamini-Hochberg correction.

Source data

Extended Data Fig. 15 Networks and regulators controlling the 8CLC conversion.

a and b. Cytoscape network visualization of top hub genes (highlighted in yellow) from Fig. 3a and their targets with interconnections in 4CL-day 12 naïve ESC (a) and primed ESC (b). c. Enriched GO terms for biological processes in DEG of 8CLC in stepwise e4CL-day 5 cells of Fig. 3c compared to non-8CLC. P value was calculated using hypergeometric test and adjusted for multiple testing using Benjamini-Hochberg correction. d. Violin plot showing the log normalized expression pattern of 8C-embryo enriched ETCHbox family members in early human embryonic stages20 and primed ESC untreated or converted by 4CL (days 8 and 12) and non-8CLC and 8CLC of stepwise e4CL (day 5) of Extended Data Fig. 4a. D, day. All reference datasets used in this study are summarized in Supplementary Table 8. e and f. RT-qPCR showing the expression of the indicated genes in DPPA3 knockout (DPPA3−/−) ESC compared to wild-type after conversion from a primed state to 4CL-day 12 naïve PSC (e) or direct e4CL-day 7 cells (f). Data are presented as mean values ± SD. Two independent DPPA3 knockout clones were included. n = 3 biological replicates. g. RT-qPCR showing expression of the indicated genes in TPRX1 knockout (TPRX1−/−) compared to wild-type after conversion of ESC from a primed state to 4CL-day 12 PSC (left pane), stepwise e4CL-day 5 cells (middle panel) or direct e4CL-day 7 cells (right panel). Data are presented as mean values ± SD. Two independent TPRX1 knockout clones were included. n = 3 biological replicates. h. Enriched GO terms for downregulated genes in DPPA3 knockout (DPPA3−/−) ESC compared to wild type in direct e4CL-day 7. P value was calculated using hypergeometric test and adjusted for multiple testing using Benjamini-Hochberg correction. i. Enriched GO terms for downregulated genes in TPRX1 knockout (TPRX1-/-) ESC compared to wild type in stepwise e4CL-day 5. P value was calculated using hypergeometric test and adjusted for multiple testing using Benjamini-Hochberg correction.

Source data

Extended Data Fig. 16 Extended analysis of 8CLC induction regulators.

a. Representative immunostaining images demonstrating the nucleocytoplasmic translocation of UHRF1 (UHRF1-mCherry) after DPPA3 overexpression in H9 ESC cultured in primed condition, 4CL (day 12) or e4CL (day 5). Nuclei were counterstained with DAPI. Scale, 20 μm. Representative of three independent experiments. b. RT-qPCR (upper panels, data are presented as mean values ± SD, n = 3 biological replicates) and western blot (lower panels, representative of three independent experiments.) analysis of UHRF1 mRNA and protein levels, respectively, in 4CL-day 12 naïve ESC (left), direct e4CL-day 7 cells (middle) and stepwise e4CL-day 5 cells (right) for the indicated samples. D, day. c. Violin plot showing global CpG DNA methylation levels assessed by RRBS in knockout DPPA3 ESC compared to wild-type after conversion from a primed state to 4CL-day 12 naïve PSC or direct e4CL-day 7 cells. DPPA3 knockout clone 1 was used. n = 3 technical replicates. The blue central line is the median, the boxes indicate the upper and lower quartiles. d. Unbiased clustering of differentially methylated CpG within 3 kb of the TSS in wild-type and DPPA3 knockout (DPPA3−/−) ESC converted from a primed state to 4CL-day 12 PSC (left panel; n = 3 technical replicates) or direct e4CL-day 7 cells (right panel; n = 3 technical replicates). DPPA3 knockout clone 1 was used. e. Genome browser tracks showing CpG DNA methylation levels at the naïve pluripotency KHDC3L locus (left panel) and the totipotency locus TRIM43 (right panel) in wild-type and DPPA3 knockout (DPPA3−/−) ESC converted from a primed state to 4CL-day 12 PSC. Each bar represents a single CpG and the height indicates the percentage of methylation. Bulk RNA-seq tracks showing KHDC3L and TRIM43 expression are also included. f. Unbiased clustering of differentially methylated CpG in the intergenic regions in wild-type and DPPA3 knockout (DPPA3−/−) ESC converted from a primed state to 4CL-day 12 PSC (left panel; n = 3 technical replicates) or direct e4CL-day 7 cells (right panel; n = 3 technical replicates). Representative genes closest to the differentially methylated intergenic regions are shown. DPPA3 knockout clone 1 was used. g. RT-qPCR showing expression of the indicated genes after shRNA lentivirus-mediated Dppa3 knockdown (shDppa3-2 and shDppa3-4) compared to control (shLuc) in mouse ESC cultured in serum+mLIF. Data are presented as mean values ± SD. n = 3 biological replicates. m, mouse. h. RT-qPCR showing the expression of the indicated genes in shRNA lentivirus-mediated knockdown of Dppa3 (shDppa3-2 and shDppa3-4) compared to control (shLuc) in mouse 2CLC converted in serum+mLIF with combined TSA and DZNep. Data are presented as mean values ± SD. n = 3 biological replicates. m, mouse.

Source data

Extended Data Fig. 17 Synergistic effect of TSA and DZNep on 8CLC induction.

a. Heatmaps showing the global gene expression differences in H9 ESC cultured in stepwise e4CL at day 5 with or without (w/o) either DZNep or TSA. Example genes are shown for each cluster. n = 2 technical replicates. 8CLC network hub genes are highlighted in red. b. Heatmaps showing the expression of totipotency genes in human PSC cultured in direct e4CL at day 7 with or without DZNep or TSA. n = 2 technical replicates. 8CLC network hub genes are highlighted in red.

Source data

Extended Data Fig. 18 Extraembryonic differentiation potency of primed PSC converted by 4CL and e4CL and sorted 8CLC.

a. Representative phase contrast images showing TSC derived from sorted 8CLC in e4CL-day 5, stepwise e4CL-day 5 cells or 4CL-day 12 naïve PSC. H9 ESC were used for these experiments. Scale, 40 μm. D, day. Representative of four independent experiments. b. Representative immunostaining images for GATA3 (green) and KRT7 (red) in TSC derived from primed ESC converted by 4CL (day 12) or stepwise e4CL (day 5). Nuclei were counterstained with DAPI. Scale, 50 μm. D, day. Representative of three independent experiments. c. RT-qPCR for the indicated genes in TSC derived from primed ESC converted by 4CL (day 12) or stepwise e4CL (day 5). Data are presented as mean values ± SD of fold-change compared to PSC in 4CL or stepwise e4CL. n = 3 biological replicates. d. DNA methylation plots for the ELF5 promoter in primed ESC untreated or converted by 4CL (day 12) and 4CL-day 12 naïve ESC-derived TSC. Percentages are the proportion of methylated (closed circles) to non-methylated (open circles) CpG sites. A representative experiment is shown. e. Hierarchical clustering of the bulk RNA-seq of primed ESC converted by 4CL (day 12), 4CL-day 12 naïve ESC-derived TSC and a primary human TSC dataset54. All reference datasets used in this study are summarized in Supplementary Table 8. Our dataset was generated using H9 ESC. f. Representative immunostaining images for CGB and SDC1 in SCT differentiated from 4CL-day 12 naïve PSC-derived TSC. Nuclei were counterstained with DAPI. Scale, 50 μm. Representative of three independent experiments. g. ELISA assay detecting the concentration of hCG secreted from 4CL-day 12 naïve ESC-derived TSC and SCT differentiated from 4CL-day 12 naïve ESC-derived TSC. Representative of two independent experiments.

Source data

Extended Data Fig. 19 Blastoid formation capacity of 4CL naïve PSC, e4CL-day 5 cells and sorted 8CLC.

a. Representative phase contrast images of sorted 8CLC-derived blastoids from day 1 to day 5. n = 5 biological replicates. Scale, 100 μm. b. Representative phase contrast images showing self-organized blastoids from sorted 8CLC, stepwise e4CL-day 5 cells and 4CL-day 12 naïve ESC. H9 ESC were used in all cases. D, day. Representative of five independent experiments. Scale, 50 μm. c. Representative immunofluorescence images of self-organized blastoids from 4CL-day 12 naïve ESC (left panel) and stepwise e4CL-day 5 cells (right panel). OCT4 and GATA2 were used as ICM and trophectoderm markers, respectively. Nuclei were counterstained with DAPI (blue). Scale, 20 μm. H9 ESC were used for these experiments. D, day. Representative of three independent experiments. d. UMAP visualization based on Fig. 4g highlighting the averaged expression scores of gene signatures and examples for the epiblast lineage in sorted 8CLC-derived blastoids and human E5–E7 blastocysts20. EPI, epiblast. All reference datasets used in this study are summarized in Supplementary Table 8. e. UMAP visualization based on Fig. 4g highlighting the averaged expression scores of gene signatures and examples for hypoblast lineage in sorted 8CLC-derived blastoids and human E5–E7 blastocysts. HYPO, hypoblast. f. UMAP visualization based on Fig. 4g highlighting the averaged expression scores of gene signatures and examples for the trophectoderm lineage in sorted 8CLC-derived blastoids and human E5–E7 blastocysts. TE, trophectoderm.

Source data

Extended Data Fig. 20 Interspecies chimeras using 4CL and e4CL and purified 8CLC.

a. Schematic showing the workflow of the interspecies chimera assay by aggregation method56. Eight to ten cells for each condition (primed PSC, 4CL-day 12 naïve PSC and purified 8CLC derived from TPRX1-EGFP KI HN10 ESC labelled with DsRed [HN10-DsRed]57) were aggregated with 8C-embryos of BDF1 mice. The aggregated embryos were cultured in vitro for 24 h to reach the blastocyst stage. Blastocysts were then implanted into ICR pseudopregnant mice and allowed to develop until E10.5. b. Summary of chimera assay results at blastocyst stage. TE, trophectoderm; D, day. c. Representative phase contrast (upper panel), fluorescence (DsRed, middle panel) and merged (lower panel) images showing the integration of DsRed+ cells from the indicated injected cell types into mouse blastocysts. Scale, 20 μm. Representative of three independent experiments. d. Summary of chimera assay results at E10.5 mouse embryos. Em, embryonic lineage; ExEm, extraembryonic lineage. D, day. e. Representative fluorescence (DsRed) images showing cell integration into fetus, placenta, and yolk sac at E10.5 mouse embryos. Scale, 1 mm. Representative of three independent experiments. f. Representative images showing integration of DsRed+ cells into mouse E10.5 embryos. Anti-DsRed antibodies were used together with anti-SOX1 (upper panel) or anti-GATA6 (lower panel) antibodies. Nuclei were counterstained with DAPI. Scale, 100 μm. Representative of three independent experiments. g. Representative immunofluorescence images showing integration of DsRed+ cells into mouse E10.5 placenta. Anti-DsRed antibodies were used together with anti-KRT7 antibodies. Nuclei were counterstained with DAPI. Scale, 20 μm. Representative of three independent experiments. h. Quantitative PCR testing the presence of human mitochondrial DNA in fetus, placenta, and yolk sac of mouse E10.5 embryos. Chimeric embryos (E10.5) containing 8CLC were used for this assay. Ms, mouse; Hu, human.

Source data

Extended Data Fig. 21 Teratoma formation by 4CL naïve PSC, e4CL-day 5 cells and sorted 8CLC.

a. Summary of teratoma assays using ESC primed and converted by 4CL or stepwise e4CL (day 5) and sorted 8CLC. D, day; P, passage. b. Images of teratomas derived from primed ESC converted by 4CL (passage 15, representative of two independent experiments) or stepwise e4CL (day 5, representative of eight independent experiments) and sorted 8CLC (representative of three independent experiments). H9 ESC were used for this assay. D, day; P, passage. c. Hematoxylin and eosin staining of teratomas derived from primed ESC converted by 4CL (passage 15, representative of two independent experiments) or stepwise e4CL (day 5, representative of eight independent experiments) and sorted 8CLC (representative of three independent experiments). Representative images of tissues corresponding to the three germ layers are shown. Scale, 50 μm. H9 ESC were used for this assay. D, day; P, passage. d. Bubble plot representing the frequency of expression and average expression of the marker set used to identify the cell types for each lineage. e. UMAP visualization based on Fig. 5a highlighting the identified cell types in the scRNA-seq datasets generated from primed ESC, 4CL naïve PSC, stepwise e4CL-day 5 cells and sorted 8CLC.

Source data

Extended Data Fig. 22 Extended analysis of teratoma formation.

a. UMAP visualization based on Fig. 5a highlighting the distribution and the expression of related markers in trophoblast cells from the indicated teratomas. b. UMAP visualization showing the annotated sub-clusters of extraembryonic lineages (left panel) and the relative contribution of teratomas from sorted 8CLC, e4CL-day 5 cells, 4CL naïve ESC and primed ESC (right panel). c. UMAP visualization showing the annotated sub-clusters of immune cells (left panel) and the contribution of primed PSC, 4CL naïve ESC, stepwise e4CL-day 5 cells and sorted 8CLC-derived teratoma cells (right panel). d. Bubble plot representing the frequency of expression and average expression of marker genes in different immune cell subtypes. e. Bar plot showing the contribution of sorted 8CLC, stepwise e4CL-day 5, 4CL naïve and primed PSC-derived teratomas to different immune cell subtypes.

Source data

Supplementary information

41586_2022_4625_MOESM1_ESM.pdf

Supplementary Figures Supplementary Fig. 1 | Metaphase spreads for H9 ESCs cultured in 4CL or stepwise e4CL. a, Twenty metaphase spreads for H9 ESCs cultured in 4CL for 15 passages, related to Extended Data Fig. 1h. b, Twenty metaphase spreads for H9 ESCs cultured in stepwise e4CL (day 5), related to Extended Data Fig. 2d. Supplementary Fig. 2 | Uncropped western blot scans. Uncropped western blot scans of UHRF1 and actin, related to Extended Data Fig. 16b. Supplementary Fig. 3 | Representative flow cytometry gating. a, Negative control and TPRX1–GFP+ cells gating strategy, related to Fig. 1c, Extended Data Fig. 5h and all 8CLC sorting. b, MuERV-L::GFP+ cells in different media (serum+mLIF+DMSO/DZNep/TSA/DZNep+TSA) gating strategy, related to Extended Data Fig. 7f, g.

Reporting Summary

Supplementary Table 1

log2 fold-change (FC) of all DEGs relative to primed PSCs. log2(FC) of DEG in 4CL, multiple naive media, ICM, EPSCs, e4CL and 8C embryo relative to primed PSCs. Related to Extended Data Fig. 2a and Extended Data Fig. 2c.

Supplementary Table 2

Full list of stage-specific TEs. TE name, expression stage, average log2(FC) relative to all other stages, P value and adjusted P value are shown. Related to Extended Data Fig. 5g.

Supplementary Table 3

Full lists of genes and motifs along the pseudotime 8CLC induction trajectory. a, Full gene list based on gene score analysis along the pseudotime 8CLC induction trajectory of scATAC-seq in stepwise e4CL-D5 cells. Related to Fig. 2c. b, Full motif list based on motif analysis along the pseudotime 8CLC induction trajectory of scATAC-seq in stepwise e4CL-D5 cells. Related to Fig. 2d.

Supplementary Table 4

Full list of 1,497 variable genes along the pseudotime of stepwise e4CL induction. Correlation score with the pseudotime, gene name, n value (total gene number in the same group) and group number are shown. Related to Extended Data Fig. 14a.

Supplementary Table 5

Full list of stage-specific hub genes. Full list of 2,162 hub genes and their relative expression level in primed PSC, 4CL-D12 naive PSCs and sorted 8CLCs. Related to Fig. 3a.

Supplementary Table 6

Full list of DEGs between 8CLCs and non-8CLCs in scRNA-seq (droplet-based) of stepwise e4CL-D5 cells. Gene name, expression level in 8CLCs and non-8CLCs, average logFC relative to non-8CLCs, P value and adjusted P value are shown. Related to Fig. 3c.

Supplementary Table 7

Summary of teratoma scRNA-seq and annotation. a, Summary of teratoma scRNA-seq. Numbers of sequenced cells and sequenced teratomas are shown for primed, 4CL, stepwise e4CL and sorted 8CLCs. Related to Fig. 5, Extended Data Fig. 21 d, e and Extended Data Fig. 22. b, Summary of teratoma cell-type annotation. Annotated cell type, starting cell state, annotated cell number and percentage are shown. Related to Fig. 5a, b and Extended Data Fig. 21e. c, Summary of teratoma germ layer annotation. Annotated germ layer, starting cell state, annotated cell number and percentage are shown. Related to Fig. 5c. d, Summary of teratoma cell-subtype annotation. Annotated trophoblast and immune cell subtypes, starting cell state, annotated cell number and percentage are shown. Related to Fig. 5d, e and Extended Data Fig. 22.

Supplementary Table 8

Published datasets used in this study. Summary of published datasets used in this study. Name of dataset, accession number, citation information and DOI are shown.

Supplementary Table 9

PCR primers and sgRNA, siRNA or shRNA target sequences. Summary of PCR primers and sgRNA, siRNA or shRNA target sequences. RT–qPCR primers, genotyping PCR primers, mitochondrial DNA PCR primers, ELF5 promoter bisulfite PCR primers, DPPA3 and TPRX1 sgRNA sequences, DPPA3 siRNA sequence, and mouse Dppa3 shRNA sequences are shown.

Supplementary Table 10

Details of antibodies used in this study, including antibody names, company, catalogue numbers, dilution ratio and manufacturer links.

Supplementary Table 11

Processed datasets with link and description. Processed datasets of this study can be accessed in Figshare website. Processed dataset names, web link for datasets, data type and related figures are summarized.

Peer Review File

Source data

Source Data Fig. 1

Source Data Fig. 2

Source Data Fig. 3

Source Data Fig. 4

Source Data Fig. 5

Source Data Extended Data Fig. 1

Source Data Extended Data Fig. 2

Source Data Extended Data Fig. 3

Source Data Extended Data Fig. 4

Source Data Extended Data Fig. 5

Source Data Extended Data Fig. 6

Source Data Extended Data Fig. 7

Source Data Extended Data Fig. 8

Source Data Extended Data Fig. 9

Source Data Extended Data Fig. 10

Source Data Extended Data Fig. 11

Source Data Extended Data Fig. 12

Source Data Extended Data Fig. 13

Source Data Extended Data Fig. 14

Source Data Extended Data Fig. 15

Source Data Extended Data Fig. 16

Source Data Extended Data Fig. 17

Source Data Extended Data Fig. 18

Source Data Extended Data Fig. 19

Source Data Extended Data Fig. 20

Source Data Extended Data Fig. 21

Source Data Extended Data Fig. 22

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mazid, M.A., Ward, C., Luo, Z. et al. Rolling back human pluripotent stem cells to an eight-cell embryo-like stage. Nature 605, 315–324 (2022). https://doi.org/10.1038/s41586-022-04625-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-022-04625-0

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing