Chromatin accessibility dynamics across C. elegans development and ageing

Version of Record

Accepted for publication after peer review and revision.

Download
Cite
Share
CommentOpen annotations (there are currently 0 annotations on this page).

Version of Record published: November 12, 2018 (This version)
Accepted Manuscript published: October 26, 2018 (Go to version)
Accepted: October 25, 2018
Received: April 7, 2018

1. Of interest
Heat stress impairs centromere structure and segregation of meiotic chromosomes in Arabidopsis

Lucie Crhak Khaitova, Pavlina Mikulkova ... Karel Riha

Research Article Apr 17, 2024
Further reading

Abstract
Introduction
Results and discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

An essential step for understanding the transcriptional circuits that control development and physiology is the global identification and characterization of regulatory elements. Here, we present the first map of regulatory elements across the development and ageing of an animal, identifying 42,245 elements accessible in at least one Caenorhabditis elegans stage. Based on nuclear transcription profiles, we define 15,714 protein-coding promoters and 19,231 putative enhancers, and find that both types of element can drive orientation-independent transcription. Additionally, more than 1000 promoters produce transcripts antisense to protein coding genes, suggesting involvement in a widespread regulatory mechanism. We find that the accessibility of most elements changes during development and/or ageing and that patterns of accessibility change are linked to specific developmental or physiological processes. The map and characterization of regulatory elements across C. elegans life provides a platform for understanding how transcription controls development and ageing.

https://doi.org/10.7554/eLife.37344.001

Introduction

The genome encodes the information for organismal life. Because the deployment of genomic information depends in large part on regulatory elements such as promoters and enhancers, their identification and characterization is essential for understanding genome function and its regulation.

Regulatory elements are typically depleted for nucleosomes, which facilitates their identification using sensitivity to digestion by nucleases such as DNase I or Tn5 transposase, termed DNA accessibility (Sabo et al., 2006; Crawford et al., 2006; Buenrostro et al., 2013). In different organisms, large repertoires of regulatory elements have been determined by profiling DNA accessibility genome-wide in different cell types and developmental stages (Thomas et al., 2011; Kharchenko et al., 2011; Thurman et al., 2012; Yue et al., 2014; Kundaje et al., 2015; Daugherty et al., 2017; Ho et al., 2017). However, no study has yet investigated regulatory element usage across the life of an animal, from the embryo to the end of life. Such information is important, because different transcriptional programs operate in different periods of life and ageing. Caenorhabditis elegans is ideal for addressing this question, as it has a simple anatomy, well-defined cell types, and short development and lifespan. A map of regulatory elements and their temporal dynamics would facilitate understanding of the genetic control of organismal life.

Active regulatory elements have previously been shown to have different transcriptional outputs and chromatin modifications (Andersson, 2015; Kim and Shiekhattar, 2015). Transcription is initiated at both promoters and enhancers, with most elements having divergent initiation events from two independent sites (Core et al., 2008; Kim et al., 2010; De Santa et al., 2010; Koch et al., 2011; Chen et al., 2013). However, promoters and enhancers differ in the production of stable transcripts. At protein-coding promoters, productive transcription elongation produces a stable transcript, whereas enhancers and the upstream divergent initiation from promoters generally produce short, aborted, unstable transcripts (Core et al., 2014; Andersson et al., 2014; Rennie et al., 2017).

Promoters and enhancers have also been shown to be differently enriched for specific patterns of histone modifications. In particular, promoters often have high levels of H3K4me3 and low levels of H3K4me1, whereas enhancers tend to have the opposite pattern of higher H3K4me1 and lower H3K4me3 (Heintzman et al., 2007; Heintzman et al., 2009). However, in human and Drosophila cell lines, it was observed that H3K4me3 and H3K4me1 levels correlate with levels of transcription at regulatory elements, rather than whether the element is a promoter or an enhancer (Core et al., 2014; Henriques et al., 2018; Rennie et al., 2018). Further, analyses of genes that are highly regulated in development showed that their promoters lacked chromatin marks associated with activity (including H3K4me3), even when the associated genes are actively transcribed (Zhang et al., 2014; Pérez-Lluch et al., 2015). Therefore, stable elongating transcription, rather than histone modification patterns, appears to be the defining feature that distinguishes active promoters from active enhancers (reviewed in Andersson, 2015; Andersson et al., 2015; Kim and Shiekhattar, 2015; Henriques et al., 2018; Rennie et al., 2018).

Regulatory elements have not been systematically mapped and annotated in C. elegans. Promoter identification has been hampered because the 5’ ends of ~70% of protein-coding transcripts are trans-spliced to a 22nt leader sequence (Allen et al., 2011). Because the region from the transcription initiation site to the trans-splice site (the ‘outron’) is removed and degraded, the 5’ end of the mature mRNA does not mark the transcription start site. To overcome this difficulty, previous studies identified transcription start sites for some genes through profiling transcription initiation and elongation in nuclear RNA or by inhibiting trans-splicing at a subset of stages (Gu et al., 2012; Chen et al., 2013; Kruesi et al., 2013; Saito et al., 2013). In addition, two recent studies used ATAC-seq or DNAse I hypersensitivity to map regions of accessible chromatin in some developmental stages, and predicted element function by proximity to first exons or chromatin state (Daugherty et al., 2017; Ho et al., 2017).

Toward building a comprehensive map of regulatory elements and their use during the life of an animal, here we used multiple assays to systematically identify and annotate accessible chromatin in the six C. elegans developmental stages and at five time points of adult ageing. Strikingly, most elements undergo a significant change in accessibility during development and/or ageing. Clustering the patterns of accessibility changes in promoters reveals groups that act in shared processes. This map makes a major step toward defining regulatory element use during C. elegans life.

Results and discussion

Defining and annotating regions of accessible DNA

To define and characterize regulatory elements across C. elegans life, we collected biological replicate samples from a developmental time course and an ageing time course (Figure 1A). The developmental time course consisted of wild-type samples from six developmental stages (embryos, four larval stages, and young adults). For the ageing time course, we used glp-1(e2144ts) mutants to prevent progeny production, since they lack germ cells at the restrictive temperature. Five adult ageing time points were collected, starting from the young adult stage (day 1) and ending at day 13, just before the major wave of death.

Figure 1A outlines the datasets generated. For all developmental and ageing time points, we used ATAC-seq to identify accessible regions of DNA. We also sequenced strand-specific nuclear RNA (>200 nt long) to determine regions of transcriptional elongation, because previous work demonstrated that this approach could capture outron signal linking promoters to annotated exons (Chen et al., 2013; Kruesi et al., 2013; Saito et al., 2013). For the development time course, we additionally sequenced short (<100 nt) capped nuclear RNA to profile transcription initiation, profiled four histone modifications to characterize chromatin state (H3K4me3, H3K4me1, H3K36me3, and H3K27me3), and performed a DNase I concentration course to investigate the relative accessibility of elements. Micrococcal nuclease (MNase) data were also collected for the embryo stage. As previously noted by others, we found that ATAC-seq accessibility signal is similar to that observed using a low-concentration DNase I or MNase, and that the ATAC-seq data has the highest signal-to-noise ratio (Buenrostro et al., 2013); Figure 1—figure supplement 1C) (Buenrostro et al., 2013); Figure 1—figure supplement 1A).

Figure 1 with 3 supplements see all

Download asset Open asset

Overview of the project.

(A) Overview of genome-wide assays and time points of developmental and ageing samples. For development samples, chromatin accessibility, transcription initiation, productive elongation, and chromatin state were profiled in six stages of wild-type animals (embryos, four larval stages, young adults). For ageing samples, chromatin accessibility and productive transcription elongation were profiled in five time points of sterile adult *glp-1* mutants (Day 1/Young adult, Day 2, Day 6, Day 9, Day 13). (B) Representative screen shot of normalized genome-wide accessibility profiles in the eleven samples (chrIII:9,041,700–9,196,700, 154 kb).

https://doi.org/10.7554/eLife.37344.002

Figure 1—source data 1 Accessible sites identified using ATAC-seq. ● chrom_ce10, start_ce10, end_ce10 location of the accessible site (bed-style coordinates, ce10). ● atac_%stage_height maximum SPMR-normalized ATAC-seq signal at the peak in %stage (one of wt_emb, wt_l1, wt_l2, wt_l3, wt_l4, wt_ya, glp1_d1, glp1_d2, glp1_d6, glp1_d9, glp1_d13). ● atac_source source of the ATAC-seq peak call (see Materials and methods). ○ atac_wt_pe wt (developmental) ATAC-seq treated as paired-end. ○ atac_wt_se wt (developmental) ATAC-seq treated as single-end. ○ atac_glp1_se glp-1 (ageing) ATAC-seq, single-end only.: https://doi.org/10.7554/eLife.37344.006
Download elife-37344-fig1-data1-v2.txt

To define sites that are accessible in at least one developmental or ageing stage, focal peaks of significant ATAC-seq enrichment were identified across all developmental and ageing samples, yielding 42,245 individual elements (Figure 1B, Figure 1—source data 1; see Materials and methods for details). Of these, 72.8% overlap a transcription factor binding site (TFBS) mapped by the modENCODE or modERN projects (Araya et al., 2014; Kudron et al., 2018), supporting their potential regulatory functions (Figure 2—figure supplement 1A).

Two recent studies reported accessible regions in C. elegans identified using DNase I hypersensitivity or ATAC-seq (Ho et al., 2017; Daugherty et al., 2017). The 42,245 accessible elements defined here overlap 33.7% of (Ho et al., 2017) DNase I hypersensitive sites and 47.9% of (Daugherty et al., 2017) ATAC-seq peaks (Figure 2—figure supplement 1B,C). Examining the non-overlapping sites from pairwise comparisons, it appears that differences in peak calling methods account for some of the differences. Accessible regions determined here required a focal peak of enrichment, whereas the other studies found both focal sites and broad regions with increased signal. Consistent with these differences in methods, sites unique to the two studies are enriched for exonic chromatin, depleted for both TFBS and transcription initiation sites, and often found in broad regions of increased accessibility across transcriptionally active gene bodies (Figure 2—figure supplement 1B–E). Similarly, using MACS2 to call peaks on the ATAC-seq data reported here, as used by Daugherty et al. (2017), identified a group of exon enriched sites not found using our peak calling method (Figure 2—figure supplement 2A). However, the fraction of such sites is relatively small indicating that other differences also contribute, such as signal-to-noise or nematode growth methods.

To functionally classify elements, we annotated each of the 42,245 elements for transcription initiation and transcription elongation signals on both strands (Figure 2A,B; Figure 2—source data 1; see Materials and methods for details). Overall, 37.1% of elements had promoter activity, defined by a significant increase in transcription elongation signal originating at the element in at least one stage and one direction. Promoters were assigned to protein-coding or pseudogenes if continuous transcription elongation signal extended from the element to an annotated first exon (covering the outron). Promoters were unassigned if transcription elongation signal was not linked to an annotated gene. We observed detectable transcription initiation signal at 82.3% of elements (Figure 2—source data 1); those with no significant transcription elongation signal in either direction were annotated as putative enhancers (hereafter referred to as ‘enhancers’). The remaining elements had no detectable transcriptional activity or overlapped ncRNAs (tRNA, snRNA, snoRNA, rRNA, or miRNA) (Figure 2B; Figure 2—source data 1). We found that accessible sites are enriched for being located within outrons or intergenic regions (Figure 2—figure supplement 3).

Figure 2 with 6 supplements see all

Download asset Open asset

Annotation of accessible elements.

(A) Top, strand-specific nuclear RNA in each developmental stage monitors transcription elongation; plus strand, blue; minus strand, red. Below is transcription initiation signal, accessible elements (colored by annotation), and gene models (chrI:12,675,000–12,683,400, 8.4 kb). The left side of each element is colored by the reverse strand annotation whereas the right side of an element is colored by the forward strand annotation (color key at bottom). (B) Left, distribution of accessible sites in four categories: promoters (one or both strands), putative enhancers, no activity, or overlapping a tRNA, snRNA, snoRNA, rRNA, or miRNA. Right, distribution of different types of promoter annotations. (C) Left, distribution of the number of promoters and enhancers per gene; right, boxplot shows that genes with more promoters also have more enhancers.

https://doi.org/10.7554/eLife.37344.007

Figure 2—source data 1 Regulatory annotation of accessible sites. ● chrom_ce10, start_ce10, end_ce10 location of the accessible site (bed-style coordinates, ce10). ● chrom_ce11, start_ce11, end_ce11 as above, but lifted over to ce11. ● annot final regulatory element type, obtained by combining strand-specific transcription patterns (see Materials and methods). ● annot_%strand annotation of the strand-specific transcription patterns at the site (%strand is either fwd or rev). ● promoter_gene_id_%strand, promoter_locus_id_%strand, promoter_gene_biotype_%strand WormBase gene id, locus id, biotype for sites annotated as coding_promoter, pseudogene_promoter or non-coding_RNA on %strand. ● associated_gene_id, associated_locus_id WormBase gene id, locus id of genes whose gene body or outron region overlaps the site. These are defined for for sites annotated as unassigned_promoter, putative_enhancer or other_element. If a site overlaps multiple genes, all overlaps are reported, separated by commas. ● tss_%strand_ce10 representative transcription initiation mode (Materials and methods) on %strand, ce10 coordinates. ● tss_%strand_ce11 as above, but lifted over to ce11. ● scap_%strand_passed True or False based on whether the site has reproducible transcription initiation (Materials and methods). ● lcap_%stage_%strand_passed_jump True or False based on whether the site passed the jump test for elongating transcription (Materials and methods, %stage is one of wt_emb, wt_l1, wt_l2, wt_l3, wt_l4, wt_ya, glp1_d1, glp1_d2, glp1_d6, glp1_d9, glp1_d13). ● lcap_%stage_%strand_passed_incr True or False based on whether the site passed the incr test for elongating transcription (Materials and methods).: https://doi.org/10.7554/eLife.37344.014
Download elife-37344-fig2-data1-v2.txt

Within the promoter class, we defined 15,572 protein-coding coding promoters: 11,478 elements are unidirectional promoters and 2118 are divergent promoters that drive expression of two oppositely oriented protein-coding genes (Figure 2—source data 1). In total, promoters were defined for 11,196 protein-coding genes, with 3000 genes having >1 promoter (Figure 2C). The protein-coding promoter annotations show good overlap with four sets of TSSs previously defined based on mapping transcription (Chen et al., 2013; Kruesi et al., 2013; Saito et al., 2013; Gu et al., 2012); 76.8–85.1%; Figure 2—figure supplement 5). Enhancers (n = 19,231) were assigned to a gene if they are located within the region from its most upstream promoter to its gene end; 6668 genes have at least one associated enhancer, and 3240 genes have >1 enhancer (Figure 2C).

The locations of unassigned promoters (n = 3106) suggest different potential functions. A large fraction (35.1%) generate antisense transcripts within the body of a protein coding gene, suggesting a possible role in regulating expression of the associated gene (Figure 2—figure supplement 5). Another large group (38.4%) produce antisense transcripts from an element that is a protein coding promoter in the sense direction, a pattern seen in many mammalian promoters, termed upstream antisense (uaRNA) or promoter upstream (PROMPT) transcripts (Figure 2—figure supplement 5; Preker et al., 2008; Flynn et al., 2011; Sigova et al., 2013). Most of the rest (21.7%) are intergenic and may define promoters for unannotated transcripts.

Patterns of histone marks at promoters and enhancers

Promoters and enhancers show general differences in patterns of histone modifications, such as higher levels of H3K4me3 at promoters or H3K4me1 at enhancers, and chromatin states are frequently used to define elements as promoters or enhancers (Heintzman et al., 2007; Ernst and Kellis, 2010; Ernst et al., 2011; Kharchenko et al., 2011; Hoffman et al., 2013; Daugherty et al., 2017). However, it has been shown that H3K4me3 levels correlate with transcriptional activity rather than with function (Pekowska et al., 2011; Core et al., 2014; Andersson et al., 2014; Henriques et al., 2018; Rennie et al., 2018), suggesting that defining regulatory elements solely based on chromatin state is likely to lead to incorrect annotations.

To further investigate the relationship between chromatin marking and element function, we mapped four histone modifications at each developmental stage (H3K4me3, H3K4me1, H3K27me3, H3K36me3) and examined their patterns around coding promoters and enhancers. As expected, many coding promoters had high levels of H3K4me3 and were depleted for H3K4me1 (Figure 3A). Moreover, enhancers had generally low levels of H3K4me3 and higher levels of H3K4me1 than promoters (Figure 3A). However, many elements did not have these patterns. For example, about 50% of coding promoters have a high level of H3K4me1 and no or low H3K4me3 marking (Figure 3A).

Figure 3 with 1 supplement see all

Download asset Open asset

Chromatin state and sequence features of promoters and enhancers.

(A) Heatmaps of indicated histone modifications and CV values at coding promoters (top), and enhancers (bottom), aligned at element midpoints. Elements are ranked by mean H3K4me3 levels. Low CV values indicate broad expression across development and cell types and high CV values indicate regulated expression. Promoters of genes with low CV values have high H3K4me3 levels. (B) Distribution of initiator Inr motif, TATA motif, and CpG content at coding promoters and enhancers, separated by H3K4me3 level (top, middle, and bottom thirds). Grey-shaded regions represent 95% confidence intervals of the sample mean at the genomic position with the highest signal.

https://doi.org/10.7554/eLife.37344.015

To investigate the nature of these patterns, we examined coefficients of variation of gene expression (CV; Gerstein et al., 2014) of the associated genes. Genes with broad stable expression across cell types and development, such as housekeeping genes, have low variation of gene expression levels and hence a low CV value. In contrast, genes with regulated expression, such as those expressed only in particular stages or cell types have a high CV value. We found a strong inverse correlation between a gene’s CV value and its promoter H3K4me3 level (−0.64, p<10⁻¹⁵, Spearman's rank correlation; Figure 3; Figure 3—figure supplement 1A). Furthermore, promoters with low or no H3K4me3 marking are enriched for H3K27me3 (Figure 3; Figure 3—figure supplement 1A), which is associated with regulated gene expression (Tittel-Elmer et al., 2010; Pérez-Lluch et al., 2015; Evans et al., 2016). These results support the view that H3K4me3 marking may be a specific feature of promoters with broad stable activity, consistent with the finding that active promoters of regulated genes lack H3K4me3 (Pérez-Lluch et al., 2015). The profiling here was done in whole animals, which may have precluded detecting modifications occurring in a small number of nuclei. Nevertheless, the results indicate that chromatin state alone is not a reliable metric for element annotation. Histone modification patterns at many promoters resemble those at enhancers, and vice versa.

Promoters and enhancers also share sequence features. Both are enriched for initiator INR elements, although enhancers have a slightly lower INR frequency (Figure 3B and Figure 3—figure supplement 1B). Promoters and enhancers are also both enriched for CpG dinucleotides (Figure 3B and Figure 3—figure supplement 1B). Promoters with high H3K4me3 and low CV values (broadly expressed genes) have the highest CpG content, whereas those with low H3K4me3 and high CV values have the lowest CpG content (Figure 3B and Figure 3—figure supplement 1B). Promoters also differ from enhancers by the presence of TATA motifs, which occur predominantly at genes with low H3K4me3,and high CV values (i.e. with regulated expression; Figure 3B and Figure 3—figure supplement 1B).

Promoters and enhancers can drive gene expression in an orientation independent manner

To validate the promoter annotations, we compared them with studies where small regions of DNA had been defined as promoters using transgenic assays. These comprised 10 regions are defined based on transcription initiation signal (Chen et al., 2014), nine regions defined based on proximity to a germ line gene (Merritt et al., 2008), and four defined by proximity to the first exon of a muscle expressed gene (Hunt-Newbury et al., 2007). Of these 23 regions, 21 overlap an element in our set of accessible sites, 19 of which are annotated as protein coding promoters (Figure 2—figure supplement 6A). One of the remaining two is annotated as an enhancer and the other overlaps an accessible element for which no transcriptional signal was detected. We further directly tested three elements annotated as promoters (for hlh-2, ztf-11 and bed-3 genes), and found that all three drove robust expression of a histone-GFP reporter (Figure 2—figure supplement 6A). Overall, there is good concordance between promoter annotation and promoter activity.

Most of the elements annotated as protein-coding promoters are flanked by bidirectional transcription initiation signal (74.0%), similar to the pattern seen in mammals. Most (82.6%) are unidirectional promoters, producing a protein-coding transcript in one direction, but no stable transcript from the upstream initiation site. To test whether such upstream antisense initiation sites could function as promoters, we inverted the orientation of two active unidirectional promoters (ztf-11 and F58D5.5). If the lack of in vivo transcription elongation was a property of the element or initiation site itself, the GFP fusion should not be expressed. However, we observed that the two inverted unidirectional promoters both drove GFP expression. The expression patterns generated were similar in both orientations, although the ztf-11 promoter was weaker when inverted (Figure 2—figure supplement 6B,C). These results suggest that signals for productive elongation occur downstream of the transcription initiation site.

Similar to the upstream antisense transcription initiation observed at promoters, enhancers also show transcription initiation signals but generally do not produce stable transcripts (Core et al., 2014; Andersson et al., 2014). Previous studies have reported that some enhancers can function as promoters in transgenic assays and also at endogenous loci (Kowalczyk et al., 2012; Leung et al., 2015; Nguyen et al., 2016; van Arensbergen et al., 2017; Mikhaylichenko et al., 2018). To assess the potential promoter activities of C. elegans enhancers, we directly fused 12 putative enhancers that had transcription initiation signal in embryos to a histone-GFP reporter gene and assessed transgenic strains for embryo expression. Two of the tested enhancers are located in introns, and one of these, from the bro-1 gene, has been previously validated as an enhancer (Brabin et al., 2011); most of the others are associated with the hlh-2 or ztf-11 genes. We found that 10 of 12 tested regions drove reporter expression in embryos, including the two intronic enhancers (Figure 2—figure supplement 6B,C). Whereas the hlh-2 and ztf-11 promoters drove strong, broad expression, the associated enhancers were active in a smaller number of cells and expression levels were overall lower (Figure 2—figure supplement 6B,C). We also tested two enhancers in inverted orientation and found that both showed similar activity in both orientations, as observed for the two tested promoters (Figure 2—figure supplement 6B,C). The percentage of enhancers that functioned as active promoters is higher than that observed in a cell-based assay (Nguyen et al., 2016), possibly because all cell types are tested in an intact animal. Episomal-based assays have also been reported to underestimate activity (Inoue et al., 2017).

Extensive regulation of chromatin accessibility in development

We observed extensive changes in chromatin accessibility across development, with most elements showing a significant difference within the developmental time course (71%,>=2 fold change, FDR < 0.01; Figure 4—source data 1; see Materials and methods). To investigate how accessibility relates to gene expression, we focused on the 13,596 elements annotated as protein-coding promoters. Of these, 10,199 displayed significant changes in accessibility in development, with the remaining 3397 promoters classified as having stable accessibility. We note that the detected changes could be due to regulation of accessibility, or alternatively to changes in cell number during development (e.g. the number of germ line nuclei increases from two in L1 larvae to ~2000 in young adults).

We reasoned that promoters having similar patterns of accessibility changes over development may regulate genes that function in shared processes and be regulated by shared sets of transcription factors. To investigate this, we applied k-medoid clustering to the 10,199 promoters with developmental changes in accessibility, defining 16 clusters (Figure 4A, Figure 4—figure supplement 1, Figure 4—figure supplement 2, and Figure 4—source data 1; see Materials and methods). Within clusters, we observed that promoter accessibility and nuclear RNA levels are usually correlated (mean r = 0.47 (sd = 0.11) across all clusters), indicating that accessibility is a good metric of promoter activity and overall gene expression (Figure 4—figure supplement 1 and Figure 4—figure supplement 2).

Figure 4 with 3 supplements see all

Download asset Open asset

Shared dynamics of promoter accessibility in development and ageing.

Clusters of promoters with shared relative accessibility patterns across (A) development or (B) ageing. Relative promoter accessibility is log2 of the depth-normalized ATAC-seq coverage at a given time point divided by the mean ATAC-seq coverage across the time series (see Materials and methods). The percentage of associated genes that have enriched expression in the indicated tissues was determined from single-cell L2 larval RNA-seq data (Cao et al., 2017); see Materials and methods). Right hand panels show examples of GO terms enriched in genes associated with development or ageing clusters.

https://doi.org/10.7554/eLife.37344.017

Figure 4—source data 1 Element accessibility dynamics and promoter accessibility clusters in development and ageing. ● chrom_ce10, start_ce10, end_ce10 location of the accessible site (bed-style coordinates, ce10). ● devel_is_dynamic True or False based on whether the site shows differential accessibility between any two developmental stages. ● ageing_is_dynamic True or False based on whether the site shows differential accessibility between any two ageing time points. ● devel_prom_cluster_label assigned developmental accessibility promoter cluster. ● ageing_prom_cluster_label assigned ageing accessibility promoter cluster. ● HOTness based on the number of transcription factors overlapping the accessible site, either HOT (19 or more factors), cold (between 1 and 18 factors) or none (zero factors). ● factor_count number of transcription factors with binding sites overlapping the accessible site. ● factor_names comma-separated list of the names of transcription factors with binding sites overlapping the accessible site.: https://doi.org/10.7554/eLife.37344.021
Download elife-37344-fig4-data1-v2.txt

To investigate whether the shared patterns of accessibility changes over development identify promoters of genes involved in common processes, we took advantage of recent single-cell profiling data obtained from L2 larvae, which provides gene expression measurements in different tissues (Cao et al., 2017). We find that half of the developmental promoter clusters are enriched for genes with tissue biased expression (Figure 4A, Figure 4—figure supplement 1 and Figure 4—figure supplement 2). Based on these patterns of enrichment, we defined four gonad promoter clusters (G1-G4), two intestine clusters (I1, I2), one hypodermal cluster (H) and one cluster enriched for neural and muscle expression (N + M) (Figure 4A, Figure 4—figure supplement 1 and Figure 4—figure supplement 2). Genes associated with the remaining eight promoter clusters (Mix1–8) are generally expressed in multiple tissues, but predominantly in the soma (Figure 4A, Figure 4—figure supplement 1 and Figure 4—figure supplement 2). As expected, genes linked to the stable promoters are widely expressed. Interestingly, within a tissue, promoter clusters can exhibit similar variations in accessibility but with different amplitude. For instance, gonad clusters G1 and G2 both show a sharp increase in accessibility at the L3 stage; however, the increase is 1.5-fold larger in G2 than in G1. The gonad clusters are generally characterized by an increase of promoter accessibility starting in L3 when germ cell number strongly increases.

To further investigate promoter clusters sharing accessibility dynamics, we performed Gene Ontology analyses on the associated genes. As expected, we found that clusters containing genes enriched for expression in a particular tissue are also associated with GO terms related to that tissue (Figure 4A, Figure 4—figure supplement 1 and Figure 4—figure supplement 2). For instance, cluster H contains genes highly expressed in hypodermis and GO terms linked to cuticle development. Of note, the four accessibility clusters enriched for expression in germ line are associated with GO terms for different sets of germ line functions (Figure 4—figure supplement 1 and Figure 4—figure supplement 2). Similarly, the two intestinal clusters also identify genes with different types of intestinal function. Furthermore, accessibility dynamics can reflect the temporal function of the associated promoters. For instance, cluster Mix4 has GO terms indicative of neuronal development and highest accessibility in the embryo, when many neurons develop. These results suggest that promoter clusters contain genes acting in a shared process and having a similar mode of regulation.

To identify potential transcriptional regulators, we asked whether the binding of particular transcription factors is enriched in any promoter clusters, using TF binding data from the modENCODE and modERN projects (Boyle et al., 2014; Kudron et al., 2018). TFs with enriched binding were found for each cluster (Figure 5A), and the expression of such TFs was generally enriched in the expected tissue. For example, we found that ELT-2, an intestine-specific GATA protein (Fukushige et al., 1998), has enriched binding at promoters in intestinal clusters 1 and 2. Similarly, hypodermal transcription factors BLMP-1 (Horn et al., 2014), NHR-25 (Gissendanner and Sluder, 2000) and ELT-3 (Gilleard et al., 1999) are enriched in the hypodermal promoter cluster, and binding of the germ line XND-1 factor (Wagner et al., 2010) is enriched in the germ line clusters of promoters. We also identified novel tissue-specific associations for uncharacterized transcription factors, such as ZTF-18 and ATHP-1 with germ line promoter clusters and CRH-2 with the intestinal clusters (Figure 5A). These results agree and extend those of Cao et al. (2017), who identified TFs for which binding was correlated with cell-type-specific expression levels.

Figure 5

Download asset Open asset

Transcription factor binding enrichment in developmental and ageing promoter clusters.

Transcription factor (TF) binding enrichments in developmental (A) or ageing (B) promoter clusters from Figure 4. TF-binding data are from modENCODE/modERN (Araya et al., 2014; Kudron et al., 2018); peaks in HOT regions were excluded (see Materials and methods). Only TFs enriched more than twofold in at least one cluster are shown, and only enrichments with a p<0.01 (Fisher’s exact test) are shown. Plots show TF binding enrichment odds ratio (left), expression of the TF in each tissue relative to its expression across all tissues (log2(TF tissue TPM/mean of the TF’s TPMs across all tissues), middle), and the decile of expression of the TF in each tissue (right; TPMs < 1 are not taken into account when calculating TPMs deciles). Expression data are from Cao et al. (2017). Legends for Figure Supplements.

https://doi.org/10.7554/eLife.37344.022

Figure 5—source data 1 TF datasets used for analyses. ● factor transcription factor name. ● dataset_name modENCODE/modERN DCC dataset name(s), separated by commas if multiple datasets from the same transcription factor were used. ● dataset_id modENCODE/modERN DCC dataset ID(s), comma-separated as above.: https://doi.org/10.7554/eLife.37344.023
Download elife-37344-fig5-data1-v2.txt

We also observed differences in TF-binding enrichments between promoter clusters associated with the same tissue. For example, Clusters G1-G4 all contain promoters associated with germline-enriched genes (Figure 4A). However, distinct binding enrichments are observed in promoters in G1-G2 compared to those in G3-G4, with the latter showing enrichment for LIN-35 and DPL-1, two members of the DREAM complex, which controls cell cycle progression (Figure 5A). Taken together, the results suggest that promoters with shared accessibility patterns have shared cell- and process-specific activity, and they highlight potential regulators that are candidates for future studies.

Analysis of ageing clusters

We next focused on chromatin accessibility changes during ageing. In contrast to the development time course, the accessibility of most promoters is stable during ageing, with only 13% (n = 1,800) of promoters showing changes (Figure 4—source data 1). Interestingly, 75% of these also had regulated accessibility in development.

As for the development time course, we clustered accessibility changes in ageing. We identified eight clusters of promoters with similar accessibility changes across ageing and annotated them based on tissue biases in gene expression (Figure 4B; Figure 4—source data 1). This defined one intestinal cluster (I), two clusters enriched for intestine or hypodermal biased expression (I + H) and five mixed clusters. Several mixed clusters show weak gene expression enrichments, such as intestine expression in Mix1-2 and neural expression in Mix3 (Figure 4B). As observed for the development clusters, enriched GO terms were consistent with gene expression biases (Figure 4B, Figure 4—figure supplement 3).

We then evaluated the enrichment of transcription factors at each ageing promoter cluster. The binding of DAF-16/FoxO, a master regulator of ageing (Lin et al., 2001), is associated with five ageing promoter clusters (Figure 5B). Consistent with a prominent role in the intestine (Figure 4B; Kaplan and Baugh, 2016), promoter clusters enriched for DAF-16 binding are also enriched for intestinal genes (Figure 4B). The binding enrichment patterns of five other TFs implicated in ageing (DVE-1, NHR-80, ELT-2, FOS-1 and PQM-1 (Uno et al., 2013; Folick et al., 2015; Goudeau et al., 2011; Mann et al., 2016; Tian et al., 2016; Mao et al., 2016; Tepper et al., 2013) are similar to DAF-16 (Figure 5B). These TFs and DAF-16 are also enriched in developmental intestine promoter clusters (Figure 5A), supporting cooperation between them in development and ageing. A group of hypodermal TFs including BLMP-1, ELT-1 and ELT-3 are found enriched at promoters in one of the two I + H ageing clusters (Figure 5B). Finally, CEBP-1 binding is enriched in clusters Mix3 and Mix4, which are characterized by a continuous increase of promoter accessibility across ageing. This suggests a potential role of CEBP-1 in activating a subset of genes during ageing, as it is the case for its homologue CEBP-β in mouse (Sandhir and Berman, 2010).

Conclusion

For the first time, we systematically map regulatory elements across the lifespan of an animal. We identified 42,245 accessible sites in C. elegans chromatin and functionally annotated them based on transcription patterns at the accessible site. This avoided the problems of histone-mark-based approaches for defining element function (Core et al., 2014; Henriques et al., 2018; Rennie et al., 2018). Our map identified promoters active across development and ageing, but we did not find promoters for every gene. Classes that would have been missed are those for genes expressed only in males or dauer larvae (which we did not profile) and genes not active under laboratory conditions. In addition, whole-animal profiling would miss promoters active in only a small number of cells. In the future, assaying accessible chromatin and nuclear transcription in specific cell types should identify many of these missed elements.

We found that accessibility of most elements changes during the life of the worm, supporting a key role played by chromatin structure. Despite the map being based on bulk profiling in whole animals, we find that regulatory elements with shared accessibility dynamics often share patterns of tissue-specific expression, GO annotation, and TF binding. The promoters with shared accessibility changes are therefore excellent starting points for studies of cell- and process-specific gene expression. In summary, our identification of regulatory elements across C. elegans life together with an initial characterization of their properties provides a key resource that will enable future studies of transcriptional regulation in development and ageing.

Share this article

Cite this article

Overview of the project.

Figure 1—source data 1

Annotation of accessible elements.

Figure 2—source data 1

Chromatin state and sequence features of promoters and enhancers.

Shared dynamics of promoter accessibility in development and ageing.

Figure 4—source data 1

Transcription factor binding enrichment in developmental and ageing promoter clusters.

Figure 5—source data 1

Author details

Jürgen Jänes

Contribution

Contributed equally with

Competing interests

Yan Dong

Contribution

Contributed equally with

Competing interests

Michael Schoof

Contribution

Contributed equally with

Competing interests

Jacques Serizay

Contribution

Contributed equally with

Competing interests

Alex Appert

Contribution

Competing interests

Chiara Cerrato

Contribution

Competing interests

Carson Woodbury

Contribution

Competing interests

Ron Chen

Present address

Contribution

Competing interests

Carolina Gemma

Present address

Contribution

Competing interests

Ni Huang

Contribution

Competing interests

Djem Kissiov

Present address

Contribution

Competing interests

Przemyslaw Stempor

Contribution

Competing interests

Annette Steward

Contribution

Competing interests

Eva Zeiser

Contribution

Competing interests

Sascha Sauer

Contribution

Competing interests

Julie Ahringer

Contribution

For correspondence

Competing interests

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism

Further reading