- Split View
-
Views
-
Cite
Cite
Stanislav Bellaousov, Jessica S. Reuter, Matthew G. Seetin, David H. Mathews, RNAstructure: web servers for RNA secondary structure prediction and analysis, Nucleic Acids Research, Volume 41, Issue W1, 1 July 2013, Pages W471–W474, https://doi.org/10.1093/nar/gkt290
- Share Icon Share
Abstract
RNAstructure is a software package for RNA secondary structure prediction and analysis. This contribution describes a new set of web servers to provide its functionality. The web server offers RNA secondary structure prediction, including free energy minimization, maximum expected accuracy structure prediction and pseudoknot prediction. Bimolecular secondary structure prediction is also provided. Additionally, the server can predict secondary structures conserved in either two homologs or more than two homologs. Folding free energy changes can be predicted for a given RNA structure using nearest neighbor rules. Secondary structures can be compared using circular plots or the scoring methods, sensitivity and positive predictive value. Additionally, structure drawings can be rendered as SVG, postscript, jpeg or pdf. The web server is freely available for public use at: http://rna.urmc.rochester.edu/RNAstructureWeb.
INTRODUCTION
RNA is an important biomolecule, as a catalyst (1), a director of post-transcriptional modification (2) and gene regulation (3), a target of drugs (4,5) and also a pharmaceutical (6,7). Its structure is generally hierarchical (8). Primary structure is the sequence of nucleotides, and is a covalent bond structure. Secondary structure is the set of the canonical base pairs, and tertiary structure is the set of additional contacts and the complete three-dimensional structure.
Because RNA folding is generally hierarchical, secondary structure can often be predicted and analyzed without predicting tertiary structure. Secondary structure prediction can provide a framework for understanding the mechanism of action of RNA. Secondary structure prediction is also an important consideration in the design of siRNA and antisense DNA oligonucleotides both to avoid structure in the mRNA target of these agents and in avoiding self-structure in agents, either of which prevents them from hybridizing with their targets (9–14). RNA structure prediction can also make predictions about which regions of sequence are accessible for interacting with proteins (15). Finally, secondary structure prediction can be used to identify novel functional RNA sequences encoded in genomes (16–18).
RNAstructure is a software package for RNA secondary structure prediction and analysis (19). It was first reported in 1998 as a free energy minimization program for Microsoft Windows (20). It has been expanded to include methods for predicting bimolecular structure (12), conserved structures in multiple homologs (21–24) and siRNA design (9). Several methods are available for predicting structures for a single sequence, including maximum expected accuracy (25), stochastic sampling (26), exhaustive traceback (27) and pseudoknot prediction (28). Graphical user interfaces are provided for Microsoft Windows, Macintosh OS-X and Linux. Command line interfaces are also available for these operating systems. Finally, the set of underlying C++ classes are available as a library for use by programmers.
This report describes a new set of web servers that provide the functionality of RNAstructure. These web servers are open to the public and can be found at http://rna.urmc.rochester.edu/RNAstructureWeb.
ORGANIZATION
The RNAstructure web servers are organized around two schemes. The first scheme provides specific programs for web users. Table 1 shows a list of programs that are available. The second scheme is a set of themes defined by the biological problem. For example, to predict a secondary structure for a single sequence, the server called Predict a Secondary Structure can be used to run calculations by multiple programs. The list of themes and the exact programs included are listed in Table 2. This approach is designed to be user-friendly because it bundles all the available methods for addressing a specific problem, and for this reason most users will want to use these servers.
Servers . | References . | Description . |
---|---|---|
AllSub | (27,29) | Generate all possible low free energy structures for a nucleic acid sequence. |
bifold | (12) | Predict the lowest free energy structure for two interacting sequences, allowing intramolecular base pairs. |
bipartition | (30) | Perform a partition function calculation for two interacting nucleic acid sequences. Intramolecular pairs are not allowed. |
CircleCompare | (19) | Compare two structures using overlaid circular plots to emphasize pairing differences and pseudoknots. |
ct2dot | (19) | Convert a CT-formatted structure into a dot bracket file. |
dot2ct | (19) | Convert a dot bracket file into a CT file. |
draw | (19) | Draw the secondary structure of a nucleic acid strand, with or without color annotation. |
DuplexFold | (31) | Predict the lowest free energy structure for two interacting sequences, not allowing intramolecular base pairs. |
Dynalign | (17,21,32) | Calculate the lowest free energy secondary structures common to two unaligned sequences. |
efn2 | (33) | Calculate the folding free energy of structures in a CT file. |
EnsembleEnergy | (30) | Calculate the ensemble folding free energy change for a sequence. |
Fold | (33,34) | Predict the lowest free energy structure and a set of low free energy structures for a sequence. |
MaxExpect | (25) | Generate a structure or structures composed of highly probable base pairs. |
Multilign | (22) | Predict low free energy secondary structures common to three or more sequences using progressive iterations of Dynalign. |
oligoscreen | (35) | Predict stability of hybridization and self-structure for a list of oligonucleotides. |
OligoWalk for siRNA Design | (9,11) | Design efficient siRNAs for a given miRNA target. |
partition | (30) | Perform a partition function calculation on a single sequence to calculate base pair probabilities. |
PARTS | (23,36) | Predict the common secondary structure, including base pair probabilities, for two unaligned sequences. |
ProbablePair | (30) | Generate secondary structures composed of base pairs with probabilities that exceed a specified threshold. |
ProbKnot | (28) | Predict a secondary structure composed of probable base pairs, which might include pseudoknots. |
RemovePseudoknots | (19,37) | Remove pseudoknots from a structure. |
scorer | (31,30) | Calculate the sensitivity and positive predictive value (PPV) for a predicted as compared to the accepted structure. |
stochastic | (26,36) | Generate a representative sample of structures using stochastic sampling. |
TurboFold/TurboKnot | (24,38) | Calculate the conserved structures of three or more unaligned sequences using iteratively refined partition functions. |
Servers . | References . | Description . |
---|---|---|
AllSub | (27,29) | Generate all possible low free energy structures for a nucleic acid sequence. |
bifold | (12) | Predict the lowest free energy structure for two interacting sequences, allowing intramolecular base pairs. |
bipartition | (30) | Perform a partition function calculation for two interacting nucleic acid sequences. Intramolecular pairs are not allowed. |
CircleCompare | (19) | Compare two structures using overlaid circular plots to emphasize pairing differences and pseudoknots. |
ct2dot | (19) | Convert a CT-formatted structure into a dot bracket file. |
dot2ct | (19) | Convert a dot bracket file into a CT file. |
draw | (19) | Draw the secondary structure of a nucleic acid strand, with or without color annotation. |
DuplexFold | (31) | Predict the lowest free energy structure for two interacting sequences, not allowing intramolecular base pairs. |
Dynalign | (17,21,32) | Calculate the lowest free energy secondary structures common to two unaligned sequences. |
efn2 | (33) | Calculate the folding free energy of structures in a CT file. |
EnsembleEnergy | (30) | Calculate the ensemble folding free energy change for a sequence. |
Fold | (33,34) | Predict the lowest free energy structure and a set of low free energy structures for a sequence. |
MaxExpect | (25) | Generate a structure or structures composed of highly probable base pairs. |
Multilign | (22) | Predict low free energy secondary structures common to three or more sequences using progressive iterations of Dynalign. |
oligoscreen | (35) | Predict stability of hybridization and self-structure for a list of oligonucleotides. |
OligoWalk for siRNA Design | (9,11) | Design efficient siRNAs for a given miRNA target. |
partition | (30) | Perform a partition function calculation on a single sequence to calculate base pair probabilities. |
PARTS | (23,36) | Predict the common secondary structure, including base pair probabilities, for two unaligned sequences. |
ProbablePair | (30) | Generate secondary structures composed of base pairs with probabilities that exceed a specified threshold. |
ProbKnot | (28) | Predict a secondary structure composed of probable base pairs, which might include pseudoknots. |
RemovePseudoknots | (19,37) | Remove pseudoknots from a structure. |
scorer | (31,30) | Calculate the sensitivity and positive predictive value (PPV) for a predicted as compared to the accepted structure. |
stochastic | (26,36) | Generate a representative sample of structures using stochastic sampling. |
TurboFold/TurboKnot | (24,38) | Calculate the conserved structures of three or more unaligned sequences using iteratively refined partition functions. |
Servers . | References . | Description . |
---|---|---|
AllSub | (27,29) | Generate all possible low free energy structures for a nucleic acid sequence. |
bifold | (12) | Predict the lowest free energy structure for two interacting sequences, allowing intramolecular base pairs. |
bipartition | (30) | Perform a partition function calculation for two interacting nucleic acid sequences. Intramolecular pairs are not allowed. |
CircleCompare | (19) | Compare two structures using overlaid circular plots to emphasize pairing differences and pseudoknots. |
ct2dot | (19) | Convert a CT-formatted structure into a dot bracket file. |
dot2ct | (19) | Convert a dot bracket file into a CT file. |
draw | (19) | Draw the secondary structure of a nucleic acid strand, with or without color annotation. |
DuplexFold | (31) | Predict the lowest free energy structure for two interacting sequences, not allowing intramolecular base pairs. |
Dynalign | (17,21,32) | Calculate the lowest free energy secondary structures common to two unaligned sequences. |
efn2 | (33) | Calculate the folding free energy of structures in a CT file. |
EnsembleEnergy | (30) | Calculate the ensemble folding free energy change for a sequence. |
Fold | (33,34) | Predict the lowest free energy structure and a set of low free energy structures for a sequence. |
MaxExpect | (25) | Generate a structure or structures composed of highly probable base pairs. |
Multilign | (22) | Predict low free energy secondary structures common to three or more sequences using progressive iterations of Dynalign. |
oligoscreen | (35) | Predict stability of hybridization and self-structure for a list of oligonucleotides. |
OligoWalk for siRNA Design | (9,11) | Design efficient siRNAs for a given miRNA target. |
partition | (30) | Perform a partition function calculation on a single sequence to calculate base pair probabilities. |
PARTS | (23,36) | Predict the common secondary structure, including base pair probabilities, for two unaligned sequences. |
ProbablePair | (30) | Generate secondary structures composed of base pairs with probabilities that exceed a specified threshold. |
ProbKnot | (28) | Predict a secondary structure composed of probable base pairs, which might include pseudoknots. |
RemovePseudoknots | (19,37) | Remove pseudoknots from a structure. |
scorer | (31,30) | Calculate the sensitivity and positive predictive value (PPV) for a predicted as compared to the accepted structure. |
stochastic | (26,36) | Generate a representative sample of structures using stochastic sampling. |
TurboFold/TurboKnot | (24,38) | Calculate the conserved structures of three or more unaligned sequences using iteratively refined partition functions. |
Servers . | References . | Description . |
---|---|---|
AllSub | (27,29) | Generate all possible low free energy structures for a nucleic acid sequence. |
bifold | (12) | Predict the lowest free energy structure for two interacting sequences, allowing intramolecular base pairs. |
bipartition | (30) | Perform a partition function calculation for two interacting nucleic acid sequences. Intramolecular pairs are not allowed. |
CircleCompare | (19) | Compare two structures using overlaid circular plots to emphasize pairing differences and pseudoknots. |
ct2dot | (19) | Convert a CT-formatted structure into a dot bracket file. |
dot2ct | (19) | Convert a dot bracket file into a CT file. |
draw | (19) | Draw the secondary structure of a nucleic acid strand, with or without color annotation. |
DuplexFold | (31) | Predict the lowest free energy structure for two interacting sequences, not allowing intramolecular base pairs. |
Dynalign | (17,21,32) | Calculate the lowest free energy secondary structures common to two unaligned sequences. |
efn2 | (33) | Calculate the folding free energy of structures in a CT file. |
EnsembleEnergy | (30) | Calculate the ensemble folding free energy change for a sequence. |
Fold | (33,34) | Predict the lowest free energy structure and a set of low free energy structures for a sequence. |
MaxExpect | (25) | Generate a structure or structures composed of highly probable base pairs. |
Multilign | (22) | Predict low free energy secondary structures common to three or more sequences using progressive iterations of Dynalign. |
oligoscreen | (35) | Predict stability of hybridization and self-structure for a list of oligonucleotides. |
OligoWalk for siRNA Design | (9,11) | Design efficient siRNAs for a given miRNA target. |
partition | (30) | Perform a partition function calculation on a single sequence to calculate base pair probabilities. |
PARTS | (23,36) | Predict the common secondary structure, including base pair probabilities, for two unaligned sequences. |
ProbablePair | (30) | Generate secondary structures composed of base pairs with probabilities that exceed a specified threshold. |
ProbKnot | (28) | Predict a secondary structure composed of probable base pairs, which might include pseudoknots. |
RemovePseudoknots | (19,37) | Remove pseudoknots from a structure. |
scorer | (31,30) | Calculate the sensitivity and positive predictive value (PPV) for a predicted as compared to the accepted structure. |
stochastic | (26,36) | Generate a representative sample of structures using stochastic sampling. |
TurboFold/TurboKnot | (24,38) | Calculate the conserved structures of three or more unaligned sequences using iteratively refined partition functions. |
Servers . | Programs run (see Table 1 for description) . |
---|---|
Predict a secondary structure | Fold, MaxExpect, ProbKnot, partition |
Predict a secondary structure common to two sequences | Dynalign, Parts |
Predict a secondary structure common to three or more sequences | Multilign, TurboFold/TurboKnot |
Predict a bimolecular secondary structure | bifold, DuplexFold |
Servers . | Programs run (see Table 1 for description) . |
---|---|
Predict a secondary structure | Fold, MaxExpect, ProbKnot, partition |
Predict a secondary structure common to two sequences | Dynalign, Parts |
Predict a secondary structure common to three or more sequences | Multilign, TurboFold/TurboKnot |
Predict a bimolecular secondary structure | bifold, DuplexFold |
Servers . | Programs run (see Table 1 for description) . |
---|---|
Predict a secondary structure | Fold, MaxExpect, ProbKnot, partition |
Predict a secondary structure common to two sequences | Dynalign, Parts |
Predict a secondary structure common to three or more sequences | Multilign, TurboFold/TurboKnot |
Predict a bimolecular secondary structure | bifold, DuplexFold |
Servers . | Programs run (see Table 1 for description) . |
---|---|
Predict a secondary structure | Fold, MaxExpect, ProbKnot, partition |
Predict a secondary structure common to two sequences | Dynalign, Parts |
Predict a secondary structure common to three or more sequences | Multilign, TurboFold/TurboKnot |
Predict a bimolecular secondary structure | bifold, DuplexFold |
INPUT
The structure prediction servers work using either bare sequences pasted into a window, or FASTA-formatted files uploaded to the server. The exception to this is the set of servers that predict a structure common to three or more sequences. For these servers, including Multilign and TurboFold, the sequence window requires multiple FASTA-formatted sequences. Alternatively, a FASTA-formatted file with multiple sequences can be uploaded.
Servers that act on structures, such as CircleCompare, draw, efn2, RemovePseudoknots or scorer, require an upload of a CT file. These can be generated manually or by a structure prediction server. Another alternative is to use a dot-bracket formatted structure and convert it to CT format using the dot2ct component of the RNAstructure server.
For each individual server, sample input data can be generated automatically to illustrate format and to provide a test case for the server. For structure prediction servers, clicking a link pastes sample sequences into the sequence windows. For the servers that act on structures, a link is provided to download a sample CT file, and this can be uploaded back to the server as sample data.
For Fold, partition and Predict a Secondary Structure, structure prediction can be restrained using SHAPE mapping data, when available, to improve the accuracy of structure prediction (39). AllSub, Dynalign, Fold, partition and Predict a Secondary Structure accept folding constraints, including the ability for force specific base pairs, forbid specific pairs, force a nucleotide paired, force a nucleotide unpaired, and specify that a nucleotide is accessible to chemical modification (33,34). These constraints and restraints are mediated by file uploads.
OUTPUT FORMATS
Once a calculation is submitted, a notification page is displayed. This page continues to refresh until the calculation output is available. At that time, the page is replaced with a page that contains the output. This page can be bookmarked and returned to at a later time. Alternatively, an e-mail address can be provided as part of the input data. If this is done, an e-mail is sent to the address when the calculation is complete and the results are available.
The structure prediction servers display the predicted structure using an SVG drawing, which can be rendered by web browsers with an SVG plugin (Figure 1). If more than one structure is predicted, i.e. suboptimal structures or a structure sample is present, ‘previous’ and ‘next’ buttons are displayed to enable scrolling between structures. For each predicted structure, the structure can be downloaded as a jpeg, svg, pdf, postscript or CT file.
Each output page displays the RNAstructure executable and the command line used to generate the results (40). Each input file is a link, allowing download of the processed-form data. This makes clear exactly what calculation is performed, and facilitates a user learning the capabilities of the RNAstructure software.
IMPLEMENTATION
The web servers use PHP to process queries. The server does not rely on JavaScript running on the client side, although a single exception to this is the previously available OligoWalk web server (11). The server itself is a computer cluster with a head node and five compute nodes, each with two dual core processors. A queuing system prioritizes the order at which queries are executed, which is generally in the order in which queries were made. The exceptions to this rule are calculations submitted for Dynalign, Multilign or TurboFold, which are submitted to the server as multiple-core calculations. These may need to wait in the queue at times when a number of other time-consuming calculations are already running.
The RNAstructure web servers have limitations on the calculations that can be done. This ensures that the resource will be available to the broader community with reasonable wait times. A list of limitations is available with the online help. Currently, for example, the single sequence structure prediction methods are limited to sequences of 2500 nucleotides or less. TurboFold is limited to a maximum of 12 sequences up to 1600 nucleotides in length. If the required calculation exceeds the web server requirements, the software can be downloaded (http://rna.urmc.rochester.edu/RNAstructure.html) and run locally.
HELP
Extensive online help is available for using the RNAstructure web server. The help documents each of the input and output fields, and is organized by individual server. For each server help page, a link is provided to the underlying RNAstructure executable (19), which provides additional details about the program. A separate help page details the server limitations, as explained above.
CONCLUSION
RNAstructure is designed to be a user-friendly software package, accessible to the community of investigators studying RNA (19). These new web servers continue this tradition, and provide the software to a wider user base. The accuracy of the algorithms has been extensively tested in prior publications. As the algorithms in RNAstructure are improved, and new algorithms developed, the web servers will be updated to continue to make RNAstructure available to the community.
FUNDING
National Institutes of Health (NIH) [R01 GM076485 to D.H.M.]. Funding for open access charge: NIH [R01 GM076485].
Conflict of interest statement. None declared.
Comments