| overview |
P.A. Limbach, P.F. Crain and J.A. McCloskey. Summary: the modified nucleosides of RNA. Nucleic Acids Res. 22, 2183-2196 (1994).Both the structural diversity and extent of posttranscriptional modification in RNA is remarkable [1], with 96 different nucleosides presently known in all types of RNA. The discovery of new modified nucleosides as well as increasing knowledge of the array of functional roles of modification, based largely on extensive studies of tRNA [2-5], mandates a need for a comprehensive database of RNA nucleosides. The RNA Modification Database is maintained as an extension of the initial version published in mid-1994 [1]. The database consists of all RNA-derived ribonucleosides of known structure, including those from established sequence positions, as well as those detected or characterized from hydrolysates of RNA. The information provided permits access to the modified nucleoside literature through provision of both computer-searchable Chemical Abstracts registry numbers and key literature citations.
The reader is referred to the earlier publication [1] or to paragraphs below for discussion of selected topics.
The authors invite comments concerning new entries, errors or omissions and on the format presently used for electronic access to the database. The email address for this purpose is: RNAmods@lib.med.utah.edu
In general, the symbol "N" is preferred for modified nucleosides of unknown structure, or "Nm" if ribose methylation of O-2' has been established. If the heterocyclic moiety is known or assumed, for example from the corresponding gene sequence, the designation A*, U*, etc. is recommended. The use of "X" for an unknown nucleoside is not recommended because of confusion with the nucleoside acp3U (see comment in the acp3U file). The superscript "x" is useful to indicate a form of modification without designation of position; for example "mxC" for cytidine monomethylated in the base at an unknown or unspecified position. Likewise "x" is sometimes used in the fashion "x5s2U", as to generically designate 5-substituted derivatives of 2-thiouridine. The use of "m" for "modified", as in "mA" (occasionally used in the rRNA literature) is strongly discouraged because of the ambiguity associated with use of "m" for methylation.
Additional comments are found under Symbol in the Description of Entries section.
Entries in each file are categorized by type of RNA, although in early work total cellular RNA was often isolated with no clear distinction made between tRNA ("soluble RNA"), the various rRNAs ("microsomal") or others. In such cases the listing given may correspond to contemporary knowledge of distribution of the nucleoside within certain types of RNA. Some of the earlier literature concerning modification in rRNA is of questionable accuracy, and this particular problem is addressed in the next section ("Nucleoside modification in ribosomal RNA").
Excluded from the database are those modifications known to be artifactual, for example as a consequence of degradation, or those having clearly incorrect structures. If the basis of structure assignment or identification is considered to be inconclusive, this is indicated under Comments in individual files.
Also, there are two notable examples of modifications that have not been included in the compilation.
A total of 96 modified nucleosides for which structures have been assigned have been reported in RNA. The largest number, 81, with the greatest structural diversity, are found in tRNA, with 30 in rRNA, 12 in mRNA and 13 in other RNA species, most notably snRNA. Based on lessons learned primarily from tRNA, it is clear that many modification motifs and their sequence locations tend to be conserved, although distinct differences among the three primary phylogenetic domains are observed. The fewer number of modifications reported from the archaeal RNAs, to a limited extent, reflects fewer investigations compared with bacteria and eukarya. In any event, it is clear that present knowledge of the structural diversity of RNA nucleosides, and certainly of their distribution, is somewhat narrowly confined to relatively few organisms. It is our judgement that the total number of RNA nucleosides listed, and the chemical structures reported, are very accurate. However, the distributions listed are in some cases a matter of concern, due primarily to the possibility of RNA inhomogeneity and the use of methods of nucleoside identification that are not sufficiently rigorous. Reinvestigation of some of the unusual or single-report source distributions is warranted, and will likely lead to future refinements in the listings.
This compilation is maintained as a continuously updated database. Comments concerning format, omissions or newly published material are solicited.