Computing similarity between RNA structures

被引:28
作者
Ma, B
Wang, LS [1 ]
Zhang, KZ
机构
[1] Peking Univ, Dept Math, Beijing 100871, Peoples R China
[2] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China
[3] Univ Western Ontario, Dept Comp Sci, London, ON N6A 5B7, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
molecular biology; RNA structures; similarity; inapproximability;
D O I
10.1016/S0304-3975(01)00192-X
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The primary structure of a ribonucleic acid (RNA) molecule is a sequence of nucleotides (bases) over the four-letter alphabet {A, C, G, U}. The secondary or tertiary structure of an RNA is a set of base-pairs (nucleotide pairs) which forms bonds between A - U and C - G. For secondary structures, these bonds have been traditionally assumed to be one to one and non-crossing. This paper considers a notion of similarity between two RNA molecule structures taking into account the primary, the secondary and the tertiary structures. We show that, for tertiary structures, it is Max SNP-hard for both minimization and maximization versions. We show a stronger result for the maximization version where it cannot be approximated within ratio 2(logdeltan) in polynomial time, unless NP subset of or equal to DTIME[2(poly logn)]. We then present an algorithm that can be used for practical application. Our algorithm will produce an optimal solution for the case where at least one of the RNA involved is of a secondary structure. We also show an approximation algorithm. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:111 / 132
页数:22
相关论文
共 17 条
[1]  
Bafna V, 1995, LECT NOTES COMPUT SC, V937, P1
[2]  
CORPET F, 1994, COMPUT APPL BIOSCI, V10, P389
[3]   On the complexity of comparing evolutionary trees [J].
Hein, J ;
Jiang, T ;
Wang, LS ;
Zhang, KZ .
DISCRETE APPLIED MATHEMATICS, 1996, 71 (1-3) :153-169
[4]  
KLEIN PN, 1989, COMPUT BIOMED RES, V22, P461
[5]  
LE SY, 1989, COMPUT APPL BIOSCI, V5, P205
[6]   A GENERAL METHOD APPLICABLE TO SEARCH FOR SIMILARITIES IN AMINO ACID SEQUENCE OF 2 PROTEINS [J].
NEEDLEMAN, SB ;
WUNSCH, CD .
JOURNAL OF MOLECULAR BIOLOGY, 1970, 48 (03) :443-+
[7]   OPTIMIZATION, APPROXIMATION, AND COMPLEXITY CLASSES [J].
PAPADIMITRIOU, CH ;
YANNAKAKIS, M .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1991, 43 (03) :425-440
[8]  
SHAPIRO BA, 1990, COMPUT APPL BIOSCI, V6, P309
[9]  
SHAPIRO BA, 1988, COMPUT APPL BIOSCI, V4, P387
[10]   IDENTIFICATION OF COMMON MOLECULAR SUBSEQUENCES [J].
SMITH, TF ;
WATERMAN, MS .
JOURNAL OF MOLECULAR BIOLOGY, 1981, 147 (01) :195-197