Detecting cryptically simple protein sequences using the SIMPLE algorithm

被引:54
作者
Albà, MM
Laskowski, RA
Hancock, JM [1 ]
机构
[1] Royal Holloway Univ London, Dept Comp Sci, Egham TW20 0EX, Surrey, England
[2] Univ Pompeu Fabra, Grp Recerca Informat Biomed, Barcelona 08003, Spain
[3] Univ London Birkbeck Coll, Dept Crystallog, London WC1E 7HX, England
关键词
D O I
10.1093/bioinformatics/18.5.672
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Low-complexity or cryptically simple sequences are widespread in protein sequences but their evolution and function are poorly understood. To date methods for the detection of low complexity in proteins have been directed towards the filtering of such regions prior to sequence homology searches but not to the analysis of the regions per se. However, many of these regions are encoded by non-repetitive DNA sequences and may therefore result from selection acting on protein structure and/or function. Results: We have developed a new tool, based on the SIMPLE algorithm, that facilitates the quantification of the amount of simple sequence in proteins and determines the type of short motifs that show clustering above a certain threshold. By modifying the sensitivity of the program simple sequence content can be studied at various levels, from highly organised tandem structures to complex combinations of repeats. We compare the relative amount of simplicity in different functional groups of yeast proteins and determine the level of clustering of the different amino acids in these proteins.
引用
收藏
页码:672 / 678
页数:7
相关论文
共 32 条
[1]   Amino acid reiterations in yeast are overrepresented in particular classes of proteins and show evidence of a slippage-like mutational process [J].
Albà, MM ;
Santibàñez-Koref, MF ;
Hancock, JM .
JOURNAL OF MOLECULAR EVOLUTION, 1999, 49 (06) :789-797
[2]   Conservation of polyglutamine tract size between mice and humans depends on codon interruption [J].
Albà, MM ;
Santibáñez-Koref, MF ;
Hancock, JM .
MOLECULAR BIOLOGY AND EVOLUTION, 1999, 16 (11) :1641-1644
[3]   The comparative genomics of polyglutamine repeats:: Extreme difference in the codon organization of repeat-encoding regions between mammals and Drosophila [J].
Albà, MM ;
Santibáñez-Koref, MF ;
Hancock, JM .
JOURNAL OF MOLECULAR EVOLUTION, 2001, 52 (03) :249-259
[4]   CpG islands as genomic footprints of promoters that are associated with replication origins [J].
Antequera, F ;
Bird, A .
CURRENT BIOLOGY, 1999, 9 (17) :R661-R667
[5]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[6]  
BIESSMANN H, 1994, CHROMOSOMA, V103, P154, DOI 10.1007/BF00368007
[7]   Initiation of DNA replication at CpG islands in mammalian chromosomes [J].
Delgado, S ;
Gómez, M ;
Bird, A ;
Antequera, F .
EMBO JOURNAL, 1998, 17 (08) :2426-2435
[8]   Simple sequence is abundant in eukaryotic proteins [J].
Golding, GB .
PROTEIN SCIENCE, 1999, 8 (06) :1358-1361
[9]   CODON REITERATION AND THE EVOLUTION OF PROTEINS [J].
GREEN, H ;
WANG, N .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (10) :4298-4302
[10]   Simple sequences and the expanding genome [J].
Hancock, JM .
BIOESSAYS, 1996, 18 (05) :421-425