Singular value decomposition analysis of protein sequence alignment score data

被引:4
作者
Fogolari, F [1 ]
Tessari, S [1 ]
Molinari, H [1 ]
机构
[1] Univ Verona, Fac Sci, Dipartimento Sci Tecnol, I-37100 Verona, Italy
来源
PROTEINS-STRUCTURE FUNCTION AND GENETICS | 2002年 / 46卷 / 02期
关键词
SVD; matrix analysis; matrix computation; clustering; array;
D O I
10.1002/prot.10032
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
One of the standard tools for the analysis of data arranged in matrix form is singular value decomposition (SVD). Few applications to genomic data have been reported to date mainly for the analysis of gene expression microarray data. We review SVD properties, examine mathematical terms and assumptions implicit in the SVD formalism, and show that SVD can be applied to the analysis of matrices representing pairwise alignment scores between large sets of protein sequences. In particular, we illustrate SVD capabilities for data dimension reduction and for clustering protein sequences. A comparison is performed between SVD-generated clusters of proteins and annotation reported in the SWISS-PROT Database for a set of protein sequences forming the calycin superfamily, entailing all entries corresponding to the lipocalin, cytosolic fatty acid-binding protein, and avidin-streptavidin Prosite patterns. Proteins 2002;46:161-170. (C) 2001 Wiley-Liss, Inc.
引用
收藏
页码:161 / 170
页数:10
相关论文
共 22 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]   Singular value decomposition for genome-wide expression data processing and modeling [J].
Alter, O ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) :10101-10106
[3]  
[Anonymous], 1996, MATRIX COMPUTATION
[4]  
[Anonymous], 1987, DIGITAL SPECTRAL ANA
[5]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[6]   The main fatty acid-binding protein in the liver of the shark (Halaetunus bivius) belongs to the liver basic type -: Isolation, amino acid sequence determination and characterization [J].
Córdoba, OL ;
Sánchez, EI ;
Santomé, JA .
EUROPEAN JOURNAL OF BIOCHEMISTRY, 1999, 265 (02) :832-838
[7]   THE SINGULAR-VALUE DECOMPOSITION AND LONG AND SHORT SPACES OF NOISY MATRICES [J].
DEMOOR, B .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1993, 41 (09) :2826-2838
[8]   Isolation, amino acid sequence determination and binding properties of two fatty-acid-binding proteins from axolotl (Ambistoma mexicanum) liver -: Evolutionary relationship [J].
Di Pietro, SM ;
Veerkamp, JH ;
Santomé, JA .
EUROPEAN JOURNAL OF BIOCHEMISTRY, 1999, 259 (1-2) :127-134
[9]   STRUCTURAL RELATIONSHIP OF STREPTAVIDIN TO THE CALYCIN PROTEIN SUPERFAMILY [J].
FLOWER, DR .
FEBS LETTERS, 1993, 333 (1-2) :99-102
[10]   STRUCTURE AND SEQUENCE RELATIONSHIPS IN THE LIPOCALINS AND RELATED PROTEINS [J].
FLOWER, DR ;
NORTH, ACT ;
ATTWOOD, TK .
PROTEIN SCIENCE, 1993, 2 (05) :753-761