Multigenic families and proteomics: Extended protein characterization as a tool for paralog gene identification

被引:21
作者
Delalande, F
Carapito, C
Brizard, JP
Brugidou, C
Van Dorsselaer, A
机构
[1] Lab Spectrometrie Phys Masse Bioorgan, F-67087 Strasbourg 2, France
[2] IRD, Montpellier, France
关键词
multigenic family; paralog gene; post-transcriptional gene silencing; rice genome;
D O I
10.1002/pmic.200400954
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In classical proteomic studies, the searches in protein databases lead mostly to the identification of protein functions by homology due to the non-exhaustiveness of the protein databases. The quality of the identification depends on the studied organism, its complexity and its representation in the protein databases. Nevertheless, this basic function identification is insufficient for certain applications namely for the development of RNA-based gene-silencing strategies, commonly termed RNA interference (RNAi) in animals and post-transcriptional gene silencing (PTGS) in plants, that require an unambiguous identification of the targeted gene sequence. A PTGS strategy was considered in the study of the infection of Oryza sativa by the Rice Yellow Mottle Virus (RYMV). It is suspected that the RYMV recruits host proteins after its entry into plant cells to form a complex facilitating virus multiplication and spreading. The protein partners of this complex were identified by a classical proteomic approach, nano liquid chromatography tandem mass spectrometry. Among the identified proteins, several were retained for a PTGS strategy. Nevertheless most of the protein candidates appear to be members of multigenic families for which all paralog genes are not present in protein databases. Thus the identification of the real expressed paralog gene with classical protein database searches is impossible. Consequently, as the genome contains all genes and thus all paralog genes, a whole genome search strategy was developed to determine the specific expressed paralog gene. With this approach, the identification of peptides matching only a single gene, called discriminant peptides, allows definitive proof of the expression of this identified gene. This strategy has several requirements: (i) a genome completely sequenced and accessible; (ii) high protein sequence coverage. In the present work, through three examples, we report and validate for the first time a genome database search strategy to specifically identify paralog genes belonging to multigenic families expressed under specific conditions.
引用
收藏
页码:450 / 460
页数:11
相关论文
共 51 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], 2000, Nature
[3]  
Arumuganathan K, 1991, PLANT MOL BIOL REP, V9, P208, DOI [DOI 10.1007/BF02672069, 10.1007/BF02672069]
[4]  
BAKKER W, 1970, NETHERLAND J PLANT P, V77, P201
[5]   Fast forward genetics based on virus-induced gene silencing [J].
Baulcombe, DC .
CURRENT OPINION IN PLANT BIOLOGY, 1999, 2 (02) :109-113
[6]   GenBank [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Rapp, BA ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :15-18
[7]   Stability of rice yellow mottle virus and cellular compartmentalization during the infection process in Oryza sativa (L) [J].
Brugidou, C ;
Opalka, N ;
Yeager, M ;
Beachy, RN ;
Fauquet, C .
VIROLOGY, 2002, 297 (01) :98-108
[8]   Inverted repeat of a heterologous 3′-untranslated region for high-efficiency, high-throughput gene silencing [J].
Brummell, DA ;
Balint-Kurti, PJ ;
Harpster, MH ;
Palys, JM ;
Oeller, PW ;
Gutterson, N .
PLANT JOURNAL, 2003, 33 (04) :793-800
[9]   OrthoParaMap: Distinguishing orthologs from paralogs by integrating comparative genome data and gene phylogenies [J].
Cannon, SB ;
Young, ND .
BMC BIOINFORMATICS, 2003, 4 (1)
[10]  
Choudhary JS, 2001, PROTEOMICS, V1, P651, DOI 10.1002/1615-9861(200104)1:5<651::AID-PROT651>3.0.CO