A double-screening method to identify reliable candidate non-synonymous SNPs from chicken EST data

被引:38
作者
Kim, H
Schmidt, CJ
Decker, KS
Emara, MG
机构
[1] Univ Delaware, Dept Anim & Food Sci, Newark, DE 19716 USA
[2] Univ Delaware, Dept Comp & Informat Sci, Newark, DE 19716 USA
关键词
chicken; double-screening; expressed sequence tag data mining; non-synonymous; SNP;
D O I
10.1046/j.1365-2052.2003.01003.x
中图分类号
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号
0905 ;
摘要
Discovery of non-synonymous single nucleotide polymorphisms (nsSNP), which cause amino acid substitutions, is important because they are more likely to alter protein function than synonymous SNPs (sSNP) or those SNPs that do not result in amino acid changes. By changing the coding sequences, nsSNP may play a role in heritable differences between individual organisms. In the chicken and many other vertebrates, the main obstacle for identifying nsSNP is that there is insufficient protein and mRNA sequence information for self-species referencing and thus, determination of the correct reading frame for expressed sequence tags (ESTs) is difficult. Therefore, in order to estimate the correct reading frame at nsSNP in chicken ESTs, a double-screening approach was designed using self- or cross-species protein referencing, in addition to the ESTScan coding region estimation programme. Starting with 23 427 chicken ESTs, 1210 potential SNPs were discovered using a phred/phrap/polyphred/consed pipeline process and among these, 108 candidate nsSNP were identified with the double screening method. A searchable SNP database (chicksnps) for the candidate chicken SNPs, including both nsSNPs and sSNPs is available at http://chicksnps.afs.udel.edu. The chicken SNP data described in this paper have been submitted to the data base SNP under National Center for Biotechnology Information assay ID ss4387050-ss4388259.
引用
收藏
页码:249 / 254
页数:6
相关论文
共 24 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Reliable identification of large numbers of candidate SNPs from public EST data [J].
Buetow, KH ;
Edmonson, MN ;
Cassidy, AB .
NATURE GENETICS, 1999, 21 (03) :323-325
[3]   Characterization of single-nucleotide polymorphisms in coding regions of human genes [J].
Cargill, M ;
Altshuler, D ;
Ireland, J ;
Sklar, P ;
Ardlie, K ;
Patil, N ;
Lane, CR ;
Lim, EP ;
Kalyanaraman, N ;
Nemesh, J ;
Ziaugra, L ;
Friedland, L ;
Rolfe, A ;
Warrington, J ;
Lipshutz, R ;
Daley, GQ ;
Lander, ES .
NATURE GENETICS, 1999, 22 (03) :231-238
[4]   SNP maps: more markers needed? [J].
Dawson, D .
MOLECULAR MEDICINE TODAY, 1999, 5 (10) :419-420
[5]   SNP association studies in Alzheimer's disease highlight problems for complex disease analysis [J].
Emahazion, T ;
Feuk, L ;
Jobs, M ;
Sawyer, SL ;
Fredman, D ;
St Clair, D ;
Prince, JA ;
Brookes, AJ .
TRENDS IN GENETICS, 2001, 17 (07) :407-413
[6]   Base-calling of automated sequencer traces using phred.: II.: Error probabilities [J].
Ewing, B ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :186-194
[7]   Base-calling of automated sequencer traces using phred.: I.: Accuracy assessment [J].
Ewing, B ;
Hillier, L ;
Wendl, MC ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :175-185
[8]   Identification of candidate coding region single nucleotide polymorphisms in 165 human genes using assembled expressed sequence tags [J].
Garg, K ;
Green, P ;
Nickerson, DA .
GENOME RESEARCH, 1999, 9 (11) :1087-1092
[9]   DIANA-EST: a statistical analysis [J].
Hatzigeorgiou, AG ;
Fiziev, P ;
Reczko, M .
BIOINFORMATICS, 2001, 17 (10) :913-919
[10]   Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequences [J].
Irizarry, K ;
Kustanovich, V ;
Li, C ;
Brown, N ;
Nelson, S ;
Wong, W ;
Lee, CJ .
NATURE GENETICS, 2000, 26 (02) :233-236