A statistical method for predicting splice variants between two groups of samples using GeneChip® expression array data

被引:10
作者
Fan, Wenhong [1 ]
Khalid, Najma [1 ]
Hallahan, Andrew R. [2 ,3 ]
Olson, James M. [2 ]
Zhao, Lue Ping [1 ]
机构
[1] Fred Hutchinson Canc Res Ctr, Div Publ Hlth Sci, Seattle, WA 98109 USA
[2] Fred Hutchinson Canc Res Ctr, Div Clin Res, Seattle, WA 98109 USA
[3] Univ Queensland, Dept Paediat & Child Hlth, Herston, Qld 4029, Australia
来源
THEORETICAL BIOLOGY AND MEDICAL MODELLING | 2006年 / 3卷
基金
美国国家卫生研究院;
关键词
D O I
10.1186/1742-4682-3-19
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Alternative splicing of pre-messenger RNA results in RNA variants with combinations of selected exons. It is one of the essential biological functions and regulatory components in higher eukaryotic cells. Some of these variants are detectable with the Affymetrix GeneChip (R) that uses multiple oligonucleotide probes (i.e. probe set), since the target sequences for the multiple probes are adjacent within each gene. Hybridization intensity from a probe correlates with abundance of the corresponding transcript. Although the multiple-probe feature in the current GeneChip (R) was designed to assess expression values of individual genes, it also measures transcriptional abundance for a sub-region of a gene sequence. This additional capacity motivated us to develop a method to predict alternative splicing, taking advance of extensive repositories of GeneChip (R) gene expression array data. Results: We developed a two-step approach to predict alternative splicing from GeneChip (R) data. First, we clustered the probes from a probe set into pseudo-exons based on similarity of probe intensities and physical adjacency. A pseudo-exon is defined as a sequence in the gene within which multiple probes have comparable probe intensity values. Second, for each pseudo-exon, we assessed the statistical significance of the difference in probe intensity between two groups of samples. Differentially expressed pseudo-exons are predicted to be alternatively spliced. We applied our method to empirical data generated from GeneChip (R) Hu6800 arrays, which include 7129 probe sets and twenty probes per probe set. The dataset consists of sixty-nine medulloblastoma (27 metastatic and 42 non-metastatic) samples and four cerebellum samples as normal controls. We predicted that 577 genes would be alternatively spliced when we compared normal cerebellum samples to medulloblastomas, and predicted that thirteen genes would be alternatively spliced when we compared metastatic medulloblastomas to non-metastatic ones. We checked the consistency of some of our findings with information in UCSC Human Genome Browser. Conclusion: The two-step approach described in this paper is capable of predicting some alternative splicing from multiple oligonucleotide-based gene expression array data with GeneChip (R) technology. Our method employs the extensive repositories of gene expression array data available and generates alternative splicing hypotheses, which can be further validated by experimental studies.
引用
收藏
页数:9
相关论文
共 18 条
[1]   A class of models for analyzing GeneChip® gene expression analysis array data -: art. no. 16 [J].
Fan, WH ;
Pritchard, JI ;
Olson, JM ;
Khalid, N ;
Zhao, LP .
BMC GENOMICS, 2005, 6 (1)
[2]   Pre-mRNA splicing and human disease [J].
Faustino, NA ;
Cooper, TA .
GENES & DEVELOPMENT, 2003, 17 (04) :419-437
[3]   Genomic organization, alternative splicing, and expression patterns of the DSCR1 (Down syndrome candidate region 1) gene [J].
Fuentes, JJ ;
Pritchard, MA ;
Estivill, X .
GENOMICS, 1997, 44 (03) :358-361
[4]   GENOMIC STRUCTURE, CHROMOSOMAL LOCALIZATION, AND CONSERVED ALTERNATIVE SPLICE FORMS OF THROMBOPOIETIN [J].
GURNEY, AL ;
KUANG, WJ ;
XIE, MH ;
MALLOY, BE ;
EATON, DL ;
DESAUVAGE, FJ .
BLOOD, 1995, 85 (04) :981-988
[5]   Predicting splice variant from DNA chip expression data [J].
Hu, GK ;
Madore, SJ ;
Moldover, B ;
Jatkoe, T ;
Balaban, D ;
Thomas, J ;
Wang, YX .
GENOME RESEARCH, 2001, 11 (07) :1237-1245
[6]   Molecular isolation and characterization of a soluble isoform of activated leukocyte cell adhesion molecule that modulates endothelial cell function [J].
Ikeda, K ;
Quertermous, T .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2004, 279 (53) :55315-55323
[7]   Exon/intron organization, chromosome localization, alternative splicing, and transcription units of the human apolipoprotein E receptor 2 gene [J].
Kim, DH ;
Magoori, K ;
Inoue, TR ;
Mao, CC ;
Kim, HJ ;
Suzuki, H ;
Fujita, T ;
Endo, Y ;
Saeki, S ;
Yamamoto, TT .
JOURNAL OF BIOLOGICAL CHEMISTRY, 1997, 272 (13) :8498-8504
[8]  
KRAWCZAK M, 1992, HUM GENET, V90, P41
[9]   Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection [J].
Li, C ;
Wong, WH .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (01) :31-36
[10]  
LIANG KY, 1986, BIOMETRIKA, V73, P13, DOI 10.1093/biomet/73.1.13