QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data

被引:453
作者
Colella, Stefano
Yau, Christopher
Taylor, Jennifer M.
Mirza, Ghazala
Butler, Helen
Clouston, Penny
Bassett, Anne S.
Seller, Anneke
Holmes, Christopher C.
Ragoussis, Jiannis
机构
[1] Wellcome Trust Ctr Human Genet, Genom Lab, Oxford OX3 7BN, England
[2] Life Sci Interface Doctoral Training Ctr, Oxford OX1 3QD, England
[3] Univ Oxford, Dept Stat, Henry Wellcome Ctr Gene Funct, Oxford OX1 3TG, England
[4] Churchill Hosp, Oxford Med Genet Labs, Oxford OX3 7LJ, England
[5] Univ Toronto, Ctr Addict & Mental Hlth, Toronto, ON M6J 1H4, Canada
[6] MRC, Mammalian Genet Unit, Didcot OX11 0RD, Oxon, England
基金
英国医学研究理事会; 英国惠康基金;
关键词
D O I
10.1093/nar/gkm076
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Array-based technologies have been used to detect chromosomal copy number changes (aneuploidies) in the human genome. Recent studies identified numerous copy number variants (CNV) and some are common polymorphisms that may contribute to disease susceptibility. We developed, and experimentally validated, a novel computational framework (QuantiSNP) for detecting regions of copy number variation from BeadArray (TM) SNP genotyping data using an Objective Bayes Hidden-Markov Model (OB-HMM). Objective Bayes measures are used to set certain hyperparameters in the priors using a novel re-sampling framework to calibrate the model to a fixed Type I (false positive) error rate. Other parameters are set via maximum marginal likelihood to prior training data of known structure. QuantiSNP provides probabilistic quantification of state classifications and significantly improves the accuracy of segmental aneuploidy identification and mapping, relative to existing analytical tools (Beadstudio, Illumina), as demonstrated by validation of breakpoint boundaries. QuantiSNP identified both novel and validated CNVs. QuantiSNP was developed using BeadArray (TM) SNP data but it can be adapted to other platforms and we believe that the OB-HMM framework has widespread applicability in genomic research. In conclusion, QuantiSNP is a novel algorithm for high-resolution CNV/aneuploidy detection with application to clinical genetics, cancer and disease association studies.
引用
收藏
页码:2013 / 2025
页数:13
相关论文
共 41 条
  • [1] [Anonymous], 1980, Proc. Symposium on the application of hidden Markov models to text and speech
  • [2] [Anonymous], BAYESIAN ANAL
  • [3] The interplay of Bayesian and frequentist analysis
    Bayarri, MJ
    Berger, JO
    [J]. STATISTICAL SCIENCE, 2004, 19 (01) : 58 - 80
  • [4] Beroukhim R, 2006, PLOS COMPUT BIOL, V2, P323, DOI 10.1371/journal.pcbi.0020041
  • [5] High-resolution analysis of DNA copy number using oligonucleotide microarrays
    Bignell, GR
    Huang, J
    Greshock, J
    Watt, S
    Butler, A
    West, S
    Grigorova, M
    Jones, KW
    Wei, W
    Stratton, MR
    Futreal, PA
    Weber, B
    Shapero, MH
    Wooster, R
    [J]. GENOME RESEARCH, 2004, 14 (02) : 287 - 295
  • [6] Schizophrenia in an adult with 6p25 deletion syndrome
    Caluseriu, O.
    Mirza, G.
    Ragoussis, J.
    Chow, E. W. C.
    MacCrimmon, D.
    Bassett, A. S.
    [J]. AMERICAN JOURNAL OF MEDICAL GENETICS PART A, 2006, 140A (11) : 1208 - 1213
  • [7] DAVIES AF, 1995, HUM MOL GENET, V4, P121
  • [8] Further evidence for the involvement of human chromosome 6p24 in the aetiology of orofacial clefting
    Davies, AF
    Imaizumi, K
    Mirza, G
    Stephens, RS
    Kuroki, Y
    Matsuno, M
    Ragoussis, J
    [J]. JOURNAL OF MEDICAL GENETICS, 1998, 35 (10) : 857 - 861
  • [9] Delineation of two distinct 6p deletion syndromes
    Davies, AF
    Mirza, G
    Sekhon, G
    Turnpenny, P
    Leroy, F
    Speleman, F
    Law, C
    van Regemorter, N
    Vamos, E
    Flinter, F
    Ragoussis, J
    [J]. HUMAN GENETICS, 1999, 104 (01) : 64 - 72
  • [10] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38