Optimized design and assessment of whole genome tiling arrays

被引:41
作者
Graef, Stefan
Nielsen, Fiona G. G.
Kurtz, Stefan
Huynen, Martijn A.
Birney, Ewan
Stunnenberg, Henk
Flicek, Paul [1 ]
机构
[1] EMBL European Bioinformat Inst, Cambridge, England
[2] Radboud Univ Nijmegen, Nijmegen Ctr Mol Life Sci, Nijmegen, Netherlands
[3] Radboud Univ Nijmegen Med Ctr, Nijmegen Ctr Mol Life Sci, Nijmegen, Netherlands
[4] Univ Hamburg, Ctr Bioinformat, Hamburg, Germany
基金
英国惠康基金;
关键词
D O I
10.1093/bioinformatics/btm200
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Recent advances in microarray technologies have made it feasible to interrogate whole genomes with tiling arrays and this technique is rapidly becoming one of the most important high-throughput functional genomics assays. For large mammalian genomes, analyzing oligonucleotide tiling array data is complicated by the presence of non-unique sequences on the array, which increases the overall noise in the data and may lead to false positive results due to cross-hybridization. The ability to create custom microarrays using maskless array synthesis has led us to consider ways to optimize array design characteristics for improving data quality and analysis. We have identified a number of design parameters to be optimized including uniqueness of the probe sequences within the whole genome, melting temperature and self-hybridization potential. Results: We introduce the uniqueness score, U, a novel quality measure for oligonucleotide probes and present a method to quickly compute it. We show that U is equivalent to the number of shortest unique substrings in the probe and describe an efficient greedy algorithm to design mammalian whole genome tiling arrays using probes that maximize U. Using the mouse genome, we demonstrate how several optimizations influence the tiling array design characteristics. With a sensible set of parameters, our designs cover 78% of the mouse genome including many regions previously considered `untilable' due to the presence of repetitive sequence. Finally, we compare our whole genome tiling array designs with commercially available designs.
引用
收藏
页码:I195 / I204
页数:10
相关论文
共 36 条
[1]   Design optimization methods for genomic DNA tiling arrays [J].
Bertone, P ;
Trifonov, V ;
Rozowsky, JS ;
Schubert, F ;
Emanuelsson, O ;
Karro, J ;
Kao, MY ;
Snyder, M ;
Gerstein, M .
GENOME RESEARCH, 2006, 16 (02) :271-281
[2]   Global identification of human transcribed sequences with genome tiling arrays [J].
Bertone, P ;
Stolc, V ;
Royce, TE ;
Rozowsky, JS ;
Urban, AE ;
Zhu, XW ;
Rinn, JL ;
Tongprasit, W ;
Samanta, M ;
Weissman, S ;
Gerstein, M ;
Snyder, M .
SCIENCE, 2004, 306 (5705) :2242-2246
[3]  
Bloomfield VA, 2000, NUCL ACIDS STRUCTURE
[4]   ChIPOTle: a user-friendly tool for the analysis of ChIP-chip data [J].
Buck, MJ ;
Nobel, AB ;
Lieb, JD .
GENOME BIOLOGY, 2005, 6 (11)
[5]   ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments [J].
Buck, MJ ;
Lieb, JD .
GENOMICS, 2004, 83 (03) :349-360
[6]  
Burrows M, 1994, 124 DIG SYST RES CTR
[7]   The Ensembl automatic gene annotation system [J].
Curwen, V ;
Eyras, E ;
Andrews, TD ;
Clarke, L ;
Mongin, E ;
Searle, SMJ ;
Clamp, M .
GENOME RESEARCH, 2004, 14 (05) :942-950
[8]   Assessing the performance of different high-density tiling microarray strategies for mapping transcribed regions of the human genome [J].
Emanuelsson, Olof ;
Nagalakshmi, Ugrappa ;
Zheng, Deyou ;
Rozowsky, Joel S. ;
Urban, Alexander E. ;
Du, Jiang ;
Lian, Zheng ;
Stolc, Viktor ;
Weissman, Sherman ;
Snyder, Michael ;
Gerstein, Mark B. .
GENOME RESEARCH, 2007, 17 (06) :886-897
[9]   The ENCODE (ENCyclopedia of DNA elements) Project [J].
Feingold, EA ;
Good, PJ ;
Guyer, MS ;
Kamholz, S ;
Liefer, L ;
Wetterstrand, K ;
Collins, FS ;
Gingeras, TR ;
Kampa, D ;
Sekinger, EA ;
Cheng, J ;
Hirsch, H ;
Ghosh, S ;
Zhu, Z ;
Pate, S ;
Piccolboni, A ;
Yang, A ;
Tammana, H ;
Bekiranov, S ;
Kapranov, P ;
Harrison, R ;
Church, G ;
Struhl, K ;
Ren, B ;
Kim, TH ;
Barrera, LO ;
Qu, C ;
Van Calcar, S ;
Luna, R ;
Glass, CK ;
Rosenfeld, MG ;
Guigo, R ;
Antonarakis, SE ;
Birney, E ;
Brent, M ;
Pachter, L ;
Reymond, A ;
Dermitzakis, ET ;
Dewey, C ;
Keefe, D ;
Denoeud, F ;
Lagarde, J ;
Ashurst, J ;
Hubbard, T ;
Wesselink, JJ ;
Castelo, R ;
Eyras, E ;
Myers, RM ;
Sidow, A ;
Batzoglou, S .
SCIENCE, 2004, 306 (5696) :636-640
[10]   Opportunistic data structures with applications [J].
Ferragina, P ;
Manzini, G .
41ST ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2000, :390-398