Notl flanking sequences:: a tool for gene discovery and verification of the human genome

被引:17
作者
Kutsenko, AS
Gizatullin, RZ
Al-Amin, AN
Wang, FL
Kvasha, SM
Podowski, RM
Matushkin, YG
Gyanchandani, A
Muravenko, OV
Levitsky, VG
Kolchanov, KA
Protopopov, AI
Kashuba, VI
Kisselev, LL
Wasserman, W
Wahlestedt, C
Zabarovsky, ER
机构
[1] Karolinska Inst, Ctr Microbiol & Tumor Biol, S-17177 Stockholm, Sweden
[2] Karolinska Inst, Ctr Gen & Bioinformat, S-17177 Stockholm, Sweden
[3] Russian Acad Sci, Inst Cytol & Genet, Novosibirsk 630090, Russia
[4] Russian Acad Sci, VA Engelhardt Mol Biol Inst, Moscow 119991, Russia
关键词
D O I
10.1093/nar/gkf428
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A set of 22 551 unique human Notl flanking sequences (16.2 Mb) was generated. More than 40% of the set had regions with significant similarity to known proteins and expressed sequences. The data demonstrate that regions flanking Notl sites are less likely to form nucleosomes efficiently and resemble promoter regions. The draft human genome sequence contained 55.7% of the Notl flanking sequences, Celera's database contained matches to 57.2% of the clones and all public databases (including non-human and previously sequenced Notl flanks) matched 89.2% of the Notl flanking sequences (identity greater than or equal to90% over at least 50 bp, data from December 2001). The data suggest that the shotgun sequencing approach used to generate the draft human genome sequence resulted in a bias against cloning and sequencing of Notl flanks. A rough estimation (based primarily on chromosomes 21 and 22) is that the human genome contains 15 000-20 000 Notl sites, of which 6000-9000 are unmethylated in any particular cell. The results of the study suggest that the existing tools for computational determination of CpG islands fail to identify a significant fraction of functional CpG islands, and unmethylated DNA stretches with a high frequency of CpG dinucleotides can be found even in regions with low CG content.
引用
收藏
页码:3163 / 3170
页数:8
相关论文
共 37 条
[1]   NOTI LINKING CLONES AS A TOOL FOR JOINING PHYSICAL AND GENETIC MAPS OF THE HUMAN GENOME [J].
ALLIKMETS, RL ;
KASHUBA, VI ;
PETTERSSON, B ;
GIZATULLIN, R ;
LEBEDEVA, T ;
KHOLODNYUK, ID ;
BANNIKOV, VM ;
PETROV, N ;
ZAKHARYEV, VM ;
WINBERG, G ;
MODI, W ;
DEAN, M ;
UHLEN, M ;
KISSELEV, LL ;
KLEIN, G ;
ZABAROVSKY, ER .
GENOMICS, 1994, 19 (02) :303-309
[2]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[3]   CONSTRUCTION AND CHARACTERIZATION OF A NOTL-BSUE LINKING LIBRARY FROM THE HUMAN X-CHROMOSOME [J].
ARENSTORF, HP ;
KANDPAL, RP ;
BASKARAN, N ;
PARIMOO, S ;
TANAKA, Y ;
KITAJIMA, S ;
YASUKOCHI, Y ;
WEISSMAN, SM .
GENOMICS, 1991, 11 (01) :115-123
[4]   CPG ISLANDS AS GENE MARKERS IN THE VERTEBRATE NUCLEUS [J].
BIRD, AP .
TRENDS IN GENETICS, 1987, 3 (12) :342-347
[5]   The DNA sequence of human chromosome 22 [J].
Dunham, I ;
Shimizu, N ;
Roe, BA ;
Chissoe, S ;
Dunham, I ;
Hunt, AR ;
Collins, JE ;
Bruskiewich, R ;
Beare, DM ;
Clamp, M ;
Smink, LJ ;
Ainscough, R ;
Almeida, JP ;
Babbage, A ;
Bagguley, C ;
Balley, J ;
Barlow, K ;
Bates, KN ;
Beasley, O ;
Bird, CP ;
Blakey, S ;
Bridgeman, AM ;
Buck, D ;
Burgess, J ;
Burrill, WD ;
Burton, J ;
Carder, C ;
Carter, NP ;
Chen, Y ;
Clark, G ;
Clegg, SM ;
Cobley, V ;
Cole, CG ;
Collier, RE ;
Connor, RE ;
Conroy, D ;
Corby, N ;
Coville, GJ ;
Cox, AV ;
Davis, J ;
Dawson, E ;
Dhami, PD ;
Dockree, C ;
Dodsworth, SJ ;
Durbin, RM ;
Ellington, A ;
Evans, KL ;
Fey, JM ;
Fleming, K ;
French, L .
NATURE, 1999, 402 (6761) :489-495
[6]   THE ROLE OF METHYLATION IN THE PHENOTYPE-DEPENDENT MODULATION OF EPSTEIN-BARR NUCLEAR ANTIGEN-2 AND LATENT MEMBRANE-PROTEIN GENES IN CELLS LATENTLY INFECTED WITH EPSTEIN-BARR VIRUS [J].
ERNBERG, I ;
FALK, K ;
MINAROVITS, J ;
BUSSON, P ;
TURSZ, T ;
MASUCCI, MG ;
KLEIN, G .
JOURNAL OF GENERAL VIROLOGY, 1989, 70 :2989-3002
[7]   CPG ISLANDS IN VERTEBRATE GENOMES [J].
GARDINERGARDEN, M ;
FROMMER, M .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 196 (02) :261-282
[8]  
GEMMILL RM, 1995, NATURE, V28, P299
[9]   IDENTIFICATION OF PROTEIN CODING REGIONS BY DATABASE SIMILARITY SEARCH [J].
GISH, W ;
STATES, DJ .
NATURE GENETICS, 1993, 3 (03) :266-272
[10]   Human NRG3 gene Map position 10q22-q23 [J].
Gizatullin, RZ ;
Muravenko, OV ;
Al-Amin, AN ;
Wang, F ;
Protopopov, AI ;
Kashuba, VI ;
Zelenin, AV ;
Zabarovsky, ER .
CHROMOSOME RESEARCH, 2000, 8 (06) :560-560