The properties of high-dimensional data spaces: implications for exploring gene and protein expression data

被引:388
作者
Clarke, Robert [1 ,2 ,3 ]
Ressom, Habtom W. [1 ,2 ,4 ]
Wang, Antai [4 ]
Xuan, Jianhua [5 ]
Liu, Minetta C. [1 ,2 ]
Gehan, Edmund A. [4 ]
Wang, Yue [5 ]
机构
[1] Georgetown Univ, Sch Med, Dept Oncol, Washington, DC 20057 USA
[2] Georgetown Univ, Sch Med, Lombardi Comprehens Canc Ctr, Washington, DC 20057 USA
[3] Georgetown Univ, Sch Med, Dept Physiol & Biophys, Washington, DC 20057 USA
[4] Georgetown Univ, Sch Med, Dept Biostat & Bioinformat & Biomath, Washington, DC 20057 USA
[5] Virginia Polytech Inst & State Univ, Sch Sci & Engn, Bradley Dept Elect & Comp Engn, Arlington, VA 22203 USA
关键词
D O I
10.1038/nrc2294
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
High-throughput genomic and proteomic technologies are widely used in cancer research to build better predictive models of diagnosis, prognosis and therapy, to identify and characterize key signalling networks and to find new targets for drug development. These technologies present investigators with the task of extracting meaningful statistical and biological information from high-dimensional data spaces, wherein each sample is defined by hundreds or thousands of measurements, usually concurrently obtained. The properties of high dimensionality are often poorly understood or overlooked in data modelling and analysis. From the perspective of translational science, this Review discusses the properties of high-dimensional data spaces that arise in genomic and proteomic studies and the challenges they can pose for data analysis and interpretation.
引用
收藏
页码:37 / 49
页数:13
相关论文
共 122 条
  • [1] AAMDAL S, 1984, CANCER-AM CANCER SOC, V53, P2525, DOI 10.1002/1097-0142(19840601)53:11<2525::AID-CNCR2820531126>3.0.CO
  • [2] 2-8
  • [3] AGARWAL R, 1998, P ACM SIGMOD INT C M, P94
  • [4] Al-Hajj M, 2007, CURR OPIN ONCOL, V19, P61
  • [5] Microarray data analysis: from disarray to consolidation and consensus
    Allison, DB
    Cui, XQ
    Page, GP
    Sabripour, M
    [J]. NATURE REVIEWS GENETICS, 2006, 7 (01) : 55 - 65
  • [6] Biological networks: The tinkerer as an engineer
    Alon, U
    [J]. SCIENCE, 2003, 301 (5641) : 1866 - 1867
  • [7] Microarray-based classification of a consecutive series of 121 childhood acute leukemias: prediction of leukemic and genetic subtype as well as of minimal residual disease status
    Andersson, A.
    Ritz, C.
    Lindgren, D.
    Eden, P.
    Lassen, C.
    Heldrup, J.
    Olofsson, T.
    Rade, J.
    Fontes, M.
    Porwit-MacDonald, A.
    Behrendtz, M.
    Hoglund, M.
    Johansson, B.
    Fioretos, T.
    [J]. LEUKEMIA, 2007, 21 (06) : 1198 - 1203
  • [8] [Anonymous], 2001, COCHRANE DB SYST REV
  • [9] Reversal of tamoxifen resistance of human breast carcinomas in vivo by neutralizing antibodies to transforming growth factor-β
    Arteaga, CL
    Koli, KM
    Dugger, TC
    Clarke, R
    [J]. JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE, 1999, 91 (01): : 46 - 53
  • [10] Gene expression profiles predict complete pathologic response to neoadjuvant paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide chemotherapy in breast cancer
    Ayers, M
    Symmans, WF
    Stec, J
    Damokosh, AI
    Clark, E
    Hess, K
    Lecocke, M
    Metivier, J
    Booser, D
    Ibrahim, N
    Valero, V
    Royce, M
    Arun, B
    Whitman, G
    Ross, J
    Sneige, N
    Hortobagyi, GN
    Pusztai, L
    [J]. JOURNAL OF CLINICAL ONCOLOGY, 2004, 22 (12) : 2284 - 2293