Data mining in bioinformatics using Weka

被引:699
作者
Frank, E
Hall, M
Trigg, L
Holmes, G
Witten, IH
机构
[1] Univ Waikato, Dept Comp Sci, Hamilton, New Zealand
[2] Reel Two, Hamilton, New Zealand
关键词
D O I
10.1093/bioinformatics/bth261
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The Weka machine learning workbench provides a general-purpose environment for automatic classification, regression, clustering and feature selection-common data mining problems in bioinformatics research. It contains an extensive collection of machine learning algorithms and data pre-processing methods complemented by graphical user interfaces for data exploration and the experimental comparison of different machine learning techniques on the same problem. Weka can process data given in the form of a single relational table. Its main objectives are to (a) assist users in extracting useful information from data and (b) enable them to easily identify a suitable algorithm for generating an accurate predictive model from it.
引用
收藏
页码:2479 / 2481
页数:3
相关论文
共 9 条
  • [1] Automated annotation of keywords for proteins related to mycoplasmataceae using machine learning techniques
    Bazzan, ALC
    Engel, PM
    Schroeder, LF
    da Silva, SC
    [J]. BIOINFORMATICS, 2002, 18 : S35 - S43
  • [2] Towards a computational model for-1 eukaryotic frameshifting sites
    Bekaert, M
    Bidou, L
    Denise, A
    Duchateau-Nguyen, G
    Forest, JP
    Froidevaux, C
    Hatin, I
    Rousset, JP
    Termier, M
    [J]. BIOINFORMATICS, 2003, 19 (03) : 327 - 335
  • [3] Automatic rule generation for protein annotation with the C4.5 data mining algorithm applied on SWISS-PROT
    Kretschmann, E
    Fleischmann, W
    Apweiler, R
    [J]. BIOINFORMATICS, 2001, 17 (10) : 920 - 926
  • [4] LI J, 2003, BIOINFORMATICS S2, V19, P93
  • [5] Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (ALL) patients
    Li, JY
    Liu, HQ
    Downing, JR
    Yeoh, AEJ
    Wong, LS
    [J]. BIOINFORMATICS, 2003, 19 (01) : 71 - 78
  • [6] Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns
    Li, JY
    Wong, LS
    [J]. BIOINFORMATICS, 2002, 18 (05) : 725 - 734
  • [7] Application of metabolomics to plant genotype discrimination using statistics and machine learning
    Taylor, J
    King, RD
    Altmann, T
    Fiehn, O
    [J]. BIOINFORMATICS, 2002, 18 : S241 - S248
  • [8] Tobler J B, 2002, Bioinformatics, V18 Suppl 1, pS164
  • [9] Witten IH, Frank E: Data Mining: Practical Machine Learning Tools and Techniques 2nd editionSan Francisco: Morgan Kaufmann Publishers; 2005:560. ISBN 0-12-088407-0, £34.99
    Francisco Azuaje
    [J]. BioMedical Engineering OnLine, 5 (1)