An Introduction to Data Mining

被引:10
作者
Apostolakis, Joannis [1 ]
机构
[1] Univ Munich, Inst Informat, D-80538 Munich, Germany
来源
DATA MINING IN CRYSTALLOGRAPHY | 2010年 / 134卷
关键词
MULTILAYER FEEDFORWARD NETWORKS; RELATE; 2; SETS; POTENTIALS; ROTATION;
D O I
10.1007/430-2009_1
中图分类号
O61 [无机化学];
学科分类号
070301 ; 081704 ;
摘要
Data mining aims at the automated discovery of knowledge from typically large repositories of data. In science this knowledge is most often integrated into a model describing a particular process or natural phenomenon. Requirements with respect to the predictivity and the generality of the resulting models are usually significantly higher than in other application domains. Therefore, in the use of data mining in the sciences, and crystallography in particular, methods from machine learning and statistics play a significantly higher role than in other application areas. In the context of Crystallography, data collection, cleaning, and warehousing are aspects from standard data mining that play an important role, whereas for the analysis of the data techniques from machine learning and statistical analysis are mostly used. The purpose of this chapter is to introduce the reader to the concepts from that latter part of the knowledge discovery process and to provide a general intuition for the methods and possibilities of the different tools for learning from databases.
引用
收藏
页码:1 / 35
页数:35
相关论文
共 37 条
[1]  
Agrawal R., 1994, P 20 INT C VER LARG, P487, DOI DOI 10.5555/645920.672836
[2]   The Cambridge Structural Database: a quarter of a million crystal structures and rising [J].
Allen, FH .
ACTA CRYSTALLOGRAPHICA SECTION B-STRUCTURAL SCIENCE, 2002, 58 (3 PART 1) :380-388
[3]  
[Anonymous], 1948, Tech. J., V27, P379
[4]  
[Anonymous], J AM STAT ASSOC
[5]  
[Anonymous], ADV KNOWL DISCOVERY
[6]  
[Anonymous], J RES NATL I STAND T
[7]  
[Anonymous], BMC BIOINFORMATIC S1
[8]  
[Anonymous], 1962, PROC S MATH THEORY A
[9]  
[Anonymous], 1986, PARALLEL DISTRIBUTED
[10]  
[Anonymous], 1966, MULTIVARIATE ANAL