Predictive lithologic mapping of South Korea from geochemical data using decision trees

被引:18
作者
Bacal, Ma Chrizelle Joyce Orillo [1 ]
Hwang, SangGi [1 ]
Guevarra-Segura, Ivy [1 ]
机构
[1] Pai Chai Univ, Dept Civil Environm & Railrd Engn, Daejeon, South Korea
关键词
Multivariate classification; Decision trees; Geochemical pattern; Digital geologic mapping; South Korea; REMOTE-SENSING DATA; SUPPORT-VECTOR-MACHINE; MINERAL PROSPECTIVITY; RANDOM FORESTS; AIRBORNE GEOPHYSICS; ANOMALIES; CLASSIFICATION; IDENTIFICATION; DEPOSITS; PROVINCE;
D O I
10.1016/j.gexplo.2019.06.008
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Two machine learning algorithms, C4.5 and random forest, collectively known as decision trees were utilized to directly establish the relationship between geochemical maps of South Korea and its geology. Using a large database containing geochemical and lithologic properties, inconsistencies in the a priori lithologic information were fixed using confusion matrix analysis and F-measure comparison via iterative C4.5 implementation. This corrective method resulted in eighteen rock classes but the succeeding C4.5 and random forest application only focused on classifying the 10 most common rock units. Geologic age was included as an attribute at such stage. Results were assessed using accuracy, precision, recall, kappa statistics, and F-measure. Average concentration of major oxides using records of correctly classified rock units were evaluated through Z-score normalization. C4.5 classification successfully predicted the spatial distribution of key lithologic units at 87% whereas random forest classification was at 96%. For both decision tree models, average standardized concentration of major oxides in each lithology adhere to perceived geologic knowledge, thereby proving the validity of the results. Rock age is determined as the most important predictor whereas major elements Al2O3, Na2O, and MgO together with trace elements Cr, Ni, and Cu are the strongest numeric predictors. Misinterpreted data points are mainly due to interpolation errors at or near map polygon boundaries, especially where map polygons are less than 50 km(2), and/or natural and anthropogenic contamination. Despite the misclassifications, decision trees are proven to be effective techniques in classifying lithologic units, and thus can reproduce a reliable geologic map from geochemical data of South Korea.
引用
收藏
页数:15
相关论文
共 81 条
[1]   Support vector machine for multi-classification of mineral prospectivity areas [J].
Abedi, Maysam ;
Norouzi, Gholam-Hossain ;
Bahroudi, Abbas .
COMPUTERS & GEOSCIENCES, 2012, 46 :272-283
[2]  
AITCHISON J, 1982, J ROY STAT SOC B, V44, P139
[3]   Multi-element association analysis of stream sediment geochemistry data for predicting gold deposits in south-central Yunnan Province, China [J].
Ali, Khaled ;
Cheng, Qiuming ;
Li, Wenchang ;
Chen, Yongqing .
GEOCHEMISTRY-EXPLORATION ENVIRONMENT ANALYSIS, 2006, 6 :341-348
[4]   Mineral identification using color spaces and artificial neural networks [J].
Baykan, Nurdan Akhan ;
Yilmaz, Nihat .
COMPUTERS & GEOSCIENCES, 2010, 36 (01) :91-97
[5]   Mapping lithology of the Sarfartoq carbonatite complex, southern West Greenland, using HyMap imaging spectrometer data [J].
Bedini, Enton .
REMOTE SENSING OF ENVIRONMENT, 2009, 113 (06) :1208-1219
[6]   Predicting rock type and detecting hydrothermal alteration using machine learning and petrophysical properties of the Canadian Malartic ore and host rocks, Pontiac Subprovince, Quebec, Canada [J].
Berube, Charles L. ;
Olivo, Gema R. ;
Chouteau, Michel ;
Perrouty, Stephane ;
Shamsipour, Pejman ;
Enkin, Randolph J. ;
Morris, William A. ;
Feltrin, Leonardo ;
Thiemonge, Raphael .
ORE GEOLOGY REVIEWS, 2018, 96 :130-145
[7]   Artificial neural network for acid sulfate soil mapping: Application to the Sirppujoki River catchment area, south-western Finland [J].
Beucher, Amelie ;
Osterholm, Peter ;
Martinkauppi, Annu ;
Eden, Peter ;
Frojdo, Soren .
JOURNAL OF GEOCHEMICAL EXPLORATION, 2013, 125 :46-55
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]  
Breiman L, 1999, 547 UCB STAT DEP, P16
[10]  
Breiman L, 2006, RANDOM FOREST BREIMA