Machine learning methods for microbial source tracking

被引:30
作者
Belanche-Munoz, Lluis [1 ]
Blanch, Anicet R. [2 ]
机构
[1] Univ Politecn Cataluna, Dept Software, Barcelona, Catalonia, Spain
[2] Univ Barcelona, Dept Microbiol, Barcelona, Spain
关键词
microbial source tracking; water; machine learning methods; microbial indicators; faecal pollution;
D O I
10.1016/j.envsoft.2007.09.013
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper reports on a successful application of statistical and inductive learning methods to determine optimal discriminating parameters and develop predictive models for the determination of faecal sources in waters, recently and heavily polluted with wastewaters (microbial source tracking). The data comes from an international study in which various microbial and chemical parameters were determined in heavily polluted waters from diverse geographical areas. A total of 38 variables derived from the microbial and chemical parameters were defined to characterise the available 103 observations. Four methods were evaluated: Euclidean k-nearest-neighbour, linear Bayesian classifier, quadratic Bayesian classifier and a support vector machine. The main aim was the obtention of highly accurate predictive models using the lowest number of variables possible. After a strong feature selection process, the obtained results show that predictive models using only two variables emerge with 100% correct classification. The obtained solutions make use of a linear combination of a discriminating tracer (the enumeration of phages infecting Bacteroides thetaiotaomicron) and a universal non-discriminant faecal indicator. Other models not using the discriminant tracer were developed, though a higher number of variables was needed to achieve a high rate of correct classification. (c) 2007 Elsevier Ltd. All rights reserved.
引用
收藏
页码:741 / 750
页数:10
相关论文
共 30 条
[1]  
Bazaraa M.S., 1990, LINEAR PROGRAMMING N, DOI DOI 10.1002/0471787779
[2]   Integrated analysis of established and novel microbial and chemical methods for microbial source tracking [J].
Blanch, Anicet R. ;
Belanche-Munoz, Lluis ;
Bonjoch, Xavier ;
Ebdon, James ;
Gantzer, Christophe ;
Lucena, Francisco ;
Ottoson, Jakob ;
Kourtis, Christos ;
Iversen, Aina ;
Kuhn, Inger ;
Moce, Laura ;
Muniesa, Maite ;
Schwartzbrod, Janine ;
Skraber, Sylvain ;
Papageorgiou, Georgios T. ;
Taylor, Huw ;
Wallis, Jessica ;
Jofre, Joan .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2006, 72 (09) :5915-5926
[3]  
Blanch Anicet R., 2004, Journal of Water and Health, V2, P249
[4]   A neural-network-based classification scheme for sorting sources and ages of fecal contamination in water [J].
Brion, GM ;
Neelakantan, TR ;
Lingireddy, S .
WATER RESEARCH, 2002, 36 (15) :3765-3774
[5]   A tutorial on Support Vector Machines for pattern recognition [J].
Burges, CJC .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167
[6]  
Christianini N., 2000, INTRO SUPPORT VECTOR, DOI DOI 10.1017/CBO9780511801389
[7]   Experience with data mining for the anaerobic wastewater treatment process [J].
Dixon, M. ;
Gallop, J. R. ;
Lambert, S. C. ;
Healy, J. V. .
ENVIRONMENTAL MODELLING & SOFTWARE, 2007, 22 (03) :315-322
[8]  
Duda RO, 2006, PATTERN CLASSIFICATI
[9]  
Everitt BS, 1993, CLUSTER ANAL
[10]   Molecular approaches to microbiological monitoring: Fecal source detection [J].
Field, KG ;
Bernhard, AE ;
Brodeur, TJ .
ENVIRONMENTAL MONITORING AND ASSESSMENT, 2003, 81 (1-3) :313-326