Hyperopt: A Python library for model selection and hyperparameter optimization

被引:273
作者
Bergstra, James [1 ]
Komer, Brent [1 ]
Eliasmith, Chris [1 ]
Yamins, Dan [2 ]
Cox, David D [3 ]
机构
[1] University of Waterloo, Canada
[2] Massachusetts Institute of Technology, United States
[3] Harvard University, United States
基金
美国国家科学基金会;
关键词
Bayesian optimization; machine learning; Python; Scikit-learn;
D O I
10.1088/1749-4699/8/1/014008
中图分类号
学科分类号
摘要
Sequential model-based optimization (also known as Bayesian optimization) is one of the most efficient methods (per function evaluation) of function minimization. This efficiency makes it appropriate for optimizing the hyperparameters of machine learning algorithms that are slow to train. The Hyperopt library provides algorithms and parallelization infrastructure for performing hyperparameter optimization (model selection) in Python. This paper presents an introductory tutorial on the usage of the Hyperopt library, including the description of search spaces, minimization (in serial and parallel), and the analysis of the results collected in the course of minimization. This paper also gives an overview of Hyperopt-Sklearn, a software project that provides automatic algorithm configuration of the Scikit-learn machine learning library. Following Auto-Weka, we take the view that the choice of classifier and even the choice of preprocessing module can be taken together to represent a single large hyperparameter optimization problem. We use Hyperopt to define a search space that encompasses many standard components (e.g. SVM, RF, KNN, PCA, TFIDF) and common patterns of composing them together. We demonstrate, using search algorithms in Hyperopt and standard benchmarking data sets (MNIST, 20-newsgroups, convex shapes), that searching this space is practical and effective. In particular, we improve on best-known scores for the model space for both MNIST and convex shapes. The paper closes with some discussion of ongoing and future work. © 2015 IOP Publishing Ltd.
引用
收藏
相关论文
共 20 条
[1]
Bergstra J., Bengio Y., Random search for hyperparameter optimization, J. Mach. Learn. Res., 13, pp. 281-305, (2012)
[2]
Brochu E., PhD Thesis, (2010)
[3]
Bergstra J., Bardenet R., Bengio Y., Keggl B., Algorithms for hyperparameter optimization, NIPS, 24, pp. 2546-2554, (2011)
[4]
Bergstra J., Yamins D., Cox D.D., Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures, Proc. ICML, (2013)
[5]
Bergstra J., Yamins D., Cox D.D., Hyperopt: A Python library for optimizing the hyperparameters of machine learning algorithms, Proc. SciPy 2013, pp. 13-20, (2013)
[6]
Bergstra J., Komer B., Eliasmith C., Warde-Farley D., Preliminary evaluation of hyperopt algorithms on HPOLib, ICML AutoML Workshop, (2014)
[7]
Ciresan D., Meier U., Schmidhuber J., Multi-column deep neural networks for image classification, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 3642-3649, (2012)
[8]
Domhan T., Springenberg T., Hutter F., Extrapolating learning curves of deep neural networks, ICML AutoML Workshop, (2014)
[9]
Eggensperger K., Feurer M., Hutter F., Bergstra J., Snoek J., Hoos H., Leyton-Brown K., Towards an empirical foundation for assessing Bayesian optimization of hyperparameters, NIPS Workshop on Bayesian Optimization in Theory and Practice, (2013)
[10]
Guan H., Zhou J., Guo M., A class-feature-centroid classifier for text categorization, Proc. 18th Int. Conf. on World Wide Web, pp. 201-210, (2009)