Global optimization for neural network training

被引:78
作者
Shang, Y
Wah, BW
机构
[1] University of Illinois, Urbana-Champaign, IL
[2] Department of Computer Science, University of Illinois, Urbana-Champaign, IL
[3] Institute of Computing Technology, Academia Sinica, Beijing
[4] Dept. of Elec. and Comp. Engineering, Coordinated Science Laboratory, University of Illinois, Urbana-Champaign, IL
[5] Coordinated Science Laboratory, University of Illinois, Urbana, IL 61801
基金
美国国家科学基金会;
关键词
D O I
10.1109/2.485892
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many learning algorithms find their roots in function-minimization algorithms that can be classified as local- or global-minimization algorithms. Algorithms that focus on either extreme-local search or global search-do not work well. The authors propose a hybrid method, called NOVEL for Nonlinear Optimization via External Lead, that combines global and local searches to explore the solution space, locate promising regions, and find local minima. To guide exploration of the solution space, it uses a continuous terrain-independent trace that does not get trapped in local minima. NOVEL next uses a locate gradient to attract the search to a local minimum, but the trace pulls it out once little improvement is found. NOVEL then selects one initial point for each promising region and uses these points for a descent algorithm to find local minima. It thus avoids searching unpromising local minima from random starting points using computationally expensive descent algorithms. In an implementation using differential- and difference equation solvers, NOVEL demonstrated superior performance in five benchmark comparisons against the best global optimization algorithms.
引用
收藏
页码:45 / +
页数:1
相关论文
共 12 条
[1]   1ST-ORDER AND 2ND-ORDER METHODS FOR LEARNING - BETWEEN STEEPEST DESCENT AND NEWTON METHOD [J].
BATTITI, R .
NEURAL COMPUTATION, 1992, 4 (02) :141-166
[2]   MINIMIZING MULTIMODAL FUNCTIONS OF CONTINUOUS-VARIABLES WITH THE SIMULATED ANNEALING ALGORITHM [J].
CORANA, A ;
MARCHESI, M ;
MARTINI, C ;
RIDELLA, S .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1987, 13 (03) :262-280
[3]  
DIXON LCW, 1994, NATO ADV SCI INST SE, V434, P513
[4]  
Fahlman S., 1990, ADV NEURAL INFORMATI, V2, P524
[5]  
Hindmarsh A. C, 1983, IMACS T SCI COMPUTAT, V1, P55
[6]  
HWANG J, 1994, IEEE T NEURAL NETWOR, V5, P1
[7]  
LEVY AV, 1981, TOPICS GLOBAL OPTIMI, V909
[8]  
Luenberger D. G., 2015, Linear and nonlinear programming, V4th
[9]  
Michalewicz Z, 1994, Genetic Algorithms + Data Structures = Evolution Programs
[10]  
Sejnowski T. J., 1987, Complex Systems, V1, P145