Connection pruning with static and adaptive pruning schedules

被引:23
作者
Prechelt, L
机构
[1] Fakultät für Informatik, Universität Karlsruhe
关键词
empirical study; pruning; early stopping; generalization;
D O I
10.1016/S0925-2312(96)00054-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural network pruning methods on the level of individual network parameters (e.g. connection weights) can improve generalization, as is shown in this empirical study. However, an open problem in the pruning methods known today (e.g. OBD, OBS, autoprune, epsiprune) is the selection of the number of parameters to be removed in each pruning step (pruning strength). This work presents a pruning method Iprune that automatically adapts the pruning strength to the evolution of weights and loss of generalization during training. The method requires no algorithm parameter adjustment by the user. Results of statistical significance tests comparing autoprune, Iprune, and static networks with early stopping are given, based on extensive experimentation with 14 different problems. The results indicate that training with pruning is often significantly better and rarely significantly worse than training with early stopping without pruning. Furthermore, Iprune is often superior to autoprune (which is superior to OBD) on diagnosis tasks unless severe pruning early in the training process is required.
引用
收藏
页码:49 / 61
页数:13
相关论文
共 18 条
  • [1] COWAN JD, 1994, ADV NEURAL INFORMATI, V6
  • [2] Fahlman S., 1990, ADV NEURAL INFORMATI, V2, P524
  • [3] Fahlman S., 1988, CMUCS88162 SCH COMP
  • [4] IMPROVING MODEL SELECTION BY NONCONVERGENT METHODS
    FINNOFF, W
    HERGERT, F
    ZIMMERMANN, HG
    [J]. NEURAL NETWORKS, 1993, 6 (06) : 771 - 783
  • [5] NEURAL NETWORKS AND THE BIAS VARIANCE DILEMMA
    GEMAN, S
    BIENENSTOCK, E
    DOURSAT, R
    [J]. NEURAL COMPUTATION, 1992, 4 (01) : 1 - 58
  • [6] HANSON SJ, 1993, ADV NEURAL INFORMATI
  • [7] HASSIBI B, 1994, ADV NEURAL INFORMATI, V6
  • [8] Hassibi Babak, 1992, P ADV NEUR INF PROC, V5
  • [9] LeCun Y., 1990, Advances in neural information processing systems, P598
  • [10] LIPPMANN RP, 1991, ADV NEURAL INFORMATI, V3