Convergence of exponentiated gradient algorithms

被引：23

作者：

Hill, SI ^{[1
]}

Williamson, RC ^{[1
]}

机构：

[1] Australian Natl Univ, Res Sch Informat Sci & Engn, Canberra, ACT, Australia

来源：

IEEE TRANSACTIONS ON SIGNAL PROCESSING | 2001年 / 49卷 / 06期

基金：

澳大利亚研究理事会;

关键词：

acoustic echo cancellation; echo; exponentiated gradient descent; learning rate; mean squared error; room acoustics;

D O I：

10.1109/78.923303

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper studies three related algorithms: the (traditional) gradient descent (GD) algorithm, the exponentiated gradient algorithm with positive and negative weights (EG +/- algorithm), and the exponentiated gradient algorithm with unnormalized positive and negative weights (EGU +/- algorithm). These algorithms have been previously analyzed using the "mistake-bound framework" in the computational learning theory community, In this paper, we perform a traditional signal processing analysis in terms of the mean square error. A relationship between the learning rate and the mean squared error (MSE) of predictions is found for the family of algorithms. This is used to compare the performance of the algorithms by choosing learning rates such that they converge to the same steady-state MSE, We demonstrate that if the target weight vector is sparse, the EG +/- algorithm typically converges more quickly than the GD or EGU +/- algorithms that perform very similarly. A side effect of our analysis is a reparametrization of the algorithms that provides insights into their behavior, The general form of the results we obtain are consistent with those obtained in the mistake-bound framework. The application of the algorithms to acoustic echo cancellation is then studied, and it is shown in some circumstances that the EG +/- algorithm will converge faster than the other two algorithms.

引用

页码：1208 / 1215

页数：8

共 13 条

[1] IMAGE METHOD FOR EFFICIENTLY SIMULATING SMALL-ROOM ACOUSTICS [J].

ALLEN, JB ;

BERKLEY, DA .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 (04) :943-950

[2]

[Anonymous], 1988, LECT ADAPTIVE PARAME

[3]

Beranek L.L., 1996, Acoustics, Vfifth

[4]

Clarkson P. M., 1993, OPTIMAL ADAPTIVE SIG

[5]

Gordon G. J., 1999, Proceedings of the Twelfth Annual Conference on Computational Learning Theory, P29, DOI 10.1145/307400.307410

[6]

Grove A. J., 1997, Proceedings of the Tenth Annual Conference on Computational Learning Theory, P171, DOI 10.1145/267460.267493

[7] LMS estimation via structural detection [J].

Homer, J ;

Mareels, I ;

Bitmead, RR ;

Wahlberg, B ;

Gustafsson, F .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1998, 46 (10) :2651-2663

[8] Exponentiated gradient versus gradient descent for linear predictors [J].

Kivinen, J ;

Warmuth, MK .

INFORMATION AND COMPUTATION, 1997, 132 (01) :1-63

[9]

Kivinen J, 1998, ADV NEUR IN, V10, P287

[10] A fast convergence algorithm for adaptive FIR filters under computational constraint for adaptive tap-position control [J].

Sugiyama, A ;

Sato, H ;

Hirano, A ;

Ikeda, S .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-ANALOG AND DIGITAL SIGNAL PROCESSING, 1996, 43 (09) :629-636

← 1 2 →