共 40 条
[1]
[Anonymous], 2010, P PYTH SCI C
[2]
[Anonymous], rmsprop: divide the gradient by a running average of its recent magnitude
[3]
[Anonymous], 2013, ASS COMPUT LINGUIST
[4]
[Anonymous], 2013, P 7 INT WORKSH SEM E
[5]
[Anonymous], 2016, P NAACL HLT
[6]
[Anonymous], 2005, P 43 ANN M ASS COMP
[7]
[Anonymous], 2015, P 6 INT WORKSHOP HLT, DOI DOI 10.18653/V1/W15-2608
[8]
LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT
[J].
IEEE TRANSACTIONS ON NEURAL NETWORKS,
1994, 5 (02)
:157-166
[9]
Chalapathy R., ARXIV160907585
[10]
Collobert R, 2011, J MACH LEARN RES, V12, P2493