A general framework for adaptive processing of data structures

被引:301
作者
Frasconi, P [1 ]
Gori, M
Sperduti, A
机构
[1] Univ Florence, Dipartimento Sistemi & Informat, I-50139 Florence, Italy
[2] Univ Siena, Dipartimento Ingn Informaz, I-53100 Siena, Italy
[3] Univ Pisa, Dipartimento Informat, Pisa, Italy
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 1998年 / 9卷 / 05期
关键词
graphical models; graphs; learning data structures; problem-solving; recurrent neural networks; recursive neural networks; sequences; syntactic pattern recognition;
D O I
10.1109/72.712151
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive models like artificial neural nets and belief nets for the problem of processing structured information. In particular, relations between data variables are expressed by directed acyclic graphs, where both numerical and categorical values coexist. The general framework proposed in this paper can be regarded as an extension of both recurrent neural networks and hidden Markov models to the case of acyclic graphs. In particular we study the supervised learning problem as the problem of learning transductions from an input structured space to an output structured space, where transductions are assumed to admit a recursive hidden state-space representation. We introduce a graphical formalism for representing this class of adaptive transductions by means of recursive networks, i,e,, cyclic graphs where nodes are labeled by variables and edges are labeled by generalized delay elements, This representation makes it possible to incorporate the symbolic and subsymbolic nature of data. Structures are processed by unfolding the recursive network into an acyclic graph called encoding network. In so doing, inference and learning algorithms can be easily inherited from the corresponding algorithms for artificial neural networks or probabilistic graphical model.
引用
收藏
页码:768 / 786
页数:19
相关论文
共 60 条
[1]  
ANGLUIN D, 1983, ACM COMPUT SURV, V15, P237
[2]  
[Anonymous], 1991, Advances in Neural Information Processing Systems
[3]  
BACK AD, 1995, ADV NEURAL INFORMATI, V7, P883
[4]   LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].
BENGIO, Y ;
SIMARD, P ;
FRASCONI, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166
[5]   Input-output HMM's for sequence processing [J].
Bengio, Y ;
Frasconi, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1996, 7 (05) :1231-1249
[6]  
BENGIO Y, 1996, ADV NEURAL INFORMATI, V8
[7]  
Bishop C. M., 1995, NEURAL NETWORKS PATT
[8]   Operations for Learning with Graphical Models [J].
Buntine, Wray L. .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1994, 2 :159-225
[9]  
CABE TJM, 1976, IEEE T SOFTWARE ENG, V2, P308
[10]  
CADORET V, 1994, P ECAI AMST NETH, P555