The Tibetan Microblog Text Representation Method Based on Shallow Parsing

被引:3
作者
Li Ailin [1 ]
Yu Hongzhi [1 ]
Yuan Bin [1 ]
机构
[1] Northwest Univ Nationalities, Natl Languages Informat Technol, Lanzhou 730030, Gansu, Peoples R China
来源
2015 8TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 1 | 2015年
关键词
text representation; Tibetan microblog; semantic space; K-means; sentiment categorization;
D O I
10.1109/ISCID.2015.297
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tibetan text representation, which has great influence on Tibetan text Categorization and Cluster, is the groundwork in Tibetan text mining. Tibetan microblog is one of the most popular Tibetan network media. Researches on Tibetan microblog are now increasing. However, because of the special features of microblog text and the features of Tibetan language, traditional Tibetan text representation method cannot satisfy the need. This paper proposes a Tibetan microblog text representation method that is based on shallow parsing and takes the Tibetan micro-blog sentiment analysis experiment. First, for Tibetan micro-blog text, the syntactic structure is generated by using syntactic tree. Second, the semantic feature space is built based on syntactic structures semantic features. Then, the semantic Cluster centroid is formed with the K-means method in the feature space. Last, the TF-IDF value based on cluster is calculated. The experiment shows, the method of this paper is compared with the SVM+TF-IDF and Naive Bayes+ the Maximum Entropy method, the F-measure is as high as 91.4%.
引用
收藏
页码:35 / 38
页数:4
相关论文
共 7 条
[1]  
Huang Yi-hua, 2011, Application Research of Computers, V28, P3229, DOI 10.3969/j.issn.1001-3695.2011.09.007
[2]  
Jumian Gesang, 1996, SENTENCE TIBETAN COM, V1, P132
[3]  
Liu Limin, 2012, Computer Engineering and Applications, V48, P1, DOI 10.3778/j.issn.1002-8331.2012.10.001
[4]  
Qi Kunyu, 2015, RES TIBETAN SEGMENTA
[5]  
[闻彬 Wen Bin], 2010, [计算机科学, Computer Science], V37, P261
[6]  
[谢丽星 Xie Lixing], 2012, [中文信息学报, Journal of Chinese Information Processing], V26, P73
[7]  
[杨震 Yang Zhen], 2012, [自动化学报, Acta Automatica Sinica], V38, P55