Development of a patent document classification and search platform using a back-propagation network

被引:101
作者
Trappey, Amy J. C. [1 ]
Hsu, Fu-Chiang
Trappey, Charles V.
Lin, Chia-I
机构
[1] Natl Tsing Hua Univ, Dept Ind Engn & Engn Management, Hsinchu 300, Taiwan
[2] Queensland Univ Technol, Fac Business, St Lucia, Qld, Australia
[3] Ind Technol Res Inst, Ctr Aerosp & Syst Technol, Hsinchu 300, Taiwan
关键词
knowledge document management; document classification; patent search; neural networks;
D O I
10.1016/j.eswa.2006.01.013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to process large numbers of explicit knowledge documents such as patents in an organized manner, automatic document categorization and search are required. In this paper, we develop a document classification and search methodology based on neural network technology that helps companies manage patent documents more effectively. The classification process begins by extracting key phrases from the document set by means of automatic text processing and determining the significance of key phrases according to their frequency in text. In order to maintain a manageable number of independent key phrases, correlation analysis is applied to compute the similarities between key phrases. Phrases with higher correlations are synthesized into a smaller set of phrases. Finally, the back-propagation network model is adopted as a classifier. The target output identifies a patent document's category based on a hierarchical classification scheme, in this case, the international patent classification (IPC) standard. The methodology is tested using patents related to the design of power hand-tools. Related patents are automatically classified using pre-trained neural network models. In the prototype system, two modules are used for patent document management. The automatic classification module helps the user classify patent documents and the search module helps users find relevant and related patent documents. The result shows an improvement in document classification and identification over previously published methods of patent document management. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:755 / 765
页数:11
相关论文
共 36 条
  • [1] Andersen B., 1998, Structural Change and Economic Dynamics, V9, P5, DOI 10.1016/S0954-349X(97)00036-2
  • [2] [Anonymous], 1986, PARALLEL DISTRIBUTED, DOI 10.7551/mitpress/5236.001.0001
  • [3] Text document categorization by term association
    Antonie, ML
    Zaïane, OR
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 19 - 26
  • [4] Knowledge-relatedness in firm technological diversification
    Breschi, S
    Lissoni, F
    Malerba, F
    [J]. RESEARCH POLICY, 2003, 32 (01) : 69 - 87
  • [5] Historical evolution of technological diversification
    Cantwell, J
    Vertova, G
    [J]. RESEARCH POLICY, 2004, 33 (03) : 511 - 529
  • [6] Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies
    Chakrabarti, S
    Dom, B
    Agrawal, R
    Raghavan, P
    [J]. VLDB JOURNAL, 1998, 7 (03) : 163 - 178
  • [7] Chiang JH, 2001, 10TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, P720, DOI 10.1109/FUZZ.2001.1009056
  • [8] Document categorization and retrieval using semantic microfeatures and growing cell structures
    Deng, WT
    Wu, W
    [J]. 12TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2001, : 270 - 274
  • [9] FARKAS J, 1994, P 1994 CAN C EL COMP, V2, P710
  • [10] Grossman DA, 1997, J AM SOC INFORM SCI, V48, P122, DOI 10.1002/(SICI)1097-4571(199702)48:2<122::AID-ASI3>3.0.CO