An Automated System to Predict Popular Cybersecurity News Using Document Embeddings

被引:5
作者
Saeed, Ramsha [1 ]
Rubab, Saddaf [1 ]
Asif, Sara [1 ]
Khan, Malik M. [1 ]
Murtaza, Saeed [1 ]
Kadry, Seifedine [2 ]
Nam, Yunyoung [3 ]
Khan, Muhammad Attique [4 ]
机构
[1] Natl Univ Sci & Technol, Islamabad, Pakistan
[2] Beriut Arab Univ, Beirut, Lebanon
[3] Soonchunhyang Univ, Dept Comp Sci & Engn, Asan, South Korea
[4] Hitec Univ, Taxila, Pakistan
来源
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES | 2021年 / 127卷 / 02期
关键词
Embeddings; semantics; cosine similarity; popularity; word2vec;
D O I
10.32604/cmes.2021.014355
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The substantial competition among the news industries puts editors under the pressure of posting news articles which are likely to gain more user attention. Anticipating the popularity of news articles can help the editorial teams in making decisions about posting a news article. Article similarity extracted from the articles posted within a small period of time is found to be a useful feature in existing popularity prediction approaches. This work proposes a new approach to estimate the popularity of news articles by adding semantics in the article similarity based approach of popularity estimation. A semantically enriched model is proposed which estimates news popularity by measuring cosine similarity between document embeddings of the news articles. Word2vec model has been used to generate distributed representations of the news content. In this work, we define popularity as the number of times a news article is posted on different websites. We collect data from different websites that post news concerning the domain of cybersecurity and estimate the popularity of cybersecurity news. The proposed approach is compared with different models and it is shown that it outperforms the other models.
引用
收藏
页码:533 / 547
页数:15
相关论文
共 24 条
[1]   To Post or Not to Post: Using Online Trends to Predict Popularity of Offline Content [J].
Abbar, Sofiane ;
Castillo, Carlos ;
Sanfilippo, Antonio .
HT'18: PROCEEDINGS OF THE 29TH ACM CONFERENCE ON HYPERTEXT AND SOCIAL MEDIA, 2018, :215-219
[2]  
[Anonymous], P 2 INT C COMP SYST
[3]  
Bojanowski P., 2017, T ASSOC COMPUT LING, V5, P135, DOI [10.1162/tacl_a_00051, DOI 10.1162/TACLA00051]
[4]   Genetic Algorithm Based Correlation Enhanced Prediction of Online News Popularity [J].
Choudhary, Swati ;
Sandhu, Angkirat Singh ;
Pradhan, Tribikram .
COMPUTATIONAL INTELLIGENCE IN DATA MINING, CIDM 2016, 2017, 556 :133-144
[5]  
Fernandes K., 2015, PORT C ART INT
[6]  
Guan XY, 2017, CHIN AUTOM CONGR, P3005, DOI 10.1109/CAC.2017.8243290
[7]   Modelling and predicting news popularity [J].
Hensinger, Elena ;
Flaounas, Ilias ;
Cristianini, Nello .
PATTERN ANALYSIS AND APPLICATIONS, 2013, 16 (04) :623-635
[8]   A deep neural network and classical features based scheme for objects recognition: an application for machine inspection [J].
Hussain, Nazar ;
Khan, Muhammad Attique ;
Sharif, Muhammad ;
Khan, Sajid Ali ;
Albesher, Abdulaziz A. ;
Saba, Tanzila ;
Armaghan, Ammar .
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (05) :14935-14957
[9]  
Jun Zhou, 2017, 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA), P164, DOI 10.1109/ICBDA.2017.8078799
[10]  
Keneshloo Y., 2016, P 2016 SIAM INT C DA