Tag optimization based on semantic similarity

QIAN Suchi, PENG Furong, LU Jianfeng   

  1. School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, Jiangsu, China
  • Received:2014-05-23 Revised:2014-11-20 Online:2015-04-20 Published:2014-05-23

Abstract: To effectively solve those problems such as lack and misuse of tags in the social media, a tag optimization method based on content similarity and semantic similarity was proposed. Firstly, TF-IDF(term frequency—inverse document frequency) was used to calculate the text similarity. Afterwards, the objective function was defined by the consistency between text similarity and tag similarity. Finally, correction term was added in optimization process to reduce the deviation of tags provided by users. The objective function was applied to Douban Movie to optimize movie tags and the results were compared and analyzed with the original tags. The accuracy of the optimized tags was improved by comparison. Experimental results showed that the method could effectively optimize tags and solve those problems such as lack and misuse of tags.

Key words: semantic similarity, social media, content similarity, tag optimization, movie tags

[1] LIN Jianghao, ZHOU Yongmei, YANG Aimin, CHEN Jin. Building of domain sentiment lexicon based on word2vec [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 40-47.
[2] XU Qing, DUAN Liguo, LI Aiping, YIN Guimei. Chinese entity relation extraction based on entity semantic similarity [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(6): 7-15.
[3] YIN Kun, YIN Hongfeng*, YANG Yan, JIA Zhen. Semantic similarity computation of Baidu encyclopedia entries based on SimRank [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(3): 29-35.
