A word extend LDA model for short text sentiment

SHEN Ji, MA Zhiqiang*, LI Tuya, ZHANG Li   

  1. College of Information Engineering, Inner Mongolia University of Technology, Hohhot 010080, Inner Mongolia, China
  • Received:2017-05-09 Online:2018-06-20 Published:2017-05-09

Abstract: Faced with low accuracy of sentiment polarity analysis for short text, this research presented an sentiment analysis model for short text based on latent dirichlet allocation. The model searched for the emotional words by the part of speech in the short texts and expanded them restrainedly to an extended set, enhanced the co-occurrence frequency between emotional words. The model added the expanded set to the discovered emotional words in short texts, increasing length of the short texts, extracting emotional information and turning topic clustering into emotion topic clustering. The model used 4 000 positive and negative short texts to experiments. The results showed that our model improved sentiment classification 11.8% than joint sentiment topic model model and 9.5% than latent sentiment model model; more emotional words were found at the same time. It proved that the model extracted richer emotion features for short texts and had a higher accuracy of classification in sentiment analysis.

Key words: short text, word extend function, latent Dirichlet allocation, unsupervised learning, sentiment analysis, document-topic generative model

