山东大学学报 (工学版) ›› 2022, Vol. 52 ›› Issue (4): 131-138.doi: 10.6040/j.issn.1672-3961.0.2021.311
• • 上一篇
孙志巍1,宋明阳1,潘泽华2,景丽萍1*
SUN Zhiwei1, SONG Mingyang1, PAN Zehua2, JING Liping1*
摘要: 为了解决主题识别过程中词的上下文语境缺失问题,通过卷积神经网络将特定的上下文信息嵌入到词向量中,再将词向量输入到判别式主题模型中。本方法可以融合附加标签信息进行有监督的训练,处理文档分类等下游任务。通过与现有判别式主题模型进行对比和分析,能够获取到更加连贯的主题,同时在文本分类任务上表现出更好的预测性能,从而验证了方法的有效性和准确性。
中图分类号:
[1] BLEI D M. Probabilistic topic models[J]. Commun-ications of the ACM, 2012, 55(4): 77-84. [2] MEI Q, LING X, WONDRA M, et al. Topic sentiment mixture: modeling facets and opinions in weblogs[C] //Proceedings of the 16th international conference on World Wide Web. New York, USA: Association for Computing Machinery, 2007: 171-180. [3] GUO W, WU S, WANG L, et al. Social-relational topic model for social networks[C] //Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. New York, USA: Association for Computing Machinery, 2015: 1731-1734. [4] PENNACCHIOTTI M, GURUMURTHY S. Investigating topic models for social media user recommendation[C] //Proceedings of the 20th International Conference Companion on World Wide Web. New York, USA: Association for Computing Machinery, 2011: 101-102. [5] BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022. [6] CHIEN J T, LEE C H, TAN Z H. Latent Dirichlet mixture model[J]. Neurocomputing, 2018, 278: 12-22. [7] AGARWAL D, CHEN B C. fLDA: matrix factorization through latent dirichlet allocation[C] //Proceedings of the Third ACM International Conference on Web Search and Data Mining. New York, USA: Association for Computing Machinery, 2010: 91-100. [8] BLEI D M, GRIFFITHS T L, JORDAN M I. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies[J]. Journal of the ACM, 2010, 57(2): 1-30. [9] KORSHUNOVA I, XIONG H, FEDORYSZAK M, et al. Discriminative topic modeling with logistic LDA[C] //Proceedings of the 33rd International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2019: 6770-6780. [10] NG A Y, JORDAN M I. On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes[C] //Proceedings of the 14th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2001: 841-848. [11] CAO Z, LI S, LIU Y, et al. A novel neural topic model and its supervised extension[C] //Proceedings of the 29th AAAI Conference on Artificial Intelligence. Menlo Park, USA: AAAI Press, 2015: 2210-2216. [12] MENG Y, HUANG J, WANG G, et al. Discriminative topic mining via category-name guided text embedding[C] //Proceedings of the 29th Web Conference. New York, USA: Association for Computing Machinery, 2020: 2121-2132. [13] DEERWESTER S, DUMAIS S T, FURNAS G W, et al. Indexing by latent semantic analysis[J]. Journal of the American society for information science, 1990, 41(6): 391-407. [14] HOFMANN T. Probabilistic latent semantic indexing[C] //Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: Association for Computing Machinery, 1999: 50-57. [15] BLEI D M, LAFFERTY J D. Correlated topic models[C] //Proceedings of the 18th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2005: 147-154. [16] BLEI D M, JORDAN M I. Modeling annotated data[C] //Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: Association for Computing Machinery, 2003: 127-134. [17] WANG X, MCCALLUM A. Topics over time: a non-markov continuous-time model of topical trends[C] //Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: Association for Computing Machinery, 2006: 424-433. [18] MIMNO D M, MCCALLUM A. Topic models conditioned on arbitrary features with Dirichlet-multinomial regression[C] //Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence. Corvallis, USA: AUAI Press, 2008: 411-418. [19] BLEI D M, MCAULIFFE J D. Supervised topic models[C] //Proceedings of the 20th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2007: 121-128. [20] LACOSTE-JULIEN S, SHA F, JORDAN M I. DiscLDA: Discriminative learning for dimensionality reduction and classification[C] //Proceedings of the 21st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2008: 897-904. [21] MIAO Y, YU L, BLUNSOM P. Neural variational inference for text processing[C] //Proceedings of the 33rd International Conference on Machine Learning. New York, USA: Association for Computing Machinery, 2016: 1727-1736. [22] MIAO Y, GREFENSTETTE E, BLUNSOM P. Discovering discrete latent topics with neural variational inference[C] //Proceedings of the 34th International Conference on Machine Learning. New York, USA: Association for Computing Machinery, 2017: 2410-2419. [23] SRIVASTAVA A, SUTTON C. Autoencoding variational inference for topic models[C] //Proceedings of the International Conference on Learning Representations. Toulon, France, 2017: 1-12. [24] KINGMA D P, WELLING M. Auto-encoding variational Bayes[C] //Proceedings of the International Conference on Learning Representations(ICLR). Banff, Canada, 2014: 12-14. [25] REZENDE D J, MOHAMED S, WIERSTRA D. Stochastic backpropagation and approximate inference in deep generative models[C] //Proceedings of the 31st International Conference on Machine Learning. New York, USA: Association for Computing Machinery, 2014: 1278-1286. [26] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. [27] KIM Y. Convolutional neural networks for sentence classification[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association of Computational Linguistics, 2014: 1746-1751. [28] JOHNSON R, ZHANG T. Effective use of word order for text categorization with convolutional neural networks[C] //Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg, USA: Association of Computational Linguistics, 2015: 103-112. [29] PENNINGTON J, SOCHER R, MANNING C D. Glove: global vectors for word representation[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association of Computational Linguistics, 2014: 1532-1543. [30] BESAG J. Spatial interaction and the statistical analysis of lattice systems[J]. Journal of the Royal Statistical Society: Series B(Methodological), 1974, 36(2): 192-225. [31] CARDOSO-CACHOPO A. Improving methods for single-label text categorization[D]. Lisbon: Technical University of Lisbon, 2007. [32] LAU J H, NEWMAN D, BALDWIN T. Machine reading tea leaves: automatically evaluating topic coherence and topic model quality[C] //Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, USA: Association of Computational Linguistics, 2014: 530-539. [33] HOFFMAN M, BACH F, BLEI D M. Online learning for latent dirichlet allocation[C] //Proceedings of the 23rd International Conference on Neural Information Processing Systems. Cambridge, UK: MIT Press, 2010: 856-864. [34] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. [35] DAI A M, LE Q V. Semi-supervised sequence learning[C] //Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge, UK: MIT Press, 2015: 3079-3087. |
[1] | 高明霞,李经纬. 基于word2vec词模型的中文短文本分类方法[J]. 山东大学学报 (工学版), 2019, 49(2): 34-41. |
[2] | 朱映雪,黄瑞章,马灿. 一种具有新主题偏向性的短文本动态聚类方法[J]. 山东大学学报 (工学版), 2018, 48(6): 8-18. |
[3] | 闫盈盈,黄瑞章,王瑞,马灿,刘博伟,黄庭. 一种长文本辅助短文本的文本理解方法[J]. 山东大学学报(工学版), 2018, 48(3): 67-74. |
[4] | 谢志峰,吴佳萍,马利庄. 基于卷积神经网络的中文财经新闻分类方法[J]. 山东大学学报(工学版), 2018, 48(3): 34-39. |
[5] | 卢文羊, 徐佳一, 杨育彬. 基于LDA主题模型的社会网络链接预测[J]. 山东大学学报(工学版), 2014, 44(6): 26-31. |
[6] | 张永军1,刘金岭2,于长辉3. 基于词贡献度的垃圾短信分类方法[J]. 山东大学学报(工学版), 2012, 42(5): 87-90. |
[7] | 王洪元,封磊,冯燕,程起才. 流形学习算法在中文文本分类中的应用[J]. 山东大学学报(工学版), 2012, 42(4): 8-12. |
[8] | 王法波,许信顺. 文本分类中一种新的特征选择方法[J]. 山东大学学报(工学版), 2010, 40(4): 8-11. |
|