上下文感知的判别式主题模型

doi:10.6040/j.issn.1672-3961.0.2021.311

摘要/Abstract

摘要： 为了解决主题识别过程中词的上下文语境缺失问题,通过卷积神经网络将特定的上下文信息嵌入到词向量中,再将词向量输入到判别式主题模型中。本方法可以融合附加标签信息进行有监督的训练,处理文档分类等下游任务。通过与现有判别式主题模型进行对比和分析,能够获取到更加连贯的主题,同时在文本分类任务上表现出更好的预测性能,从而验证了方法的有效性和准确性。

关键词: 主题模型, 词嵌入表示, 判别式模型, 上下文语义, 文本分类

中图分类号:

TP391.1

孙志巍,宋明阳,潘泽华,景丽萍. 上下文感知的判别式主题模型[J]. 山东大学学报 (工学版), 2022, 52(4): 131-138.

SUN Zhiwei, SONG Mingyang, PAN Zehua, JING Liping. Context-aware discriminative topic model[J]. Journal of Shandong University(Engineering Science), 2022, 52(4): 131-138.

[1] BLEI D M. Probabilistic topic models[J]. Commun-ications of the ACM, 2012, 55(4): 77-84.
[2] MEI Q, LING X, WONDRA M, et al. Topic sentiment mixture: modeling facets and opinions in weblogs[C] //Proceedings of the 16th international conference on World Wide Web. New York, USA: Association for Computing Machinery, 2007: 171-180.
[3] GUO W, WU S, WANG L, et al. Social-relational topic model for social networks[C] //Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. New York, USA: Association for Computing Machinery, 2015: 1731-1734.
[4] PENNACCHIOTTI M, GURUMURTHY S. Investigating topic models for social media user recommendation[C] //Proceedings of the 20th International Conference Companion on World Wide Web. New York, USA: Association for Computing Machinery, 2011: 101-102.
[5] BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
[6] CHIEN J T, LEE C H, TAN Z H. Latent Dirichlet mixture model[J]. Neurocomputing, 2018, 278: 12-22.
[7] AGARWAL D, CHEN B C. fLDA: matrix factorization through latent dirichlet allocation[C] //Proceedings of the Third ACM International Conference on Web Search and Data Mining. New York, USA: Association for Computing Machinery, 2010: 91-100.
[8] BLEI D M, GRIFFITHS T L, JORDAN M I. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies[J]. Journal of the ACM, 2010, 57(2): 1-30.
[9] KORSHUNOVA I, XIONG H, FEDORYSZAK M, et al. Discriminative topic modeling with logistic LDA[C] //Proceedings of the 33rd International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2019: 6770-6780.
[10] NG A Y, JORDAN M I. On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes[C] //Proceedings of the 14th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2001: 841-848.
[11] CAO Z, LI S, LIU Y, et al. A novel neural topic model and its supervised extension[C] //Proceedings of the 29th AAAI Conference on Artificial Intelligence. Menlo Park, USA: AAAI Press, 2015: 2210-2216.
[12] MENG Y, HUANG J, WANG G, et al. Discriminative topic mining via category-name guided text embedding[C] //Proceedings of the 29th Web Conference. New York, USA: Association for Computing Machinery, 2020: 2121-2132.
[13] DEERWESTER S, DUMAIS S T, FURNAS G W, et al. Indexing by latent semantic analysis[J]. Journal of the American society for information science, 1990, 41(6): 391-407.
[14] HOFMANN T. Probabilistic latent semantic indexing[C] //Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: Association for Computing Machinery, 1999: 50-57.
[15] BLEI D M, LAFFERTY J D. Correlated topic models[C] //Proceedings of the 18th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2005: 147-154.
[16] BLEI D M, JORDAN M I. Modeling annotated data[C] //Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: Association for Computing Machinery, 2003: 127-134.
[17] WANG X, MCCALLUM A. Topics over time: a non-markov continuous-time model of topical trends[C] //Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: Association for Computing Machinery, 2006: 424-433.
[18] MIMNO D M, MCCALLUM A. Topic models conditioned on arbitrary features with Dirichlet-multinomial regression[C] //Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence. Corvallis, USA: AUAI Press, 2008: 411-418.
[19] BLEI D M, MCAULIFFE J D. Supervised topic models[C] //Proceedings of the 20th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2007: 121-128.
[20] LACOSTE-JULIEN S, SHA F, JORDAN M I. DiscLDA: Discriminative learning for dimensionality reduction and classification[C] //Proceedings of the 21st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2008: 897-904.
[21] MIAO Y, YU L, BLUNSOM P. Neural variational inference for text processing[C] //Proceedings of the 33rd International Conference on Machine Learning. New York, USA: Association for Computing Machinery, 2016: 1727-1736.
[22] MIAO Y, GREFENSTETTE E, BLUNSOM P. Discovering discrete latent topics with neural variational inference[C] //Proceedings of the 34th International Conference on Machine Learning. New York, USA: Association for Computing Machinery, 2017: 2410-2419.
[23] SRIVASTAVA A, SUTTON C. Autoencoding variational inference for topic models[C] //Proceedings of the International Conference on Learning Representations. Toulon, France, 2017: 1-12.
[24] KINGMA D P, WELLING M. Auto-encoding variational Bayes[C] //Proceedings of the International Conference on Learning Representations(ICLR). Banff, Canada, 2014: 12-14.
[25] REZENDE D J, MOHAMED S, WIERSTRA D. Stochastic backpropagation and approximate inference in deep generative models[C] //Proceedings of the 31st International Conference on Machine Learning. New York, USA: Association for Computing Machinery, 2014: 1278-1286.
[26] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[27] KIM Y. Convolutional neural networks for sentence classification[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association of Computational Linguistics, 2014: 1746-1751.
[28] JOHNSON R, ZHANG T. Effective use of word order for text categorization with convolutional neural networks[C] //Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg, USA: Association of Computational Linguistics, 2015: 103-112.
[29] PENNINGTON J, SOCHER R, MANNING C D. Glove: global vectors for word representation[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association of Computational Linguistics, 2014: 1532-1543.
[30] BESAG J. Spatial interaction and the statistical analysis of lattice systems[J]. Journal of the Royal Statistical Society: Series B(Methodological), 1974, 36(2): 192-225.
[31] CARDOSO-CACHOPO A. Improving methods for single-label text categorization[D]. Lisbon: Technical University of Lisbon, 2007.
[32] LAU J H, NEWMAN D, BALDWIN T. Machine reading tea leaves: automatically evaluating topic coherence and topic model quality[C] //Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, USA: Association of Computational Linguistics, 2014: 530-539.
[33] HOFFMAN M, BACH F, BLEI D M. Online learning for latent dirichlet allocation[C] //Proceedings of the 23rd International Conference on Neural Information Processing Systems. Cambridge, UK: MIT Press, 2010: 856-864.
[34] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[35] DAI A M, LE Q V. Semi-supervised sequence learning[C] //Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge, UK: MIT Press, 2015: 3079-3087.

多维度评价

Viewed

Full text

183

HTML			PDF

Just accepted	Online first	Issue	Just accepted	Online first	Issue
0	0	0	0	0	183

From	Others	local

Times	18	165
Rate	10%	90%

Abstract

709

Just accepted	Online first	Issue

0	0	709

From	Others	local

Times	705	4
Rate	99%	1%

Cited

Web of Science	Crossref	ScienceDirect	Search for Citations in Google Scholar >>


This page requires you have already subscribed to WoS.

Shared

Discussed