您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2022, Vol. 52 ›› Issue (4): 131-138.doi: 10.6040/j.issn.1672-3961.0.2021.311

• • 上一篇    

上下文感知的判别式主题模型

孙志巍1,宋明阳1,潘泽华2,景丽萍1*   

  1. 1.北京交通大学计算机与信息技术学院, 北京 100044;2.北京新纽科技有限公司, 北京 100044
  • 发布日期:2022-08-24
  • 作者简介:孙志巍(1998— ),女,安徽阜阳人,硕士研究生,主要研究方向为自然语言处理及主题模型. E-mail:19120401@bjtu.edu.cn. *通信作者简介:景丽萍(1978— ),女,河南南阳人,教授,博士,博士生导师,主要研究方向为机器学习及其应用. E-mail:lpjing@bjtu.edu.cn
  • 基金资助:
    国家自然科学基金项目(61822601,61773050,61632004);北京市自然科学基金资助项目(Z180006);国家科技研发计划资助项目(2020AAA0106800,2017YFC1703506);中央高校基本科研业务费专项资金资助项目(2019JBZ110)

Context-aware discriminative topic model

SUN Zhiwei1, SONG Mingyang1, PAN Zehua2, JING Liping1*   

  1. 1. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China;
    2. Beijing Newlink Technology Co., Ltd., Beijing 100044, China
  • Published:2022-08-24

摘要: 为了解决主题识别过程中词的上下文语境缺失问题,通过卷积神经网络将特定的上下文信息嵌入到词向量中,再将词向量输入到判别式主题模型中。本方法可以融合附加标签信息进行有监督的训练,处理文档分类等下游任务。通过与现有判别式主题模型进行对比和分析,能够获取到更加连贯的主题,同时在文本分类任务上表现出更好的预测性能,从而验证了方法的有效性和准确性。

关键词: 主题模型, 词嵌入表示, 判别式模型, 上下文语义, 文本分类

中图分类号: 

  • TP391.1
[1] BLEI D M. Probabilistic topic models[J]. Commun-ications of the ACM, 2012, 55(4): 77-84.
[2] MEI Q, LING X, WONDRA M, et al. Topic sentiment mixture: modeling facets and opinions in weblogs[C] //Proceedings of the 16th international conference on World Wide Web. New York, USA: Association for Computing Machinery, 2007: 171-180.
[3] GUO W, WU S, WANG L, et al. Social-relational topic model for social networks[C] //Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. New York, USA: Association for Computing Machinery, 2015: 1731-1734.
[4] PENNACCHIOTTI M, GURUMURTHY S. Investigating topic models for social media user recommendation[C] //Proceedings of the 20th International Conference Companion on World Wide Web. New York, USA: Association for Computing Machinery, 2011: 101-102.
[5] BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
[6] CHIEN J T, LEE C H, TAN Z H. Latent Dirichlet mixture model[J]. Neurocomputing, 2018, 278: 12-22.
[7] AGARWAL D, CHEN B C. fLDA: matrix factorization through latent dirichlet allocation[C] //Proceedings of the Third ACM International Conference on Web Search and Data Mining. New York, USA: Association for Computing Machinery, 2010: 91-100.
[8] BLEI D M, GRIFFITHS T L, JORDAN M I. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies[J]. Journal of the ACM, 2010, 57(2): 1-30.
[9] KORSHUNOVA I, XIONG H, FEDORYSZAK M, et al. Discriminative topic modeling with logistic LDA[C] //Proceedings of the 33rd International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2019: 6770-6780.
[10] NG A Y, JORDAN M I. On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes[C] //Proceedings of the 14th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2001: 841-848.
[11] CAO Z, LI S, LIU Y, et al. A novel neural topic model and its supervised extension[C] //Proceedings of the 29th AAAI Conference on Artificial Intelligence. Menlo Park, USA: AAAI Press, 2015: 2210-2216.
[12] MENG Y, HUANG J, WANG G, et al. Discriminative topic mining via category-name guided text embedding[C] //Proceedings of the 29th Web Conference. New York, USA: Association for Computing Machinery, 2020: 2121-2132.
[13] DEERWESTER S, DUMAIS S T, FURNAS G W, et al. Indexing by latent semantic analysis[J]. Journal of the American society for information science, 1990, 41(6): 391-407.
[14] HOFMANN T. Probabilistic latent semantic indexing[C] //Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: Association for Computing Machinery, 1999: 50-57.
[15] BLEI D M, LAFFERTY J D. Correlated topic models[C] //Proceedings of the 18th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2005: 147-154.
[16] BLEI D M, JORDAN M I. Modeling annotated data[C] //Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: Association for Computing Machinery, 2003: 127-134.
[17] WANG X, MCCALLUM A. Topics over time: a non-markov continuous-time model of topical trends[C] //Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: Association for Computing Machinery, 2006: 424-433.
[18] MIMNO D M, MCCALLUM A. Topic models conditioned on arbitrary features with Dirichlet-multinomial regression[C] //Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence. Corvallis, USA: AUAI Press, 2008: 411-418.
[19] BLEI D M, MCAULIFFE J D. Supervised topic models[C] //Proceedings of the 20th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2007: 121-128.
[20] LACOSTE-JULIEN S, SHA F, JORDAN M I. DiscLDA: Discriminative learning for dimensionality reduction and classification[C] //Proceedings of the 21st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2008: 897-904.
[21] MIAO Y, YU L, BLUNSOM P. Neural variational inference for text processing[C] //Proceedings of the 33rd International Conference on Machine Learning. New York, USA: Association for Computing Machinery, 2016: 1727-1736.
[22] MIAO Y, GREFENSTETTE E, BLUNSOM P. Discovering discrete latent topics with neural variational inference[C] //Proceedings of the 34th International Conference on Machine Learning. New York, USA: Association for Computing Machinery, 2017: 2410-2419.
[23] SRIVASTAVA A, SUTTON C. Autoencoding variational inference for topic models[C] //Proceedings of the International Conference on Learning Representations. Toulon, France, 2017: 1-12.
[24] KINGMA D P, WELLING M. Auto-encoding variational Bayes[C] //Proceedings of the International Conference on Learning Representations(ICLR). Banff, Canada, 2014: 12-14.
[25] REZENDE D J, MOHAMED S, WIERSTRA D. Stochastic backpropagation and approximate inference in deep generative models[C] //Proceedings of the 31st International Conference on Machine Learning. New York, USA: Association for Computing Machinery, 2014: 1278-1286.
[26] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[27] KIM Y. Convolutional neural networks for sentence classification[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association of Computational Linguistics, 2014: 1746-1751.
[28] JOHNSON R, ZHANG T. Effective use of word order for text categorization with convolutional neural networks[C] //Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg, USA: Association of Computational Linguistics, 2015: 103-112.
[29] PENNINGTON J, SOCHER R, MANNING C D. Glove: global vectors for word representation[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association of Computational Linguistics, 2014: 1532-1543.
[30] BESAG J. Spatial interaction and the statistical analysis of lattice systems[J]. Journal of the Royal Statistical Society: Series B(Methodological), 1974, 36(2): 192-225.
[31] CARDOSO-CACHOPO A. Improving methods for single-label text categorization[D]. Lisbon: Technical University of Lisbon, 2007.
[32] LAU J H, NEWMAN D, BALDWIN T. Machine reading tea leaves: automatically evaluating topic coherence and topic model quality[C] //Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, USA: Association of Computational Linguistics, 2014: 530-539.
[33] HOFFMAN M, BACH F, BLEI D M. Online learning for latent dirichlet allocation[C] //Proceedings of the 23rd International Conference on Neural Information Processing Systems. Cambridge, UK: MIT Press, 2010: 856-864.
[34] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[35] DAI A M, LE Q V. Semi-supervised sequence learning[C] //Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge, UK: MIT Press, 2015: 3079-3087.
[1] 高明霞,李经纬. 基于word2vec词模型的中文短文本分类方法[J]. 山东大学学报 (工学版), 2019, 49(2): 34-41.
[2] 朱映雪,黄瑞章,马灿. 一种具有新主题偏向性的短文本动态聚类方法[J]. 山东大学学报 (工学版), 2018, 48(6): 8-18.
[3] 闫盈盈,黄瑞章,王瑞,马灿,刘博伟,黄庭. 一种长文本辅助短文本的文本理解方法[J]. 山东大学学报(工学版), 2018, 48(3): 67-74.
[4] 谢志峰,吴佳萍,马利庄. 基于卷积神经网络的中文财经新闻分类方法[J]. 山东大学学报(工学版), 2018, 48(3): 34-39.
[5] 卢文羊, 徐佳一, 杨育彬. 基于LDA主题模型的社会网络链接预测[J]. 山东大学学报(工学版), 2014, 44(6): 26-31.
[6] 张永军1,刘金岭2,于长辉3. 基于词贡献度的垃圾短信分类方法[J]. 山东大学学报(工学版), 2012, 42(5): 87-90.
[7] 王洪元,封磊,冯燕,程起才. 流形学习算法在中文文本分类中的应用[J]. 山东大学学报(工学版), 2012, 42(4): 8-12.
[8] 王法波,许信顺. 文本分类中一种新的特征选择方法[J]. 山东大学学报(工学版), 2010, 40(4): 8-11.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!