您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2024, Vol. 54 ›› Issue (4): 35-41.doi: 10.6040/j.issn.1672-3961.0.2023.160

• 机器学习与数据挖掘 • 上一篇    下一篇

基于高斯分布和Householder flow的无监督图嵌入算法

刘国军,范天祥,王乃正,张正达,齐广智   

  1. 哈尔滨工业大学计算学部, 黑龙江 哈尔滨 150001
  • 发布日期:2024-08-20
  • 作者简介:刘国军(1979— ),男,黑龙江哈尔滨人,副教授,博士生导师,博士,主要研究方向为机器学习、计算机视觉、图像处理、模式识别等. E-mail: hitliu@hit.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61976071);黑龙江省联合基金资助项目(LH2020F012)

Unsupervised graph embedding algorithm based on Gaussian distribution and Householder flow

LIU Guojun, FAN Tianxiang, WANG Naizheng, ZHANG Zhengda, QI Guangzhi   

  1. Faculty of Computing, Harbin Institute of Technology, Harbin 150001, Heilongjiang, China
  • Published:2024-08-20

摘要: 为更好地表示节点,提出一种新的图嵌入方法,将节点表示为由均值和方差构成的高斯分布,通过应用一系列可逆 Householder变换,将相对简单的分布转换为更灵活的分布,可以更好地捕获关于其表示的不确定性。为提高稳定性,采用Wasserstein距离进行分布之间的度量。试验结果表明,在多个基准数据集上,使用Householder变换的Graph2Gauss(G2G)算法比原始模型的链接预测表现更好。通过节点分类的效果可以看出,对于节点信息缺失的图,使用Wasserstein距离可以大幅增加节点分类的F1分数。

关键词: 无监督学习, 图嵌入, 高斯分布, Householder flow, Wasserstein距离

中图分类号: 

  • TP181
[1] HE X, DENG K, WANG X, et al. LightGCN: simplifying and powering graph convolution network for recom-mendation[C] //Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2020: 639-648.
[2] WAIKHOM L, PATGIRI R. Graph neural networks:methods, applications, and opportunities[EB/OL].(2021-08-24)[2023-07-01]. https://arxiv.org/abs/2108.10733.
[3] NIEPERT M, AHMED M, KUTZKOV K. Learning convolutional neural networks for graphs[C] //Proceedings of the 33rd International Conference on Machine Learning. New York, USA: PMLR, 2016: 2014-2023.
[4] YANG L, CHEUNG N M, LI J, et al. Deep clustering by gaussian mixture variational autoencoders with graph embedding[C] //Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea: IEEE, 2019: 6440-6449.
[5] HAMILTON W, YING Z, LESKOVEC J. Inductive representation learning on large graphs[C] //Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc., 2017: 1024-1034.
[6] QU M, BENGIO Y, TANG J. GMNN:graph Markov neural networks[C] //Proceedings of the 36th International Conference on Machine Learning. Long Beach, USA: PMLR, 2019: 5241-5250.
[7] WANG X, BO D, SHI C, et al. A survey on hetero-geneous graph embedding: methods, techniques, applications and sources[J]. IEEE Transactions on Big Data, 2023, 9(2): 415-436.
[8] GOYAL P, FERRARA E. Graph embedding techniques, applications, and performance:a survey[J]. Knowledge-Based Systems, 2018, 151: 78-94.
[9] VELICKOVIC P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL].(2018-02-04)[2023-07-01]. https://arxiv.org/abs/1710.10903.
[10] YING C, CAI T, LUO S,et al. Do transformers really perform badly for graph representation?[EB/OL].(2021-11-24)[2023-07-01]. https://arxiv.org/abs/2106.05234v1.
[11] WANG Z, YU D, LI Q, et al. SR-HGN: semantic-and relation-aware heterogeneous graph neural network[J]. Expert Systems with Applications, 2023, 224: 119982.
[12] BOJCHEVSKI A, GÜNNEMANN S. Deep gaussian embedding of graphs:unsupervised inductive learning via ranking[EB/OL].(2018-02-27)[2023-07-01]. https://arxiv.org/abs/1707.03815v2.
[13] REZENDE D, MOHAMED S. Variational inference with normalizing flows[C] //Proceedings of the 32nd International Conference on Machine Learning. Lille, France: PMLR, 2015: 1530-1538.
[14] KOBYZEV I, PRINCE S J D, BRUBAKER M A. Normalizing flows:an introduction and review of current methods[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43(11): 3964-3979.
[15] HOUSEHOLDER A S. Unitary triangularization of a nonsymmetric matrix[J]. Journal of the ACM, 1958, 5(4): 339-342.
[16] LIU G, LIU Y, GUO M, et al. Variational inference with Gaussian mixture model and householder flow[J]. Neural Networks, 2019, 109: 43-55.
[17] 王梅. 几类特殊对称矩阵的分解[D]. 天津:天津工业大学, 2019. WANG Mei. Decomposition of some special symmetric matrices[D]. Tianjin: Tianjin University of Techn-ology, 2019.
[18] SUN X, BISCHOF C. A basis-kernel representation of orthogonal matrices[J]. SIAM Journal on Matrix Analysis and Applications, 1995, 16(4): 1184-1196.
[19] ZHU D, CUI P, WANG D, et al. Deep variational network embedding in Wasserstein space[C] //Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London, UK: ACM, 2018: 2827-2836.
[20] 刘泽华. Wasserstein空间上的积分及其在分布式鲁棒优化中的应用[D]. 南京:南京大学, 2020. LIU Zehua. Calculus on Wasserstein space and its application in distributed robust optimization[D]. Nanjing: Nanjing University, 2020.
[21] ÇELIK T Ö, JAMNESHAN A, MONTÚFAR G, et al. Wasserstein distance to independence models[J]. Journal of Symbolic Computation, 2021, 104: 855-873.
[22] MCCALLUM A K, NIGAM K, RENNIE J,et al. Automating the construction of internet portals with machine learning[J]. Information Retrieval, 2000, 3: 127-163.
[23] GILES C L, BOLLACKER K D, LAWRENCE S.CiteSeer: an automatic citation indexing system[C] //Proceedings of the Third ACM Conference on Digital Libraries. Pittsburgh, USA: ACM, 1998: 89-98.
[24] PAN S, WU J, ZHU X, et al. Tri-party deep network representation[C] //Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. New York, USA: IJCAI, 2016: 1895-1901.
[25] SEN P, NAMATA G, BILGIC M,et al. Collective classification in network data[J]. AI Magazine, 2008, 29(3): 93-106.
[26] GROVER A, LESKOVEC J. Node2vec: scalable feature learning for networks[C] //Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. San Francisco, USA: ACM, 2016: 855-864.
[27] YANG C, LIU Z, ZHAO D, et al. Network representation learning with rich text information[C] //Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence. Buenos Aires, Argentina: AAAI, 2015: 2111-2117.
[28] FAWCETT T. An introduction to ROC analysis[J]. Pattern Recognition Letters, 2006, 27(8): 861-874.
[29] KIPF T N, WELLING M. Variational graph auto-encoders[EB/OL].(2016-11-26)[2023-07-01]. https://arxiv.org/abs/1611.07308.
[30] PEROZZI B, AL-RFOU R, SKIENA S.Deepwalk: online learning of social representations[C] //Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, USA: ACM, 2014: 701-710.
[1] 闵海根,雷小平,李杰,童星,吴霞,方煜坤. 基于双层混合集成的自动驾驶汽车故障检测[J]. 山东大学学报 (工学版), 2022, 52(6): 30-40.
[2] 傅桂霞,邹国锋,毛帅,潘金凤,尹丽菊. 融合Gabor特征与卷积特征的小样本行人重识别[J]. 山东大学学报 (工学版), 2021, 51(3): 22-29.
[3] 秦军,张远鹏,蒋亦樟,杭文龙. 多代表点自约束的模糊迁移聚类[J]. 山东大学学报 (工学版), 2019, 49(2): 107-115.
[4] 沈冀,马志强,李图雅,张力. 面向短文本情感分析的词扩充LDA模型[J]. 山东大学学报(工学版), 2018, 48(3): 120-126.
[5] 陈斌 陈松灿 潘志松 李斌. 异常检测综述[J]. 山东大学学报(工学版), 2009, 39(6): 13-23.
[6] 李 莹,宋 刚 . 汉字笔迹鉴别中预处理算法实现[J]. 山东大学学报(工学版), 2007, 37(2): 80-83 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 张新国1,许崇芳1*,王金双1,严纪丛1,韩廷武1,2. 无电感蔡氏电路设计方法与应用[J]. 山东大学学报(工学版), 2010, 40(6): 134 -138 .
[2] 陈文强1,林琛1,2,陈珂3,陈锦秀1,邹权1,2*. 基于GraphLab的分布式近邻传播聚类算法[J]. 山东大学学报(工学版), 2013, 43(5): 13 -18 .
[3] 周海龙,申向东,薛慧君. 小龄期水泥土无侧限抗压强度试验研究[J]. 山东大学学报(工学版), 2014, 44(1): 75 -79 .
[4] 赵建玉,贾磊,朱文兴,杨立才 . 干道交叉口交通信号的模糊控制设计[J]. 山东大学学报(工学版), 2006, 36(1): 46 -50 .
[5] 夏辉1,王华1,陈熙2. 一种基于微粒群思想的蚁群参数自适应优化算法[J]. 山东大学学报(工学版), 2010, 40(3): 26 -30 .
[6] 黄传真1,2,庄新强1,2,邹斌1,2,刘子夜1,2. 汽车覆盖件模具钢高速切削数据库的研究[J]. 山东大学学报(工学版), 2011, 41(5): 9 -12 .
[7] 李彩虹,李贻斌,范晨 . 移动机器人动态避障算法[J]. 山东大学学报(工学版), 2007, 37(5): 60 -64 .
[8] 魏巍,张艳宁. 基于半监督隐含狄利克雷分配的人脸姿态判别方法[J]. 山东大学学报(工学版), 2011, 41(3): 17 -22 .
[9] 孟祥彬1,姚凯2*,吴庆东3,刘吉山3,窦志刚2. 强夯加固废弃铁矿渣路基的动应力扩散规律实验研究[J]. 山东大学学报(工学版), 2012, 42(1): 87 -92 .
[10] 陈冬岩. 基于多信道的MAC层协议在无线传感器网络中的应用[J]. 山东大学学报(工学版), 2009, 39(1): 41 -49 .