您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2010, Vol. 40 ›› Issue (2): 1-10.

• 机器学习与数据挖掘 •    下一篇

知识保持的嵌入方法

张道强   

  1. 南京航空航天大学信息科学与技术学院,  江苏 南京210016
  • 收稿日期:2010-02-10 出版日期:2010-04-16 发布日期:2010-02-10

Knowledge preserving embedding

ZHANG Dao-qiang   

  1. Department of Computer Science and Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016,  China
  • Received:2010-02-10 Online:2010-04-16 Published:2010-02-10
  • About author:ZHANG Dao-qiang(1978-),male, born in Shandong, China, Ph.D., Professor, his research interests include machine learning, pattern recognition, data mining, and image processing. E-mail: dqzhang@nuaa.edu.cn
  • Supported by:

    This work was supported by the National Science Foundation of China (60875030)

摘要:

考虑了一种带有数据领域知识的降维问题。这里领域知识是指关于数据的一些额外监督信息,如类别标号以及比标号弱的样本间相似性和不相似性约束等。其中,约束可以从标号中产生,但反过来从约束中却得不到标号信息,因而约束比标号更一般。另外,在图像检索等实际应用中,约束比标号更容易获取。鉴于此,本文主要研究基于约束的降维问题。提出了一种有效利用约束进行降维的约束保持嵌入算法(constraint preserving embedding, COPE),将其纳入到图嵌入统一框架之中并指出与同类方法的关系。进一步,通过引入无标记样本提出了半监督COPE算法;提出核COPE以揭示数据中的非线性结构。最后,在人脸识别、图像检索及半监督聚类等一系列实验中的结果验证了算法的有效性。

关键词: 半监督降维, 成对约束, 领域知识

Abstract:

The problem of dimensionality reduction given some domain knowledge on the data is considered. Here the domain knowledge denotes additional supervision information other than the data, e.g. the class labels of data or more weakly, the pairwise similarity or dissimilarity constraints. The focus is on the latter because it is more general than the former. Given class labels of data, corresponding pairwise similarity or dissimilarity constraints can be generated, but not vice versa. Also in real world application such as image retrieval, obtaining pairwise constraints is much easier than obtaining labels.A simple algorithm called constraint preserving embedding (COPE) was presented, which can effectively use the pairwise constraints for better embedding. The algorithm is formulated under a unified spectral graph embedding framework and  the relationship between it and existing related methods is indicated. Moreover,  COPE  is extended to semisupervised and kernel cases, in order to include unlabeled data and capture the nonlinear relationships between data. The performance of the  proposed algorithms is evaluated through a series of experiments including face image recognition and retrieval and semisupervised clustering. Experimental results show that the algorithms are effective and promising in learning from pairwise constraints.

Key words: pairwise constraint;domain knowledge,  semi-supervised dimensionality reduction

[1] 刘笑,陈家炜,胡峻林. 用于亲属关系鉴别的成对约束组合度量学习[J]. 山东大学学报 (工学版), 2022, 52(2): 50-56.
[2] 解子奇,王立宏,李嫚. 块对角子空间聚类中成对约束的主动式学习[J]. 山东大学学报 (工学版), 2021, 51(2): 65-73.
[3] 丁彦,李永忠*. 基于PCA和半监督聚类的入侵检测算法研究[J]. 山东大学学报(工学版), 2012, 42(5): 41-46.
[4] 张友新,王立宏. 两阶段近邻传播半监督聚类算法[J]. 山东大学学报(工学版), 2012, 42(2): 18-22.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李可,刘常春,李同磊 . 一种改进的最大互信息医学图像配准算法[J]. 山东大学学报(工学版), 2006, 36(2): 107 -110 .
[2] 张恭孝,杨荣华 . 水杨醛缩甲基氨基硫脲Schiff碱配合物的合成与表征[J]. 山东大学学报(工学版), 2008, 38(3): 108 -111 .
[3] 薛强,艾兴,赵军,周咏辉,袁训亮 . 纳米TiC对Si3N4基复合陶瓷材料性能和微观结构的影响[J]. 山东大学学报(工学版), 2008, 38(3): 69 -72 .
[4] 李新平 代翼飞 胡静. 某岩溶隧道围岩稳定性及涌水量预测的流固耦合分析[J]. 山东大学学报(工学版), 2009, 39(4): 1 -6 .
[5] 穴洪涛,田国会,李晓磊,路飞 . QR Code在多种类物体识别与操作中的应用[J]. 山东大学学报(工学版), 2007, 37(6): 25 -30 .
[6] 陈胜利,吴辉球,罗云峰 . 多物品最优网上动态拍卖设计[J]. 山东大学学报(工学版), 2008, 38(2): 120 -126 .
[7] 李杰 刘弘. 基于遗传算法的分形艺术图案生成方法[J]. 山东大学学报(工学版), 2008, 38(6): 33 -36 .
[8] 牛秀明,傅春华 . 炭在脉冲放电过程中对污水中有机物的降解作用[J]. 山东大学学报(工学版), 2008, 38(1): 121 -126 .
[9] 王进野,姚瑞英,张纪良,王其军 . 一类模糊双曲正切模型稳定性控制[J]. 山东大学学报(工学版), 2007, 37(2): 63 -66 .
[10] 张承慧,裴荣辉,石庆升,马永庆 . 城市变频调速给水泵站的优化配置[J]. 山东大学学报(工学版), 2007, 37(2): 97 -102 .