﻿ 基于全局距离和类别信息的邻域保持嵌入算法
 «上一篇 下一篇»
 山东大学学报(工学版)  2016, Vol. 46 Issue (1): 10-14  DOI: 10.6040/j.issn.1672-3961.0.2015.296 0

### 引用本文

MEI Qinglin, ZHANG Huaxiang. A neighborhood preserving embedding algorithm based on global distance and label information[J]. Journal of Shandong University(Engineering Science), 2016, 46(1): 10-14. DOI: 10.6040/j.issn.1672-3961.0.2015.296.

### 文章历史

1. 山东师范大学信息科学与工程学院, 山东 济南 250014;
2. 山东省分布式计算机软件新技术重点实验室, 山东 济南 250014

A neighborhood preserving embedding algorithm based on global distance and label information
MEI Qinglin1, 2, ZHANG Huaxiang1, 2
1. School of Information Science and Engineering, Shandong Normal University, Jinan 250014, Shandong, China;
2. Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan 250014, Shandong, China
Abstract: An algorithm of neighborhood preserving embedding based on global distance and label information was proposed. A global factor that characterized the global distance and a function term that characterized the label information were added in the traditional Euclidean distance formula of adjacent graph. Global factor could make unevenly dirtibuted samples smooth and uniform, label information could make intra-class compact and inter-class separable, which improved quality of neighborhood and constructed an optimal adjacency graph, and improved classification accuracy. Experimental results showed that the proposed algorithm had higher accuracy and performed more effective than traditional neighborhood preserving embedding algorithm.
Key words: dimension reduction    neighborhood preserving embedding algorithm    global distance    label information    neighborhood optimization
0 引言

1 NPE算法

(1)构造邻接图G

Gn个样本点构成的图,如果两个样本点xixj是近邻,则用边将其连接,否则不连接。判断两点是否为近邻有两种方法,一种是KNN方法,另一种是ε-邻域方法,两种方法均是基于传统欧氏距离。

(2)计算权值矩阵W

 $\min \sum\limits_i {||{x_i}} - \sum\limits_j {{W_{ij}}} {x_j}|{|^2},\sum\limits_j {{W_{ij}}}=1\;.$ (1)

(3)计算投影矩阵A

 $\min \sum\limits_i {||{y_i}} - \sum\limits_j {{W_{ij}}} {y_j}|{|^2}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{=}}\;{\rm{min}}\;{A^T}XM{X^T}A\;,$ (2)

 $XM{X^T}a=\lambda X{X^{\rm{T}}}a,$ (3)

2 基于全局距离和类别信息的邻域保持嵌入算法

NPE是一种基于局部的线性降维算法,通过选取样本点的邻近点构造权值矩阵进行投影,这种基于局部的方法往往忽略了样本点与非邻近点之间的关系。对于分布不均匀的样本,距离较远的两个样本点,其相对距离不一定远。比如两个样本点xixj,xi与其他样本点距离都很近,xj与其他样本点距离都很远,而xixj的最近邻点,通过传统NPE构造邻域图的距离公式计算的话,xixj的邻近点,而xj并不是xi的一个邻近点,忽略了xjxi的影响。通过这种方法选取邻近点面对分布不均匀的样本存在一定的局限性,不能准确找到近邻。基于全局的方法不仅考虑到样本点与邻近点之间的关系,同样考虑到样本点与非邻近点之间的关系。

 ${d_{ij}}=\frac{{||{x_i} - {x_j}||}}{{\sqrt {{D_i}{D_j}} }}\;,$ (4)

 $\delta({x_i},{x_j})=\left\{ \begin{array}{l} \gamma \;\;\;\;\;\;\;\;\;\;\;{x_i}{\rm{和}}{x_j}{\rm{同类}}\\ \frac{1}{\gamma }\;\;\;\;\;\;\;\;\;\;{x_i}{\rm{和}}{x_j}{\rm{不同类}} \end{array} \right.{\rm{ }},$ (5)

 ${d_{ij}}=\frac{{||{x_i} - {x_j}||}}{{\sqrt {{D_i}{D_j}} }}\delta({x_i},{x_j})\;.$ (6)

xixj同类时,δ(xi,xj)为(0,1)之间的常数,dij减小;当xixj不同类时,δ(xi,xj)为大于1的常数,dij增大。从而使类内紧凑,类间疏离。改进的GLI-NPE算法具体步骤如下:

(1)通过距离公式(6)选取K个邻近点,构造邻域图。

(2)与NPE算法的第2步相同,对构造的邻域图计算重构权值矩阵W

(3)与NPE算法的第3步相同,利用M=(I-W)T(I-W)构造矩阵M,通过公式(3)计算M的特征值和特征向量,得到投影矩阵A,通过Y=ATX,得到样本X在低维空间的表示Y

3 试验结果与分析

3.1 基于UCI数据集的试验

 图1 zoo数据集 Fig. 1 zoo data set
 图2 wine数据集 Fig. 2 wine data set
 图3 heart数据集 Fig. 3 heart data set
 图4 sonar数据集 Fig. 4 sonar data set
 图5 german数据集 Fig. 5 german data set

3.2 基于人脸数据集的试验

 图6 ORL数据集 Fig. 6 ORL data set
 图7 Yale数据集 Fig. 7 Yale data set
 图8 Yale B数据集 Fig. 8 YaleB data set

4 结论

 [1] 周志华, 杨强. 机器学习及其应用[M].北京:清华大学出版社, 2011:87-95.(1) [2] EGECIOGLU O, FERHATOAMANOGLU H, OGRAS U. Dimensionality reduction and similarity computation by inner-product approximations[J].IEEE Transactions on Knowledge and Data Engineering, 2004, 16(6):714-726.(1) [3] BELKIN M, NIYOGI P. Laplacian eigenmaps for dimensionality reduction and data representation[J]. Neural Computation, 2003, 15(6):1373-1396.(1) [4] 赵连伟, 罗四维, 赵艳敞, 等. 高维数据流形的低维嵌入及嵌入维数研究[J]. 软件学报, 2005, 16(8):1423-1430.ZHAO Lianwei, LUO Siwei, ZHAO Yanchang, et al. Study on the low-dimensional embedding and the embedding dimensionality of manifold of high-dimensional data[J].Journal of Software, 2005, 16(8):1423-1430.(1) [5] TENENBAUMe J B, DE SILVA V, LANGFORD J C. A global geometric framework for nonlinear dimensionality reduction[J].Science, 2000, 290(5500):2319-2323.(1) [6] HUANG Hong, HUANG Yunbiao. Improved discriminant sparsity neighborhood preserving embedding for hyperspectral image classification[J].Neurocomputing, 2014, 136(2014):224-234.(1) [7] JOLLIFFE I T. Principle component analysis[M]. Berlin: Springer-Verlag, 1986: 384-389.(1) [8] WANG Fei, WANG Xin, ZHANG Daoqiang, et al. MarginFace: A novel face recognition method by average neighborhood margin maximization[J]. Pattern Recognition, 2009, 42(11):2863-2875.(1) [9] BELHUMEUR P N, HESPANDA J P, KRIEGMAN D J. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(7):711-720.(1) [10] HE Xiaofei, NIYOGI P. Locality preserving projections[J]. Advances in Neural Information Processing Systems, 2005, 45(1):186-197.(1) [11] HAM J, LEE D D, SAUL L K. Semi-supervised alignment of manifolds[J]. International Workshop on Articial Intelligence ＆ Statistics, 2005:120-127.(1) [12] ROWEIS S T, SAUL L K. Nonlinear dimensionality reduction by locally linear embedding[J]. Science, 2000, 290(5500):2323-2326.(1) [13] HE Xiaofei, CAI Deng, YAN Shuicheng, et al. Neighborhood preserving embedding[C]／／Proceedings of the Tenth IEEE International Conference on Computer Vision Workshops. New York, USA: IEEE, 2005:1208-1213.(1) [14] YIN Fei, JIAO L C, SHANG Fanhua, et al. Sparse regularization discriminant analysis for face recognition[J]. Neurocomputing, 2014, 128:341-362.(1) [15] ZHANG Wei, XUE Xiangyang, LU Hong, et al. Discriminant neighborhood embedding for classification[J]. Pattern Recognition, 2006, 39(11):2240-2243.(1) [16] GUI Jie, SUN Zhenan, JIA Wei, et al. Discriminant sparse neighborhood preserving embedding for face recognition[J]. Pattern Recognition, 2012, 45(8):2884-2893.(1) [17] CHENG Bin, YANG Jianchao, YAN Shuicheng, et al. Learning with l1-graph for image analysis[J]. IEEE Transactions on Image Processing, 2010, 19(4):858-866.(1) [18] QIAO Lishan, CHEN Songcan, TAN Xiaoyan. Sparsity preserving projections with applications to face recognition[J]. Pattern Recognition, 2010, 43(1):331-341.(1) [19] 包兴, 张莉, 赵梦梦, 等. 基于类别信息的邻域保持嵌入算法[J]. 计算机科学, 2015, 42(5):94-97.BAO Xing, ZHANG Li, ZHAO Mengmeng, et al. Label information-based neighbourhood preserving embedding[J]. Computer Science, 2015, 42(5):94-97.(1) [20] WANG Yong, WU Yi. Complete neighborhood preserving embedding for face recognition[J]. Pattern Recognition, 2010, 43(3):1008-1015.(1) [21] SAMARIA F S, HARTER A C. Parameterisation of a stochastic model for human face identification[C]／／Proceedings of the Second IEEE Workshop on Applications of Computer Vision. New York, USA: IEEE, 1994:138-142.(1) [22] LEE K C, HO J, KRIEGMAN D J. Acquiring linear subspaces for face recognition under variable lighting[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(5):684-698.(1)