Journal of Shandong University(Engineering Science) ›› 2019, Vol. 49 ›› Issue (2): 107-115.doi: 10.6040/j.issn.1672-3961.0.2018.458

• Machine Learning & Data Mining • Previous Articles     Next Articles

Transfer fuzzy clustering based on self-constraint of multiple medoids

Jun QIN1(),Yuanpeng ZHANG1,2,*(),Yizhang JIANG2,Wenlong HANG3   

  1. 1. Department of Medical Informatics, Nantong University, Nantong 226001, Jiangsu, China
    2. School of Digital Media, Jiangnan University, Wuxi 214122, Jiangsu, China
    3. School of Computer Science and Technology, Nanjing Tech University, Nanjing 211816, Jiangsu, China
  • Received:2018-10-29 Online:2019-04-20 Published:2019-04-19
  • Contact: Yuanpeng ZHANG E-mail:2432533512@qq.com;155297131@qq.com
  • Supported by:
    国家自然科学基金资助项目(81701793);南通市科技计划资助项目(MS12017016-2);江苏省社会科学基金资助项(18YSC009)

Abstract:

Transfer clustering approaches derived from the fuzzy C-means (FCM) framework, which considered virtual centers from source domains as transfer knowledge, inherited the shortcomings of FCM. These methods were not robust to outliers and noises, and whose single cluster centers were not sufficient enough to capture the inner structures of clusters. To solve the problems, a transfer fuzzy clustering approach was proposed based on the self-constraint of multiple medoids. Prototype weights were introduced and assigned to each object to capture the inner structures of clusters. Such a weighting strategy could capture the inner structures of clusters more sufficiently and made the clustering more robust to outliers and noises; Furthermore, with the distribution of data in the source domain, the inner structure of data in the target domain was reconstructed, and the corresponding new structure was considered as the transfer knowledge to guide the clustering of the target domain. Relative to the use of single virtual center of each cluster as transfer knowledge, the updated inner structures of data in the target domain contained more knowledge. Experimental results demonstrated that the proposed approach achieved 0.674 5 and 0.608 4 improvements in terms of NMI and ARI on synthetic datasets and real-life datasets compared with introduced benchmarking approaches. Therefore, based on the transfer principle of the self-constraint of multiple medoids, the proposed clustering approach performed well in the transfer environment.

Key words: fuzzy clustering, transfer clustering, multiple exemplars, transfer learning, unsupervised learning

CLC Number: 

  • TP391

Fig.1

Transfer learning of PPKTFCM"

Table 1

Parameter settings of each approach"

算法 算法简介 参数设置
FCM 基于虚拟簇中心和“软”划分的模糊均值聚类算法 模糊指数m:[1.1, 1.2, …, 3]
AP和TAP AP聚类算法是基于数据点间的“信息传递”的一种聚类算法。TAP算法是AP算法的泛化,利用源域数据的统计特征(分布匹配迁移策略)以及源域数据和目标域数据之间的几何特征(实例保留迁移策略)来实现迁移聚类 最佳近邻个数:[1, 2, …, 7], λ1:[0.1, 0.2, …, 1], λ2:[1, 2, …, 10]
PPKTFCM 在FCM框架下提出的基于样本点与历史类中心点距离和极小规则以及隶属度变化极小规则的一种模糊迁移聚类算法 模糊指数m:[1.1, 1.2, …, 3], λ1:[0.1, 0.2, …, 10], λ2:[1, 2, …, 10]
退化的MMSC-TFC和MMSC-TFC 本研究提出的算法 λ1:[0.1, 0.2, …, 10], λ2:[1, 2, …, 10], $\boldsymbol{\epsilon}$=10-4

Fig.2

Synthetic datasets"

Table 2

Clustering results on D-T of the target domain"

算法 K=2 K=3
NMI ARI NMI ARI
FCM 0.246 2 0.086 2 0.796 7 0.822 1
AP 0.337 3 0.408 7 0.621 8 0.625 6
退化的MMSC-TFC 0.735 7 0.828 2 0.847 9 0.886 2
PPKTFCM 0.296 1 0.150 3 0.761 0 0.784 5
TAP 0.613 9 0.709 8 0.794 8 0.805 2
MMSC-TFC 0.867 4 0.912 1 1.000 0 1.000 0

Fig.3

Clustering results on the synthetic dataset"

Table 3

Sample composition of two transfer scenarios: source domain and target domain"

场景 数据集 大小 维数 簇个数
comp VS sci 源域 1 500 350 2
目标域 150 350
rec VS talk 源域 1 500 350 2
目标域 150 350

Table 4

Clustering results on real-life datasets"

算法 comp VS sci rec VS talk
NMI ARI NMI ARI
FCM 0.221 4 0.189 7 0.235 7 0.257 4
AP 0.489 2 0.501 1 0.202 0 0.688 5
退化的MMSC-TFC 0.732 2 0.801 2 0.712 5 0.875 2
PPKTFCM 0.723 6 0.789 6 0.892 2 0.911 4
TAP 0.795 1 0.775 4 0.823 0 0.938 7
MMSC-TFC 0.785 2 0.798 1 0.910 2 0.922 6
1 张远鹏, 邓赵红, 钟富礼, 等. 基于代表点评分策略的快速自适应聚类算法[J]. 计算机研究与发展, 2018, 55 (1): 163- 178.
ZHANG Yuanpeng , DENG Zhaohong , CHUNG Fuli , et al. Fast self-adaptive clustering algorithm based on exemplar score strategy[J]. Journal of Computer Research and Development, 2018, 55 (1): 163- 178.
2 ZHANG Y P , CHUNG F L , WANG S T . Fast exemplar-based clustering by gravity enrichment between data objects[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2018, 1 (1): 1- 14.
3 ZHANG Y P , CHUNG F L , WANG S T . Fast reduced set-based exemplar finding and cluster assignment[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2017, 2 (1): 1- 15.
4 ZHANG Y P , TIAN F , WU H Q , et al. Brain MRI tissue classification based fuzzy clustering with competitive learning[J]. Journal of Medical Imaging & Health Informatics, 2017, 7 (7): 1654- 1659.
5 TZORTZIS G , LIKAS A . The minmax K-means clustering algorithm[J]. Pattern Recognition, 2014, 47 (7): 2505- 2516.
doi: 10.1016/j.patcog.2014.01.015
6 ESTER M, KRIEGEL H P, SANDER J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Proc of KDD-96. Menlo Park, USA: AAAI Press, 1996: 226-231.
7 DENG Z H , JIANG Y Z , CHOI K S , et al. Knowledge-leverage-based TSK fuzzy system modeling[J]. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24 (8): 1200- 1212.
doi: 10.1109/TNNLS.2013.2253617
8 蒋亦樟, 邓赵红, 王骏, 等. 基于知识利用的迁移学习一般化增强模糊划分聚类算法[J]. 模式识别与人工智能, 2013, 26 (10): 975- 984.
doi: 10.3969/j.issn.1003-6059.2013.10.010
JIANG Yizhang , DENG Zhaohong , WANG Jun , et al. Transfer generalized fuzzy C-means clustering algorithm with improved fuzzy partitions by leveraging knowledge[J]. Pattern Recgonition & Artificial Intelligence, 2013, 26 (10): 975- 984.
doi: 10.3969/j.issn.1003-6059.2013.10.010
9 CHEN A G , WANG S T . Knowledge transfer clustering algorithm with privacy protection[J]. Journal of Electronics & Information Technology, 38 (3): 523- 531.
10 杭文龙, 蒋亦樟, 刘解放, 等. 迁移近邻传播聚类算法[J]. 软件学报, 2016, (11): 2796- 2813.
HANG Wenlong , JIANG Yizhang , LIU Jiefang , et al. Transfer affinity propagation clustering algorithm[J]. Journal of Software, 2016, (11): 2796- 2813.
11 FREY B J , DUECK D . Clustering by passing messages between data points[J]. Science, 2007, 315 (5814): 972- 976.
doi: 10.1126/science.1136800
12 MEI J P , CHEN L H . Fuzzy relational clustering around medoids: a unified view[J]. Fuzzy Sets and Systems, 2011, 183 (2011): 44- 56.
13 MIYAMOTO S, UMAYAHARA K, Fuzzy clustering by quadratic regularization[C]//Processding of the 1998 IEEE International Conference on Fuzzy Systems. Monterey, USA: IEEE Press 1998: 1394-1399.
14 YING W H , CHUNG F L , WANG S T . Scaling up synchronization-inspired partitioning clustering[J]. IEEE Transactions on Knowledge & Data Engineering, 2014, 26 (8): 2045- 2057.
15 QIAN P J , JIANG Y Z , DENG Z H , et al. Cluster prototypes and fuzzy memberships jointly leveraged cross-domain maximum entropy clustering[J]. IEEE Transactions on Cybernetics, 2015, 46 (1): 181- 193.
16 李素姝, 王士同, 李滔. 基于LS-SVM与模糊补准则的特征选择方法[J]. 山东大学学报(工学版), 2017, 47 (3): 34- 42.
LI Sushu , WANG Shitong , LI Tao . A feature selection method based on LS-SVM and fuzzy supplementary criterion[J]. Journal of Shandong University (Engineering Science), 2017, 47 (3): 34- 42.
17 CHENG J , SAAD Y . Lanczos vectors versus singular vectors for effective dimension reduction[J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 21 (8): 1091- 1103.
doi: 10.1109/TKDE.2008.228
18 MO D , HUANG S . Fractal-based intrinsic dimension estimation and its application in dimensionality reduction[J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 24 (1): 59- 71.
doi: 10.1109/TKDE.2010.225
19 YAO J , LIU X , ZHU X , et al. Control of large-scale systems through dimension reduction[J]. IEEE Transactions on Services Computing, 2015, 8 (4): 563- 575.
doi: 10.1109/TSC.2014.2312946
20 ZHOU Y , PENG J , CHEN C . Dimension reduction using spatial and spectral regularized local discriminant embedding for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53 (2): 1082- 1095.
doi: 10.1109/TGRS.2014.2333539
[1] Hongbin ZHANG,Diedie QIU,Renzhong WU,Tao ZHU,Jin HUA,Donghong JI. Image attribute annotation based on extreme gradient boosting algorithm [J]. Journal of Shandong University(Engineering Science), 2019, 49(2): 8-16.
[2] LI Yuxin, PU Yuanyuan, XU Dan, QIAN Wenhua, LIU Hejuan. Image aesthetic quality evaluation based on embedded fine-tune deep CNN [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 60-66.
[3] SHEN Ji, MA Zhiqiang, LI Tuya, ZHANG Li. A word extend LDA model for short text sentiment [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 120-126.
[4] YU Li-ping1,2, TANG Huan-ling1,2. Transfer learning model based on classification consensus and  its application in pedestrian detection [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2013, 43(4): 26-31.
[5] ZHANG Yun-xia, CUI Xiao-song, ZOU Li*. A clustering method based on 18-element linguistic-valued fuzzy similar matrix [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2013, 43(1): 34-40.
[6] CHEN Bin, CHEN Song-Can, PAN Zhi-Song, LI Bin. Survey of outlier detection technologies [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(6): 13-23.
[7] , . An approach to detecting faces in color images [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(4): 0-0 .
[8] MA Zhi-qiang,CHANG Fa-liang,TIAN Wei,ZHAO Yao . An approach to face detection in color images [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(4): 19-22 .
[9] XU Yan-sheng,LIU Xing-fang . Application of the fuzzy clustering iterative model to the evalution of water resource carrying capacity [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(3): 100-104 .
[10] LI Yi-bin ,LI Cai-hong ,RUAN Jiu-hong . On intelligent vehicle transverse locomotion pattern space construction for intelligent transport system [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 36-40 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] WANG Su-yu,<\sup>,AI Xing<\sup>,ZHAO Jun<\sup>,LI Zuo-li<\sup>,LIU Zeng-wen<\sup> . Milling force prediction model for highspeed end milling 3Cr2Mo steel[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(1): 1 -5 .
[2] LI Kan . Empolder and implement of the embedded weld control system[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(4): 37 -41 .
[3] LI Liang, LUO Qiming, CHEN Enhong. Graph-based ranking model for object-level search
[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 15 -21 .
[4] CHEN Rui, LI Hongwei, TIAN Jing. The relationship between the number of magnetic poles and the bearing capacity of radial magnetic bearing[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(2): 81 -85 .
[5] JI Tao,GAO Xu/sup>,SUN Tong-jing,XUE Yong-duan/sup>,XU Bing-yin/sup> . Characteristic analysis of fault generated traveling waves in 10 Kv automatic blocking and continuous power transmission lines[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 111 -116 .
[6] . [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 27 -32 .
[7] QIN Tong, SUN Fengrong*, WANG Limei, WANG Qinghao, LI Xincai. 3D surface reconstruction using the shape based interpolation guided by maximal discs[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(3): 1 -5 .
[8] LIU Wen-liang, ZHU Wei-hong, CHEN Di, ZHANG Hong-quan. Detection and tracking of moving targets using the morphology match in radar images[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(3): 31 -36 .
[9] SUN Guohua, WU Yaohua, LI Wei. The effect of excise tax control strategy on the supply chain system performance[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 63 -68 .
[10] SUN Weiwei, WANG Yuzhen. Finite gain stabilization of singlemachine infinite bus system subject to saturation[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 69 -76 .