• 机器学习与数据挖掘 •

### 基于加权的K-modes聚类初始中心选择算法

1. 1. 青岛科技大学信息科学与技术学院, 山东 青岛 266061;2. 中国科学院计算技术研究所, 北京 100190
• 收稿日期:2015-04-15 出版日期:2016-04-20 发布日期:2015-04-15
• 作者简介:江峰(1978— ),男,江西彭泽人,副教授, 博士,主要研究方向为数据挖掘,粗糙集.E-mail: jiangkong@163.net
• 基金资助:
国家自然科学基金资助项目(60802042,61273180);山东省自然科学基金资助项目(ZR2011FQ005);山东省高等学校科技计划资助项目(J11LG05)

### A weight-based initial centers selection algorithm for K-modes clustering

JIANG Feng1, DU Junwei1, LIU Guozhu1, SUI Yuefei2

1. 1. College of Information Science &
Technology, Qingdao University of Science and Technology, Qingdao 266061, Shandong, China;
2. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
• Received:2015-04-15 Online:2016-04-20 Published:2015-04-15

Abstract: The current initialization methods for K-modes clustering do not consider the case in which various attributes have different significances. To solve this problem, a weighted density and weighted overlap distance-based initial center selection algorithm(called Ini-Weight)was proposed. In algorithm Ini-Weight, initial centers were selected by calculating the density of each object and the distance between any two objects. In Ini-Weight, when calculating the density of each object and the distance between any two objects, different weights were assigned to different attributes according to the significance of each attribute. Finally, Ini-Weight was compared with the current methods on UCI data sets. The results showed that Ini-Weight algorithm could effectively distinguish different attributes and improve the accuracy for selecting initial centers.

• TP181
