您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2012, Vol. 42 ›› Issue (4): 1-7.

• 机器学习与数据挖掘 •    下一篇

一种适应概念漂移数据流的分类算法

郭躬德1,2,李南1,2,陈黎飞1,2   

  1. 1. 福建师范大学数学与计算机科学学院, 福建 福州 350007;
    2. 福建师范大学网络安全与密码技术重点实验室, 福建 福州 350007
  • 收稿日期:2011-04-05 出版日期:2012-08-20 发布日期:2011-04-05
  • 作者简介:郭躬德(1965- ),男,福建龙岩人,教授,博士,主要研究方向为模式识别与人工智能. E-mail: ggd@fjnu.edu.cn
  • 基金资助:

    福建省省属高校科研专项重点项目(JK2009006);福建省高校产学合作重大项目资助(2010H6007)

A self-adaptive classification method for conceptdrifting data streams

GUO Gong-de1,2, LI Nan1,2, CHEN Li-fei1,2   

  1. 1. School of Mathematics and Computer Science, Fujian Normal University, Fuzhou 350007, China;
    2. Key Laboratory of Network Security and Cryptography, Fujian Normal University, Fuzhou 350007, China
  • Received:2011-04-05 Online:2012-08-20 Published:2011-04-05

摘要:

针对带有概念漂移的数据流的分类问题,提出一种新颖的能够识别并且适应概念漂移数据流的分类算法。该算法将原始数据流沿着时间轴划分为若干数据块后,选择第一块中有代表性的数据作为样本训练模型,从而减轻了噪声和边界对分类精度的影响,使得漂移检测能较为全面且对离群点不过于敏感;此后对随后的数据块进行分类,并依据分类结果动态修正当前分类模型。实验结果表明:该方法能够根据数据流的当前状况自动调整分类模型,快速适应数据流概念漂移的情况,并得到较好的分类效果。

关键词: 概念漂移, 数据流, 分类;离群点;时间轴

Abstract:

A novel method was proposed for classifiying the concept-drifting data streams, which could track concept-drifting of data streams and quickly adapt to this change. After dividing a given data stream into several data blocks, it could choose the representative data from the first one for training model. The proposed method could alleviate the effects from noise and bordering data better, and be insensitive to outlier. Moreover, it used the created model for classifying each of the following data blocks, and used the classification results to dynamically adjust the current classification model. The experimental results showed that the proposed method could not only adjust classification model automatically according to the current status of data streams and quickly adapt to the situation of the concept drift, but also improve the classification performance.

Key words: concept drift, data stream, classification, outlier, time axis

[1] 琚春华1,2,陈之奇1*. 一种挖掘概念漂移数据流的模糊积分集成分类方法[J]. 山东大学学报(工学版), 2011, 41(4): 44-48.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!