您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (6): 8-14.doi: 10.6040/j.issn.1672-3961.1.2016.294

• • 上一篇    下一篇

基于输入K-近邻的正则化路径上SVR贝叶斯组合

王梅1,2,曾昭虎3,孙莺萁1,杨二龙4*,宋考平2,4   

  1. 1. 东北石油大学计算机与信息技术学院, 黑龙江 大庆 163318;2. 北京德威佳业科技有限公司博士后科研工作站, 北京 100020;3. 大庆油田有限责任公司第五采油厂信息中心, 黑龙江 大庆 163318;4. 东北石油大学教育部提高油气采收率重点实验室, 黑龙江 大庆 163318
  • 收稿日期:2016-03-31 出版日期:2016-12-20 发布日期:2016-03-31
  • 通讯作者: 杨二龙(1976— ),男,河北保定人,教授,博导,主要研究方向为知识工程. E-mail:erlongyang.nepu@gmail.com E-mail:wangmei@nepu.edu.cn
  • 作者简介:王梅(1976— ),女,河北保定人,副教授,博士,主要研究方向为模型选择和核方法. E-mail: wangmei@nepu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(51574085);黑龙江省自然科学基金资助项目(F2015020);北京市博士后工作经费资助项目(2015ZZ-120);北京市朝阳区博士后工作经费资助项目(2014ZZ-14);东北石油大学校培育基金资助项目(XN2014102)

Bayesian combination of SVR on regularization path based on KNN of input

WANG Mei1,2, ZENG Zhaohu3, SUN Yingqi1, YANG Erlong4*, SONG Kaoping2,4   

  1. 1. School of Computer and Information Technology, Northeast Petroleum University, Daqing 163318, Heilongjiang, China;
    2. Post Doctoral Scientific Research Workstation, Beijing Deweijiaye Science and Technology Corporation Ltd., Beijing 100020, China;
    3. Information Center of Production Plant No.5, Petro China Daqing Oilfield, Daqing 163318, Heilongjiang, China;
    4. Key Laboratory on Enhanced Oil and Gas Recovery of the Ministry of Education, Northeast Petroleum University, Daqing 163318, Heilongjiang, China
  • Received:2016-03-31 Online:2016-12-20 Published:2016-03-31

摘要: ε-不敏感支持向量回归(ε-insensitive support vector regression, ε-SVR)正则化路径的基础上,提出基于输入K-近邻的三步式SVR模型组合方法。在整个样本集上进行训练,求得ε-SVR的正则化路径。由SVR正则化路径的分段线性性质确定初始模型集合,并应用平均贝叶斯信息准则(Bayesian Information Criterion, BIC)策略对初始模型集合进行修剪以获得候选模型集合。该修剪策略可减小候选模型集合的规模,提高模型组合的计算效率和预测性能。在预测或测试阶段,根据样本输入向量采用K-近邻法确定最终组合模型集合,并实现贝叶斯组合预测证明了ε-SVR模型组合的Lε-风险一致性,给出了SVR模型组合基于样本的合理性解释。试验结果验证了正则化路径上基于输入K-近邻的ε-SVR模型组合的有效性。

关键词: 模型组合, 支持向量回归, 正则化路径, K-近邻, 一致性

Abstract: A model combination method of ε-insensitive support vector regression(ε-SVR)based on regularization path with K-Nearest Neighbor(KNN)of input was proposed. The model set was constructed with ε-SVR regularization path, which was trained by using the same original training set. The initial model set was obtained according to the piecewise linearity of SVR regularization path. The average of Bayesian Information Criterion(BIC)was applied to exclude models with poor performance and prune the initial model set. In the testing or predicting phase, the combination model set was determined with the KNN method, and Bayesian combination was performed. The pruning policy improves not only the computational efficiency of model combination but also the generalization performance. The Lε-risk consistency for model combination of ε-SVR was defined and proved, which gave the mathematical foundation of the proposed method. Experimental results demonstrated the effectiveness and efficiency of the Bayesian combination of ε-SVR on regularization path.

Key words: regularization path, model combination, support vector regression, KNN, consistency

中图分类号: 

  • TP181
[1] BURGES C J. A tutorial on support vector machines for pattern recognition[J]. Data mining and knowledge discovery, 1998, 2(2):121-167.
[2] 邓乃扬, 田英杰. 数据挖掘中的新方法: 支持向量机[M]. 北京: 科学出版社, 2004.
[3] ANTHONY M, HOLDEN S B. Cross-validation for binary classification by real-valued functions: theoretical analysis[C] //Proceedings of the 11th Annual Conference on Computational Learning Theory. Berlin, Germany: Springer, 1998: 218-229.
[4] CHAPELLE O, VAPNIK V. Model selection for support vector machines[C] //Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 1999.
[5] VAPNIK V, CHAPELLE O. Bounds on error expectation for support vector machines[J]. Neural Computation, 2000, 12(9):2013-2036.
[6] GOLD C, SOLLICH P. Model selection for support vector machine classification [J]. Neurocomputing, 2003, 55(1):221-249.
[7] KEERTHI S S. Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms[J]. IEEE Transactions on Neural Networks, 2002, 13(5):1225-1229.
[8] 刘向东, 骆斌, 陈兆乾. 支持向量机最优模型选择的研究[J]. 计算机研究与发展, 2005, 42(4):576-581. LIU Xiangdong, LUO Bin, CHEN Zhaoqian. Optimal model selection for support vector machine[J]. Journal of Computer Research and Development, 2005, 42(4):576-581.
[9] 汪廷华. 支持向量机模型选择研究 [D]. 北京:北京交通大学, 2010. WANG Tinghua. Reseach on model selection for support vector machine[D].Beijing: Beijing Jiaotong University, 2010.
[10] 丁立中, 廖士中. 基于正则化路径的支持向量机近似模型选择[J]. 计算机研究与发展, 2012, 49(6):1248-1255. DING Lizhong, LIAO Shizhong. Approximate model selection on regularization path for support vector machines[J]. Journal of Computer Research and Development, 2012, 49(6):1248-1255.
[11] BREIMAN L. Bagging predictors[J]. Machine Learning, 1996, 24(2):123-140.
[12] VALENTINI G, MUSELLI M, RUFFINO F. Bagged ensembles of support vector machines for gene expression data analysis[C] //Proceedings of the Int Joint Conference on Neural Networks. Piscataway, USA: IEEE Computer Society, 2003: 1844-1849.
[13] SUN B Y, HUANG D S. Least squares support vector machine ensemble [C] //Proceedings of the Int Joint Confence on Neural Networks. Piscataway, USA: IEEE Computer Society, 2004:2013-2016.
[14] KIM H C, PANG S, JE H M. Constructing support vector machine ensemble[J]. Pattern Recognition, 2003, 36(12):2757-2767.
[15] KIM H C, PANG S, JE H M. Pattern classification using support vector machine ensemble[C] //Proceedings of the 16th Int Confence on Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2002:160-163.
[16] LI X, WANG L, SUNG E. AdaBoost with SVM-based component classifiers[J]. Engineering Applications of Artificial Intelligence, 2008, 21(5):785-795.
[17] 王梅,廖士中. 正则化路径上三步式SVM贝叶斯组合[J]. 计算机研究与发展, 2013, 50(9):1855-1864. WANG Mei, LIAO Shizhong. Three-step Bayesian combination of SVM on regularization path[J]. Journal of Computer Research and Development, 2013, 50(9): 1855-1864.
[18] GUNTER Lacey, ZHU Ji. Efficient computation and model selection for the support vector regression[J]. Neural Computation, 2007, 19(6):1633-1655.
[19] 廖士中,王梅,赵志辉. 正定矩阵支持向量机正则化路径算法[J]. 计算机研究与发展, 2013, 50(11): 2253-2261. LIAO Shizhong, WANG Mei, ZHAO Zhihui. Regularization path algorithm of SVM via positive definite matrix[J]. Journal of Computer Research and Development, 2013, 50(11): 2253-2261.
[20] WANG Mei, LIAO Shizhong. Model combination for support vector regression via regularization path[C] //Proceedings of 12th Pacific Rim International Conference on Artificial Intelligence(PRICAI 2012). Beijing, China:Science Press, 2012:649-660.
[21] STEINWART I. On the influence of the kernel on the consistency of support vector machines[J]. The Journal of Machine Learning Research, 2002(2):67-93.
[22] CHRISTMANN Andreas, STEINWART Ingo. Consistency and robustness of kernel-based regression in convex risk minimization[J]. Bernoulli, 2007, 13(3):799-819.
[1] 王婷婷,翟俊海,张明阳,郝璞. 基于HBase和SimHash的大数据K-近邻算法[J]. 山东大学学报(工学版), 2018, 48(3): 54-59.
[2] 李笋,王超,张桂林,徐志根,程涛,王义元,王瑞琪. 基于支持向量回归的短期负荷预测[J]. 山东大学学报(工学版), 2017, 47(6): 52-56.
[3] 李富贵1,2,黄添强1,2*,苏立超1,2,苏伟峰3. 融合多特征的异源视频复制-粘贴篡改检测[J]. 山东大学学报(工学版), 2013, 43(4): 32-38.
[4] 于立萍1,2,唐焕玲1,2. 基于分类一致性的迁移学习及其在行人检测中的应用[J]. 山东大学学报(工学版), 2013, 43(4): 26-31.
[5] 徐龙琴1,刘双印1,2,3,4*. 基于APSO-WLSSVR的水质预测模型[J]. 山东大学学报(工学版), 2012, 42(5): 80-86.
[6] 赵燕燕, 范丽亚. 多输出支持向量回归机在依赖时间的变分不等式中的应用[J]. 山东大学学报(工学版), 2011, 41(3): 23-30.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李可,刘常春,李同磊 . 一种改进的最大互信息医学图像配准算法[J]. 山东大学学报(工学版), 2006, 36(2): 107 -110 .
[2] 岳远征. 远离平衡态玻璃的弛豫[J]. 山东大学学报(工学版), 2009, 39(5): 1 -20 .
[3] 程代展,李志强. 非线性系统线性化综述(英文)[J]. 山东大学学报(工学版), 2009, 39(2): 26 -36 .
[4] 王勇, 谢玉东.

大流量管道煤气的控制技术研究

[J]. 山东大学学报(工学版), 2009, 39(2): 70 -74 .
[5] 刘新1 ,宋思利1 ,王新洪2 . 石墨配比对钨极氩弧熔敷层TiC增强相含量及分布形态的影响[J]. 山东大学学报(工学版), 2009, 39(2): 98 -100 .
[6] 田芳1,张颖欣2,张礼3,侯秀萍3,裘南畹3. 新型金属氧化物薄膜气敏元件基材料的开发[J]. 山东大学学报(工学版), 2009, 39(2): 104 -107 .
[7] 陈华鑫, 陈拴发, 王秉纲. 基质沥青老化行为与老化机理[J]. 山东大学学报(工学版), 2009, 39(2): 125 -130 .
[8] 赵延风1,2, 王正中1,2 ,芦琴1,祝晗英3 . 梯形明渠水跃共轭水深的直接计算方法[J]. 山东大学学报(工学版), 2009, 39(2): 131 -136 .
[9] 李士进,王声特,黄乐平. 基于正反向异质性的遥感图像变化检测[J]. 山东大学学报(工学版), 2018, 48(3): 1 -9 .
[10] 赵科军 王新军 刘洋 仇一泓. 基于结构化覆盖网的连续 top-k 联接查询算法[J]. 山东大学学报(工学版), 2009, 39(5): 32 -37 .