您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2019, Vol. 49 ›› Issue (2): 8-16.doi: 10.6040/j.issn.1672-3961.0.2018.271

• 机器学习与数据挖掘 • 上一篇    下一篇

基于极端梯度提升树算法的图像属性标注

张红斌1(),邱蝶蝶1,邬任重1,朱涛2,滑瑾2,姬东鸿3   

  1. 1. 华东交通大学软件学院,江西 南昌 330013
    2. 华东交通大学信息工程学院,江西 南昌 330013
    3. 武汉大学国家网络安全学院,湖北 武汉 430072
  • 收稿日期:2018-07-04 出版日期:2019-04-20 发布日期:2019-04-19
  • 作者简介:张红斌(1979—),男,江苏如皋人,副教授,硕士生导师,博士,主要研究方向为图像标注,自然语言处理,机器学习. E-mail:zhanghongbin@whu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61762038);国家自然科学基金资助项目(61741108);国家自然科学基金资助项目(61861016);教育部人文社会科学研究规划基金资助项目(16YJAZH029);教育部人文社会科学研究规划基金资助项目(17YJAZH117)

Image attribute annotation based on extreme gradient boosting algorithm

Hongbin ZHANG1(),Diedie QIU1,Renzhong WU1,Tao ZHU2,Jin HUA2,Donghong JI3   

  1. 1. Software School, East China Jiaotong University, Nanchang 330013, Jiangxi, China
    2. School of Information, East China Jiaotong University, Nanchang 330013, Jiangxi, China
    3. School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, Hubei, China
  • Received:2018-07-04 Online:2019-04-20 Published:2019-04-19
  • Supported by:
    国家自然科学基金资助项目(61762038);国家自然科学基金资助项目(61741108);国家自然科学基金资助项目(61861016);教育部人文社会科学研究规划基金资助项目(16YJAZH029);教育部人文社会科学研究规划基金资助项目(17YJAZH117)

摘要:

提出基于极端梯度提升树(eXtreme gradient boosting,XGBoost)算法的图像属性标注模型,以改善标注性能:提取图像局部二值模式(local binary patterns,LBP)、灰度纹理空间包络特征(Gist)、尺度不变特征变换(scale invariant feature transform,SIFT)、视觉几何组(visual geometry group,VGG)等特征,以准确刻画图像视觉内容;基于图像特征,采用XGBoost算法集成弱分类器为强分类器,完成图像属性标注;深入挖掘图像属性蕴含的深层语义,构建全新的、层次化的属性表示体系,以贴近人类客观认知;设计迁移学习策略并合理组合分类模型,进一步改善标注性能。试验表明:Gist特征能真实刻画图像视觉内容;执行基础迁移学习后,标注精准度比迁移学习前最优指标提升8.69%;执行混合型迁移学习后,合理组合分类模型,标注精准度比基础迁移学习的最优指标提升17.55%。模型有效地改善图像属性标注精度。

关键词: 图像属性标注, 极端梯度提升树, 迁移学习, 弱分类器, 深层语义

Abstract:

To improve annotation performance, a novel image attribute annotation model based on eXtreme gradient boosting (XGBoost) algorithm was proposed: image features i.e. local binary patterns (LBP), Gist, scale invariant feature transform (SIFT), and visual geometry group (VGG) were extracted respectively to better characterize the key visual content of images. Then the state-of-the-art boosting algorithm called XGBoost was used to design a strong classifier by integrating a group of weaker classifiers. Based on the strong classifier, image attribute annotation was implemented. A lot of valuable deep semantic implied by image attribute was mined in turn to create a novel hierarchical attribute representation mechanism, which was closer to human's objective cognition. Finally, transfer learning strategy was designed to further improve annotation performance. Experimental results showed that the key visual content of images was truly characterized by the Gist feature. Compared to the best competitor before transfer learning, the accuracy of basic transfer (BT) learning strategy was improved about 8.69%. Compared to the best competitor of BT, the accuracy of hybrid transfer (HT) learning strategy was improved about 17.55%. The annotation accuracy was improved by the presented model.

Key words: image attribute annotation, eXtreme gradient boosting, transfer learning, weak classifiers, deep semantic

中图分类号: 

  • TP391

图1

图像属性标注的基本框架"

表1

迁移学习前主要模型的标注结果比较"

%
模型PrecisionmaterialAPmaterial Accuracy
Pu Canvas Polyester Nylon
Gist_XGBoost 64.45 57.71 38.27 43.09 50.88 52.54
Gist _GBDT 64.65 56.99 38.37 40.89 50.23 51.46
Gist _NB 43.05 37.91 24.41 30.30 33.92 38.28
Gist _RF 61.00 54.45 37.56 40.87 48.47 50.22
Gist _DT 49.16 39.69 29.11 29.33 36.82 37.39
Gist _KNN 57.65 43.60 34.89 44.46 45.15 45.79
Gist _LR 55.91 41.76 0.00 39.25 34.23 46.19
SIFT_XGBoost 75.89 56.86 41.03 56.54 57.58 60.04
SIFT_GBDT 72.65 55.56 43.45 59.53 57.80 59.83
SIFT_NB 84.27 57.04 37.59 49.29 57.05 57.05
SIFT_RF 69.36 49.87 40.00 53.02 53.06 55.97
SIFT_DT 64.62 37.76 29.78 38.24 42.60 43.52
SIFT_KNN 55.33 39.27 32.74 38.02 41.34 41.90
LBP_XGBoost 66.67 57.42 42.06 44.67 52.71 54.24
LBP_GBDT 64.40 57.65 40.87 42.93 51.46 52.89
LBP_NB 45.99 39.61 32.41 33.45 37.87 39.34
LBP_RF 60.43 51.93 40.00 41.38 48.44 50.40
LBP_DT 49.61 38.35 29.00 30.01 36.74 36.80
LBP_KNN 52.15 45.05 38.79 45.00 45.25 46.38
LBP_LR 60.73 53.33 39.72 41.40 48.80 50.03
VGG16_XGBoost 55.68 49.36 35.58 35.09 43.93 45.01
VGG16_GBDT 53.79 46.69 33.61 34.19 42.07 43.39
VGG16_NB 26.75 0.00 0.00 0.00 6.69 26.75
VGG16_RF 53.63 46.91 32.69 35.58 42.20 43.44
VGG16_DT 42.47 38.80 28.35 26.93 34.14 34.15
VGG16_KNN 41.28 35.19 31.51 33.10 35.27 36.20
VGG16_LR 28.34 0.00 31.44 0.00 14.95 28.89

图2

迁移学习前,主要标注模型的P-R曲线"

图3

NT vs BT vs HT的Accuracy对比, (G (Gist), S (SIFT), L (LBP))"

图4

各种模型组合的Accuracy对比"

图5

本研究结果与主要基线的标注精准度对比"

1 杨晓玲, 李志清, 刘雨桐. 基于多标签判别字典学习的图像自动标注[J]. 计算机应用, 2018, 38 (5): 1294- 1298.
doi: 10.3969/j.issn.1001-3695.2018.05.003
YANG Xiaoling , LI Zhiqing , LIU Yutong . Automatic image annotation based on multi-label discriminative dictionary learning[J]. Journal of Computer Applications, 2018, 38 (5): 1294- 1298.
doi: 10.3969/j.issn.1001-3695.2018.05.003
2 WANG XinJing , ZHANG Lei , MA Weiying . Duplicate search-based image annotation using web-scale data[J]. Proceedings of the IEEE, 2012, 100 (9): 2705- 2721.
doi: 10.1109/JPROC.2012.2193109
3 张红斌, 姬东鸿, 尹兰, 等. 基于关键词精化和句法树的商品图像句子标注[J]. 计算机研究与发展, 2016, 53 (11): 2542- 2555.
doi: 10.7544/issn1000-1239.2016.20150906
ZHANG Hongbin , JI Donghong , YIN Lan , et al. Caption generation from produce image based on tag refinement and syntactic tree[J]. Journal of Computer Research and Development, 2016, 53 (11): 2542- 2555.
doi: 10.7544/issn1000-1239.2016.20150906
4 XU Kelvin, BA Jimmy Lei, KIROS Ryan, et al. Show, attend and tell: neural image caption generation with visual attention[C]// Proceedings of International Conference on Machine Learning.New York, USA: ACM, 2015: 2048-2057.
5 FARHADI A, ENDRES I, HOIEM D, et al. Describing objects by their attributes[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2009: 1778-1785.
6 KUMAR N, BELHUMEUR P, NAYAR S K: FaceTracer: a search engine for large collections of images with faces[C]// Proceedings of European Conference on Computer Vision. Berlin, German: Springer, 2008, 5305(14): 340-353.
7 KUMAR N, BERG A C, BELHUMEUR P, et al: Attribute and simile classifiers for face verification[C]// Proceedings of IEEE International Conference on Computer Vision. Piscataway, NJ: IEEE, 2010, 30(2): 365-372.
8 JAYARAMAN D, GRAUMAN D. Zero-shot recognition with unreliable attributes[C]// Proceedings of Conference and Workshop on Neural Information Processing Systems. New York, USA: Curran Associates, 2014: 3464-3472.
9 BERG T, BERG Alexander C, SHIH Jonathan. Automatic attribute discovery and characterization from noisy web data[C]// Proceedings of European Conference on Computer Vision.Berlin, German: Springer, 2010, 6311: 663-676.
10 PARIKH D, GRAUMAN K. Relative attributes[C]// Proceedings of IEEE International Conference on Computer Vision, Piscataway. NJ: IEEE, 2011, 6669(5): 503-510.
11 KOVASHKA A, GRAUMAN K. Discovering shades of attribute meaning with the crowd[C]//Proceedings of European Conference on Computer Vision. Berlin, German: Springer, 2014: 114(1): 56-73.
12 KOVASHKA A, GRAUMAN K.. Attribute adaptation for personalized image search[C]// Proceedings of International Conference on Computer Vision. Piscataway. NJ: IEEE, 2013.
13 KOVASHKA A , PARIKH D , GRAUMAN K . WhittleSearch: interactive image search with relative attribute feedback[J]. International Journal of Computer Vision, 2015, 115 (2): 185- 210.
doi: 10.1007/s11263-015-0814-0
14 乔雪, 彭晨, 段贺, 等. 基于共享特征相对属性的零样本图像分类[J]. 电子与信息学报, 2017, 39 (7): 1563- 1570.
QIAO Xue , PENG Chen , DUAN He , et al. Shared features based relative attributes for zero-shot image classification[J]. Journal of Electronic & Information Technology, 2017, 39 (7): 1563- 1570.
15 ZHAO Bo, FENG Jiashi, WU Xiao, et al. Memory-augmented attribute manipulation networks for interactive fashion search[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2017: 6156-6164.
16 JAYARAMAN D, SHA F, GRAUMAN K. Decorrelating semantic visual attributes by resisting the urge to share[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2014: 1629-1636.
17 YAO Ting, PAN Yingwei, LI Yehao, et al. Boosting image captioning with attributes[C]// Proceedings of International Conference on Computer Vision. Piscataway, NJ: IEEE, 2017: 4904-4912.
18 CHEN Tianqi, GUESTRIN C. XGBoost: A scalable tree boosting system[C]// Proceedings of ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, USA: ACM, 2016, 785-794.
19 OJALA T , PIETIKAINEN M , HARWOOD D . A comparative study of texture measures with classification based on featured distributions[J]. Pattern Recognition, 1996, 29 (1): 51- 59.
20 OLIVA A , TORRALBA A . Building the gist of a scene: the role of global image features in recognition[J]. Progress in Brain Research: Visual Perception, 2006, 155, 23- 36.
21 KE Y, SUKTHANKAR R. PCA-SIFT: A more distinctive representation for local image descriptors[C]/ Proceedings of CVPR. Washington, USA: IEEE, 2004: 506-513.
22 SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]// Proceedings of International Conference on Learning Representation, [s.n.]: [S.l.], 2015.
23 FRIEDMAN J H . Greedy function approximation: a gradient boosting machine[J]. Annals of Statistics, 2001, 29 (5): 1189- 1232.
24 PAN S J , YANG Qiang . A survey on transfer learning[J]. IEEE Transactions on Knowledge & Data Engineering, 2010, 22 (10): 1345- 1359.
25 李航. 统计学习方法[M]. 北京: 清华大学出版社, 2012: 151- 152.
26 王晓梅, 林晓惠, 黄鑫. 基于特征有效范围的前向特征选择及融合分类算法[J]. 小型微型计算机系统, 2016, 37 (6): 1159- 1163.
doi: 10.3969/j.issn.1000-1220.2016.06.008
WANG Xiaomei , LIN Xiaohui , HUANG Xin . Algorithm of forward feature selection and aggregation of classifiers based on feature effective range[J]. Journal of Chinese Computer Systems, 2016, 37 (6): 1159- 1163.
doi: 10.3969/j.issn.1000-1220.2016.06.008
[1] 秦军,张远鹏,蒋亦樟,杭文龙. 多代表点自约束的模糊迁移聚类[J]. 山东大学学报 (工学版), 2019, 49(2): 107-115.
[2] 李雨鑫,普园媛,徐丹,钱文华,刘和娟. 深度卷积神经网络嵌套fine-tune的图像美感品质评价[J]. 山东大学学报(工学版), 2018, 48(3): 60-66.
[3] 于立萍1,2,唐焕玲1,2. 基于分类一致性的迁移学习及其在行人检测中的应用[J]. 山东大学学报(工学版), 2013, 43(4): 26-31.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 王素玉,艾兴,赵军,李作丽,刘增文 . 高速立铣3Cr2Mo模具钢切削力建模及预测[J]. 山东大学学报(工学版), 2006, 36(1): 1 -5 .
[2] 李梁,罗奇鸣,陈恩红. 对象级搜索中基于图的对象排序模型(英文)[J]. 山东大学学报(工学版), 2009, 39(1): 15 -21 .
[3] 李可,刘常春,李同磊 . 一种改进的最大互信息医学图像配准算法[J]. 山东大学学报(工学版), 2006, 36(2): 107 -110 .
[4] 季涛,高旭,孙同景,薛永端,徐丙垠 . 铁路10 kV自闭/贯通线路故障行波特征分析[J]. 山东大学学报(工学版), 2006, 36(2): 111 -116 .
[5] 孙炜伟,王玉振. 考虑饱和的发电机单机无穷大系统有限增益镇定[J]. 山东大学学报(工学版), 2009, 39(1): 69 -76 .
[6] 孙殿柱,朱昌志,李延瑞 . 散乱点云边界特征快速提取算法[J]. 山东大学学报(工学版), 2009, 39(1): 84 -86 .
[7] 赵然杭,陈守煜 . 水资源数量与质量联合评价理论模型研究[J]. 山东大学学报(工学版), 2006, 36(3): 46 -50 .
[8] 李芳佳, 高尚策, 唐政, 石井雅博, 山下和也. 基于元胞自动化模型的三维雪花晶体近似模式的产生(英文)[J]. 山东大学学报(工学版), 2009, 39(1): 102 -105 .
[9] 孔维涛,张庆范,张承慧 . 基于DSP的空间矢量脉宽调制(SVPWM)的实现[J]. 山东大学学报(工学版), 2008, 38(3): 81 -84 .
[10] 陈华鑫, 陈拴发, 王秉纲. 基质沥青老化行为与老化机理[J]. 山东大学学报(工学版), 2009, 39(2): 125 -130 .