Journal of Shandong University(Engineering Science) ›› 2019, Vol. 49 ›› Issue (2): 8-16.doi: 10.6040/j.issn.1672-3961.0.2018.271

• Machine Learning & Data Mining • Previous Articles     Next Articles

Image attribute annotation based on extreme gradient boosting algorithm

Hongbin ZHANG1(),Diedie QIU1,Renzhong WU1,Tao ZHU2,Jin HUA2,Donghong JI3   

  1. 1. Software School, East China Jiaotong University, Nanchang 330013, Jiangxi, China
    2. School of Information, East China Jiaotong University, Nanchang 330013, Jiangxi, China
    3. School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, Hubei, China
  • Received:2018-07-04 Online:2019-04-20 Published:2019-04-19
  • Supported by:
    国家自然科学基金资助项目(61762038);国家自然科学基金资助项目(61741108);国家自然科学基金资助项目(61861016);教育部人文社会科学研究规划基金资助项目(16YJAZH029);教育部人文社会科学研究规划基金资助项目(17YJAZH117)

Abstract:

To improve annotation performance, a novel image attribute annotation model based on eXtreme gradient boosting (XGBoost) algorithm was proposed: image features i.e. local binary patterns (LBP), Gist, scale invariant feature transform (SIFT), and visual geometry group (VGG) were extracted respectively to better characterize the key visual content of images. Then the state-of-the-art boosting algorithm called XGBoost was used to design a strong classifier by integrating a group of weaker classifiers. Based on the strong classifier, image attribute annotation was implemented. A lot of valuable deep semantic implied by image attribute was mined in turn to create a novel hierarchical attribute representation mechanism, which was closer to human's objective cognition. Finally, transfer learning strategy was designed to further improve annotation performance. Experimental results showed that the key visual content of images was truly characterized by the Gist feature. Compared to the best competitor before transfer learning, the accuracy of basic transfer (BT) learning strategy was improved about 8.69%. Compared to the best competitor of BT, the accuracy of hybrid transfer (HT) learning strategy was improved about 17.55%. The annotation accuracy was improved by the presented model.

Key words: image attribute annotation, eXtreme gradient boosting, transfer learning, weak classifiers, deep semantic

CLC Number: 

  • TP391

Fig.1

Basic framework for image attribute annotation"

Table 1

Annotation results comparison with the main baselines before transfer learning"

%
模型PrecisionmaterialAPmaterial Accuracy
Pu Canvas Polyester Nylon
Gist_XGBoost 64.45 57.71 38.27 43.09 50.88 52.54
Gist _GBDT 64.65 56.99 38.37 40.89 50.23 51.46
Gist _NB 43.05 37.91 24.41 30.30 33.92 38.28
Gist _RF 61.00 54.45 37.56 40.87 48.47 50.22
Gist _DT 49.16 39.69 29.11 29.33 36.82 37.39
Gist _KNN 57.65 43.60 34.89 44.46 45.15 45.79
Gist _LR 55.91 41.76 0.00 39.25 34.23 46.19
SIFT_XGBoost 75.89 56.86 41.03 56.54 57.58 60.04
SIFT_GBDT 72.65 55.56 43.45 59.53 57.80 59.83
SIFT_NB 84.27 57.04 37.59 49.29 57.05 57.05
SIFT_RF 69.36 49.87 40.00 53.02 53.06 55.97
SIFT_DT 64.62 37.76 29.78 38.24 42.60 43.52
SIFT_KNN 55.33 39.27 32.74 38.02 41.34 41.90
LBP_XGBoost 66.67 57.42 42.06 44.67 52.71 54.24
LBP_GBDT 64.40 57.65 40.87 42.93 51.46 52.89
LBP_NB 45.99 39.61 32.41 33.45 37.87 39.34
LBP_RF 60.43 51.93 40.00 41.38 48.44 50.40
LBP_DT 49.61 38.35 29.00 30.01 36.74 36.80
LBP_KNN 52.15 45.05 38.79 45.00 45.25 46.38
LBP_LR 60.73 53.33 39.72 41.40 48.80 50.03
VGG16_XGBoost 55.68 49.36 35.58 35.09 43.93 45.01
VGG16_GBDT 53.79 46.69 33.61 34.19 42.07 43.39
VGG16_NB 26.75 0.00 0.00 0.00 6.69 26.75
VGG16_RF 53.63 46.91 32.69 35.58 42.20 43.44
VGG16_DT 42.47 38.80 28.35 26.93 34.14 34.15
VGG16_KNN 41.28 35.19 31.51 33.10 35.27 36.20
VGG16_LR 28.34 0.00 31.44 0.00 14.95 28.89

Fig.2

Before transfer learning, the P-R curves of the proposed models"

Fig.3

Accuracy comparisons of NT vs BT vs HT, (G (Gist), S (SIFT), L (LBP))"

Fig.4

Accuracy comparisons of various model combinations"

Fig.5

Accuracy comparisons between the proposed model and the state-of-the-art baselines"

1 杨晓玲, 李志清, 刘雨桐. 基于多标签判别字典学习的图像自动标注[J]. 计算机应用, 2018, 38 (5): 1294- 1298.
doi: 10.3969/j.issn.1001-3695.2018.05.003
YANG Xiaoling , LI Zhiqing , LIU Yutong . Automatic image annotation based on multi-label discriminative dictionary learning[J]. Journal of Computer Applications, 2018, 38 (5): 1294- 1298.
doi: 10.3969/j.issn.1001-3695.2018.05.003
2 WANG XinJing , ZHANG Lei , MA Weiying . Duplicate search-based image annotation using web-scale data[J]. Proceedings of the IEEE, 2012, 100 (9): 2705- 2721.
doi: 10.1109/JPROC.2012.2193109
3 张红斌, 姬东鸿, 尹兰, 等. 基于关键词精化和句法树的商品图像句子标注[J]. 计算机研究与发展, 2016, 53 (11): 2542- 2555.
doi: 10.7544/issn1000-1239.2016.20150906
ZHANG Hongbin , JI Donghong , YIN Lan , et al. Caption generation from produce image based on tag refinement and syntactic tree[J]. Journal of Computer Research and Development, 2016, 53 (11): 2542- 2555.
doi: 10.7544/issn1000-1239.2016.20150906
4 XU Kelvin, BA Jimmy Lei, KIROS Ryan, et al. Show, attend and tell: neural image caption generation with visual attention[C]// Proceedings of International Conference on Machine Learning.New York, USA: ACM, 2015: 2048-2057.
5 FARHADI A, ENDRES I, HOIEM D, et al. Describing objects by their attributes[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2009: 1778-1785.
6 KUMAR N, BELHUMEUR P, NAYAR S K: FaceTracer: a search engine for large collections of images with faces[C]// Proceedings of European Conference on Computer Vision. Berlin, German: Springer, 2008, 5305(14): 340-353.
7 KUMAR N, BERG A C, BELHUMEUR P, et al: Attribute and simile classifiers for face verification[C]// Proceedings of IEEE International Conference on Computer Vision. Piscataway, NJ: IEEE, 2010, 30(2): 365-372.
8 JAYARAMAN D, GRAUMAN D. Zero-shot recognition with unreliable attributes[C]// Proceedings of Conference and Workshop on Neural Information Processing Systems. New York, USA: Curran Associates, 2014: 3464-3472.
9 BERG T, BERG Alexander C, SHIH Jonathan. Automatic attribute discovery and characterization from noisy web data[C]// Proceedings of European Conference on Computer Vision.Berlin, German: Springer, 2010, 6311: 663-676.
10 PARIKH D, GRAUMAN K. Relative attributes[C]// Proceedings of IEEE International Conference on Computer Vision, Piscataway. NJ: IEEE, 2011, 6669(5): 503-510.
11 KOVASHKA A, GRAUMAN K. Discovering shades of attribute meaning with the crowd[C]//Proceedings of European Conference on Computer Vision. Berlin, German: Springer, 2014: 114(1): 56-73.
12 KOVASHKA A, GRAUMAN K.. Attribute adaptation for personalized image search[C]// Proceedings of International Conference on Computer Vision. Piscataway. NJ: IEEE, 2013.
13 KOVASHKA A , PARIKH D , GRAUMAN K . WhittleSearch: interactive image search with relative attribute feedback[J]. International Journal of Computer Vision, 2015, 115 (2): 185- 210.
doi: 10.1007/s11263-015-0814-0
14 乔雪, 彭晨, 段贺, 等. 基于共享特征相对属性的零样本图像分类[J]. 电子与信息学报, 2017, 39 (7): 1563- 1570.
QIAO Xue , PENG Chen , DUAN He , et al. Shared features based relative attributes for zero-shot image classification[J]. Journal of Electronic & Information Technology, 2017, 39 (7): 1563- 1570.
15 ZHAO Bo, FENG Jiashi, WU Xiao, et al. Memory-augmented attribute manipulation networks for interactive fashion search[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2017: 6156-6164.
16 JAYARAMAN D, SHA F, GRAUMAN K. Decorrelating semantic visual attributes by resisting the urge to share[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2014: 1629-1636.
17 YAO Ting, PAN Yingwei, LI Yehao, et al. Boosting image captioning with attributes[C]// Proceedings of International Conference on Computer Vision. Piscataway, NJ: IEEE, 2017: 4904-4912.
18 CHEN Tianqi, GUESTRIN C. XGBoost: A scalable tree boosting system[C]// Proceedings of ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, USA: ACM, 2016, 785-794.
19 OJALA T , PIETIKAINEN M , HARWOOD D . A comparative study of texture measures with classification based on featured distributions[J]. Pattern Recognition, 1996, 29 (1): 51- 59.
20 OLIVA A , TORRALBA A . Building the gist of a scene: the role of global image features in recognition[J]. Progress in Brain Research: Visual Perception, 2006, 155, 23- 36.
21 KE Y, SUKTHANKAR R. PCA-SIFT: A more distinctive representation for local image descriptors[C]/ Proceedings of CVPR. Washington, USA: IEEE, 2004: 506-513.
22 SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]// Proceedings of International Conference on Learning Representation, [s.n.]: [S.l.], 2015.
23 FRIEDMAN J H . Greedy function approximation: a gradient boosting machine[J]. Annals of Statistics, 2001, 29 (5): 1189- 1232.
24 PAN S J , YANG Qiang . A survey on transfer learning[J]. IEEE Transactions on Knowledge & Data Engineering, 2010, 22 (10): 1345- 1359.
25 李航. 统计学习方法[M]. 北京: 清华大学出版社, 2012: 151- 152.
26 王晓梅, 林晓惠, 黄鑫. 基于特征有效范围的前向特征选择及融合分类算法[J]. 小型微型计算机系统, 2016, 37 (6): 1159- 1163.
doi: 10.3969/j.issn.1000-1220.2016.06.008
WANG Xiaomei , LIN Xiaohui , HUANG Xin . Algorithm of forward feature selection and aggregation of classifiers based on feature effective range[J]. Journal of Chinese Computer Systems, 2016, 37 (6): 1159- 1163.
doi: 10.3969/j.issn.1000-1220.2016.06.008
[1] Jun QIN,Yuanpeng ZHANG,Yizhang JIANG,Wenlong HANG. Transfer fuzzy clustering based on self-constraint of multiple medoids [J]. Journal of Shandong University(Engineering Science), 2019, 49(2): 107-115.
[2] LI Yuxin, PU Yuanyuan, XU Dan, QIAN Wenhua, LIU Hejuan. Image aesthetic quality evaluation based on embedded fine-tune deep CNN [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 60-66.
[3] YU Li-ping1,2, TANG Huan-ling1,2. Transfer learning model based on classification consensus and  its application in pedestrian detection [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2013, 43(4): 26-31.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] WANG Su-yu,<\sup>,AI Xing<\sup>,ZHAO Jun<\sup>,LI Zuo-li<\sup>,LIU Zeng-wen<\sup> . Milling force prediction model for highspeed end milling 3Cr2Mo steel[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(1): 1 -5 .
[2] LI Liang, LUO Qiming, CHEN Enhong. Graph-based ranking model for object-level search
[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 15 -21 .
[3] LI Ke,LIU Chang-chun,LI Tong-lei . Medical registration approach using improved maximization of mutual information[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 107 -110 .
[4] JI Tao,GAO Xu/sup>,SUN Tong-jing,XUE Yong-duan/sup>,XU Bing-yin/sup> . Characteristic analysis of fault generated traveling waves in 10 Kv automatic blocking and continuous power transmission lines[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 111 -116 .
[5] SUN Weiwei, WANG Yuzhen. Finite gain stabilization of singlemachine infinite bus system subject to saturation[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 69 -76 .
[6] SUN Dianzhu, ZHU Changzhi, LI Yanrui. [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 84 -86 .
[7] HAO Ranhang,CHEN Shouyu . The theory, model and method of water resources evaluationombining quantity with quality[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(3): 46 -50 .
[8] LI Fangjia, GAO Shangce, TANG Zheng*, Ishii Masahiro, Yamashita Kazuya. 3D similar pattern generation of snow crystals with cellular automata[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 102 -105 .
[9] KONG Wei-tao,ZHANG Qing-fan,ZHANG Cheng-hui . DSP based implementation of the space vector pulse width modulation[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(3): 81 -84 .
[10] CHEN Huaxin, CHEN Shuanfa, WANG Binggang. The aging behavior and mechanism of base asphalts[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 125 -130 .