基于优选典型相关分量的跨媒体检索模型

doi:10.6040/j.issn.1672-3961.0.2017.552

Abstract

Abstract:

It is one of the most important factors which affect final retrieval performance effectively by acquiring the core semantic correlations between heterogeneous media in cross-media retrieval. To improve retrieval performance, a modified kernel canonical correlation analysis (MKCCA) model was presented: image features like SIFT (scale invariant feature transform) and GIST were extracted respectively to better characterize the key visual content of images. Meanwhile TF (term frequency) feature was extracted to depict the key characteristics of texts. Then the extracted features were mapped into a high-dimensional space by mapping kernels. As the results, two kernel matrixes were acquired to describe the mapped features. Based on the kernel matrixes, the non-linear semantic correlations between images and texts were fully mined by canonical correlation analysis (CCA) model. More importantly, with the help of a semantic correlation threshold, those core canonical correlation vectors were chosen to suppress semantic noises and depict the key semantic correlations between images and texts more robustly. Experimental results showed that the best overall retrieval performance was obtained by using the feature combination SIFT-TF. Moreover the highest retrieval performance was obtained by MKCCA model combined with gauss kernel. Compared to the best competitor, the MAP value of the "images retrieve texts (I_R_T)" task was improved about 3.06% while the MAP value of the "texts retrieve image (T_R_I)" task was improved about 1.18%.

Key words: canonical correlated vectors, cross-media retrieval, kernel canonical correlation analysis, semantic correlation threshold, gauss kernel

CLC Number:

TP391

Guangli LI,Bin LIU,Tao ZHU,Yi YIN,Hongbin ZHANG. Cross-media retrieval model based on choosing key canonical correlated vectors[J].Journal of Shandong University(Engineering Science), 2018, 48(5): 38-46.

Figures/Tables 6

Fig.1

Fig.2

Fig.3

Table 1

Table 2

Table 3

References 22

1	HODOSH M , YOUNG P , HOCKENMAIER J . Framing image description as a ranking task: Data, models and evaluation metrics[J]. Journal of Artificial Intelligence Resource, 2013, 47, 853- 899. doi: 10.1613/jair.3994
2	KIROS R, SALAKHUTDINOV R, ZEMEL R. Multimodal Neural Language Models[C]//Proceedings of International Conference on Machine Learning, 2014. New York: ACM, 2014: 595-603.
3	LI P, MA J, GAO S. Learning to summarize web image and text mutually[C]//Proceedings of ACM International Conference on Multimedia Retrieval. New York: ACM, 2012: 1-8.
4	李广丽, 陈婧琳, 刘斌, 等. 基于Tag-rank和典型相关性分析的在线商品跨媒体检索研究[J]. 科学技术与工程, 2016, 16 (4): 222- 227.
	LI Guangli , CHEN Jinglin , LIU Bin , et al. Cross-media retrieval of online product based on tag-rank and CCA[J]. Science Technology and Engineering, 2016, 16 (4): 222- 227.
5	WU F, ZHANG H, ZHUANG Y T. Learning semantic correlations for cross-media retrieval[C]//Proceedings of International Conference on Image Processing. Piscataway, NJ: IEEE, 2006: 1465-1468.
6	WU F, YANG Y, ZHUANG Y T, et al. Understanding multimedia document semantics for cross-media retrieval[C]//Proceedings of Pacific-rim Conference on Advances in Multimedia Information Processing. Berlin Heidelberg: Springer, 2006, 4261: 979-988.
7	RASIWASIA N, COSTA P J, COVIELLO E, et al. A new approach to cross-modal multimedia[C]//Proceedings of Acm International Conference on Multimedia. New York: ACM, 2010: 251-260.
8	WANG Xikui, LIU Yang, WANG Donghui, et al. Cross-media Topic Mining on Wikipedia[C]//Proceedings of Acm International Conference on Multimedia. New York: ACM, 2013: 689-692.
9	SVANTE Wold . Principal component analysis[J]. Chemometrics and Intelligent Laboratory Systems, 1987, (2): 37- 52.
10	STONE James . Encyclopedia of statistics in behavioral science[M]. Chichester: John Wiley & Sons, 2005.
11	VINZI V E , CHIN W , HENSELER J , et al. Handbook of partial least squares: concepts, methods and applications[M]. Berlin: Springer, 2010.
12	HARDOON D R , SZEDMAK S , SHAWE Taylor J . Canonical correlation analysis: an overview with application to learning methods[J]. Neural Computation, 2004, 16 (12): 2639- 2664. doi: 10.1162/0899766042321814
13	AKAHO S. A kernel method for canonical correlation analysis[C]//Proceedings of the International Meeting of the Psychometric Society. New York: ACM, 2001, 40(2): 263-269
14	BLEI D, JORDAN M. Modeling annotated data[C]//Proceedings of International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2003: 127-134.
15	戴晓娟. 基于SVM线性核函数情感分类模型的建立和研究[J]. 哈尔滨师范大学自然科学学报, 2014, 30 (3): 55- 57. doi: 10.3969/j.issn.1000-5617.2014.03.018
	DAI Xiaojuan . The establishment and emotion research of the linear kernel based on SVM classification model[J]. Natural Sciences Journal of Harbin Normal University, 2014, 30 (3): 55- 57. doi: 10.3969/j.issn.1000-5617.2014.03.018
16	赵莹.支持向量机中高斯核函数的研究[D].上海:华东师范大学, 2007.
	ZHAO Ying. Research on gauss kernel in support vector machine[D]. Shanghai: East China Normal University, 2007.
17	赵金伟, 冯博琴, 闫桂荣. 基于正交多项式核函数方法[J]. 计算机技术与发展, 2012, 22 (5): 177- 179, 184.
	ZHAO Jinwei , FENG Boqin , YAN Guirong . Review of chebyshev kernel functions[J]. Computer Technology and Development, 2012, 22 (5): 177- 179, 184.
18	姚志均, 刘俊涛, 周瑜, 等. 基于对称KL距离的相似性度量方法[J]. 华中科技大学学报(自然科学版), 2011, 39 (11): 1- 4, 38.
	YAO Zhijun , LIU Juntao , ZHOU Yu , et al. Similarity measure method using symmetric KL divergence[J]. Journal of Huazhong University of Science and Technology (Natural Science Edition), 2011, 39 (11): 1- 4, 38.
19	FENG Yansong , LAPAPTA M . Automatic caption generation for news images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35 (4): 797- 812. doi: 10.1109/TPAMI.2012.118
20	张红斌, 姬东鸿, 尹兰, 等. 基于关键词精化和句法树的商品图像句子标注[J]. 计算机研究与发展, 2016, 53 (11): 2542- 2555. doi: 10.7544/issn1000-1239.2016.20150906
	ZHANG Hongbin , JI Donghong , YIN Lan , et al. Caption generation from product image based on tag refinement and syntactic tree[J]. Journal of Computer Research and Development, 2016, 53 (11): 2542- 2555. doi: 10.7544/issn1000-1239.2016.20150906
21	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2016: 770-778.
22	KIM Yoon. Convolutional neural networks for sentence classification[C]//Proceedings of Conference on Empirical Methods on Natural Language Processing. Stroudsburg, PA: ACL, 2014: 1746-1751.

Metrics

Viewed

Full text

855

HTML			PDF

Just accepted	Online first	Issue	Just accepted	Online first	Issue
0	0	10	0	0	845

From	Others	local

Times	56	799
Rate	7%	93%

Abstract

1637

Just accepted	Online first	Issue

0	0	1637

	From	Others

	Times	1637
	Rate	100%

Cited

Web of Science	Crossref	ScienceDirect	Search for Citations in Google Scholar >>


This page requires you have already subscribed to WoS.

Shared

Discussed

Comments

Recommended 10

[1]	HE Dongzhi, ZHANG Jifeng, ZHAO Pengfei. Parallel implementing probabilistic spreading algorithm using MapReduce programming mode[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 0, (): 22 -28 .
[2]	HUANG Jinchao. A new method for muti-objects image segmentation based on faster region proposal networks[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(4): 20 -26 .
[3]	TANG Qingshun, JIN Lu, LI Guodong, WU Chunfu. Robotic manipulators tracking control based on adaptive terminal sliding mode controller[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(5): 45 -53 .
[4]	ZHANG Jianming, LIU Quansheng, TANG Zhicheng, ZHAN Ting, JIANG Yalong. New peak shear strength criterion with inclusion of shear action history[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 0, (): 77 -81 .
[5]	WANG Huan, ZHOU Zhongmei. An over sampling algorithm based on clustering[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 134 -139 .
[6]	YANG Ai-min1, ZHOU Yong-mei1, DENG He2, ZHOU Jian-feng3. Method of feature generation and selection for network traffic classification[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(5): 1 -7 .
[7]	YOU Ming-yu, CHEN Yan, LI Guo-zheng. Im-IG: A novel feature selection method for imbalanced problems[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(5): 123 -128 .
[8]	WU Guo-yao1, MA Li-yong2. A method based on FFD B-spline registration of the iris image fusion[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(5): 24 -27 .
[9]	XIAO Qiao, PEI Jihong, WANG Lixia, GONG Zhicheng. Ship detection in remote sensing image based on the fuzzy fusion of multi-channel Gabor filtering[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 0, (): 29 -35 .
[10]	MA Xiangming, SUN Xia, ZHANG Qiang. Construction and analysis on typical working cycle of wheel loader[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 0, (): 82 -87 .

%
模型	GIST-TF			SIFT-TF			模型平均准确率
	MAP		特征平均准确率	MAP		特征平均准确率
	图像检索文本	文本检索图像	特征平均准确率	图像检索文本	文本检索图像	特征平均准确率
CCA	33.93	17.34	25.64	20.61	37.06	28.84	27.24
OKCCA+linear kernel	28.76	32.19	30.48	28.64	25.22	26.93	28.71
OKCCA+gauss kernel	18.27	29.34	23.81	35.46	31.59	33.53	28.67
OKCCA+poly kernel	18.99	18.01	18.50	31.45	22.98	27.22	22.86
MKCCA+linear kernel	30.50	33.37	31.94	27.78	26.39	27.09	29.52
MKCCA+gauss kernel	20.29	29.09	24.69	38.95	30.08	34.52	29.61
MKCCA+poly kernel	18.98	17.89	18.44	30.23	21.47	25.85	22.15

%
语义距离	GIST-TF_SCM			SIFT-TF_SCM			语义距离平均准确率
	MAP		特征平均准确率	MAP		特征平均准确率
	图像检索文本	文本检索图像	特征平均准确率	图像检索文本	文本检索图像	特征平均准确率
KL	24.36	21.46	22.91	31.01	24.33	27.67	25.29
JS	25.96	23.45	24.71	34.77	27.38	31.08	27.89
L1	26.24	23.62	24.93	34.69	27.55	31.12	28.03
L2	27.22	23.67	25.45	35.89	28.19	32.04	28.74

%
特征组合	MAP				特征平均准确率
特征组合	CCA	OKCCA	MKCCA	SCM	特征平均准确率
T_R_I with SIFT-TF	37.06	31.59	30.08	28.19	31.73
T_R_I with GIST-TF	17.34	32.19	33.37	23.67	26.64
I_R_T with SIFT-TF	20.61	35.46	38.95	35.89	32.73
I_R_T with GIST-TF	33.93	28.76	30.50	27.22	30.10
平均	27.24	32.00	33.23	28.74

Cross-media retrieval model based on choosing key canonical correlated vectors

RichHTML

PDF (PC)

Abstract

Cite this article

share this article

Figures/Tables 6

References 22

Related Articles 1

Metrics

Comments

Recommended 10