Journal of Shandong University(Engineering Science) ›› 2020, Vol. 50 ›› Issue (4): 28-34.doi: 10.6040/j.issn.1672-3961.0.2019.454
LIAO Nanxing1, ZHOU Shibin1*, ZHANG Guopeng1, CHENG Deqiang2
CLC Number:
[1] KARPATHY A, LI F F. Deep visual-semantic alignments for generating image descriptions[C] // Proceedings of the IEEE conference on computer vision and pattern recognition. Boston, USA: IEEE, 2015: 3128-3137. [2] VINYALS O, TOSHEV A, BENGIO S, et al. Show and tell: a neural image caption generator[C] // Proceedings of the IEEE conference on computer vision and pattern recognition. Boston, USA: IEEE, 2015: 3156-3164. [3] XU K, BA J, KIROS R, et al. Show, attend and tell: Neural image caption generation with visual attention [C] // Proceedings of the International conference on machine learning. Lille, France: JMLR, 2015: 2048-2057. [4] MAO J, XU W,YANG Y, et al. Deep captioning with multimodal recurrent neural networks(m-rnn)[C] // Proceedings of the International Conference on Learning Representations. San Diego, USA: ICLR, 2014: 13-29. [5] ZHOU B, KHOSLA A, LAPEDRIZA A, et al. Learning deep features for discriminative localization[C] // Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, USA: IEEE, 2016: 2921-2929. [6] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-cam:explanations from deep networks via gradient-based localization[C] // Proceedings of the IEEE International Conference on Computer Vision. Honolulu, USA: IEEE, 2017: 618-626. [7] LIN M, CHEN Q, YAN S. Network in network[C] // Proceedings of the International Conference on Learning Representations. Banff, Canada: ICLR, 2013: 284-294. [8] MNIH V, HEESS N, GRAVES A. Recurrent models of visual attention[C] // Proceedings of the Advances in Neural Information Processing Systems. Montreal, Canada: NIPS, 2014: 2204-2212. [9] BAHDANAU D, CHOROWSKI J, SERDYUK D, et al. End-to-end attention-based large vocabulary speech recognition[C] // Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP). Shanghai, China: IEEE, 2016: 4945-4949. [10] LU J, XIONG C, PARIKH D, et al. Knowing when to look: adaptive attention via a visual sentinel for image captioning[C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017: 375-383. [11] YANG Z, ZHANG Y-J, UR REHMAN S, et al. Image captioning with object detection and localization[C] // Proceedings of the International Conference on Image and Graphics. Singapore: Springer, 2017: 109-118. [12] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 770-778. [13] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C] // Proceedings of the International Conferenceon Learning Representations.[S.l.] : ICLR, 2014: 4-11. [14] ANDERSON P, HE X, BUEHLER C, et al. Bottom-up and top-down attention for image captioning and visual question answering[C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 6077-6086. [15] PAPINENI K, ROUKOS S, WARD T, et al. BLEU: a method for automatic evaluation of machine translation[C] // Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Philadelphia, USA: ACL, 2002: 311-318. [16] LIN C-Y. Rouge: a package for automatic evaluation of summaries[C] // Proceedings of the Text Summarization Branches out. Barcelona, Spain: ACL, 2004: 74-81. [17] LAVIE A, AGARWAL A. METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments[C] // Proceedings of the Second Workshop on Statistical Machine Translation. Prague, Czech Republic: ACL, 2007: 228-231. [18] VEDANTAM R, LAWRENCE ZITNICK C, PARIKH D. Cider: consensus-based imagedescription evaluation[C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 4566-4575. [19] DONAHUE J, ANNE HENDRICKS L, GUADARRA-MA S, et al. Long-term recurrent convolutional networks for visual recognition and description[C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 2625-2634. [20] KIROS R, SALAKHUTDINOV R, ZEMEL R. Multimodal Neural Language Models[C] // Proceedings of the Machine Learning Research. Bejing, China: PMLR, 2014: 595-603. |
[1] | CAI Guoyong, HE Xinhao, CHU Yangyang. Visual sentiment analysis based on spatial attention mechanism and convolutional neural network [J]. Journal of Shandong University(Engineering Science), 2020, 50(4): 8-13. |
[2] | Shiqi SONG,Yan PIAO,Zexin JIANG. Vehicle classification and tracking for complex scenes based on improved YOLOv3 [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 27-33. |
[3] | Zhifu CHANG,Fengyu ZHOU,Yugang WANG,Dongdong SHEN,Yang ZHAO. A survey of image captioning methods based on deep learning [J]. Journal of Shandong University(Engineering Science), 2019, 49(6): 25-35. |
[4] | Xiaoxiong HOU,Xinzheng XU,Jiong ZHU,Yanyan GUO. Computer aided diagnosis method for breast cancer based on AlexNet and ensemble classifiers [J]. Journal of Shandong University(Engineering Science), 2019, 49(2): 74-79. |
[5] | Fang GUO,Lei CHEN,Ziwen YANG. Real-time traffic prediction based on MGU for large-scale IP backbone networks [J]. Journal of Shandong University(Engineering Science), 2019, 49(2): 88-95. |
[6] | Mengmeng LIANG,Tao ZHOU,Yong XIA,Feifei ZHANG,Jian YANG. Lung tumor images recognition based on PSO-ConvK convolutional neural network [J]. Journal of Shandong University(Engineering Science), 2018, 48(5): 77-84. |
[7] | Pu ZHANG,Chang LIU,Yong WANG. Suggestion sentence classification model based on feature fusion and ensemble learning [J]. Journal of Shandong University(Engineering Science), 2018, 48(5): 47-54. |
[8] | XIE Zhifeng, WU Jiaping, MA Lizhuang. Chinese financial news classification method based on convolutional neural network [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 34-39. |
[9] | LI Yuxin, PU Yuanyuan, XU Dan, QIAN Wenhua, LIU Hejuan. Image aesthetic quality evaluation based on embedded fine-tune deep CNN [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 60-66. |
[10] | HE Zhengyi, ZENG Xianhua, GUO Jiang. An ensemble method with convolutional neural network and deep belief network for gait recognition and simulation [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 88-95. |
[11] | ZHAO Yanxia, WANG Xizhao. Multipurpose zero watermarking algorithm for color image based on SVD and DCNN [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 25-33. |
[12] | XU Shan-shan, LIU Ying-an*, XU Sheng. Wood defects recognition based on the convolutional neural network [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2013, 43(2): 23-28. |
|