Journal of Shandong University(Engineering Science) ›› 2020, Vol. 50 ›› Issue (4): 28-34.doi: 10.6040/j.issn.1672-3961.0.2019.454
Previous Articles Next Articles
LIAO Nanxing1, ZHOU Shibin1*, ZHANG Guopeng1, CHENG Deqiang2
CLC Number:
| [1] KARPATHY A, LI F F. Deep visual-semantic alignments for generating image descriptions[C] // Proceedings of the IEEE conference on computer vision and pattern recognition. Boston, USA: IEEE, 2015: 3128-3137. [2] VINYALS O, TOSHEV A, BENGIO S, et al. Show and tell: a neural image caption generator[C] // Proceedings of the IEEE conference on computer vision and pattern recognition. Boston, USA: IEEE, 2015: 3156-3164. [3] XU K, BA J, KIROS R, et al. Show, attend and tell: Neural image caption generation with visual attention [C] // Proceedings of the International conference on machine learning. Lille, France: JMLR, 2015: 2048-2057. [4] MAO J, XU W,YANG Y, et al. Deep captioning with multimodal recurrent neural networks(m-rnn)[C] // Proceedings of the International Conference on Learning Representations. San Diego, USA: ICLR, 2014: 13-29. [5] ZHOU B, KHOSLA A, LAPEDRIZA A, et al. Learning deep features for discriminative localization[C] // Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, USA: IEEE, 2016: 2921-2929. [6] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-cam:explanations from deep networks via gradient-based localization[C] // Proceedings of the IEEE International Conference on Computer Vision. Honolulu, USA: IEEE, 2017: 618-626. [7] LIN M, CHEN Q, YAN S. Network in network[C] // Proceedings of the International Conference on Learning Representations. Banff, Canada: ICLR, 2013: 284-294. [8] MNIH V, HEESS N, GRAVES A. Recurrent models of visual attention[C] // Proceedings of the Advances in Neural Information Processing Systems. Montreal, Canada: NIPS, 2014: 2204-2212. [9] BAHDANAU D, CHOROWSKI J, SERDYUK D, et al. End-to-end attention-based large vocabulary speech recognition[C] // Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP). Shanghai, China: IEEE, 2016: 4945-4949. [10] LU J, XIONG C, PARIKH D, et al. Knowing when to look: adaptive attention via a visual sentinel for image captioning[C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017: 375-383. [11] YANG Z, ZHANG Y-J, UR REHMAN S, et al. Image captioning with object detection and localization[C] // Proceedings of the International Conference on Image and Graphics. Singapore: Springer, 2017: 109-118. [12] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 770-778. [13] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C] // Proceedings of the International Conferenceon Learning Representations.[S.l.] : ICLR, 2014: 4-11. [14] ANDERSON P, HE X, BUEHLER C, et al. Bottom-up and top-down attention for image captioning and visual question answering[C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 6077-6086. [15] PAPINENI K, ROUKOS S, WARD T, et al. BLEU: a method for automatic evaluation of machine translation[C] // Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Philadelphia, USA: ACL, 2002: 311-318. [16] LIN C-Y. Rouge: a package for automatic evaluation of summaries[C] // Proceedings of the Text Summarization Branches out. Barcelona, Spain: ACL, 2004: 74-81. [17] LAVIE A, AGARWAL A. METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments[C] // Proceedings of the Second Workshop on Statistical Machine Translation. Prague, Czech Republic: ACL, 2007: 228-231. [18] VEDANTAM R, LAWRENCE ZITNICK C, PARIKH D. Cider: consensus-based imagedescription evaluation[C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 4566-4575. [19] DONAHUE J, ANNE HENDRICKS L, GUADARRA-MA S, et al. Long-term recurrent convolutional networks for visual recognition and description[C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 2625-2634. [20] KIROS R, SALAKHUTDINOV R, ZEMEL R. Multimodal Neural Language Models[C] // Proceedings of the Machine Learning Research. Bejing, China: PMLR, 2014: 595-603. |
| [1] | WANG Yuou, YUAN Yingchun, HE Zhenxue, HE Chen. University academic named entity recognition based on the fusion of multi-feature and multi-head self-attention mechanism [J]. Journal of Shandong University(Engineering Science), 2025, 55(6): 35-44. |
| [2] | ZHOU Qunying, SUI Jiacheng, ZHANG Ji, WANG Hongyuan. Industrial product surface defect detection based on self supervised convolution and parameter free attention mechanism [J]. Journal of Shandong University(Engineering Science), 2025, 55(4): 40-47. |
| [3] | DONG Mingshu, CHEN Liqi, MA Chuanyi, ZHANG Zhuhao, SUN Renjuan, GUAN Yanhua, ZHUANG Peizhi. Deep learning-based intelligent judgment for radar detection of pavement cracks [J]. Journal of Shandong University(Engineering Science), 2025, 55(3): 72-79. |
| [4] | LI Feng, WEN Yimin. Multi-scale visual and textual semantic feature fusion for image captioning [J]. Journal of Shandong University(Engineering Science), 2025, 55(3): 80-87. |
| [5] | WANG Yuou, YUAN Yingchun, HE Zhenxue, WANG Kejian. A relation extraction method based on improved RoBERTa, multiple-instance learning and dual attention mechanism [J]. Journal of Shandong University(Engineering Science), 2025, 55(2): 78-87. |
| [6] | Jiachun LI,Bowen LI,Jianbo CHANG. An efficient and lightweight RGB frame-level face anti-spoofing model [J]. Journal of Shandong University(Engineering Science), 2023, 53(6): 1-7. |
| [7] | Xinzhang WU,Xiangyu LIANG,Hongyu ZHU,Dongdong ZHANG. Short-term wind power prediction based on CEEMDAN-GRA-PCC-ATCN [J]. Journal of Shandong University(Engineering Science), 2022, 52(6): 146-156. |
| [8] | Ye LIANG,Nan MA,Hongzhe LIU. Image-dependent fusion method for saliency maps [J]. Journal of Shandong University(Engineering Science), 2021, 51(4): 1-7. |
| [9] | LIAO Jinping, MO Yuchang, YAN Ke. Model and application of short-term electricity consumption forecast based on C-LSTM [J]. Journal of Shandong University(Engineering Science), 2021, 51(2): 90-97. |
| [10] | ZHANG Qinyang, LI Xu, YAO Chunlong, LI Changwu. Aspect-level sentiment classification combined with syntactic dependency information [J]. Journal of Shandong University(Engineering Science), 2021, 51(2): 83-89. |
| [11] | Junsan ZHANG,Qiaoqiao CHENG,Yao WAN,Jie ZHU,Shidong ZHANG. MIRGAN: a medical image report generation model based on GAN [J]. Journal of Shandong University(Engineering Science), 2021, 51(2): 9-18. |
| [12] | ZHANG Yuefang, DENG Hongxia, HU Chunxiang, QIAN Guanyu, LI Haifang. Hippocampal segmentation combining residual attention mechanism and generative adversarial networks [J]. Journal of Shandong University(Engineering Science), 2020, 50(6): 76-81. |
| [13] | Guoyong CAI,Xinhao HE,Yangyang CHU. Visual sentiment analysis based on spatial attention mechanism and convolutional neural network [J]. Journal of Shandong University(Engineering Science), 2020, 50(4): 8-13. |
| [14] | Shiqi SONG,Yan PIAO,Zexin JIANG. Vehicle classification and tracking for complex scenes based on improved YOLOv3 [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 27-33. |
| [15] | Zhifu CHANG,Fengyu ZHOU,Yugang WANG,Dongdong SHEN,Yang ZHAO. A survey of image captioning methods based on deep learning [J]. Journal of Shandong University(Engineering Science), 2019, 49(6): 25-35. |
|