山东大学学报 (工学版) ›› 2021, Vol. 51 ›› Issue (2): 9-18.doi: 10.6040/j.issn.1672-3961.0.2020.227
Junsan ZHANG1(),Qiaoqiao CHENG1,Yao WAN2,Jie ZHU3,Shidong ZHANG4
摘要:
基于图像理解的医学影像报告生成任务与传统的图像理解任务相比, 是一个更加具有挑战的任务。针对该任务, 提出医学影像报告生成对抗网络(medical image report generative adversarial network, MIRGAN)模型。采用共同注意力机制对多个特征区域的视觉特征和语义特征进行综合处理并分别生成对应于这些区域的描述。融合生成对抗网络(generative adversarial network, GAN)和强化学习(reinforcement learning, RL)方法优化生成模型的性能使其输出更高质量的报告。试验结果验证了MIRGAN模型的有效性。
中图分类号:
1 | VINYALS O, TOSHEV A, BENGIO S, et al. Show and tell: a neural image caption generator[C]//Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 3156-3164. |
2 | JING Baoyu, XIE Pengtao, XING ERIC. On the automatic generation of medical imaging reports[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: ACL, 2018. |
3 | GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Proceedings of the Advances in Neural Information Processing Systems. Montreal, Canada: MIT Press, 2014. |
4 | SUTTON R S , BARTO A G . Introduction to reinforcement learning[M]. Cambridge, UK: MIT Press, 1998. |
5 | DENTON E L, CHINTALA S, FERGUS R. Deep generative image models using a laplacian pyramid of adversarial networks[C]//Proceedings of the Advances in Neural Information Processing Systems. Montreal, Canada: MIT Press, 2015. |
6 | LI Changliang, SU Yixin, LIU Wenju. Text-to-text generative adversarial networks[C]//2018 International Joint Conference on Neural Networks (IJCNN). Rio de Janeiro, Brazil: IEEE, 2018: 1-7. |
7 | CHEN L, ZHANG H, XIAO J, et al. SCA-CNN: spatial and shannel-wise attention in convolutional networks for image captioning[C]//Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. |
8 | XU K, BA J, KIROS R, et al. Show, attend and tell: neural image caption generation with visual attention[C]//Proceedings of the International Conference on Machine Learning. Lille, France: ACM, 2015: 2048-2057. |
9 | YOU Quanzeng, JIN Hailin, WANG Zhaowen, et al. Image captioning with semantic attention[C]//Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 4651-4659. |
10 | LU Jiasen, XIONG Caiming, PARIKH DEVI, et al. Knowing when to look: adaptive attention via a visual sentinel for image captioning[C]//Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017: 375-383. |
11 | WANG Xiaosong, PENG Yifan, LU Zhiyong, et al. Tienet: Text-image embedding network for common thorax disease classification and reporting in chest x-rays[C]//Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 9049-9058. |
12 | LI Y, LIANG X, HU Z, et al. Hybrid retrieval-generation reinforced agent for medical image report generation[C]// Advances in Neural Information Processing Systems. Montreal, Canada: MIT Press, 2018: 1530-1540. |
13 | KISILEV P , WALACH E , BARKAN E , et al. From medical image to automatic medical report generation[J]. IBM Journal of Research and Development, 2015, 59 (2/3): 2:1- 2:7. |
14 | SHIN H C, ROBERTS K, LU Le, et al. Learning to read chest x-rays: recurrent neural cascade model for automated image annotation[C]//Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 2497-2506. |
15 | ZHANG Y, GAN Z, LAWRENCE C. Generating Text via Adversarial Training[C]//Advances in Neural Information Processing Systems. Barcelona, Spain: MIT Press, 2016. |
16 | BACHMAN P, PRECUP D. Data generation as sequential decision making[C]//Advances in Neural Information Processing Systems. Montreal, Canada: MIT Press, 2015: 3249-3257. |
17 | SUTTON R S, MCALLESTER D A, SINGH S P, et al. Policy gradient methods for reinforcement learning with function approximation[C]//Advances in Neural Information Processing Systems. Denver, USA: MIT Press, 2000. |
18 | YU Lantao, ZHANG Weinan, WANG Jun, et al. Seqgan: sequence generative adversarial nets with policy gradient[C]//AAAI Conference on Artificial Intelligence. San Francisco, USA: AAAI, 2017. |
19 | KRAUSE J, JOHNSON J, KRISHNA R, et al. A hierarchical approach for generating descriptive image paragraphs[C]//Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017: 317-325. |
20 | VESELY K, GHOSHAL A, BURGET L, et al. Sequence-discriminative training of deep neural networks[C]//Interspeech. Lyon, France: IEEE, 2013: 2345-2349. |
21 | KIM Y. Convolutional neural networks for sentence classification[C]//Empirical Methods in Natural Language Processing. Doha, Qatar: ACL, 2014: 1746-1751. |
22 | LAI Siwei, XU Liheng, LIU Kang, et al. Recurrent convolutional neural networks for text classification[C]//29th AAAI Conference on Artificial Intelligence. Austin Texas, USA: AAAI, 2015. |
23 | DEMNER-FUSHMAN D , KOHLI M D , ROSENMAN M B , et al. Preparing a collection of radiology examinations for distribution and retrieval[J]. Journal of the American Medical Informatics Association, 2016, 23 (2): 304- 310. |
24 | HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//Computer Visual on and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 770-778. |
25 | DONAHUE J, ANNE HENDRICKS L, GUADARRAMA S, et al. Long-term recurrent convolutional networks for visual recognition and description[C]//Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015. |
26 | PAPINENI K, ROUKOS S, WARD T, et al. BLEU: a method for automatic evaluation of machine translation[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Grenoble, France: ACL, 2002. |
27 | DENKOWSKI M, LAVIE A. Meteor universal: language specific translation evaluation for any target language[C]// Proceedings of the 9th Workshop on Statistical Machine Translation. Baltimore, USA: ACL, 2014: 376-380. |
28 | CHIN-YEW L. Rouge: a package for automatic evaluation of summaries[C]//Proceedings of the 42th Annual Meeting on Association for Computational Linguistics. Barcelona, Spain: ACL, 2004: 74-81. |
29 | VEDANTAM R, LAWRENCE Z C, PARIKH D. Cider: consensus-based image description evaluation[C]//Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 4566-4575. |
[1] | 张月芳,邓红霞,呼春香,钱冠宇,李海芳. 融合残差块注意力机制和生成对抗网络的海马体分割[J]. 山东大学学报 (工学版), 2020, 50(6): 76-81. |
[2] | 李春阳,李楠,冯涛,王朱贺,马靖凯. 基于深度学习的洗衣机异常音检测[J]. 山东大学学报 (工学版), 2020, 50(2): 108-117. |
[3] | 常致富,周风余,王玉刚,沈冬冬,赵阳. 基于深度学习的图像自动标注方法综述[J]. 山东大学学报 (工学版), 2019, 49(6): 25-35. |
[4] | 沈晶,刘海波,张汝波,吴艳霞,程晓北. 基于半马尔可夫对策的多机器人分层强化学习[J]. 山东大学学报(工学版), 2010, 40(4): 1-7. |
|