MIRGAN: 一种基于GAN的医学影像报告生成模型

doi:10.6040/j.issn.1672-3961.0.2020.227

Abstract

Abstract:

The medical image report generation task based on image understanding became a widely concerned issue. Compared with the traditional image understanding task, medical image report generation was a more challenging task. We proposed a medical image report generative adversarial network (MIRGAN) model for this task. A co-attention mechanism was adopted to synthesize the visual and semantic features of multiple feature areas and generate descriptions corresponding to these areas. Combining the generative adversarial networks (GAN) and reinforcement learning (RL) optimized the performance of the generative model to output higher quality reports. The experiment results demonstrated the effectiveness of our proposed MIRGAN model.

Key words: image understanding task, medical image report generation, co-attention mechanism, generative adversarial network, reinforcement learning

CLC Number:

TP391

Junsan ZHANG,Qiaoqiao CHENG,Yao WAN,Jie ZHU,Shidong ZHANG. MIRGAN: a medical image report generation model based on GAN[J].Journal of Shandong University(Engineering Science), 2021, 51(2): 9-18.

Figures/Tables 8

Fig.1

Fig.2

Fig.3

Table 1

Table 2

Fig.4

Table 3

Fig.5

References 29

1	VINYALS O, TOSHEV A, BENGIO S, et al. Show and tell: a neural image caption generator[C]//Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 3156-3164.
2	JING Baoyu, XIE Pengtao, XING ERIC. On the automatic generation of medical imaging reports[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: ACL, 2018.
3	GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Proceedings of the Advances in Neural Information Processing Systems. Montreal, Canada: MIT Press, 2014.
4	SUTTON R S , BARTO A G . Introduction to reinforcement learning[M]. Cambridge, UK: MIT Press, 1998.
5	DENTON E L, CHINTALA S, FERGUS R. Deep generative image models using a laplacian pyramid of adversarial networks[C]//Proceedings of the Advances in Neural Information Processing Systems. Montreal, Canada: MIT Press, 2015.
6	LI Changliang, SU Yixin, LIU Wenju. Text-to-text generative adversarial networks[C]//2018 International Joint Conference on Neural Networks (IJCNN). Rio de Janeiro, Brazil: IEEE, 2018: 1-7.
7	CHEN L, ZHANG H, XIAO J, et al. SCA-CNN: spatial and shannel-wise attention in convolutional networks for image captioning[C]//Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016.
8	XU K, BA J, KIROS R, et al. Show, attend and tell: neural image caption generation with visual attention[C]//Proceedings of the International Conference on Machine Learning. Lille, France: ACM, 2015: 2048-2057.
9	YOU Quanzeng, JIN Hailin, WANG Zhaowen, et al. Image captioning with semantic attention[C]//Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 4651-4659.
10	LU Jiasen, XIONG Caiming, PARIKH DEVI, et al. Knowing when to look: adaptive attention via a visual sentinel for image captioning[C]//Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017: 375-383.
11	WANG Xiaosong, PENG Yifan, LU Zhiyong, et al. Tienet: Text-image embedding network for common thorax disease classification and reporting in chest x-rays[C]//Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 9049-9058.
12	LI Y, LIANG X, HU Z, et al. Hybrid retrieval-generation reinforced agent for medical image report generation[C]// Advances in Neural Information Processing Systems. Montreal, Canada: MIT Press, 2018: 1530-1540.
13	KISILEV P , WALACH E , BARKAN E , et al. From medical image to automatic medical report generation[J]. IBM Journal of Research and Development, 2015, 59 (2/3): 2:1- 2:7.
14	SHIN H C, ROBERTS K, LU Le, et al. Learning to read chest x-rays: recurrent neural cascade model for automated image annotation[C]//Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 2497-2506.
15	ZHANG Y, GAN Z, LAWRENCE C. Generating Text via Adversarial Training[C]//Advances in Neural Information Processing Systems. Barcelona, Spain: MIT Press, 2016.
16	BACHMAN P, PRECUP D. Data generation as sequential decision making[C]//Advances in Neural Information Processing Systems. Montreal, Canada: MIT Press, 2015: 3249-3257.
17	SUTTON R S, MCALLESTER D A, SINGH S P, et al. Policy gradient methods for reinforcement learning with function approximation[C]//Advances in Neural Information Processing Systems. Denver, USA: MIT Press, 2000.
18	YU Lantao, ZHANG Weinan, WANG Jun, et al. Seqgan: sequence generative adversarial nets with policy gradient[C]//AAAI Conference on Artificial Intelligence. San Francisco, USA: AAAI, 2017.
19	KRAUSE J, JOHNSON J, KRISHNA R, et al. A hierarchical approach for generating descriptive image paragraphs[C]//Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017: 317-325.
20	VESELY K, GHOSHAL A, BURGET L, et al. Sequence-discriminative training of deep neural networks[C]//Interspeech. Lyon, France: IEEE, 2013: 2345-2349.
21	KIM Y. Convolutional neural networks for sentence classification[C]//Empirical Methods in Natural Language Processing. Doha, Qatar: ACL, 2014: 1746-1751.
22	LAI Siwei, XU Liheng, LIU Kang, et al. Recurrent convolutional neural networks for text classification[C]//29th AAAI Conference on Artificial Intelligence. Austin Texas, USA: AAAI, 2015.
23	DEMNER-FUSHMAN D , KOHLI M D , ROSENMAN M B , et al. Preparing a collection of radiology examinations for distribution and retrieval[J]. Journal of the American Medical Informatics Association, 2016, 23 (2): 304- 310.
24	HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//Computer Visual on and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 770-778.
25	DONAHUE J, ANNE HENDRICKS L, GUADARRAMA S, et al. Long-term recurrent convolutional networks for visual recognition and description[C]//Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015.
26	PAPINENI K, ROUKOS S, WARD T, et al. BLEU: a method for automatic evaluation of machine translation[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Grenoble, France: ACL, 2002.
27	DENKOWSKI M, LAVIE A. Meteor universal: language specific translation evaluation for any target language[C]// Proceedings of the 9th Workshop on Statistical Machine Translation. Baltimore, USA: ACL, 2014: 376-380.
28	CHIN-YEW L. Rouge: a package for automatic evaluation of summaries[C]//Proceedings of the 42th Annual Meeting on Association for Computational Linguistics. Barcelona, Spain: ACL, 2004: 74-81.
29	VEDANTAM R, LAWRENCE Z C, PARIKH D. Cider: consensus-based image description evaluation[C]//Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 4566-4575.

Metrics

Viewed

Full text

585

HTML			PDF

Just accepted	Online first	Issue	Just accepted	Online first	Issue
0	0	56	0	0	529

From	Others	local

Times	91	494
Rate	16%	84%

Abstract

1534

Just accepted	Online first	Issue

0	0	1534

From	Others	local

Times	1532	2
Rate	100%	0%

Cited

Web of Science	Crossref	ScienceDirect	Search for Citations in Google Scholar >>


This page requires you have already subscribed to WoS.

Shared

Discussed

Comments

Recommended 10

[1]	WANG Li-ju,HUANG Qi-cheng,WANG Zhao-xu . [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(6): 51 -56 .
[2]	LIU Zhongguo,ZHANG Xiaojing,LIU Boqiang,LIU Changchun, . The development of ultrasonic characterization of the biological tissue elasticity[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(3): 34 -38 .
[3]	JI Tao,GAO Xu/sup>,SUN Tong-jing,XUE Yong-duan/sup>,XU Bing-yin/sup> . Characteristic analysis of fault generated traveling waves in 10 Kv automatic blocking and continuous power transmission lines[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 111 -116 .
[4]	SUN Cong-zheng,GUAN Cong-sheng,QIN Jing-yu,CHENG Chuan . The structure and performances of the electroless Ni-P alloy coating on aluminum alloy[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(5): 108 -112 .
[5]	XU Li-li,JI Zhong,XIA Ji-mei . The optimum algorithm for the container loading problem with homogeneous cargoes[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(3): 14 -17 .
[6]	DIAO Yong, TIAN Si-Meng, CAO Zhe-Meng. Geological work method for the construction of the Yichang Wanzhou Railway tunnel in high risk karst areas[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(5): 91 -95 .
[7]	GAO Hou-Lei, TIAN Jia, DU Jiang, WU Zhi-Gang, LIU Chu-Min. Distributed generation—new technology in energy development[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(5): 106 -110 .
[8]	YUE Yuan-Zheng. Relaxation in glasses far from equilibrium[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(5): 1 -20 .
[9]	CHEN Huaxin, CHEN Shuanfa, WANG Binggang. The aging behavior and mechanism of base asphalts[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 125 -130 .
[10]	XU Xiaodan, DUAN Zhengjie, CHEN Zhongyu. The sentiment mining method based on extended sentiment dictionary and integrated features[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(6): 15 -18 .

内容	描述
数据数量	7470
“impression”部分	总体诊断
“findings”部分	局部诊断
“tags”部分	关键字

数据	数量
标签	351
字典大小	2195
训练集^#	6470
测试集^#	500
验证集^#	500

模型	BLEU-1	BLEU-2	BLEU-3	BLEU-4	METEOR	ROUGE	CIDEr
CNN-RNN^[1]	0.316	0.211	0.140	0.095	0.159	0.267	0.111
LRCN^[25]	0.369	0.229	0.149	0.099	0.155	0.278	0.190
AdtAtt^[10]	0.369	0.226	0.151	0.108	0.171	0.323	0.155
CoAtt^[2]	0.382	0.248	0.176	0.125	0.184	0.300	0.287
MIRGAN-visual attention^[8]	0.381	0.246	0.171	0.119	0.190	0.327	0.304
MIRGAN-semantic attention^[10]	0.372	0.232	0.169	0.120	0.187	0.311	0.298
MIRGAN	0.401	0.257	0.178	0.125	0.194	0.336	0.313

MIRGAN: a medical image report generation model based on GAN

RichHTML

PDF (PC)

Abstract

Cite this article

share this article

Figures/Tables 8

References 29

Related Articles 5

Metrics

Comments

Recommended 10

[1]	ZHENG Zijun, FENG Xiang, YU Huiqun, LI Xiuquan. Dynamic prediction of spatiotemporal big data based on relationship transfer and reinforcement learning [J]. Journal of Shandong University(Engineering Science), 2021, 51(2): 105-114.
[2]	ZHANG Yuefang, DENG Hongxia, HU Chunxiang, QIAN Guanyu, LI Haifang. Hippocampal segmentation combining residual attention mechanism and generative adversarial networks [J]. Journal of Shandong University(Engineering Science), 2020, 50(6): 76-81.
[3]	Chunyang LI,Nan LI,Tao FENG,Zhuhe WANG,Jingkai MA. Abnormal sound detection of washing machines based on deep learning [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 108-117.
[4]	Zhifu CHANG,Fengyu ZHOU,Yugang WANG,Dongdong SHEN,Yang ZHAO. A survey of image captioning methods based on deep learning [J]. Journal of Shandong University(Engineering Science), 2019, 49(6): 25-35.
[5]	SHEN Jing, LIU Hai-bo, ZHANG Ru-bo, WU Yan-xia, CHENG Xiao-bei. Multi-robot hierarchical reinforcement learning based on semi-Markov games [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(4): 1-7.