您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2025, Vol. 55 ›› Issue (4): 18-28.doi: 10.6040/j.issn.1672-3961.0.2024.047

• 深度学习与视觉专题 • 上一篇    

基于纹理和结构交互的人脸图像修复

周遵富1,2,张乾3*,石计亮1,2,岳诗琴4   

  1. 1.贵州民族大学数据科学与信息工程学院, 贵州 贵阳 550025;2.贵州省模式识别与智能系统重点实验室(贵州民族大学), 贵州 贵阳 550025;3.贵州民族大学教务处, 贵州 贵阳 550025;4.武汉理工大学汽车工程学院, 湖北 武汉430070
  • 发布日期:2025-08-31
  • 作者简介:周遵富(1996— ),男,贵州毕节人,硕士研究生,主要研究方向为计算机视觉. E-mail: zzf08100429@163.com. *通信作者简介:张乾(1984— ),男,贵州贵定人,教授,硕士生导师,博士,主要研究方向为计算机视觉. E-mail: gzmuzq@gzmu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(62062024);贵州民族大学校级科研资助项目(GZMUZK[2021]YB23);贵州省教育厅自然科学研究资助项目(黔教技[2022]015号)

Face image inpainting based on texture and structure interaction

ZHOU Zunfu1,2, ZHANG Qian3*, SHI Jiliang1,2, YUE Shiqin4   

  1. ZHOU Zunfu1, 2, ZHANG Qian3*, SHI Jiliang1, 2, YUE Shiqin4(1. College of Data Science and Information Engineering, Guizhou Minzu University, Guiyang 550025, Guizhou, China;
    2. Key Laboratory of Pattern Recognition and Intelligent System, Guizhou Minzu University, Guiyang 550025, Guizhou, China;
    3. Academic Affairs Office, Guizhou Minzu University, Guiyang 550025, Guizhou, China;
    4. College of Automotive Engineering, Wuhan University of Technology, Wuhan 430070, Hubei, China
  • Published:2025-08-31

摘要: 针对基于学习的人脸图像修复方法在提取深层特征时存在丢失上下文语义信息的问题,提出一种具有高效归一化注意力机制的生成器,有效提取人脸图像中的深层特征,并在多个尺度上更好地聚合低级和高级特征。为增强生成图像的一致性,提出一种具有残差主路径转换的双级门控特征融合模块,进一步融合解码后的纹理和结构信息,并设计一种增强的上下文特征聚合块,其中改进的提示生成块使提示参数在多尺度上进行特征间的交互,指导修复网络动态调整,生成现实、可信的人脸图像。试验结果表明,在CelebA-HQ数据集上,本研究方法的峰值信噪比RPSN、结构相似性SSIM、平均绝对误差EMA、Fréchet 初始距离DFI分别为37.74 dB、0.983 0、0.24%、1.489;在LFW数据集上,本研究方法的RPSNSSIMEMADFI分别为39.19 dB、0.987 7、0.21%、3.555。与其他5种主流方法相比,本研究方法取得相当有竞争力的结果。定性和定量试验结果表明,本研究方法能有效恢复残损的人脸结构和纹理信息。

关键词: 人脸图像修复, 归一化注意力模块, 提示生成模块, 多尺度特征融合, 双流判别器

Abstract: Aiming at the issue of losing contextual semantic information when extracting deep features by learning-based face image inpainting methods, a generator with an efficient normalized attention mechanism was proposed, which extracted deep features from face images more effectively and better aggregated low-level and high-level features at multiple scales. To enhance the consistency of the generated images, a bi-level gated feature fusion module with residual main path transformation was introduced. This module further fused decoded texture and structure information, and incorporated an enhanced contextual feature aggregation module, in which an improved prompt generation block enabled prompt parameters to interact between features at multiple scales, guiding the dynamic adjustment of the inpainting network to generate realistic and believable face images. Experimental results on the CelebA-HQ datasets showed that this research method achieved 37.74 dB, 0.983 0, 0.24%, and 1.489 in terms of peak signal-to-noise ratio (RPSN), structural similarity (SSIM), mean absolute error(EMA), and Fréchet inception distance (DFI). On the LFW dataset, the RPSN, SSIM, EMA, and DFI of this research method achieved 39.19 dB, 0.987 7, 0.21%, and 3.555. Compared with five other mainstream methods, this research method achieved quite competitive results. Qualitative and quantitative experiments demonstrated that this research method could effectively restore corrupted facial structure and texture information.

Key words: face image inpainting, normalized attention module, prompt generation module, multi-scale feature fusion, two-stream discriminator

中图分类号: 

  • TP391.41
[1] BARNES C, SHECHTMAN E, FINKELSTEIN A, et al. PatchMatch: a randomized correspondence algorithm for structural image editing[J]. ACM Transactions on Graphics, 2009, 28(3): 1-11.
[2] EFROS A A, FREEMAN W T. Image quilting for texture synthesis and transfer[C] //Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. New York, USA:ACM, 2001: 341-346.
[3] YU J, LIN Z, YANG J, et al. Free-form image inpainting with gated convolution[C] //Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision(ICCV). Seoul: IEEE, 2019: 4470-4479.
[4] YAN Z, LI X, LI M, et al. Shift-Net: image inpainting via deep feature rearrangement[C] //Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018: 3-19.
[5] LIU H, JIANG B, XIAO Y, et al. Coherent semantic attention for image inpainting[C] //Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision(ICCV). Seoul: IEEE, 2019: 4170-4179.
[6] LI J, HE F, ZHANG L, et al. Progressive reconstruction of visual structure for image inpainting[C] //Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision(ICCV). Seoul: IEEE, 2019: 5961-5970.
[7] WANG N, ZHANG Y, ZHANG L. Dynamic selection network for image inpainting[J]. IEEE Transactions on Image Processing, 2021, 30: 1784-1798.
[8] ZENG Y, FU J, CHAO H,et al. Aggregated contextual transformations for high-resolution image inpainting[J]. IEEE Transactions on Visualization and Computer Graphics, 2023, 29(7): 3266-3280.
[9] HUI S, ZHOU S, DENG Y, et al. Auxiliary loss reweighting for image inpainting[EB/OL].(2022-04-22)[2024-04-20]. https://arxiv.org/abs/2111.07279
[10] MISRA D. Mish: a self regularized non-monotonic neural activation function[EB/OL].(2019-08-26)[2024-04-20]. https://arxiv.org/abs/1908.08681
[11] LIU Y, SHAO Z, TENG Y, et al. NAM: normali-zation-based attention module[EB/OL].(2021-11-24)[2024-04-20]. https://arxiv.org/abs/2111.12419
[12] KARRAS T, AILA T, LAINE S, et al. Progressive growing of GANs for improved quality, stability, and variation[EB/OL].(2018-02-26)[2024-04-20]. https://arxiv.org/abs/1710.10196
[13] HUANG G B, MATTAR M, BERG T, et al. Labeled faces in the wild: a database for studying face recognition in unconstrained environments[EB/OL].(2008-09-16)[2024-04-20]. https://inria.hal.science/inria-00321923v1/document
[14] BERTALMIO M, SAPIRO G, CASELLES V, et al. Image inpainting[C] //Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. New Orleans, USA: ACM, 2000: 417-424.
[15] TSCHUMPERLÉ D, DERICHE R. Vector-valued image regularization with PDEs: a common framework for different applications[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(4): 506-517.
[16] DARABI S, SHECHTMAN E, BARNES C, et al. Image melding: combining inconsistent images using patch-based synthesis[J]. ACM Transactions on Graphics, 2012, 31(4): 1-10.
[17] BUYSSENS P, DAISY M, TSCHUMPERLÉ D, et al. Exemplar-based inpainting: technical review and new heuristics for better geometric reconstructions[J]. IEEE Transactions on Image Processing, 2015, 24(6): 1809-1824.
[18] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C] //Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada: Curran Associates, 2014: 2672-2680.
[19] NAZERI K, NG E, JOSEPH T, et al. EdgeConnect: generative image inpainting with adversarial edge learning[EB/OL].(2019-01-11)[2024-04-20]. https://arxiv.org/abs/1901.00212
[20] XIONG W, YU J, LIN Z,et al. Foreground-aware image inpainting[C] //Proceedings of the 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019: 5833-5841.
[21] REN Y, YU X, ZHANG R, et al. StructureFlow: image inpainting via structure-aware appearance flow[C] // Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision(ICCV). Seoul: IEEE, 2019: 181-190.
[22] ZENG Y H, FU J L, CHAO H Y, et al. Learning pyramid-context encoder network for high-quality image inpainting[C] //Proceedings of the 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019: 1486-1494.
[23] YU J, LIN Z, YANG J, et al. Generative image inpainting with contextual attention[C] //Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 5505-5514.
[24] ZHENG C X, CHAM T J, CAI J F. Pluralistic image completion[C] //Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019: 1438-1447.
[25] QIN J, BAI H, ZHAO Y.Multi-scale attention network for image inpainting[J]. Computer Vision and Image Understanding, 2021, 204: 103155.
[26] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
[27] GUO X, YANG H, HUANG D. Image inpainting via conditional texture and structure dual generation[C] //Proceedings of the 18th IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 14114-14123.
[28] POTLAPALLI V, ZAMIR S W, KHAN S, et al. PromptIR: prompting for all-in-one blind image restoration[C] //Proceedings of the 37th International Conference on Neural Information Processing Systems. New Orleans, USA: Curran Associates, 2023: 71275-71293.
[29] RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation[C] //Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer, 2015: 234-241.
[30] IIZUKA S, SIMO-SERRA E, ISHIKAWA H. Globally and locally consistent image completion[J]. ACM Transactions on Graphics, 2017, 36(4): 1-14.
[31] LI C, WAND M. Precomputed real-time texture synthesis with Markovian generative adversarial networks[C] //Proceedings of the 14th European Conference on Computer Vision. Amsterdam, Netherlands: Springer, 2016: 702-716.
[32] LIU G, REDA F A, SHIH K J, et al. Image inpainting for irregular holes using partial convolutions[C] //Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018: 89-105.
[33] DENG Y, HUI S, MENG R, et al. Hourglass attention network for image inpainting[C] //Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer, 2022: 483-501.
[1] 刘全金,嵇文,胡浪涛,黄汇磊,杨瑞,李翔,高泽文,魏本征. 基于双解码器的医学图像分割模型[J]. 山东大学学报 (工学版), 2024, 54(6): 8-18.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!