纺织学报 ›› 2024, Vol. 45 ›› Issue (05): 155-164.doi: 10.13475/j.fzxb.20230502601

• 服装工程 • 上一篇    下一篇

基于上下文提取与注意力融合的遮挡服装图像分割

顾梅花(), 花玮, 董晓晓, 张晓丹   

  1. 西安工程大学 电子信息学院, 陕西 西安 710048
  • 收稿日期:2023-05-11 修回日期:2024-01-08 出版日期:2024-05-15 发布日期:2024-05-31
  • 作者简介:顾梅花(1980—),女,副教授,博士。主要研究方向为图像处理与分析。E-mail:gumh2001@163.com
  • 基金资助:
    国家自然科学基金项目(61901347);陕西省科技厅面上项目(2022JM-146);陕西省科技厅面上项目(2024JC-YBMS-491)

Occlusive clothing image segmentation based on context extraction and attention fusion

GU Meihua(), HUA Wei, DONG Xiaoxiao, ZHANG Xiaodan   

  1. School of Electronics and Information, Xi'an Polytechnic University, Xi'an, Shaanxi 710048, China
  • Received:2023-05-11 Revised:2024-01-08 Published:2024-05-15 Online:2024-05-31

摘要:

针对遮挡服装图像分割准确率低的问题,提出一种融合上下文提取与注意力机制的遮挡服装图像实例分割方法。以Mask R-CNN为基础网络,首先采用上下文提取模块优化ResNet的输出特征,通过融合不同速率的多路径特征从多个感受野中捕获图像的上下文信息,强化遮挡服装特征表示的识别及提取能力;然后引入通道注意力机制与空间注意力机制的残差连接,自适应地专注于捕捉遮挡服装图像的空间和通道维度上的语义相互依赖关系,降低上下文提取模块在处理特征图时因冗余的上下文关系扩大造成误定位与误识别的概率;最后,采用目标检测损失函数CIoU计算原理作为非极大值抑制的评判标准,关注预测框和真实框的重叠与非重叠区域,最大程度地选择遮挡服装的最优目标框,使预测框更加贴近真实框。结果表明,与其它方法相比,改进方法显著改善了不同遮挡程度服装图像的误分割现象,能提取出更精确的服装实例,其对遮挡服装图像的平均分割精度比原模型提升了4.4%。

关键词: 图像分割, 遮挡服装, 上下文提取, 注意力机制, CIoU计算原理

Abstract:

Objective Visual analysis of clothing attracts attention, while convenitional methods for clothing parsing fail to capture richer information about clothing details due to various factors including complex backgrounds and mutual occlusion of clothing. Therefore, a novel clothing image instance segmentation method is proposed to effectively extract and segment the multi-pose and mutually occluded target clothing in complex scenes for the subsequent processing of clothing analysis, retrieval, and other tasks to better meet targeted needs for personalized clothing design, retrieval, and matching.

Method The output features of ResNet were optimized by using a context extraction module to enhance the recognition and extraction of feature representations of occlusive clothing. Then the attention mechanism of residual connectivity was introduced to adaptively focus on capturing the semantic inter-dependencies in the spatial and channel dimensions of occlusive clothing images. As the last step, CIoU computational principle was used as the criterion for non-maximal suppression, while focusing on the overlapping and non-overlapping regions of the predicted box and the real box to select the optimal target box that covers the occlusive clothing to the fullest extent.

Results In qualitative comparison with Mask R-CNN as well as Mask Scoring R-CNN and YoLact methods, the proposed method showed stronger mask perception and inference ability, effectively decoupling the overlapping relationship between masked garment instances with more accurate segmentation visual effect. In addition, accuracy (AP) was used as an evaluation index for further quantitative analysis of the improved model, and the segmentation accuracy APm under different IoU was 49.3%, which was 3.6% higher than the original model. Meanwhile, by comparing the segmentation accuracy of each improved model for different occlusion degrees, it was seen that the Mask R-CNN model had the lowest segmentation accuracy for various occlusion degrees, while with the optimization of CEM, AM and CIoU strategy, the accuracy of the improved model in minor occlusion APL1, moderate occlusion APL2 and severe occlusion APL3 was improved by 4.3%, 4.2% and 4.8%, respectively, and the most significant improvement in segmentation accuracy was for severely occluded clothing. Finally, the accuracy of the proposed method was compared with that of Mask R-CNN, Mask Scoring R-CNN, SOLOv1, and Yolact. The overall accuracy of Yolact model for segmenting clothing with different degrees of occlusion was slightly lower, the overall accuracy of Mask Scoring R-CNN for segmenting clothing was slightly higher than that of Mask R-CNN, and SOLOv1 achieved similar segmentation accuracy as Mask R-CNN. The accuracy of the proposed method was significantly better than that of other methods for segmentation of garments with different occlusion degrees, where APL3 for segmentation of severely occlusive clothing was improved the most, which was 4.8% higher than Mask R-CNN and 4.2%-11.1% higher than other models.

Conclusion By embedding the context extraction module, attention mechanism module, and CIoU computation strategy into Mask R-CNN network, a novel clothing instance segmentation model is constructed, with enhanced recognition and extraction ability of the model for clothing features. The semantic inter-dependencies between masked clothing feature maps in spatial and channel dimensions are captured, and the segmentation accuracy for each clothing is improved. The optimal target frame is predicted for each clothing instance, which improves the accuracy of the model for segmenting occlusive clothing instances. Through a series of comprehensive experiments, the feasibility and effectiveness of the proposed method are proved, providing a new idea for the research of clothing image instance segmentation.

Key words: image segmentation, occlusive clothing, context extraction, attention mechanism, CIoU computational principle

中图分类号: 

  • TS941.2

图1

遮挡服装图像分割效果"

图2

改进Mask R-CNN的服装图像实例分割模型结构示意图"

图3

上下文提取模块网络结构"

图4

注意力机制嵌入方式"

图5

CIoU预测框优化示意图"

表1

各改进模型在不同IoU下的分割精度对比"

CEM AM CIoU A P m/% A P 50 m/% A P 75 m/%
45.7 75.9 51.8
46.2 77.3 53.2
46.0 76.9 53.4
46.4 77.3 54.2
47.8 79.8 56.7
49.3 80.4 57.0

表2

各改进模型在不同遮挡程度下的分割精度对比"

CEM AM CIoU APL1/% APL2/% APL3/%
61.2 57.3 47.5
63.3 59.2 49.1
62.4 58.5 48.5
62.8 58.5 48.4
63.9 59.9 49.8
65.5 61.5 52.3

图6

各改进模型损失曲线图"

图7

不同方法进行服装实例分割效果比较"

表3

不同模型对服装实例分割的精度比较"

方法 AP50 AP75 APL1 APL2 APL3
Yolact[9] 72.9 43.1 53.8 50.1 41.2
Mask Scoring R-CNN[29] 79.3 57.2 63.7 59.1 48.1
SOLOv1[30] 79.1 54.1 61.4 56.5 46.7
Mask R-CNN[7] 78.7 53.8 61.2 57.3 47.5
本文方法 87.8 63.1 65.5 61.5 52.3
[1] 庹武, 王晓玉, 高雅昆, 等. 基于改进边缘检测算法的服装款式识别[J]. 纺织学报, 2021, 42(10): 157-162.
doi: 10.13475/j.fzxb.20201205006
TOU Wu, WANG Xiaoyu, GAO Yakun, et al. Apparel style recognition based on improved edge detection algorithm[J]. Journal of Textile Research, 2021, 42(10): 157-162.
[2] 吴传彬, 刘骊, 付晓东, 等. 结合显著区域检测和手绘草图的服装图像检索[J]. 纺织学报, 2019, 40(7): 174-181.
WU Chuanbin, LIU Li, FU Xiaodong, et al. Combining significant region detection and hand sketching for garment image retrieval[J]. Journal of Textile Research, 2019, 40(7): 174-181.
[3] KURNIA R, HERYANI W. Fashion harmony of blouse color and pants/skirts for women clothing using fuzzy logic[C]// International Conference on Multimedia and Image Processing. Brunei: IEEE, 2016: 57-60.
[4] ZHAO B, WU X, PENG Q, et al. Clothing cosegmentation for shopping images with cluttered background[J]. IEEE Transactions on Multimedia, 2016, 18(6): 1111-1123.
[5] LIU F, DAI S. Normal distribution sampling convolutional neural network for fine-grained image classification[C]// Proceedings of 2019 Chinese Intelligent Systems Conference. Singapore: Springer, 2020: 645-652.
[6] 黄涛, 李华, 周桂, 等. 实例分割方法研究综述[J]. 计算机科学与探索, 2023, 17(4): 810-825.
HUANG Tao, LI Hua, ZHOU Gui, et al. A review of instance segmentation methods[J]. Computer Science and Exploration, 2023, 17(4): 810-825.
[7] HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2980-2988.
[8] 张绪义, 曹家乐. 基于轮廓点掩模细化的单阶段实例分割网络[J]. 光学学报, 2020, 40(21): 113-121.
ZHANG Xuyi, CAO Jiale. A single-stage instance segmentation network based on contour point mask refinement[J]. Journal of Optics, 2020, 40(21): 113-121.
[9] BOLAY D, ZHOU C, XIAO F, et al. YOLACT: real-time instance segmentation[C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 9157-9166.
[10] BOLAY D, ZHOU C, XIAO F, et al. YOLACT++: better real-time instance segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(2): 1108-1121.
[11] LI K, MALIK J. Amodal instance segmentation[C]// Computer Vision-ECCV 2016. Amsterdam: Springer International Publishing, 2016: 677-693.
[12] SALEH K, SZÉNÁSI S, VÁMOSSY Z. Occlusion handling in generic object dection: a review[C]// 2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics. Slovakia: IEEE, 2021: 477-484.
[13] KE L, TAI Y. Deep occlusion-aware instance segmentation with overlapping bilayers[C]// IEEE Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 4019-4028.
[14] WANG P, YUILLE A. Doc: deep occlusion estimation from a single image[C]// Computer Vision-ECCV 2016. Amsterdam: Springer International Publishing, 2016: 545-561.
[15] ZHOU Y Z, ZHU Y, YE Q X, et al. Weakly supervised instance segmentation using class peak response[C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 3791-3800.
[16] KE L, BHARATH H, JITENDRA M. Iterative instance segmentation[C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 3659-3667.
[17] LIU Z W, LUO P, QIU S. DeepFashion: powering robust clothes recognition and retrieval with rich annotations[C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 1305-1311.
[18] SRIRAM G, GANESH Babu T R, PRAVEENA R, et al. Classification of leukemia and leukemoid using VGG-16 convolutional neural network architecture[J]. Molecular & Cellular Biomechanics, 2022, 19(1): 29-40.
[19] GE Y Y, ZHANG R M, WANG X G, et al. DeepFashion2: a versatile benchmark for detection, pose estimation, segmentation and re-identification of clothing images[C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 5337-5345.
[20] JIA M L, SHI M Y, MIKHAIL S, et al. Fashionpedia: ontology, segmentation, and an attribute localization dataset[C]// Computer Vision-ECCV 2020. [s.l.]: Springer International Publishing, 2020: 316-332.
[21] 顾梅花, 刘杰, 李立瑶, 等. 结合特征学习与注意力机制的服装图像分割[J]. 纺织学报, 2022, 43(11): 163-171.
doi: 10.13475/j.fzxb.20210901109
GU Meihua, LIU Jie, LI Liyao, et al. Combining feature learning and attention mechanism for garment image segmentation[J]. Journal of Textile Research, 2022, 43(11): 163-171.
doi: 10.13475/j.fzxb.20210901109
[22] 花玮, 顾梅花, 李立瑶, 等. 改进SOLOv2的服装图像分割算法[J]. 纺织高校基础科学学报, 2021, 34(4): 74-81.
HUA Wei, GU Meihua, LI Liyao, et al. Improved SOLOv2 algorithm for garment image segmentation[J]. Basic Sciences Journal of Textile Universities, 2021, 34(4): 74-81.
[23] WANG X L, ZHANG R F, KONG T, et al. SOLOv2: dynamic and fast instance segmentation[C]// Advances in Neural Information Processing Systems. Vancouver, Nips Foundation, 2020: 17721-17732.
[24] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
doi: 10.1109/TPAMI.2016.2577031 pmid: 27295650
[25] 袁明阳, 宋亚林, 张潮, 等. 基于GA-RetinaNet的水下目标检测[J]. 计算机系统应用, 2023, 32(6): 80-90.
YUAN Mingyang, SONG Yalin, ZHANG Chao, et al. Underwater object detection based on GA-RetinaNet[J]. Computer Systems & Applications, 2023, 32(6): 80-90.
[26] CHEN L C, PAPANDREOU G, KOKKINOS L, et al. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. Patern Analysis and Machine Intelligence, 2017, 40(4): 834-848.
[27] WOO S, PARK J, LEE J Y, et al. Cbam:vonvolutional block attention module[C]// Proceeding of the 15th European Conference on Computer Vision. Munich: Springer International Publishing, 2018: 3-19.
[28] ZHENG Z Hi, WANG P, LIU W, et al. Distance-IOU loss:faster and better learning for bounding box regression[C]// AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12993-13000.
[29] HUANG Z J, HUANG L C, GONG Y C, et al. Mask scoring R-CNN[C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 6409-6418.
[30] WANG X L, KONG T, SHEN C H, et al. SOLO: Segmenting objects by locations[C]// Computer Vision-ECCV 2020. Glasgow: Springer International Publishing, 2020: 649-665.
[1] 胡旭东, 汤炜, 曾志发, 汝欣, 彭来湖, 李建强, 王博平. 基于轻量化卷积神经网络的纬编针织物组织结构分类[J]. 纺织学报, 2024, 45(05): 60-69.
[2] 扶才志, 曹鸿艳, 廖文皓, 李忠健, 黄琪翔, 蒲三成. 基于多视角图像的纱线条干均匀度测量方法[J]. 纺织学报, 2024, 45(03): 49-57.
[3] 师红宇, 位营杰, 管声启, 李怡. 基于残差结构的棉花异性纤维检测算法[J]. 纺织学报, 2023, 44(12): 35-42.
[4] 马创佳, 齐立哲, 高晓飞, 王子恒, 孙云权. 基于改进YOLOv4-Tiny的缝纫线迹质量检测方法[J]. 纺织学报, 2023, 44(08): 181-188.
[5] 袁甜甜, 王鑫, 罗炜豪, 梅琛楠, 韦京艳, 钟跃崎. 基于注意力机制和视觉转换器的三维虚拟试衣网络[J]. 纺织学报, 2023, 44(07): 192-198.
[6] 付晗, 胡峰, 龚杰, 余联庆. 面向织物疵点检测的缺陷重构方法[J]. 纺织学报, 2023, 44(07): 103-109.
[7] 王斌, 李敏, 雷承霖, 何儒汉. 基于深度学习的织物疵点检测研究进展[J]. 纺织学报, 2023, 44(01): 219-227.
[8] 陈佳, 杨聪聪, 刘军平, 何儒汉, 梁金星. 手绘草图到服装图像的跨域生成[J]. 纺织学报, 2023, 44(01): 171-178.
[9] 顾梅花, 刘杰, 李立瑶, 崔琳. 结合特征学习与注意力机制的服装图像分割[J]. 纺织学报, 2022, 43(11): 163-171.
[10] 任维佳, 杜玉红, 左恒力, 袁汝旺. 棉花中异性纤维检测图像分割和边缘检测方法研究进展[J]. 纺织学报, 2021, 42(12): 196-204.
[11] 胡群, 张宁, 潘如如. 应用均值漂移的印花面料交互式换色方法[J]. 纺织学报, 2021, 42(11): 97-102.
[12] 裘柯槟, 陈维国, 周华, 应双双. 成像技术在纺织品颜色测量中的应用进展[J]. 纺织学报, 2020, 41(09): 155-161.
[13] 徐启永 胡峰 王传桐 吴雨川. 改进频率调谐显著算法在疵点图像分割中的应用[J]. 纺织学报, 2018, 39(05): 125-131.
[14] 陈晓春 张彦彬 彭虎 陈翔 褚乃清. 重叠纤类图像的凹点匹配和分割算法[J]. 纺织学报, 2017, 38(11): 143-149.
[15] 张成梁 李蕾 董全成 葛荣雨. 应用区域颜色分割的机采棉杂质检测方法[J]. 纺织学报, 2017, 38(07): 135-141.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!