本文已被:浏览 161次 下载 226次
投稿时间:2024-05-12 修订日期:2024-06-16
投稿时间:2024-05-12 修订日期:2024-06-16
中文摘要: 针对铜矿图像分类中传统神经网络因感受野限制和维度信息阻塞面临的问题,提出了基于X射线透射成像技术改进的Swin-Transformer模型。该模型以Swin-Transformer为基础框架,在主干网络的第二和第三阶段中添加Mixing Block,通过局部窗口自注意力和深度卷积之间的的双向交互,使模型的感受野得到显著增大,从而增强了特征表示和建模能力;同时,引入的EMA(Efficient Multi-Scale Attention)模块,将部分通道重塑为批量维度,并将通道维度分组为多个子特征,使空间语义特征在每个特征组中均匀分布,提升了模型在通道和多尺度空间维度信息融合方面的能力,并增强了对感兴趣区域特征的表征效果。通过实验验证,改进模型解决了感受野和维度信息受限的问题,并在铜矿智能识别任务上达到了94.40%的准确率。
中文关键词: 深度学习 X射线成像 矿石识别 Swin-Transformer
Abstract:To solve the problems faced by traditional neural networks in copper mine image classification due to sensitivity field limitation and dimensional information blocking, an improved Swin-Transformer model based on X-ray transmission imaging technology is proposed. The model is based on Swin-Transformer. Mixing blocks are added in the second and third stages of the backbone network, and the bidirectional interaction between local window self-attention and deep convolution can significantly increase the model"s sensitivity field, thus significantly enhancing the feature representation and modeling capabilities. At the same time, the EMA module introduced reshaped some channels into batch dimensions and grouped the channel dimensions into multiple sub-features, so that the spatial semantic features were evenly distributed in each feature group, which further improved the model"s ability in the information fusion of channels and multi-scale spatial dimensions, and enhanced the characterization effect of features in regions of interest.Through experimental verification, the improved model has solved the problems of limited receptive field and dimensional information, and achieved an accuracy of 94.40% in the intelligent recognition task of copper mines.
文章编号: 中图分类号: 文献标志码:
基金项目:国家自然科学基金资助项目(U2067202);江西省重点研发计划项目(20203BBG73069);江西省主要学科学术和技术带头人培养计划项目(20225BCJ22004)
作者 | 单位 | |
黄永进* | 东华理工大学信息工程学院 | 2022120124@ecut.edu.cn |
何剑锋 | 东华理工大学信息工程学院 | |
李卫东 | 东华理工大学信息工程学院 | |
夏菲 | 东华理工大学信息工程学院 | |
王杉 | 东华理工大学信息工程学院 | |
汪雪元 | 东华理工大学信息工程学院 | |
钟国韵 | 东华理工大学信息工程学院 | |
瞿金辉 | 东华理工大学信息工程学院 |
引用文本:
黄永进,何剑锋,李卫东,夏菲,王杉,汪雪元,钟国韵,瞿金辉.基于改进Swin-Transformer模型的铜矿X射线图像分类研究[J].有色金属(选矿部分),2024(12):112-118.
huangyongjin,HE Jianfeng,LI Weidong,XIA Fei,WANG Shan,WANG Xueyuan,ZHONG Guoyun,QU Jinhui.Research on Copper Mine X-ray Image Classification Based on Improved Swin Transformer Model[J].Nonferrous Metals(Mineral Processing Section),2024(12):112-118.
黄永进,何剑锋,李卫东,夏菲,王杉,汪雪元,钟国韵,瞿金辉.基于改进Swin-Transformer模型的铜矿X射线图像分类研究[J].有色金属(选矿部分),2024(12):112-118.
huangyongjin,HE Jianfeng,LI Weidong,XIA Fei,WANG Shan,WANG Xueyuan,ZHONG Guoyun,QU Jinhui.Research on Copper Mine X-ray Image Classification Based on Improved Swin Transformer Model[J].Nonferrous Metals(Mineral Processing Section),2024(12):112-118.