[1]黄文秀,周术诚,陈新元,等.基于图像融合和注意力机制的图像分类[J].南京师大学报(自然科学版),2025,48(03):120-128.[doi:10.3969/j.issn.1001-4616.2025.03.014]
 Huang Wenxiu,Zhou Shucheng,Chen Xinyuan,et al.Image Classification Based on Image Fusion and Attention Mechanism[J].Journal of Nanjing Normal University(Natural Science Edition),2025,48(03):120-128.[doi:10.3969/j.issn.1001-4616.2025.03.014]
点击复制

基于图像融合和注意力机制的图像分类()

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

卷:
48
期数:
2025年03期
页码:
120-128
栏目:
计算机科学与技术
出版日期:
2025-06-20

文章信息/Info

Title:
Image Classification Based on Image Fusion and Attention Mechanism
文章编号:
1001-4616(2025)03-0120-09
作者:
黄文秀1周术诚2陈新元13周忠眉4王榕国1
(1.福州工商学院工学院,福建 福州 350715)
(2.福建农林大学计算机与信息学院,福建 福州 350002)
(3.吉隆坡大学信息技术学院,马来西亚 吉隆坡 50250)
(4.闽南师范大学计算机学院,福建 漳州 363000)
Author(s):
Huang Wenxiu1Zhou Shucheng2Chen Xinyuan13Zhou Zhongmei4Wang Rongguo1
(1.School of Technology,Fuzhou Technology and Business College,Fuzhou 350715,China)
(2.College of Computer and Information Sciences,Fujian Agriculture and Forestry University,Fuzhou 350002,China)
(3.College of Information Technology,Universiti Kuala Lumpur,Kuala Lumpur 50250,Malaysia)
(4.School of Computer Science,Minnan Normal University,Zhangzhou 363000,China)
关键词:
图像融合注意力机制深度学习图像分类卷积网络
Keywords:
image fusionattention mechanismsdeep learningimage classificationconvolutional networks
分类号:
TP301
DOI:
10.3969/j.issn.1001-4616.2025.03.014
文献标志码:
A
摘要:
图像分类作为计算机视觉领域的一个关键任务,在多种应用场景中具有重要意义. 针对图像分类中的准确性和鲁棒性问题,提出了一种基于图像融合和注意力机制的分类方法. 首先,选用了ResNet-152作为图像分类的基础模型,并对公开数据集进行了预处理. 在特征融合阶段,采用了三个并行支路,分别应用不同大小的卷积核进行特征提取. 随后,在残差网络结构后引入了注意力机制,综合了构建的格拉姆矩阵、平均池化和最大池化,以突出模型对分类有益的区域. 在实验阶段,在公共图像数据集上进行大量实验,结果表明所提出的方法在实际应用中表现出很好的效果,分类准确度从最初的96.68%提高到98.87%. 同时,相较于传统方法具有更好的鲁棒性. 因此,本研究为图像分类领域提供了一种有效的改进方法,具有广泛的应用前景.
Abstract:
As a key task in the field of computer vision,image classification is of great significance in many application scenarios. Aiming at the accuracy and robustness of image classification,a classification method based on image fusion and attention mechanism is proposed. Firstly,ResNet-152 is selected as the basic model of image classification,and the public data set is preprocessed. In the feature fusion stage,three parallel branches are used,and convolution kernels with different sizes are used to extract features. Then,the attention mechanism is introduced after the residual network structure,and the Gram matrix,average pooling and maximum pooling are integrated to highlight the areas where the model is beneficial to classification. In the experimental stage,through a large number of experiments on public image data sets,the results show that the proposed method has a good effect in practical application,and the classification accuracy has increased from the initial 96.68% to 98.87%. In addition,compared with traditional methods,it has better robustness. Therefore,this study provides an effective improvement method for the field of image classification,which has a wide application prospect.

参考文献/References:

[1]李少凡,高尚兵,张莹莹,等. 基于轻量化网络与嵌入式的分心行为协同检测系统[J]. 南京师范大学学报(工程技术版),2023,23(1):25-32.
[2]WANG Y,XIE Y,ZENG J,et al. Cross-modal fusion for multi-label image classification with attention mechanism[J]. Computers and electrical engineering,2022,101:108002.
[3]姜昊,凌萍,陈寸生保. 一种新的基于通道-空间融合注意力及SwinT的细粒度图像分类算法[J]. 南京师范大学学报(工程技术版),2023,23(3):36-42.
[4]NAM E J,NAM C H,JANG K S. Face detection method based fusion RetinaNet using RGB-D image[J]. Journal of the Korea Institute of Information and Communication Engineering,2022,26(4):519-525.
[5]PANDA M K,SUBUDHI B N,VEERAKUMAR T,et al. Modified ResNet-152 network with hybrid pyramidal pooling for local change detection[J]. IEEE transactions on artificial intelligence,2023,5(4):1599-1612.
[6]SUN T,DING S,GUO L. Low-degree term first in ResNet,its variants and the whole neural network family[J]. Neural networks,2022,148:155-165.
[7]TABRIZCHI H,PARVIZPOUR S,RAZMARA J. An improved VGG model for skin cancer detection[J]. Neural processing letters,2023,55(4):3715-3732.
[8]张婷婷,王斌,王坤,等. 基于增强层次对称点图像分析和深度残差网络的水电机组故障诊断[J]. 水利学报,2023,54(11):1380-1391.
[9]ULLAH A,ELAHI H,SUN Z,et al. Comparative analysis of AlexNet,ResNet18 and SqueezeNet with diverse modification and arduous implementation[J]. Arabian journal for science and engineering,2022,47(2):2397-2417.
[10]KURANI A,DOSHI P,VAKHARIA A,et al. A comprehensive comparative study of artificial neural network(ANN)and support vector machines(SVM)on stock forecasting[J]. Annals of DATA SCIEnce,2023,10(1):183-208.
[11]WANG Y,PAN Z,DONG J. A new two-layer nearest neighbor selection method for kNN classifier[J]. Knowledge-based systems,2022,235:107604.
[12]MOHAMMADPOUR L,LING T C,LIEW C S,et al. A survey of CNN-based network intrusion detection[J]. Applied sciences,2022,12(16):8162.
[13]LIN K,ZHOU T,GAO X,et al. Deep convolutional neural networks for construction and demolition waste classification:VGGNet structures,cyclical learning rate,and knowledge transfer[J]. Journal of environmental management,2022,318:115501.
[14]YANG L,YU X,ZHANG S,et al. GoogLeNet based on residual network and attention mechanism identification of rice leaf diseases[J]. Computers and electronics in agriculture,2023,204:107543.
[15]STEFENON S F,YOW K C,NIED A,et al. Classification of distribution power grid structures using inception v3 deep neural network[J]. Electrical Engineering,2022,104(6):4557-4569.
[16]陈斌,樊飞燕,陆天易. 残差混合注意力结合骨骼图卷积多人姿态识别[J]. 南京师大学报(自然科学版),2024,47(4):106-117.
[17]张德,甄昊宇,林青宇. 结合注意力机制的生成对抗网络图像超分辨重建[J]. 福建师范大学学报(自然科学版),2023,39(3):86-93.
[18]周湘贞,李帅,隋栋. 基于深度学习和注意力机制的微博情感分析[J]. 南京师大学报(自然科学版),2023,46(2):115-121.
[19]王虎,尹泽泉,王雯婕,等. 基于金字塔池化网络的质子交换膜燃料电池气体扩散层组分推理方法[J]. 重庆大学学报,2024,47(1):84-92.
[20]ZHANG Q,XIAO J,TIAN C,et al. A robust deformed convolutional neural network(CNN)for image denoising[J]. CAAI transactions on intelligence technology,2023,8(2):331-342.
[21]NESHAT M,AHMED M,ASKARI H,et al. Hybrid inception architecture with residual connection:fine-tuned Inception-ResNet deep learning model for lung inflammation diagnosis from chest radiographs[J]. Procedia computer science,2024,235:1841-1850.
[22]LUO Y,XU Y,WANG C,et al. ResNeXt-CC:a novel network based on cross-layer deep-feature fusion for white blood cell classification[J]. Scientific reports,2024,14(1):18439-18451.
[23]欧阳顺湘. 从格拉姆矩阵不等式到多维协方差不等式[J]. 高等数学研究,2023,26(1):38-40.
[24]CHEN H,PEI Y,ZHAO H,et al. Super-resolution guided knowledge distillation for low-resolution image classification[J]. Pattern recognition letters,2022,155:62-68.
[25]李雨泽,李心安. 基于GCNet的毫米波波束选择算法[J]. 光通信研究,2023(6):72-76.
[26]张广群,李英杰,汪杭军,等. 基于微调CaffeNet的林业图像分类[J]. 林业科学,2020,56(10):121-128.
[27]WANG J,HU C,XU H,et al. A novel multi-atlas and multi-channel(MAMC)approach for multiple sclerosis lesion segmentation in brain MRI[J]. Signal,image and video processing,2019,13:1019-1027.
[28]MUDRAKOLA S,NAGARATNA H. Multi-attention convolution neural network to detect breast cancer[J]. Computer integrated manufacturing systems,2024,30(6):75-99.
[29]ZHANG W,CHEN G,ZHUANG P,et al. CATNet:cascaded attention transformer network for marine species image classification[J]. Expert systems with applications,2024,256:124932.

相似文献/References:

[1]马洪江,杨兴江,向昌成,等.一种PCNN实现区域检测的图像融合方法[J].南京师大学报(自然科学版),2010,33(03):131.
 Ma Hongjiang,Yang Xingjiang,Xiang Changcheng.Multi-Focus Image Fusion Using PCNN and Region Detection[J].Journal of Nanjing Normal University(Natural Science Edition),2010,33(03):131.
[2]徐春艳,宋余庆,刘 哲,等.基于Contourlet变换和T混合模型的医学图像融合算法[J].南京师大学报(自然科学版),2017,40(01):27.[doi:10.3969/j.issn.1001-4616.2017.01.005]
 Xu Chunyan,Song Yuqing,Liu Zhe,et al.A Medical Image Fusion Algorithm Based on Contourlet Transform and T Mixture Models[J].Journal of Nanjing Normal University(Natural Science Edition),2017,40(03):27.[doi:10.3969/j.issn.1001-4616.2017.01.005]
[3]杨琬琪,周子奇,郭心娜.注意力机制引导的多模态心脏图像分割[J].南京师大学报(自然科学版),2019,42(03):27.[doi:10.3969/j.issn.1001-4616.2019.03.004]
 Yang Wanqi,Zhou Ziqi,Guo Xinna.Attention-Guided Multimodal Cardiac Segmentation[J].Journal of Nanjing Normal University(Natural Science Edition),2019,42(03):27.[doi:10.3969/j.issn.1001-4616.2019.03.004]
[4]张旭辉,张 郴,李雅南,等.城市旅游餐饮体验的注意力机制模型建构——基于机器学习的网络文本深度挖掘[J].南京师大学报(自然科学版),2022,45(01):32.[doi:10.3969/j.issn.1001-4616.2022.01.006]
 Zhang Xuhui,Zhang Chen,Li Yanan,et al.Construction of Attention Mechanism Model of Urban Tourism Catering Experience:Deep Mining of Online Text Based on Machine Learning[J].Journal of Nanjing Normal University(Natural Science Edition),2022,45(03):32.[doi:10.3969/j.issn.1001-4616.2022.01.006]
[5]梁兵涛,倪云峰.基于集成学习的中文命名实体识别方法[J].南京师大学报(自然科学版),2022,45(03):123.[doi:10.3969/j.issn.1001-4616.2022.03.016]
 Liang Bingtao,Ni Yunfeng.Chinese Named Entity Recognition Method Based on Ensemble Learning[J].Journal of Nanjing Normal University(Natural Science Edition),2022,45(03):123.[doi:10.3969/j.issn.1001-4616.2022.03.016]
[6]周湘贞,李 帅,隋 栋.基于深度学习和注意力机制的微博情感分析[J].南京师大学报(自然科学版),2023,46(02):115.[doi:10.3969/j.issn.1001-4616.2023.02.015]
 Zhou Xiangzhen,Li Shuai,Sui Dong.Microblog Emotion Analysis Based on Deep Learning and Attention Mechanism[J].Journal of Nanjing Normal University(Natural Science Edition),2023,46(03):115.[doi:10.3969/j.issn.1001-4616.2023.02.015]
[7]张文娟,张 彬,杨皓哲.基于双注意力机制的成绩预测[J].南京师大学报(自然科学版),2023,46(04):103.[doi:10.3969/j.issn.1001-4616.2023.04.014]
 Zhang Wenjuan,Zhang Bin,Yang Haozhe.Performance Prediction based on Dual-Attention Mechanism[J].Journal of Nanjing Normal University(Natural Science Edition),2023,46(03):103.[doi:10.3969/j.issn.1001-4616.2023.04.014]
[8]龚成张,严云洋,卞苏阳,等.基于Fast-CAANet的火焰检测方法[J].南京师大学报(自然科学版),2024,47(02):109.[doi:10.3969/j.issn.1001-4616.2024.02.013]
 Gong Chengzhang,Yan Yunyang,Bian Suyang,et al.Flame Detection Based on Fast-CAANet[J].Journal of Nanjing Normal University(Natural Science Edition),2024,47(03):109.[doi:10.3969/j.issn.1001-4616.2024.02.013]
[9]刘丛昊,王 军,谢 非,等.基于改进NanoDet的复杂运动场景多人体检测算法[J].南京师大学报(自然科学版),2024,47(02):140.[doi:10.3969/j.issn.1001-4616.2024.02.016]
 Liu Conghao,Wang Jun,Xie Fei,et al.An Improved NanoDet-Based Multi-Human Detection Algorithm for Complex Motion Scenes[J].Journal of Nanjing Normal University(Natural Science Edition),2024,47(03):140.[doi:10.3969/j.issn.1001-4616.2024.02.016]
[10]赵新玥,陈美凤,张 静,等.基于双粒度的小麦问句分类模型研究[J].南京师大学报(自然科学版),2025,48(01):100.[doi:10.3969/j.issn.1001-4616.2025.01.013]
 Zhao Xinyue,Chen Meifeng,Zhang Jing,et al.Research on Wheat Question Classification Model Based on Dual-Granularity[J].Journal of Nanjing Normal University(Natural Science Edition),2025,48(03):100.[doi:10.3969/j.issn.1001-4616.2025.01.013]

备注/Memo

备注/Memo:
收稿日期:2025-01-25.
基金项目:国家自然科学基金项目(61672159)、福建省自然科学基金项目(2022J01398)、福建省中青年教师科技类教育科研项目(JAT211024)、福建省中青年教师科技类教育科研项目(JAT211001)、福建省终身教育提质培优项目(ZS22026).
通讯作者:周忠眉,博士,教授,研究方向:数据挖掘、人工智能. E-mail:251128555@qq.com
更新日期/Last Update: 2025-06-20