[1]古学茹,张备伟,丁 雯.基于多流自适应时空图卷积网络的人体行为识别[J].南京师大学报(自然科学版),2025,48(03):112-119.[doi:10.3969/j.issn.1001-4616.2025.03.013]
 Gu Xueru,Zhang Beiwei,Ding Wen.Human Behavior Recognition Based on Multi-stream Adaptive Spatio-temporal Graph Convolutional Networks[J].Journal of Nanjing Normal University(Natural Science Edition),2025,48(03):112-119.[doi:10.3969/j.issn.1001-4616.2025.03.013]
点击复制

基于多流自适应时空图卷积网络的人体行为识别()

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

卷:
48
期数:
2025年03期
页码:
112-119
栏目:
计算机科学与技术
出版日期:
2025-06-20

文章信息/Info

Title:
Human Behavior Recognition Based on Multi-stream Adaptive Spatio-temporal Graph Convolutional Networks
文章编号:
1001-4616(2025)03-0112-08
作者:
古学茹1张备伟1丁 雯2
(1.南京财经大学信息工程学院,江苏 南京 210023)
(2.沙洲职业工学院数字化与微电子学院,河北 张家港 215699)
Author(s):
Gu Xueru1Zhang Beiwei1Ding Wen2
(1.School of Information Engineering,Nanjing University of Finance and Economics,Nanjing 210023 China)
(2.Shazhou Polytechnic College,Zhangjiagang 215699,China)
关键词:
动作识别注意力机制图卷积神经网络多流网络
Keywords:
action recognitionattention mechanismgraph convolutional neural networkmultistream network
分类号:
TP391.41; TP183
DOI:
10.3969/j.issn.1001-4616.2025.03.013
文献标志码:
A
摘要:
针对当前人体骨骼动作识别算法全局关系描述不够详尽、时空特征挖掘不够充分等问题,本文提出了一种基于多流自适应时空图卷积网络的人体骨骼动作识别模型(MsAST-GCN). 首先使用注意力机制和神经张量网络(NTN)算法求解每对关节点之间的连接强度,构建全局邻接矩阵; 接着利用topK策略,根据连接强度动态选择前K个邻居节点,更新全局邻接矩阵; 然后采用混合池化模型提取全局上下文信息及时间关键帧特征. 通过对关节信息、骨骼信息、关节运动信息、骨骼运动信息同时进行建模,加强模型提取的特征对动作的表征能力; 最后在数据集NTU-RGB+D上开展实验. 结果表明,该模型在人体骨骼动作识别任务中取得了良好的性能,有效提高了动作识别的准确率.
Abstract:
Aiming the problems of insufficient mining of temporal and spatial features in current human action recognition algorithms,a human skeleton action recognition model based on multi-stream adaptive spatio-temporal graph convolutional networks is proposed in this paper. Firstly,the attention mechanism and neural tensor network(NTN)algorithm are used to solve the connection strength between each pair of nodes,and the global adjacency matrix is constructed. Then,topK strategy is used to dynamically select the first K neighbor nodes according to the connection strength and update the global adjacency matrix. Next a hybrid pooling model is used to extract the global context information and time key frame features. By modeling joint information,bone information,joint movement information and bone movement information,the ability of the features extracted from the model to represent the movement is strengthened. Finally,experiments are carried out on the data set NTU-RGB+D. The results show that the model has a good performance in the task of human skeleton action recognition,and effectively improves the accuracy of action recognition.

参考文献/References:

[1]WANG J,LI Z,LIU B,et al. MGSAN:multimodal graph self-attention network for skeleton-based action recognition[J]. Multimedia systems,2024,30(6):1-12.
[2]XIANG X,LI X,LIU X,et al. A GCN and transformer complementary network for skeleton-based action recognition[J]. Computer vision and image understanding,2024,249:104213.
[3]WANG Z,CHEN T. Human-UAV interactive perception:skeleton-based iterative perspective optimization algorithm for UAV patrol tracking of large-scale pedestrian abnormal behavior[J]. Applied soft computing,2024,167:112467.
[4]ZHANG Y,ZHAO C,YAO Y,et al. Human posture estimation and action recognition on fitness behavior and fitness[J]. Alexandria engineering journal,2024,107:434-442.
[5]YAN S,XIONG Y,LIN D. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans,USA:AAAI Press,2018:7444-7452.
[6]ZHAO Y,WANG J,WANG H,et al. Adaptive spatiotemporal graph convolutional nerwork with intermediate aggregation of multi-stream skeleton features for action recognition[J]. Neurocomputing,2022,505:116-124.
[7]BAI Z,DING Q,XU H,et al. Skeleton-based similar action recognition through integrating the salient image feature into a center-connected graph convolutional network[J]. Neurocomputing,2022,507:40-53.
[8]SHI L,ZHANG Y,CHENG J,et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[J]. The IEEE conference on computer vision and pattern recognition(CVPR). Long Beach,USA:IEEE Computer Society,2019:12026-12035.
[9]BAI Y,DING H,BIAN S,et al. SimGNN:a neural network approach to fast similarity computation[C]//The 20th Internationsl Conference on Web Search and Data Mining. Melbourne,Australia:ACM,2019:384-392.
[10]SHAHROUDY A,LIU J,NG T,et al. NTU RGB+D:a large scale dataset for 3D human activity analysis[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas,USA:IEEE Computer Society,2016:1010-1019.
[11]YOON Y,YU J,JEON M. Predictively encoded graph convolutional network for noise-robust skeleton-based action recognition[J]. Applied Intelligence,2022,52(3):2317-2331.
[12]ZHANG G,WEN S,LI S,et al. Fast 3D-graph convolutional networks for skeleton-based action recognition[J]. Applied soft computing,2023,145:110575.
[13]石跃祥,朱茂清. 基于骨架动作识别的协作卷积Transformer网络[J]. 电子与信息学报,2023,45(4):1485-1493.
[14]LIU Y,ZHANG H,XU D,et al. Graph transformer network with temporal kernel attention for skeleton-based action recognition[J]. Knowledge-based systems,2022,240:108146.
[15]曹毅,吴伟官,李平,等. 基于时空特征增强图卷积网络的骨架行为识别[J]. 电子与信息学报,2023,45(8):3022-3031.
[16]王小军,王兴,林羽,等. 基于时间感知注意力与拥塞驱动图卷积的交通流量预测[J]. 福建师范大学学报(自然科学版),2025,41(1):1-10.

相似文献/References:

[1]杨琬琪,周子奇,郭心娜.注意力机制引导的多模态心脏图像分割[J].南京师大学报(自然科学版),2019,42(03):27.[doi:10.3969/j.issn.1001-4616.2019.03.004]
 Yang Wanqi,Zhou Ziqi,Guo Xinna.Attention-Guided Multimodal Cardiac Segmentation[J].Journal of Nanjing Normal University(Natural Science Edition),2019,42(03):27.[doi:10.3969/j.issn.1001-4616.2019.03.004]
[2]张旭辉,张 郴,李雅南,等.城市旅游餐饮体验的注意力机制模型建构——基于机器学习的网络文本深度挖掘[J].南京师大学报(自然科学版),2022,45(01):32.[doi:10.3969/j.issn.1001-4616.2022.01.006]
 Zhang Xuhui,Zhang Chen,Li Yanan,et al.Construction of Attention Mechanism Model of Urban Tourism Catering Experience:Deep Mining of Online Text Based on Machine Learning[J].Journal of Nanjing Normal University(Natural Science Edition),2022,45(03):32.[doi:10.3969/j.issn.1001-4616.2022.01.006]
[3]梁兵涛,倪云峰.基于集成学习的中文命名实体识别方法[J].南京师大学报(自然科学版),2022,45(03):123.[doi:10.3969/j.issn.1001-4616.2022.03.016]
 Liang Bingtao,Ni Yunfeng.Chinese Named Entity Recognition Method Based on Ensemble Learning[J].Journal of Nanjing Normal University(Natural Science Edition),2022,45(03):123.[doi:10.3969/j.issn.1001-4616.2022.03.016]
[4]周湘贞,李 帅,隋 栋.基于深度学习和注意力机制的微博情感分析[J].南京师大学报(自然科学版),2023,46(02):115.[doi:10.3969/j.issn.1001-4616.2023.02.015]
 Zhou Xiangzhen,Li Shuai,Sui Dong.Microblog Emotion Analysis Based on Deep Learning and Attention Mechanism[J].Journal of Nanjing Normal University(Natural Science Edition),2023,46(03):115.[doi:10.3969/j.issn.1001-4616.2023.02.015]
[5]张文娟,张 彬,杨皓哲.基于双注意力机制的成绩预测[J].南京师大学报(自然科学版),2023,46(04):103.[doi:10.3969/j.issn.1001-4616.2023.04.014]
 Zhang Wenjuan,Zhang Bin,Yang Haozhe.Performance Prediction based on Dual-Attention Mechanism[J].Journal of Nanjing Normal University(Natural Science Edition),2023,46(03):103.[doi:10.3969/j.issn.1001-4616.2023.04.014]
[6]龚成张,严云洋,卞苏阳,等.基于Fast-CAANet的火焰检测方法[J].南京师大学报(自然科学版),2024,47(02):109.[doi:10.3969/j.issn.1001-4616.2024.02.013]
 Gong Chengzhang,Yan Yunyang,Bian Suyang,et al.Flame Detection Based on Fast-CAANet[J].Journal of Nanjing Normal University(Natural Science Edition),2024,47(03):109.[doi:10.3969/j.issn.1001-4616.2024.02.013]
[7]刘丛昊,王 军,谢 非,等.基于改进NanoDet的复杂运动场景多人体检测算法[J].南京师大学报(自然科学版),2024,47(02):140.[doi:10.3969/j.issn.1001-4616.2024.02.016]
 Liu Conghao,Wang Jun,Xie Fei,et al.An Improved NanoDet-Based Multi-Human Detection Algorithm for Complex Motion Scenes[J].Journal of Nanjing Normal University(Natural Science Edition),2024,47(03):140.[doi:10.3969/j.issn.1001-4616.2024.02.016]
[8]赵新玥,陈美凤,张 静,等.基于双粒度的小麦问句分类模型研究[J].南京师大学报(自然科学版),2025,48(01):100.[doi:10.3969/j.issn.1001-4616.2025.01.013]
 Zhao Xinyue,Chen Meifeng,Zhang Jing,et al.Research on Wheat Question Classification Model Based on Dual-Granularity[J].Journal of Nanjing Normal University(Natural Science Edition),2025,48(03):100.[doi:10.3969/j.issn.1001-4616.2025.01.013]
[9]黄文秀,周术诚,陈新元,等.基于图像融合和注意力机制的图像分类[J].南京师大学报(自然科学版),2025,48(03):120.[doi:10.3969/j.issn.1001-4616.2025.03.014]
 Huang Wenxiu,Zhou Shucheng,Chen Xinyuan,et al.Image Classification Based on Image Fusion and Attention Mechanism[J].Journal of Nanjing Normal University(Natural Science Edition),2025,48(03):120.[doi:10.3969/j.issn.1001-4616.2025.03.014]
[10]董 可,严云洋,耿嘉雯,等.增强感受野特征的多尺度火灾检测方法[J].南京师大学报(自然科学版),2025,48(04):87.[doi:10.3969/j.issn.1001-4616.2025.04.009]
 Dong Ke,Yan Yunyang,Geng Jiawen,et al.Multi-Scale Flame Detection Based on Enhanced Receptive Field Feature[J].Journal of Nanjing Normal University(Natural Science Edition),2025,48(03):87.[doi:10.3969/j.issn.1001-4616.2025.04.009]

备注/Memo

备注/Memo:
收稿日期:2024-10-25.
基金项目:国家自然科学基金项目(61877061).
通讯作者:张备伟,博士,副教授,研究方向:大数据处理、图像处理、模式识别. E-mail:zhangbeiwei@nufe.edu.cn
更新日期/Last Update: 2025-06-20