[1]韩乾乾,顾华宁,金 湓,等.基于X3D特征和语义融合的篮球运动员检测跟踪方法[J].南京师大学报(自然科学版),2025,48(06):101-110.[doi:10.3969/j.issn.1001-4616.2025.06.011]
 Han Qianqian,Gu Huaning,Jing Pen,et al.Basketball Player Detection and Tracking Method Based X3D Feature and Semantic Fusion[J].Journal of Nanjing Normal University(Natural Science Edition),2025,48(06):101-110.[doi:10.3969/j.issn.1001-4616.2025.06.011]
点击复制

基于X3D特征和语义融合的篮球运动员检测跟踪方法()

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

卷:
48
期数:
2025年06期
页码:
101-110
栏目:
计算机科学与技术
出版日期:
2025-12-20

文章信息/Info

Title:
Basketball Player Detection and Tracking Method Based X3D Feature and Semantic Fusion
文章编号:
1001-4616(2025)06-0101-10
作者:
韩乾乾1顾华宁2金 湓3赵永强4
(1.广东海洋大学体育与休闲学院,广东 湛江 524088)
(2.成都理工大学管理科学学院,四川 成都610059)
(3.西北民族大学化工学院,甘肃 兰州 730030)
(4.广西师范大学基建处,广西 桂林 54100)
Author(s):
Han Qianqian1Gu Huaning2Jing Pen3Zhao Yongqiang4
(1.College of Sports and Recreation,Guangdong Ocean University,Zhanjiang 524088,China)
(2.College of Management Science,Chengdu University of Technology,Chengdu 610059,China)
(3.Northwest Minzu University,College of Chemistry and Chemical Engineering,Lanzhou 730030,China)
(4.Infrastructure Construction Department,Guangxi Normal University,Guilin 541004,China)
关键词:
篮球视频检测跟踪X3D网络单应性稳像特征融合
Keywords:
basketball videodetection and trackingX3D networkhomography stabilizationfeature fusion
分类号:
TP391
DOI:
10.3969/j.issn.1001-4616.2025.06.011
文献标志码:
A
摘要:
针对单机位篮球视频中运动员频繁遮挡、队服相似及相机抖动导致的跟踪难题,本文提出融合X3D时空特征与多模态语义的实时检测跟踪方法. 首先,采用单应性稳像将直播流配准至参考平面,抑制背景漂移. 其次,利用轻量化X3D网络在16帧片段上提取1 024维时空描述子(5 GFLOPs算力约束),捕获各种篮球比赛的关键动作模式,同时满足了边缘部署的延迟要求; 最后,设计注意力驱动的特征融合模块,自适应结合几何位移、外观直方图与X3D特征. 在NBA-SYN和UCF-Sports-Basket公开数据集上的实验表明,该方法分别达到77.43%和79.30%的MOTA,以45.31 FPS的实时性能,显著优于现有方案,为有效硬件条件下的篮球视频分析提供可靠技术支撑.
Abstract:
In order to solve the tracking problems caused by frequent occlusion of players,similar uniforms and camera shake in single-camera basketball videos,this paper proposes a real-time detection and tracking method that integrates X3D spatiotemporal features and multimodal semantics. First,homography stabilization is used to align the live stream to the reference plane to suppress background drift. Secondly,a lightweight X3D network is used to extract 1 024-dimensional spatiotemporal descriptors(5 GFLOPs computing power constraint)on 16-frame segments to capture the key action patterns of various basketball games while meeting the latency requirements of edge deployment; finally,an attention-driven feature fusion module is designed to adaptively combine geometric displacement,appearance histogram and X3D features. Experiments on the NBA-SYN and UCF-Sports-Basket public datasets show that this method achieves 77.43% and 79.30% MOTA respectively,with a real-time performance of 45.31 FPS,which is significantly better than the existing solutions,providing reliable technical support for basketball video analysis under effective hardware conditions.

参考文献/References:

[1]戴凤麟. 无标定的自适应多视角跟踪系统[D]. 上海:复旦大学,2025.
[2]ZHANG Y,SHANG A,ZHANG W,et al. A measurement fusion algorithm of active and passive sensors based on angle association for multi-target tracking[J]. Information fusion,2024,106(12):1-16.
[3]CAO Z,SIMON T,WEI S E,et al. Realtime multi-person 2d pose estimation using part affinity fields[C]//EEE Conference on Computer Vision and Pattern Recognition. Hawaii,USA:IEEE,2017:1302-1310.
[4]LOZZI D,DI POMPEO I,MARCACCIO M,et al. AI-Powered Analysis of Eye Tracker Data in Basketball Game[J]. Sensors,2025,25(11):1019-1028.
[5]YUE W,XU F,YANG J. Tracking-by-Detection Algorithm for Underwater Target Based on Improved Multi-Kernel Correlation Filter[J]. Remote sensing,2024,16(2):1-16.
[6]马昌庆. 面向大场景监控视频的行人多目标跟踪算法研究[D]. 沈阳:东北大学,2021.
[7]朱国晖,漆娜,郭子萱. 车联网中基于进化策略算法与匈牙利算法的资源分配策略[J]. 西安邮电大学学报,2024,29(4):21-29.
[8]FEICHTENHOFER C. X3d:Expanding architectures for efficient video recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington,USA:IEEE,2020:203-213.
[9]GIRSHICK R,DONAHUE J,DARRELL T,et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition(CVPR). Columbus,Ohio:IEEE,2014:580-587.
[10]龙君芳,马琳娟,李庆珍. 基于KPCA和结构化支持向量机的视频目标跟踪[J]. 南京理工大学学报,2023,47(5):671-677.
[11]张国印,王传博,高伟. 抗遮挡的行人多目标跟踪算法[J]. 智能系统学报,2024,19(5):1248-1256.
[12]张英俊,白小辉,谢斌红. CNN-Transformer特征融合多目标跟踪算法[J]. 计算机工程与应用,2024,60(2):180-190.
[13]张丽娟,周治平. 基于网络流的分层关联多目标跟踪[J]. 计算机辅助设计与图形学学报,2018,30(9):1670-1677.
[14]苏银强,王宣,王淳,等. 一种用于视觉跟踪的低秩上下文感知的相关滤波器[J]. 计算机科学,2024,51(9):121-128.
[15]WANG W,ZHANG K,LV M,et al. Hierarchical spatiotemporal context-aware correlation filters for visual tracking[J]. IEEE transactions on cybernetics,2020,51(12),6066-6079.
[16]YAN B,PENG H,FU J,et al. Learning spatio-temporal transformer for visual tracking[C]//In Proceedings of the IEEE/CVF International Conference on Computer Vision(CVPR). Nashville,Tennessee:IEEE,2021:10448-10457.
[17]CAO Z,TOMAS S,SHIH-EN W,et al. Realtime multi-person 2d pose estimation using part affinity fields[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Hawaii,USA:IEEE,2017:7291-7299.
[18]FANG H S,XIE S Q,TAI Y W,et al. Rmpe:Regional multi-person pose estimation[C]//Proceedings of the IEEE International Conference on Computer Vision(CVPR). Hawaii,USA:IEEE,2017:2334-2343.
[19]VOIGTLAENDER P,MICHAEL K,ALJOSA O,et al. MOTS:Multi-object tracking and segmentation[C]//Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition(CVPR). Long Beach,California,USA:IEEE,2019:7942-7951.
[20]张炜昕,钟映春,张钢,等. 视频流中移动篮球的检测与跟踪方法[J]. 广东工业大学学报,2025,42(3):62-71.
[21]FACCHINETTI T,METULINI R,ZUCCOLOTTO P. Filtering active moments in basketball games using data from players tracking systems[J]. Annals of operations research,2023,325(1):521-538.
[22]SÁNCHEZ J. Comparison of motion smoothing strategies for video stabilization using parametric models[J]. Image processing on line,2017,29(7):309-346.
[23]CARREIRA J,ANDREW Z. Quo vadis,action recognition?a new model and the kinetics dataset[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Hawaii,USA:IEEE,2017:6299-6308.
[24]TRAN D,BOURDEV L,FERGUS R,et al,Learning spatiotemporal features with 3d convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision(ICCV). Santiago,Chile:IEEE,2015:4489-4497.
[25]王志明,张佳,彭江南,等. SlowFast架构下景区异常行为识别算法及预警研究[J]. 南京理工大学学报,2024,48(3):374-383.
[26]ZHANG Y F,PEIZE S,YI J,et al. Bytetrack:Multi-object tracking by associating every detection box[C]//European Conference on Computer Vision. Tel Aviv,Israel:IEEE,2022:1-21.

备注/Memo

备注/Memo:
收稿日期:2025-07-17.
基金项目:国家自然科学基金资助项目(42402278)、甘肃省科技计划资助项目(23YFGA0073)、2023年度广东省教育科学规划课题(高等教育专项)资助项目(2023GXJK297).
通讯作者:韩乾乾,博士研究生,讲师,研究方向:人工智能,目标识别,运动训练. E-mail:19232022628@163.com
更新日期/Last Update: 2025-12-20