«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j.issn.1001-4616.2019.01.012]
点击复制

基于轻量化SSD的车辆及行人检测网络()

分享到：

《南京师范大学学报》（自然科学版）[ISSN:1001-4616/CN:32-1239/N]

卷:: 第42卷
期数:: 2019年01期

页码:: 73

栏目:: ·人工智能算法与应用专栏·

出版日期:: 2019-03-20

文章信息/Info

Title:: Vehicle and Pedestrian Detection Model Based on Lightweight SSD

文章编号:: 1001-4616(2019)01-0073-09

作者:: 郑冬¹; 李向群¹; 许新征¹; 2; (1.中国矿业大学计算机科学与技术学院,江苏徐州 221116)(2.数据科学与智能应用福建省高校重点实验室,福建漳州 363000)

Author(s):: Zheng Dong¹; Li Xiangqun¹; Xu Xinzheng¹; 2; (1.School of Computer Science and Technology,China University of Mining and Technology,Xuzhou 221116,China)(2.Key Laboratory of Data Science and Intelligence Application,Fujian Province University,Zhangzhou 363000,China)

关键词:: 目标检测; 卷积神经网络; 轻量化神经网络; SSD; MobileNetv2

Keywords:: object detection; convolutional neural network; lightweight neural network; SSD; mobileNetv2

分类号:: TP193

DOI:: 10.3969/j.issn.1001-4616.2019.01.012

文献标志码:: A

摘要:: 近年来,基于深度学习的目标检测算法发展迅速. 但是由于深度网络规模过大,导致其还不能在嵌入式平台上进行广泛应用. 本文针对SSD(Single Shot Multi-box Detector)模型的规模进行优化,引入了轻量化卷积神经网络MobileNetv2,对比了SSD和其轻量化版本SSDLite的网络结构,在此基础上提出了基于轻量化SSD的车辆及行人检测模型LVP-DN(Lightweight Vehicle and Pedestrian Detection Network). 首先,通过MobilNetv2替代VGG作为基础网络进行特征提取. 然后,用轻量化的SSD版本SSDLite替代SSD,从而达到减少模型大小、加快检测速度的目的. 进一步通过优化默认候选框的比例,提高了网络对行人的检测精度. 最后,在KITTI和PASCAL VOC数据集上分别对比了不同基础网络、输入图像尺寸及是否使用预训练模型这3个因素对网络性能的影响. 实验结果表明,相比其他流行的目标检测模型,本文所提出的车辆及行人检测模型在精度、速度和模型大小等评价标准上取得了较好的效果.

Abstract:: In recent years,the object detection algorithm based on deep learning has developed rapidly. However,it can’t be widely used in embedded platforms because the network is too large. This paper optimized the model size of SSD(Single Shot Multi-box Detector)network,introduced the lightweight convolutional neural network—MobileNetv2,analyzed the inverted residual and linear bottleneck structure in MobileNetv2,and compared SSD and its lightweight version—SSDLite. We proposed a lightweight vehicle and pedestrian detection model which named LVP-DN(Lightweight Vehicle and Pedestrian Detection Network). First,the MobilNetv2 was used to instead of VGG as the basic network to perform feature extraction. Then,the SSDLite was used to replace the original structure,in order to reduce the model size and speed up the detection process. It is improved that the accuracy of network detection for pedestrians by optimizing the ratio of the default box. We compared the impact of three factors on network performance on the KITTI and PASCAL VOC datasets. The factors are the input image size,different basic network and whether used the pre-training models. The experimental results show that compared with other popular object detection models,the vehicle and pedestrian detection models proposed in this paper have achieved good results in the evaluation standards such as accuracy,speed,and model size.

参考文献/References:

[1] SZEGEDY C,TOSHEV A,ERHAN D. Deep neural networks for object detection[C]//International Conference on Neural Information Processing Systems. USA:MIT Press,2013,26:2553-2561.
[2]SERMANET P,EIGEN D,ZHANG X,et al. OverFeat:integrated recognition,localization and detection using convolutional networks[J]. Eprint Arxiv,2013:1312.6229.
[3]REN S,HE K,GIRSHICK R,et al. Faster R-CNN:towards real-time object detection with region proposal networks[C]//International Conference on Neural Information Processing Systems. Canada:MIT Press,2015:91-99.
[4]DAI J,LI Y,HE K,et al. R-FCN:Object detection via region-based fully convolutional networks[J]. Eprint Arxiv,2016:1605.06409.
[5]HE K,GKIOXARI G,DOLLáR P,et al. Mask R-CNN[C]//IEEE International Conference on Computer Vision. Italy:IEEE,2017:2980-2988.
[6]LIU W,ANGUELOV D,ERHAN D,et al. SSD:single shot multibox detector[C]//Computer Vision-ECCV 2016. Amsterdam,the Netherlands:Springer International Publishing,2016:21-37.
[7]REDMON J,DIVVALA S,GIRSHICK R,et al. You only look once:unified,real-time object detection[C]//Computer Vision and Pattern Recognition. USA:IEEE,2016:779-788.
[8]GIRSHICK R,DONAHUE J,DARRELL T,et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition. USA:IEEE Computer Society,2014:580-587.
[9]SANDLER M,HOWARD A,ZHU M,et al. MobileNetV2:inverted residuals and linear bottlenecks[J]. Eprint Arxiv,2018.
[10]GIRSHICK R. Fast R-CNN[C]//IEEE International Conference on Computer Vision. USA:IEEE Computer Society,2015:1440-1448.
[11]REDMON J,FARHADI A. YOLO9000:better,faster,stronger[C]//IEEE Conference on Computer Vision and Pattern Recognition. Italy:IEEE Computer Society,2017:6517-6525.
[12]REDMON J,FARHADI A. YOLOv3:an incremental improvement[J]. Eprint Arxiv,2018:104.02767.
[13]HE K,ZHANG X,REN S,et al. Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition. USA:IEEE Computer Society,2016:770-778.
[14]LIN T Y,DOLLAR P,GIRSHICK R,et al. Feature pyramid networks for object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition. Italy:IEEE Computer Society,2017:936-944.
[15]WONG A,SHAFIEE M J,LI F,et al. Tiny SSD:a tiny single-shot detection deep convolutional neural network for real-time embedded object detection[C]//Conference on Computer and Robot Vision. Foronto:IEEE,2018(15):95-101.
[16]IANDOLA F N,HAN S,MOSKEWICZ M W,et al. SqueezeNet:AlexNet-level accuracy with 50×fewer parameters and<0.5 MB model size[J]. Eprint Arxiv,2016:1602.07360.
[17]EVERINGHAM M. The PASCAL visual object classes challenge[J]. Lecture notes in computer science,2005,111(1):98-136.
[18]CHOLLET F. Xception:deep learning with depthwise separable convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition. Italy:IEEE Computer Society,2017:1800-1807.
[19]HOWARD A G,ZHU M,CHEN B,et al. MobileNets:efficient convolutional neural networks for mobile vision applications[J]. Eprint Arxiv,2017.
[20]GEIGER A,LENZ P,URTASUN R. Are we ready for autonomous driving?The KITTI vision benchmark suite[C]//IEEE Conference on Computer Vision and Pattern Recognition. USA:IEEE Computer Society,2012:3354-3361.
[21]GEIGER A,LENZ P,STILLER C,et al. Vision meets robotics:the KITTI dataset[J]. International journal of robotics research,2013,32(11):1231-1237.
[22]LECUN Y,BOTTOU L,BENGIO Y,et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE,1998,86(11):2278-2324.
[23]SIMONYAN K,ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//International Conference on Leoorning Representations. USA:IEEE,2014.
[24]GIRSHICK R. Fast R-CNN[C]//IEEE International Conference on Computer Vision. USA:IEEE Computer Society,2015:1440-1448.
[25]KIM H,LEE Y,YIM B,et al. On-road object detection using deep neural network[C]//IEEE International Conference on Consumer Electronics-Asia. Korea:IEEE,2016:1-4.
[26]HUANG J,GUADARRAMA S,MURPHY K,et al. Speed/accuracy trade-offs for modern convolutional object detectors[C]//IEEE International Conference on Computer Vision. USA:IEEE Computer Society,2016:3296-3297.

相似文献/References:

[1]王征,李皓月,许洪山,等.基于卷积神经网络和SVM的中国画情感分类[J].南京师范大学学报(自然科学版),2017,40(03):74.[doi:10.3969/j.issn.1001-4616.2017.03.011]
　Wang Zheng,Li Haoyue,Xu Hongshan,et al.Chinese Painting Emotion Classification Based onConvolution Neural Network and SVM[J].Journal of Nanjing Normal University(Natural Science Edition),2017,40(01):74.[doi:10.3969/j.issn.1001-4616.2017.03.011]
[2]方谦昊,朱红,何瀚志,等.基于卷积神经网络的脑膜瘤亚型影像自动分级[J].南京师范大学学报(自然科学版),2018,41(03):22.[doi:10.3969/j.issn.1001-4616.2018.03.004]
　Fang Qianhao,Zhu Hong,He Hanzhi,et al.Automatic Classification of Meningioma Subtype ImageBased on Convolutional Neural Network[J].Journal of Nanjing Normal University(Natural Science Edition),2018,41(01):22.[doi:10.3969/j.issn.1001-4616.2018.03.004]
[3]尤鸣宇,韩煊.基于样本扩充的小样本车牌识别[J].南京师范大学学报(自然科学版),2019,42(03):1.[doi:10.3969/j.issn.1001-4616.2019.03.001]
　You Mingyu,Han Xuan.Small Sample License Plate Recognition Based on Sample Expansion[J].Journal of Nanjing Normal University(Natural Science Edition),2019,42(01):1.[doi:10.3969/j.issn.1001-4616.2019.03.001]
[4]赵文芳,林润生,唐伟,等.基于深度学习的PM2.5短期预测模型[J].南京师范大学学报(自然科学版),2019,42(03):32.[doi:10.3969/j.issn.1001-4616.2019.03.005]
　Zhao Wenfang,Lin Runsheng,Tang Wei,et al.Forecasting Model of Short-Term PM2.5 ConcentrationBased on Deep Learning[J].Journal of Nanjing Normal University(Natural Science Edition),2019,42(01):32.[doi:10.3969/j.issn.1001-4616.2019.03.005]
[5]韩文军,孙小虎,吉根林,等.基于卷积神经网络的多光谱与全色遥感图像融合算法[J].南京师范大学学报(自然科学版),2021,44(03):123.[doi:10.3969/j.issn.1001-4616.2021.03.018]
　Han Wenjun,Sun Xiaohu,Ji Genlin,et al.Multispectral and Panchromatic Remote Sensing Image Fusion AlgorithmBased on Convolutional Neural Networks[J].Journal of Nanjing Normal University(Natural Science Edition),2021,44(01):123.[doi:10.3969/j.issn.1001-4616.2021.03.018]
[6]严忱,严云洋,高尚兵,等.基于多级特征融合的视频火焰检测方法[J].南京师范大学学报(自然科学版),2021,44(03):131.[doi:10.3969/j.issn.1001-4616.2021.03.019]
　Yan Chen,Yan Yunyang,Gao Shangbing,et al.Video Flame Detection Based on Fusion of Multilevel Features[J].Journal of Nanjing Normal University(Natural Science Edition),2021,44(01):131.[doi:10.3969/j.issn.1001-4616.2021.03.019]
[7]蔡钟晟,陈飞,曾勋勋.一种具有抗噪性能的圆形目标检测器[J].南京师范大学学报(自然科学版),2021,44(04):85.[doi:10.3969/j.issn.1001-4616.2021.04.011]
　Cai Zhongsheng,Chen Fei,Zeng Xunxun.A Circular Object Detector with Anti-Noise Performance[J].Journal of Nanjing Normal University(Natural Science Edition),2021,44(01):85.[doi:10.3969/j.issn.1001-4616.2021.04.011]
[8]马晓慧,马尚才,闫俊伢,等.基于距离感知的目标情感分类模型[J].南京师范大学学报(自然科学版),2021,44(04):111.[doi:10.3969/j.issn.1001-4616.2021.04.014]
　Ma Xiaohui,Ma Shangcai,Yan Junya,et al.Distance-Based Model for Target-Level Sentiment Analysis[J].Journal of Nanjing Normal University(Natural Science Edition),2021,44(01):111.[doi:10.3969/j.issn.1001-4616.2021.04.014]
[9]钟桂凤,庞雄文,孙道宗.基于差分进化的卷积神经网络的文本分类研究[J].南京师范大学学报(自然科学版),2022,45(01):136.[doi:10.3969/j.issn.1001-4616.2022.01.019]
　Zhong Guifeng,Pang Xiongwen,Sun Daozong.Research on Text Classification Based on Convolutional Neural Network of Differential Evolution[J].Journal of Nanjing Normal University(Natural Science Edition),2022,45(01):136.[doi:10.3969/j.issn.1001-4616.2022.01.019]
[10]邬忠萍,刘新厂,郝宗波.基于并行CNN和识别策略优化的车牌识别方法研究[J].南京师范大学学报(自然科学版),2023,46(03):98.[doi:10.3969/j.issn.1001-4616.2023.03.013]
　Wu Zhongping,Liu Xinchang,Hao Zongbo.Research of License Plate Recognition Method Based on Parallel CNN and Optimization Strategies[J].Journal of Nanjing Normal University(Natural Science Edition),2023,46(01):98.[doi:10.3969/j.issn.1001-4616.2023.03.013]

备注/Memo

备注/Memo:: 收稿日期:2018-08-19.
基金项目:国家自然科学基金(61672522)、数据科学与智能应用福建省高校重点实验室开放课题(D1804).
通讯联系人:许新征,博士,副教授,研究方向:机器学习和模式识别. E-mail:xuxinzh@163.com

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed2994
全文下载/Downloads3013
评论/Comments

更新日期/Last Update: 2019-03-30