Sun Rujun,Zhang Lufei.Dynamic Sparse Method for Deep Learning Execution[J].Journal of Nanjing Normal University(Natural Science Edition),2019,42(03):11-19.[doi:10.3969/j.issn.1001-4616.2019.03.002]





Dynamic Sparse Method for Deep Learning Execution
数学工程与先进计算国家重点实验室,江苏 无锡 214125
Sun RujunZhang Lufei
State Key Laboratory of Mathematical Engineering and Advanced Computing,Wuxi 214125,China
deep learningsparse methodresource limitationdynamic scheduling
以深度学习为代表的人工智能技术迅速发展,庞大的数据、模型,更大的计算量和更复杂的计算都对模型的执行提出了挑战. 在实际应用中,资源和应用的动态特征以及用户的动态需求,需要模型执行的动态性来保证. 而稀疏化是在资源受限、用户需求调整情况下动态模型的执行重要手段. 目前主流的稀疏化技术主要是针对特定问题的稀疏化,且针对推理的多,针对训练的少,缺乏在训练执行阶段进行动态调整和稀疏化的手段. 本文在对深度学习领域的基本计算单元进行可稀疏性分析的基础上,进一步分析了模型执行的不同层面、不同组成部分的稀疏化能力; 经过对动态需求、模型稀疏化策略的建模后,提出了基于动态指导的深度学习模型稀疏化执行方法,并进行了基本实验; 最后从量化建模与量化实验的角度对今后的研究工作提出了展望.
Artificial intelligence,especially deep learning,is developing rapidly. Large data,large models,complex control flow,and more computations have challenged model execution in both training and inference stage. Practically,resources and user demands show dynamic characteristics,and need to be guaranteed by executing model dynamically. Finding sparsity is an important means of dynamically model execution under resource constraints and the changing user demand. At present,the main sparse method is focused on specific problems. Most of them are used in inference. We are in urgent need of dynamic sparse method in training. In this paper,we firstly analyzed the chance of sparsicy in basic deep learning models and further analyzed the sparseization ability of different layers and different components of models. After modeling the dynamic requirements and sparseness strategy,we proposed a sparse execution method of deep learning based on dynamic execution,and did some basic experiments. Finally,from the perspective of quantitative modeling and quantitative experiments,the future research work is prospected.


[1] 周志华. 机器学习[M]. 北京:清华大学出版社,2016.
[2]JENKINSON J,GRIGORYAN A M,HAJINOROOZI M,et al. Machine learning and image processing in astronomy with sparse data sets[C]//2014 IEEE International Conference on Systems,Man,and Cybernetics,SMC 2014. San Diego,CA,USA,2014:200-203.
[3]DHILLON I S,MODHA D S. Concept decompositions for large sparse text data using clustering[J]. Machine learning,2001,42(1/2):143-175.
[4]YANG C H,LIU F,HUANG J,et al. Auto-classification of retinal diseases in the limit of sparse data using a two-streams machine learning model[J]. CoRR,2018,abs/1808.05754.
[5]ANTHOLZER S,HALTMEIER M,SCHWAB J. Deep learning for photoacoustic tomography from sparse data[J]. CoRR,2017,abs/1704.04587.
[6]HAN S,DALLY B. Efficient methods and hardware for deep learning[D]. Stanford:Stanford University,2017.
[7]SIMONYAN K,ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. CoRR,2014,abs/1409.1556.
[8]SERCU T,PUHRSCH C,KINGSBURY B,et al. Very deep multilingual convolutional neural networks for LVCSR[J]. CoRR,2015,abs/1509.08967.
[9]SUN Y,LIANG D,WANG X G,et al. Deepid3:face recognition with very deep neural networks[J]. CoRR,2015,abs/1502.00873.
[10]DUQUE A B,SANTOS L L J,MACêDO D,et al. Squeezed very deep convolutional neural networks for text classification[J]. CoRR,2019,abs/1901.09821.
[11]THOM M. Sparse neural networks[D]. Ulm:University of Ulm,2015.
[12]LIU B Y,WANG M,FOROOSH H,et al. Sparse convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.[S.l.:s.n.]. Boston,MA,USA,2015:806-814.
[13]NARANG S,ELSEN E,DIAMOS G,et al. Exploring sparsity in recurrent neural networks[J]. arXiv preprint arXiv:1704.05119,2017.
[14]XU L,CHOY C,LI Y W. Deep sparse rectifier neural networks for speech denoising[C]//IEEE International Workshop on Acoustic Signal Enhancement,IWAENC 2016. Xi’an,China,2016:1-5.
[15]SALDANHA L B,BOBDA C. Sparsely connected neural networks in FPGA for handwritten digit recognition[C]//17th International Symposium on Quality Electronic Design,ISQED 2016. Santa Clara,CA,USA,2016:113-117.
[16]NG A. Sparse autoencoder[J]. CS294A lecture notes,2011,72(2011):1-19.
[17]ZHANG Y D,HOU X X,Lü Y D,et al. Sparse autoencoder based deep neural network for voxelwise detection of cerebral microbleed[C]//22nd IEEE International Conference on Parallel and Distributed Systems,ICPADS 2016. Wuhan,China,2016:1229-1232.
[18]KORDMAHALLEH M M,SEFIDMAZGI M G,HOMAIFAR A. A sparse recurrent neural network for trajectory prediction of atlantic hurricanes[C]//Proceedings of the 2016 on Genetic and Evolutionary Computation Conference. Denver,CO,USA,2016:957-964.
[19]ZHANG S T,DU Z D,ZHANG L,et al. Cambricon-x:An accelerator for sparse neural networks[C]//49th Annual IEEE/ACM International Symposium on Microarchitecture,MICRO 2016. Taipei,China,2016:20:1-20.
[20]Open neural network exchange(onnx)model zoo[EB/OL]. [2019-03-01]. https://github.com/onnx/models.
[21]MOCANU D C,MOCANU E,STONE P,et al. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science[J]. Nature communications,2018,9(1):2383-2395.
[22]KEPNER J,GADEPALLY V,JANANTHAN H,et al. Sparse deep neural network exact solutions[C]//2018 IEEE High Performance Extreme Computing Conference,HPEC 2018. Waltham,MA,USA,2018:1-8.
[23]LIU L,DENG L,HU X,et al. Dynamic sparse graph for efficient deep learning[J]. CoRR,2018,abs/1810.00859.
[24]CHICKERING D M,HECKERMAN D. Fast learning from sparse data[J]. CoRR,2013,abs/1301.6685.


 Zheng Depeng,Du Jixiang,Zhai Chuanmin.Age Estimation Based on Deep Learning MPCANet[J].Journal of Nanjing Normal University(Natural Science Edition),2017,40(03):20.[doi:10.3969/j.issn.1001-4616.2017.01.004]
[2]朱 繁,王洪元,张 继.基于深度学习的行人重识别研究综述[J].南京师范大学学报(自然科学版),2018,41(04):93.[doi:10.3969/j.issn.1001-4616.2018.04.015]
 Zhu Fan,Wang Hongyuan,Zhang Ji.A Survey of Person Re-identification Based on Deep Learning[J].Journal of Nanjing Normal University(Natural Science Edition),2018,41(03):93.[doi:10.3969/j.issn.1001-4616.2018.04.015]
[3]赵文芳,林润生,唐 伟,等.基于深度学习的PM2.5短期预测模型[J].南京师范大学学报(自然科学版),2019,42(03):32.[doi:10.3969/j.issn.1001-4616.2019.03.005]
 Zhao Wenfang,Lin Runsheng,Tang Wei,et al.Forecasting Model of Short-Term PM2.5 ConcentrationBased on Deep Learning[J].Journal of Nanjing Normal University(Natural Science Edition),2019,42(03):32.[doi:10.3969/j.issn.1001-4616.2019.03.005]
[4]张新峰,闫昆鹏,赵 珣.基于双向LSTM的手写文字识别技术研究[J].南京师范大学学报(自然科学版),2019,42(03):58.[doi:10.3969/j.issn.1001-4616.2019.03.008]
 Zhang Xinfeng,Yan Kunpeng,Zhao Xun.Handwriting Chinese Text Recognition Using BiLSTM Network[J].Journal of Nanjing Normal University(Natural Science Edition),2019,42(03):58.[doi:10.3969/j.issn.1001-4616.2019.03.008]
 Jia Yufu,Hu Shenghong,Liu Wenping,et al.Wild Image Enhancement with Conditional Generative Adversarial Network[J].Journal of Nanjing Normal University(Natural Science Edition),2019,42(03):88.[doi:10.3969/j.issn.1001-4616.2019.03.012]
[6]汤 凯,何 庆,赵 群,等.基于改进的深度残差网络的图像识别[J].南京师范大学学报(自然科学版),2019,42(03):115.[doi:10.3969/j.issn.1001-4616.2019.03.015]
 Tang Kai,He Qing,Zhao Qun,et al.Image Recognition Based on Improved Deep Neural Network[J].Journal of Nanjing Normal University(Natural Science Edition),2019,42(03):115.[doi:10.3969/j.issn.1001-4616.2019.03.015]


收稿日期:2019-07-05.基金项目:国家重点研发计划(2016YFB1000505)、国家重点研发计划(2017YFB0202001)、国家自然科学基金项目(9143020017). 通讯联系人:孙茹君,博士研究生,研究方向:人工智能运行环境. E-mail:sun.rujun@meac-skl.cn
更新日期/Last Update: 2019-09-30