[1]欧阳国亮,李志芳. 方言识别在侦查应用中面临的问题及对策[J]. 山西警察学院学报,2017,25(1):51-54.
[2]HOU J,LIU Y,ZHENG T F,et al. Multi-layered features with SVM for Chinese accent identification[C]//2010 International Conference on Audio,Language and Image Processing. Shanghai,2010:25-30.
[3]庞程,王秀玲,张结,等. 基于多特征融合的GMM汉语普通话口音识别[J]. 华中科技大学学报(自然科学版),2015(S1):5.
[4]杨伟,杨俊杰. 基于语言学音系例字的口音自动识别探究[J]. 中国司法鉴定,2021(2):5.
[5]YANG S W,CHI P H,CHUANG Y S,et al. Superb:speech processing universal performance benchmark[DB/OL]. arXiv preprint arXiv:2105.01051. [2021-03-03]. https://doi.org/10.48550.arXiv.2015.01051
[6]BAI Z,ZHANG X L. Speaker recognition based on deep learning:an overview[J]. Neural networks,2021,140:65-99.
[7]SNYDER D,GARCIA-ROMERO D,SELL G,et al. X-vectors:robust dnn embeddings for speaker recognition[C]//2018 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),Calgary,Canada:IEEE,2018:5329-5333.
[8]MAHDI H,DENGXIN D. Unified hypersphere embedding for speaker recognition[J]. arXiv preprint arXiv:1807.08312,[2018-07-22]. https://doi.org/10.48550.arXiv.1087.08312
[9]WANG F,CHENG J,LIU W Y,et al. Additive margin softmax for face verification[J]. IEEE signal processing letters,2018,25(7):926-930.
[10]SHI X,YU F,LU Y,et al. The accented english speech recognition challenge 2020:Open datasets,tracks,baselines,results and methods[C]//ICASSP 2021-2021 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),Toronto,Canada:IEEE,2021:6918-6922.
[11]ZHANG Z,WANG Y,YANG J. Accent recognition with hybrid phonetic features[J]. Sensors,2021,21(18):6258.
[12]WANG W,ZHANG C,WU X. Deep discriminative feature learning for accent recognition[DB/OL]. arXiv preprint arXiv:2011.12461. [2020-11-25]. https://doi.org/pdf/2011.12461.pdf
[13]PENG Y,ZHANG J,ZHANG H,et al. Multilingual approach to joint speech and accent recognition with DNN-HMM Framework[C]//2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference(APSIPA ASC),Tokyo,Japan:IEEE,2021:1043-1048.
[14]DEHAK N,KENNY P,DEHAK R,et al. Front-end factor analysis for speaker verification[J]. IEEE transactions on audio,speech,and language processing,2011,19(4):788-798.
[15]SNYDER D,GARCIA R D,POVEY D,et al. Deep neural network embeddings for text-independent speaker verification[C]//Interspeech,Stockholm,Sweden,2017:999-1003.
[16]PEDDINTI V,POVEY D,KHUDANPUR S. A time delay neural network architecture for efficient modeling of long temporal contexts[C]//Sixteenth Annual Conference of the International Speech Communication Association,Dresden,Germany:2015.
[17]CHUNG J S,NAGRANI A,ZISSERMAN A. Voxceleb2:deep speaker recognition[DB/OL]. arXiv preprint arXiv:1806.05622. [2018-06-14]. https://doi.org/10.21437/Interspeech.2018-1929
[18]OKABE K,KOSHINAKA T,SHINODA K. Attentive statistics pooling for deep speaker embedding[DB/OL]. arXiv preprint arXiv:1803.10963. [2018-03-29]. https://doi.org/10.21437/Interspeech.2018-993
[19]jiaaro.com. Pydub[EB/OL]. https://github.com/jiaaro/pydub.(2021-03-10)[2022-07-04].
[20]Speechbrain. Speaker Verification with xvector embeddings on Voxceleb[EB/OL]. https://huggingface.co/speechbrain/spkrec-xvect-voxceleb,(2021-05-03). [2021-07-04].
[21]HOCHREITER S,SCHMIDHUBER J. Long short-term memory[J]. Neural computation,1997,9(8):1735-80.
[22]ZAREMBA W,SUTSKEVER I,VINYALS O. Recurrent neural network regularization[DB/OL]. arXiv preprint arXiv:1409.2329. [2014-09-08]. https://arXiv.org/pdf/1409.2329.pdf
[23]GAO Q,WU H,SUN Y,et al. An end-to-end speech accent recognition method based on hybrid CTC/attention transformer ASR[C]//ICASSP 2021-2021 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),Toronto,Canada:IEEE,2021:7253-7257.
[24]SNYDER D,HEN G,POVEY D. MUSAN:a music,speech,and noise corpu[DB/OL]. arXiv:1510.08484v1. [2015-10-28]. https://doi.org/10.48550/arXiv.1510.08484
[25]RAVANELLI M,PARCOLLET T,PLANTINGA P,et al. SpeechBrain:a general-purpose speech toolkit[DB/OL]. arXiv preprint arXiv:2106.04624. [2021-06-08]. https://doi.org/10.48550/arXiv.2016.04624