JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE) ›› 2016, Vol. 46 ›› Issue (2): 57-63.doi: 10.6040/j.issn.1672-3961.2.2015.147

Previous Articles     Next Articles

An endpoint detection algorithm based on frequency-domain characteristics and transition fragment judgment

GUO Yu, ZHANG Erhua*, LIU Chi   

  1. School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, Jiangsu, China
  • Received:2015-05-12 Online:2016-04-20 Published:2015-05-12

Abstract: In order to improve the accuracy of speech endpoint detection as well as enhance robustness of the endpoint detection algorithm in noisy environment, two new endpoint detection parameters were proposed. The spectrum entropy based on critical band took both perceptual characteristics of the human auditory system and the differences between speech and noise signals in frequency domain distribution into account, as well as the minus frequency-domain energy parameter paid attention to the difference between speech frames and silence frames in frequency energy. The advantages of those two parameters were combined to constitute a robust endpoint detection parameter. Meanwhile, in order to avoid the miscarriage of judgment caused by the unitary threshold, the transition fragment judgment based on statistics of characteristics distribution was applied. The experiment results showed that the endpoint detection algorithm had better discrimination for speech frames and silence frames, the algorithm could carry out better accuracy than other conventional anti-noisy endpoint detection algorithms under different and low signal-to-noise ratio noisy environments, especially in the case of non-stationary noise, the accuracy improved by more than 5%.

Key words: transition fragment judgment, energy entropy, frequency-domain energy, critical band, endpoint detection, spectrum entropy

CLC Number: 

  • TP391.42
[1] RAMIREZ J, YERAMOS P, GORRIZ M, et al. SVM-based speech endpoint detection using contextual speech features[J]. Electronics Letters, 2006, 42(7):426-428.
[2] 蔡魁杰. 基于支持向量机的汉语语音端点检测和声韵分离[D]. 哈尔滨:哈尔滨工程大学, 2007. CAI Kuijie. Endpoint detection and initial/final segmentation of Chinese speech based on SVM[D]. Harbin: Harbin Engineering University, 2007.
[3] 李发权, 杨立才, 颜红博. 基于PCA-SVM多生理信息融合的情绪识别方法[J]. 山东大学学报(工学版), 2014, 44(6):70-76. LI Faquan, YANG Licai, YAN Hongbo. An emotion recognition method of multiphysiological information fusion based on PCA-SVM[J]. Journal of Shandong University(Engineering Science), 2014, 44(6):70-76.
[4] WILPIN J G, RABINER L R. Application of hidden Markov models to automatic speech endpoint detection[J]. Computer Speech & Language, 1987, 2(3-4):321-346.
[5] OUZOUNOV A. Telephone speech endpoint detection using mean-delta feature[J]. Cybernetics and Information Technologies, 2014, 14(2):127-139.
[6] OUZOUNOV A. A robust features for speech detection[J]. Cybernetics and Information Technologies, 2004, 4(2):3-14.
[7] GHOSH P, TSIATRAS A, NARAYANAN S. Robust voice activity detection using long-term signal variability[J]. IEEE Trans on Audio, Speech and Language Processing, 2010, 19(3):600-613.
[8] 张君昌, 胡海涛, 崔力. 融合Burg谱估计与信号变化率测度的语音端点检测[J]. 西安电子科技大学学报(自然科学版), 2014, 41(3):192-195. ZHANG Junchang, HU Haitao, CUI Li. Robust voice endpoint detection fusing burg specturm estimate and signal variability[J]. Journal of Xidian University(Natural Science Edition), 2014, 41(3):192-195.
[9] LIANG Shenghuang, CHUNG Hoyang. A novel approach to robust speech endpoint detection in car environments[J]. IEEE ICASSP, 2000, 3:1751-1754.
[10] 吴迪, 赵鹤鸣, 陶智, 等. 低信噪比下采用感知语谱结构边界参数的语音端点检测算法[J]. 声学学报, 2014, 39(3):392-399. WU Di, ZHAO Heming, TAO Zhi, et al. Speech endpoint detection in low-SNRs environment based on perception spectrogram structure boundary parameter[J]. Chinese Journal of Acoustice, 2014, 39(3):392-399.
[11] 李杰, 周萍, 杜志然. 短时TEO能量在带噪语音端点检测中的应用[J]. 计算机工程与应用, 2013, 49(12):144-147. LI Jie, ZHOU Ping, DU Zhiran. Application of short-time TEO energy in noisy speech endpoint detection [J]. Computer Engineering and Application, 2013, 49(12):144-147.
[12] YING G, MITCHELL C, JAMIESON L. Endpoint detection of isolated utterances based on a modified teager energy measurement [J]. IEEE ICASSP, 1993, 2:732-735.
[13] 鲁远耀, 周妮, 肖珂, 等. 强噪声环境下改进的语音端点检测算法[J]. 计算机应用, 2014, 34(5):1386-1390. LU Yuanyao, ZHOU Ni, XIAO Ke, et al. Improved speech endpoint detection algorithm in strong noise environment[J]. Journal of Computer Applications, 2014, 34(5):1386-1390.
[14] 吴边, 王忠, 刘兴涛. 强背景噪声下语音端点检测的算法研究 [J]. 计算机工程与应用, 2011, 47(33):137-139. WU Bian, WANG Zhong, LIU Xintao. Research on speech endpoint detection in strong noise[J]. Computer Engineering and Application, 2011, 47(33):137-139.
[15] CHATLANI N, SORAGHAN J. EMD-based filtering(EMDF)of low-frequency noise for speech enhancement[J]. Audio, Speech, and Language Processing, IEEE Transactions on, 2012, 20(4):1158-1166.
[16] ZAO L, COELHO R, FLANDRIN P. Speech enhancement with EMD and hurst-based mode selection[J]. Audio, Speech, and Language Processing, IEEE/ACM Transactions on, 2014, 22(5):899-911.
[17] 王博, 郭英, 韩立峰. 基于熵函数的语音端点检测算法研究[J]. 信号处理, 2009, 25(3):368-373. WANG Bo, GUO Ying, HAN Lifeng. Research on entropy based voice activity detection algorithms[J]. Signal Processing, 2009, 25(3):368-373.
[18] GUNNAR FANT. Acoustic theory of speech production[M]. Hague: de Gruyter Mouton, 1970.
[19] HARRINGTON J, CASSIDY S. Techniques in speech acoustics[J]. Computational Linguistics, 1998, 8(2):294-295.
[20] ZWICKER M, TERHARDT E, Analytical expressions for critical band rate and critical bandwidth as a function of frequency[J]. Journal of the Acoustic Society of America, 1980, 68:1523-1525.
[21] 张仁志, 崔慧娟. 基于短时能量的语音端点检测算法研究[J]. 电声技术, 2005, 21(7):52-59. ZHANG Renzhi, CUI Huijuan. Speech endpoint detection algorithm analyses based on short-term energy[J]. Audio Engineering, 2005, 21(7):52-59.
[22] TOMI Kinnunen. Spectral features for automatic text independent speaker recognition(Licentiate's Thesis)[EB/OL].(2003-12-21)[2006-02-20]. ftp://ftp.cs.joensuu.fi/pub/PhLic/2004-PhLic-Kinnunen- Tomi.pdf.
[23] 钱博, 李燕萍, 唐振民, 等. 基于频域能量分布分析的自适应元音帧提取算法 [J]. 电子学报, 2007, 35(2):279-282. QIAN Bo, LI Yanping, TANG Zhenmin, et al. Self-Adaptive vowel-frame detection algorithm based on energy distribution analysis in frequency domain[J]. Chinese Journal of Electronics, 2007, 35(2):279-282.
[24] GUNAWARDENA J. Min-max functions[J]. Discrete Event Dynamic Systems, 1994, 4(4):377-407.
[25] SAHOO T, PARTRA S. Silence removal and endpoint detection of speech signal for text independent speaker identification[J]. International Journal Image, Graphics and Signal Processing, 2014, 6(6):27-35.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LI Ke,LIU Chang-chun,LI Tong-lei . Medical registration approach using improved maximization of mutual information[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 107 -110 .
[2] JI Tao,GAO Xu/sup>,SUN Tong-jing,XUE Yong-duan/sup>,XU Bing-yin/sup> . Characteristic analysis of fault generated traveling waves in 10 Kv automatic blocking and continuous power transmission lines[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 111 -116 .
[3] YUE Yuan-Zheng. Relaxation in glasses far from equilibrium[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(5): 1 -20 .
[4] WANG Yong, XIE Yudong. Gas control technology of largeflow pipe[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 70 -74 .
[5] LIU Xin 1, SONG Sili 1, WANG Xinhong 2. [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 98 -100 .
[6] CAI Xiaojun , ZHAGN Qing , CHAI Qiaolin 1, KONG Suli 2. AnDivided multipath dynamic source routing based on energybalanced[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 141 -145 .
[7] LONG Zhi-Jian, ZHANG Chang-Qiao. Synthesis and properties of associating DRA by binary copolymerization based on lauryl methacrylate[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(5): 128 -132 .
[8] MENG Jian, LI Yibin, LI Bin. Bound gait controlling method of quadruped robot[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(3): 28 -34 .
[9] HANG Guang-qing,KONG Fan-yu,LI Da-xing, . Efficient algorithm with resistance to simple power analysis on Koblitz curves[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(3): 78 -80 .
[10] XU Yan-sheng,LIU Xing-fang . Application of the fuzzy clustering iterative model to the evalution of water resource carrying capacity[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(3): 100 -104 .