1561-5405

10.24151/1561-5405

Proceedings of Universities. Electronics

Scientifical and technical journal "Proceedings of Universities. Electronics"

Научно-технический журнал «Известия высших учебных заведений. Электроника»

1561-5405 2587-9960

National Research University of Electronic Technology

Национальный исследовательский университет "Московский институт электронной техники"

10.24151/1561-5405-2021-26-2-184-196

004.021:004.421:004.932

Информационно-коммуникационные технологии

Methods and Algorithms for Real Time Voice Noise Cleaning

Методы и алгоритмы шумоочистки звука в реальном времени

Вишняков Игорь Эдуардович

Вишняков

Игорь Эдуардович

Vishnyakov

Igor E.

Igor E. Vishnyakov

Масягин Михаил Михайлович

Масягин

Михаил Михайлович

Mikhaylovich

Masyagin Mikhail

Masyagin Mikhail Mikhaylovich

Одинцов Олег Александрович

Одинцов

Олег Александрович

Aleksandrovich

Odintsov Oleg

Odintsov Oleg Aleksandrovich

Слюсарь Валентин Викторович

Слюсарь

Валентин Викторович

Sliusar

Valentin V.

Valentin V. Sliusar

Московский государственный технический университет имени Н.Э. Баумана (национальный исследовательский университет)Национальный исследовательский университет «МИЭТ», г. Москва, Россия

184196

http://ivuz-e.ru/issues/2-_2021/metody_i_algoritmy_shumoochistki_zvuka_v_realnom_vremeni/

The voice cleaning methods and algorithms play a key role both in preprocessing speech for further analysis and recognition, and in improving the quality of communication between users of information networks. The real-time streaming noise cleaning methods are the most important and complex area. The ability to process streaming data without delays imposes a number of significant restrictions on the algorithm: it cannot be iterative with a previously unknown number of iterations, and cannot explicitly use the data before or after the current block being processed. In the work, a modern adaptive noise reduction method for speech that can work with minimal signal transmission delays has been proposed. A large-scale study of existing approaches has been conducted, with special attention paid to two groups of algorithms: noise detection algorithms and noise suppression algorithms. Based on them the developed algorithm meeting the specified requirements has been built and analyzed. A set of audio data of Russian speech with various noises superimposed on it has been created. The testing of the algorithm has been made and its comparison with existing actual noise cleaning methods has been performed. The proposed adaptive method of noise cleaning without using specialized apparatus means and subsidiary information is able to operate in the real time conditions. The testing of the developed algorithm using the metrics of segment NC and PESQ have shown the high efficiency of the development and its superiority to common noise cleaning implementations Speex and WebRTC with respect to the noise cleaning quality and operation speed.

Методы и алгоритмы шумоочистки голоса применяются как для предобработки речи с целью ее дальнейшего анализа и распознавания, так и для непосредственного улучшения качества связи между пользователями информационных сетей. Наиболее важное и сложное направление - потоковая шумоочистка звука в реальном времени. Возможность обработки потоковых данных без задержек налагает на алгоритм ряд существенных ограничений: он не может быть итеративным с заранее неизвестным числом итераций и не может явно использовать данные, находящиеся до или после текущего обрабатываемого блока. В работе предложены адаптивные методы и алгоритмы шумоподавления для речи, работающие с минимальными задержками передачи сигнала. Исследованы существующие подходы к шумоочистке звука в реальном времени, особое внимание уделено алгоритмам детектирования шума и подавления шума. На их базе построены и проанализированы алгоритмы, удовлетворяющие поставленным требованиям. Создан набор аудиоданных русской речи с наложенными на нее различными шумами. Проведены тестирование предложенных решений и их сравнение с существующими актуальными методами шумоочистки. Предложенные методы и алгоритмы шумоочистки звука без использования специализированных аппаратных средств и вспомогательной информации работают в режиме реального времени. Тестирование разработанных алгоритмов с помощью метрик сегментного отношения зашумленного сигнала к шуму и PESQ показало высокую эффективность разработки и ее преимущество перед распространенными реализациями шумоочистки Speex и WebRTC по качеству шумоочистки и скорости работы.

алгоритм шумоподавлениявинеровская фильтрацияаприорное отношение сигнал/шумапостериорное отношение сигнал/шумречевой сигнал

Lukin A.S. AES San Francisco 2008: Tutorial T3. Broadband noise reduction: theory and application // Aes.org: [Web] / Audio Engineering Society. October 2–5, 2008. URL: https://www.aes.org/events/125/tutorials/session.cfm?code=T3 (accessed 31.03.2021).

Pascual S., Bonafonte S., Serra J. SEGAN: speech enhancement generative adversarial network // Interspeech 2017. Stockholm: ISCA, 2017. P. 3642–3646.

Gabbay A., Shamir A., Peleg Sh. Visual speech enhancement // The Hebrew University of Jerusalem. 2018. P. 1170–1174.

Schoenenberg K., Raake A., Koeppe J. Why are you so slow? Misattribution of transmis-sion delay to attributes of the conversation partner at the far-end // International Journal of Hu-man-Computer Studies. 2014. Vol. 72. Issue 5: May. P. 477–487.

Burnett G.C. Noise suppressing multi-microphone headset // The Journal of the Acoustical Society of America. 2013. No. 133. P. 4352.

Doclo S. Multi-microphone noise reduction and dereverberation techniques for speech applications: PhD Diss. Katholieke Universiteit Leuven, 2003. 50 p.

Khan F., Milner B.P. Speaker separation using virtually-derived binary masks // Audito-ry-Visual Speech Processing. Annecy: ISCA, 2013. P. 215–220.

Speex [Электронный ресурс]. URL: https://www.speex.org/ (дата обращения 20.05.2020).

WebRTC [Электронный ресурс]. URL: https://webrtc.org/ (дата обращения 20.05.2020).

10.

Cohen I., Berdugo B. Speech enhancement for non-stationary noise environments // Signal Processing. 2001. Vol. 81. No. 11. P. 2403–2418.

11.

Ephraim Y., Malah D. Speech enhancement using minimum mean-square error log-spectral amplitude estimator // IEEE Transactions on acoustics, Speech and Signal Processing. 1985. P. 443–445.

12.

Shrawankar U., Thakare V. Performance analysis of noise filters and speech enhance-ment techniques in adverse mixed noisy environment for HCI // International Journal of Research and Reviews in Computer Science. 2012. No. 3. P. 1817–1825.

13.

Loizou P. Speech enhancement. Theory and practice. 2nd ed. CRC Press, 2013. 705 p.

14.

Rangachari S. Loizou P.A. Noise estimation algorithm for highly nonstationary envi-ronments // Speech Communications. 2006. Vol. 48. No. 2. P. 220–231.

15.

Scalart P., Filho J.V. Speech enhancement based on a priori signal to noise estimation // 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing. Atlanta, GA: IEEE, 1996. Vol. 2. P. 629–632.

16.

Plapous C., Marro C., Mauuary L., Scalart P. A two-step noise reduction technique // 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing. Montreal: IEEE, 2004. Vol. 1. P. 289–292.

17.

Shifeng O., Chao G., Ying G. Improved a priori SNR estimation for speech enhance-ment incorporating speech distortion component // TELKOMNIKA Indonesian Journal of Elec-trical Engineering. 2013. Vol. 11 (9). P. 5359–5364.

18.

glibc // Операционная система GNU: [Электронный ресурс] / Free Software Founda-tion. URL: https://www.gnu.org/software/libc/ (дата обращения 26.05.2020).

19.

Rix A., Beerends J., Hollier M., Hekstra A.P. Perceptual evaluation of speech quality (PESQ) // 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Salt Lake City, UT: IEEE, 2001. P. 1–4.

20.

ZvukiPro [Электронный ресурс]. URL: https://zvukipro.com/ (дата обращения 26.05.2020).