Abstract
Motor speech condition called dysarthria is caused by a lack of movement in the lips, tongue, vocal cords, and diaphragm are a few of the muscles needed to produce speech. Speech that is slurred, sluggish, or inaccurate might be the initial sign of dysarthria, which varies in severity. Parkinson’s disease, muscular dystrophy, multiple sclerosis, brain tumors, brain damage, and amyotrophic lateral sclerosis are among the health problems that can result from dysarthria. This research develops an efficient method for extracting features and classifying dysarthria affected persons from speech signals. This suggested method uses a speech signal as its source. The supplied speech signal is pre-processed to improve the identification of dysarthria speech. Pre-processing methods like the Butterworth band pass filter and Savitzky Golay digital FIR filter are used to smoothing the raw data. After pre-processing, the signals are input into the feature extraction techniques, such as Yule-Walker Autoregressive modelling, Mel frequency cepstral coefficients and Perceptual Linear Predictive to extract the important features. The dysarthria speech is finally detected using an improved Elman Spike Neural Network (EESNN) algorithm-based classifier. Hunter Prey Optimization (HPO) is used to select the weights of EESNN optimally. The proposed algorithm achieves 94.25% accuracy and 94.26% specificity values. Thus this proposed approach is the best choice for predicting dysarthria disease using speech signal.
REFERENCES
Gurugubelli Gurugubelli, K., and Vuppala, A.K., Analytic phase features for dysarthric speech detection and intelligibility assessment, Speech Commun., 2020, vol. 121, pp. 1–15.
Millet, J. and Zeghidour, N., Learning to detect dysarthria from raw speech, in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2019, pp. 5831–5835.
Shih, D.H., Liao, C.H., Wu, T.W., Xu, X.Y., and Shih, M.H., Dysarthria speech detection using convolutional neural networks with gated recurrent unit, in Healthcare, MDPI, 2022, vol. 10, no. 10, p. 1956.
Ijitona, T.B., Soraghan, J.J., Lowit, A., Di-Caterina, G., and Yue, H., Automatic detection of speech disorder in dysarthria using extended speech feature extraction and neural networks classification, 3rd Internationl Conference on Intelligent Signal Processing, London, United Kingdom, December 2017.
Korzekwa, D., Barra-Chicote, R., Kostek, B., Drugman, T., and Lajszczak, M., Interpretable deep learning model for the detection and reconstruction of dysarthric speech. arXiv preprint arXiv:1907.04743, 2019.
Novotný, M., Pospíšil, J., Čmejla, R., and Rusz, J., Automatic detection of voice onset time in dysarthric speech, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2015, pp. 4340–4344.
Kodrasi, I. and Bourlard, H., Super-Gaussianity of speech spectral coefficients as a potential biomarker for dysarthric speech detection, in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2019, pp. 6400–6404.
Kodrasi, I., Temporal envelope and fine structure cues for dysarthric speech detection using CNNs, IEEE Signal Process. Lett., 2021, vol. 28, pp. 1853–1857.
Diwakar, G. and Karjigi, V., Improving speech to text alignment based on repetition detection for dysarthric speech, Circuits, Syst., Signal Process., 2020, vol. 39, 5543–5567.
Wang, D., Deng, L., Yeung, Y.T., Chen, X., Liu, X., and Meng, H., Unsupervised domain adaptation for dysarthric speech detection via domain adversarial training and mutual information minimization. arXiv preprint arXiv:2106.10127, 2021.
Sekhar, S.M., Kashyap, G., Bhansali, A., and Singh, K., Dysarthric-speech detection using transfer learning with convolutional neural networks, ICT Express, 2022, vol. 8, no. 1, pp. 61–64.
Zaidi, B.F., Selouani, S.A., Boudraa, M., and Sidi Yakoub, M., Deep neural network architectures for dysarthric speech analysis and recognition, Neural Comput. Appl., 2021, vol. 33, pp. 9089–9108.
Ramos, V.M., Hernandez-Diaz, H.A.K., Huici, M.E.H.D., Martens, H., van Nuffelen, G., and De Bodt, M., Acoustic features to characterize sentence accent production in dysarthric speech, Biomed. Signal Process. Control, 2020, vol. 57, p. 101750.
Narendra, N.P., Schuller, B., and Alku, P., The detection of Parkinson’s disease from speech using voice source information, IEEE/ACM Trans. Audio, Speech, Lang. Process., 2021, vol. 29, pp. 1925–1936.
Yılmaz, E., Mitra, V., Sivaraman, G., and Franco, H., Articulatory and bottleneck features for speaker-independent ASR of dysarthric speech, Comput. Speech Lang., 2019, vol. 58, pp. 319–334.
Janbakhshi, P., Automatic Pathological Speech Assessment, EPFL, 2022, no. 9483.
Madhu Keerthana, Y., Sreenivasa Rao, K., and Mitra, P., Dysarthric speech detection from telephone quality speech using epoch-based pitch perturbation features, Int. J. Speech Technol., 2022, vol. 25, no. 4, pp. 967–973.
Mahata, S., Kar, R., and Mandal, D., Optimal rational approximation of bandpass Butterworth filter with symmetric fractional-order roll-off, AEU-Int. J. Electron. Commun., 2020, vol. 117, p. 153106.
Zhang, G., Hao, H., Wang, Y., Jiang, Y., Shi, J., Yu, J., and Yu, B., Optimized adaptive Savitzky-Golay filtering algorithm based on deep learning network for absorption spectroscopy, Spectrochim. Acta, Part A, 2021, vol. 263, p. 120187.
Giri, P., Grzesiek, A., Żuławiński, W., Sundar, S., and Wyłomańska, A., The modified Yule-Walker method for multidimensional infinite-variance periodic autoregressive model of order 1, J. Korean Stat. Soc., 2023, vol. 52, no. 2, pp. 462–493.
Pawar, M.D. and Kokate, R.D., Convolution neural network based automatic speech emotion recognition using Mel-frequency Cepstrum coefficients, Multimedia Tools Appl., 2021, vol. 80, pp. 15563–15587.
Solairaj, A., Sugitha, G., and Kavitha, G., Enhanced Elman spike neural network based sentiment analysis of online product recommendation, Appl. Soft Comput., 2023, vol. 132, p. 109789.
Naruei, I., Keynia, F., and Sabbagh Molahosseini, A., Hunter-prey optimization: Algorithm and applications, Soft Comput., 2022, vol. 26, no. 3, pp. 1279–1314.
Dataset 1. https://www.kaggle.com/datasets/iamhungundji/dysarthria-detection.
ACKNOWLEDGMENTS
AUTHOR CONTRIBUTIONS. The corresponding author claims the major contribution of the paper including formulation, analysis and editing. The co-authors provides guidance to verify the analysis result and manuscript editing.
Funding
This work was supported by ongoing institutional funding. No additional grants to carry out or direct this particular research were obtained.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
CONFLICT OF INTEREST
The authors of this work declare that they have no conflicts of interest.
DATA AND MATERIAL AVAILABILITY
Not applicable.
CODE AVAILABILITY
Not applicable.
Additional information
Publisher’s Note.
Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Pranav Kumar, Ahmad, M.T. & Kumari, R. HPO Based Enhanced Elman Spike Neural Network for Detecting Speech of People with Dysarthria. Opt. Mem. Neural Networks 33, 205–220 (2024). https://doi.org/10.3103/S1060992X24700097
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S1060992X24700097