Елементи методології прецизійного фонетичного аналізу фонограм усного мовлення

Данильчук, О. М.; Ковтун, В. В.; Никитенко, О. Д.; Нестюк, Ю. Ю.; Присяжнюк, В. В.; Danylchuk, O. M.; Kovtun, V. V.; Nykytenko, O. D.; Nestiuk, Yu. Yu.; Prysiazhniuk, V. V.

dc.contributor.author	Данильчук, О. М.	uk
dc.contributor.author	Ковтун, В. В.	uk
dc.contributor.author	Никитенко, О. Д.	uk
dc.contributor.author	Нестюк, Ю. Ю.	uk
dc.contributor.author	Присяжнюк, В. В.	uk
dc.contributor.author	Danylchuk, O. M.	en
dc.contributor.author	Kovtun, V. V.	en
dc.contributor.author	Nykytenko, O. D.	en
dc.contributor.author	Nestiuk, Yu. Yu.	en
dc.contributor.author	Prysiazhniuk, V. V.	en
dc.date.accessioned	2023-05-22T07:11:59Z
dc.date.available	2023-05-22T07:11:59Z
dc.date.issued	2022
dc.identifier.citation	Елементи методології прецизійного фонетичного аналізу фонограм усного мовлення [Текст] / О. М. Данильчук, В. В. Ковтун, О. Д. Никитенко [та ін.] // Вісник Вінницького політехнічного інституту. – 2022. – № 3. – С. 36–51.	uk
dc.identifier.citation	Данильчук О. М., Ковтун В. В., Никитенко О. Д., Нестюк Ю. Ю., Присяжнюк В. В. Елементи методології прецизійного фонетичного аналізу фонограм усного мовлення. Вісник Вінницького політехнічного інституту. 2022. № 3. С. 36–51.	uk
dc.identifier.issn	1997-9266
dc.identifier.uri	http://ir.lib.vntu.edu.ua//handle/123456789/37178
dc.description.abstract	Дослідження наріжного для сучасної лінгвістики об’єкта — процесу мовленнєвої і текстової міжособистісної комунікації, зважаючи на обсяг інфосфери двадцять першого століття, є неможливим без ґрунтовного та цілеспрямованого залучення інформаційних технологій з інших галузей знань, зокрема, комп’ютерних наук. Утворена в результаті порівняно молода наука — комп’ютерна лінгвістика, ставить за мету автоматичний аналіз природних мов у всіх спектрах їх реалізацій. З довгого списку актуальних задач, активно досліджуваних у парадигмі комп’ютерної лінгвістики, згадаємо автоматизацію складання та лінгвістичної обробки мовних корпусів, автоматизовану класифікацію та реферування документів, створення точних лінгвістичних моделей природних мов, екстракцію фактографічної інформації з неформалізованих лінгвістичних даних тощо. Рушійною силою для поліпшення результатів розв’язання цих дослідницьких задач потенційно є ефективна, строго формалізована методологія обчислювального фонетичного аналізу лінгвістичної інформації, особливо мовленнєвої. Цей тезис цілком відповідає вмісту статті, що доводить актуальність поданих в ній наукових і прикладних результатів. Відповідно, в роботі подані елементи методології прецизійного фонетичного аналізу фонограм усного мовлення з урахуванням явища фонетичної фузії. Математичний апарат створених методів ґрунтується на положеннях теорії розпізнавання образів, теорії інформації і акустичної теорії мовотворення. Цей базис забезпечив основу для аналітичної формалізації проблеми багатокритеріальності процесу розпізнавання мовних одиниць мовлення людиною. В результаті, запропоновано метод для достовірної кластеризації персональних фонетичних алфавітів мовців. Також запропоновані: метод для детектування потенційно ненадійно класифікованих мовних одиниць та коригування результатів процесу автоматизованого транскрибування мовленнєвих сигналів; метод оцінювання впливу середовища поширення досліджуваних мовленнєвих сигналів на результат транскрибування.	uk
dc.description.abstract	The study of the cornerstone of modern linguistics - the process of speech and textual interpersonal communication, given the size of the infosphere of the twenty-first century, is impossible without a sound and purposeful involvement of information technology from other fields of knowledge, including computer science. The resulting relatively young science, computational linguistics, aims to automatically analyze natural languages in all spectra of their implementations. Among the long list of topical issues actively studied in the paradigm of computational linguistics, we mention the automation of compi-lation and linguistic processing of language corpora, automated classification and abstracting of documents, creating accu-rate linguistic models of natural languages, extraction of factual information from informal linguistic data. An effective, strictly formalized methodology for computational phonetic analysis of linguistic information, especially speech information, is po-tentially a driving force for improving the results of solving these research problems. This thesis is fully consistent with the content of the article, which proves the relevance of the presented scientific and applied results. Accordingly, the paper presents elements of the methodology of precision phonetic analysis of phonograms of oral speech, taking into account the phenomenon of phonetic fusion. The mathematical apparatus of the created methods is based on the provisions of the theory of pattern recognition, information theory and acoustic theory of language formation. This basis provided the basis for a system of analytical formalization of the problem of multicriteria of the process of recognition of language units of human speech. As a result, a method for reliable clustering of personal phonetic alphabets of speakers is presented. A method for detecting potentially unreliable classified speech units and adjusting the results of the process of automated transcription of speech signals is also presented. A method for estimating the influence of the medium of propagation of the studied speech signals on the transcription result is also proposed.	en
dc.language.iso	uk_UA	uk_UA
dc.publisher	ВНТУ	uk
dc.relation.ispartof	Вісник Вінницького політехнічного інституту. № 3 : 36–51.	uk
dc.relation.uri	https://visnyk.vntu.edu.ua/index.php/visnyk/article/view/2771
dc.subject	комп’ютерна лінгвістика	uk
dc.subject	класифікація мовних одиниць	uk
dc.subject	автоматизоване транскрибування	uk
dc.subject	фонетичний аналіз мовлення	uk
dc.subject	computer linguistics	en
dc.subject	classification of language units	en
dc.subject	automated transcription	en
dc.subject	phonetic analysis of speech	en
dc.title	Елементи методології прецизійного фонетичного аналізу фонограм усного мовлення	uk
dc.title.alternative	Elements of Methodology of Precision Phonetic Analysis of Oral Phonograms	en
dc.type	Article
dc.identifier.udc	004.942
dc.relation.references	A. Mandal, Kumar Prasanna, and P. K. R. Mitra, “Recent developments in spoken term detection: a survey,” Int. J. Speech Technol 17, pp. 183-198, 2014. https://doi.org/10.1007/s10772-013-9217-1 .	en
dc.relation.references	C. China Bhanja, M. A. Laskar, and R. H. Laskar, “Modelling multi-level prosody and spectral features using deep neural network for an automatic tonal and non-tonal pre-classification-based Indian language identification system,” Lang Resources & Evaluation, 2021. https://doi.org/10.1007/s10579-020-09527-z .	en
dc.relation.references	S. S. Agrawal, A. Jain, and S. Sinha, “Analysis and modeling of acoustic information for automatic dialect classifica-tion,” Int. J. Speech Technol 19, pp. 593-609, 2016. https://doi.org/10.1007/s10772-016-9351-7 .	en
dc.relation.references	S. Gholamdokht Firooz, S. Reza, and Y. Shekofteh, “Spoken language recognition using a new conditional cascade method to combine acoustic and phonetic results,” Int. J. Speech Technol 21, pp. 649-657, 2018. https://doi.org/10.1007/s10772-018-9526-5 .	en
dc.relation.references	D. Duran, et al. “A Computational Model of Unsupervised Speech Segmentation for Correspondence Learning,” Res on Lang and Comput , no. 8, pp. 133-168, 2010. https://doi.org/10.1007/s11168-011-9075-4 .	en
dc.relation.references	D. Mirman, “Mechanisms of Semantic Ambiguity Resolution: Insights from Speech Perception,” Res on Lang and Comput no.6, pp. 293-309, 2008. https://doi.org/10.1007/s11168-008-9055-5 .	en
dc.relation.references	E. M. Bender, et al. “Grammar Customization,” Res on Lang and Comput no. 8, pp. 23-72, 2010. https://doi.org/10.1007/s11168-010-9070-1 .	en
dc.relation.references	M. Dickinson, “On Morphological Analysis for Learner Language, Focusing on Russian,” Res on Lang and Comput no. 8, pp. 273, 2010. https://doi.org/10.1007/s11168-011-9079-0 .	en
dc.relation.references	S. Moran, E. Grossman, and A. Verkerk, “Investigating diachronic trends in phonological inventories using BDPROTO,” Lang Resources & Evaluation no. 55, pp. 79-103, 2021. https://doi.org/10.1007/s10579-019-09483-3 .	en
dc.relation.references	C. van Bael, H. van den Heuvel, and H. Strik, “Validation of phonetic transcriptions in the context of automatic speech recognition,” Lang Resources & Evaluation no. 41, pp. 129-146, 2007. https://doi.org/10.1007/s10579-007-9033-9 .	en
dc.relation.references	N. B. Chittaragi, S. G. Koolagudi, “Automatic dialect identification system for Kannada language using single and en-semble SVM algorithms,” Lang Resources & Evaluation no. 54, pp. 553-585, 2020. https://doi.org/10.1007/s10579-019-09481-5	en
dc.relation.references	L. Pearl, S. Goldwater, and M. Steyvers, “Online Learning Mechanisms for Bayesian Models of Word Segmentation,” Res on Lang and Comput no. 8, pp. 107-132, 2010. https://doi.org/10.1007/s11168-011-9074-5 .	en
dc.relation.references	M. Kurimo, et al. “Modeling under-resourced languages for speech recognition,” Lang Resources & Evaluation no. 51, pp. 961-987, 2017. https://doi.org/10.1007/s10579-016-9336-9 .	en
dc.relation.references	A. Masmoudi, et al. “Automatic speech recognition system for Tunisian dialect,” Lang Resources & Evaluation no. 52, pp. 249-267, 2018. https://doi.org/10.1007/s10579-017-9402-y .	en
dc.relation.references	W. Elvira-García, et al. “A tool for automatic transcription of intonation: Eti_ToBI a ToBI transcriber for Spanish and Catalan. Lang Resources & Evaluation,” no. 50, pp. 767-792, 2016. https://doi.org/10.1007/s10579-015-9320-9 .	en
dc.relation.references	H. Strik, M. Hulsbosch, and C. Cucchiarini, “Analyzing and identifying multiword expressions in spoken language,” Lang Resources & Evaluation no. 44, pp. 41-58, 2010. https://doi.org/10.1007/s10579-009-9095-y .	en
dc.relation.references	M. Aissiou, “A genetic model for acoustic and phonetic decoding of standard arabic vowels in continuous speech,” Int J Speech Technol no. 23, pp. 425-434, 2020. https://doi.org/10.1007/s10772-020-09694-y .	en
dc.relation.references	C. Santhosh Kumar, V. P. Mohandas, “Robust features for multilingual acoustic modeling,” Int J Speech Technol no. 14, pp. 147-155, 2011. https://doi.org/10.1007/s10772-011-9092-6 .	en
dc.relation.references	N. B. Chittaragi, S. G. Koolagudi, “Acoustic-phonetic feature based Kannada dialect identification from vowel sounds,” Int J Speech Technol no. 22, pp. 1099-1113, 2019. https://doi.org/10.1007/s10772-019-09646-1 .	en
dc.relation.references	N. T. Kleynhans, E. Barnard, “Efficient data selection for ASR,” Lang Resources & Evaluation no. 49, pp. 327-353, 2015. https://doi.org/10.1007/s10579-014-9285-0 .	en
dc.relation.references	C. Clavel, et al. “Spontaneous speech and opinion detection: mining call-centre transcripts,” Lang Resources & Evaluation no. 47, pp. 1089-1125, 2013. https://doi.org/10.1007/s10579-013-9224-5 .	en
dc.relation.references	F. Anitha Florence Vinola, G. Padma, “A probabilistic stochastic model for analysis on the epileptic syndrome using speech synthesis and state space representation,” Int J Speech Technol, no. 23, pp. 35-360, 2020. https://doi.org/10.1007/s10772-020-09702-1 .	en
dc.relation.references	M. Mehrabani, J. H. L. Hansen, “Automatic analysis of dialect/language sets,” Int J Speech Technol no. 18, pp. 277-286, 2015. https://doi.org/10.1007/s10772-014-9268-y .	en
dc.relation.references	X. Ma, “Evocation: analyzing and propagating a semantic link based on free word association,” Lang Resources & Evaluation no. 47, pp. 819-837, 2013. https://doi.org/10.1007/s10579-013-9219-2 .	en
dc.relation.references	J. Chaki “Pattern analysis based acoustic signal processing: a survey of the state-of-art,” Int J Speech Technol, 2020. https://doi.org/10.1007/s10772-020-09681-3 .	en
dc.relation.references	K. B. Bhangale, and K. Mohanaprasad, “A review on speech processing using machine learning paradigm,” Int J Speech Technol no. 24, pp. 367-388, 2021. https://doi.org/10.1007/s10772-021-09808-0 .	en
dc.relation.references	P. Verma, and P. K. Das, “i-Vectors in speech processing applications: a survey,” Int J Speech Technol, no. 8, pp. 529-546, 2015. https://doi.org/10.1007/s10772-015-9295-3 .	en
dc.relation.references	T. Drugman, and N. Dutoit, “The Deterministic Plus Stochastic Model of the Residual Signal and Its Applications,” IEEE Transactions on Audio, Speech, and Language Processing, 20, no. 3, pp. 968-981, 2012. https://doi.org/1109/TASL.2011.2169787 .	en
dc.relation.references	X. Chen, and C. Bao, “Phoneme-Unit-Specific Time-Delay Neural Network for Speaker Verification,” IEEE/ACM Trans-actions on Audio, Speech, and Language Processing, no. 29, pp. 1243-1255, 2021. https://doi.org/10.1109/TASLP.2021.3065202 .	en
dc.relation.references	I. Omer, M. Zampieri, and M. Oakes, “Phonetic differences for dialect clustering,” 9th International Conference on In-formation and Communication Systems (ICICS), 2018, pp. 145-150. https://doi.org/10.1109/IACS.2018.8355457 .	en
dc.relation.references	H. Van hamme, “Phonetic analysis of a computational model for vocabulary acquisition from auditory inputs,” IEEE In-ternational Conference on Development and Learning (ICDL), 2011, pp. 1-6. https://doi.org/10.1109/DEVLRN.2011.6037365 .	en
dc.relation.references	Z. Wang, C. Liu, H. Wang, Y. Hu, and L. Dai, “Phonetic clustering based confidence measure for embedded speech recognition,” in 7th International Symposium on Chinese Spoken Language Processing, 2010, pp. 186-189. https://doi.org/10.1109/ISCSLP.2010.5684914 .	en
dc.relation.references	P. Kannadaguli, and V. Bhat, “A comparison of Bayesian multivariate modeling and hidden Markov modeling (HMM) based approaches for automatic phoneme recognition in kannada,” Recent and Emerging trends in Computer and Computational Sciences (RETCOMP), 2015, pp. 1-5. https://doi.org/10.1109/RETCOMP.2015.7090795 .	en
dc.relation.references	F. A. A. Laleye, E. C. Ezin, and C. Motamed, “Automatic Text-Independent Syllable Segmentation Using Singularity Exponents and Rényi Entropy,” J Sign Process Syst no. 88, pp. 439-451, 2017. https://doi.org/10.1007/s11265-016-1183-9 .	en
dc.relation.references	J. Kang, et al. “Lattice Based Transcription Loss for End-to-End Speech Recognition,” J Sign Process Syst no. 90, pp. 1013-1023, 2018. https://doi.org/10.1007/s11265-017-1292-0 .	en
dc.relation.references	Y. Qian, et al. “Spoken Language Understanding of Human-Machine Conversations for Language Learning Applica-tions,” J Sign Process Syst no. 92, pp. 805-817, 2020. https://doi.org/10.1007/s11265-019-01484-3 .	en
dc.relation.references	Y. Cui, et al. “Simultaneous Predictive Gaussian Classifiers, ”J. Classif no. 33, pp. 73-102, 2016. https://doi.org/10.1007/s00357-016-9197-3 .	en
dc.relation.references	O. Bisikalo, O. Boivan, N. Khairova, O. Kovtun, and V. Kovtun, “Precision Automated Phonetic Analysis of Speech Signals for Information Technology of Text-dependent Authentication of a Person by Voice, ” CEUR Workshop Proceedings, no. 2853, pp. 276-288, 2021. urn:nbn:de:0074-2853-7 .	en
dc.identifier.doi	https://doi.org/10.31649/1997-9266-2022-162-3-36-51

Файли в цьому документі

Ім'я:: 2771-Текст статті-3116-1-10-20 ...
Розмір:: 640.2Kb
Формат:: PDF

Відкрити

Даний документ включений в наступну(і) колекцію(ї)

Вісник Вінницького політехнічного інституту. 2022. № 3 [12]

Показати скорочену інформацію