Показати скорочену інформацію

dc.contributor.authorOralbekova, D.en
dc.contributor.authorMamyrbayev, О.en
dc.contributor.authorAzarova, L.en
dc.contributor.authorKurmetkan, Т.en
dc.contributor.authorGordiichuk, Н.en
dc.contributor.authorZhumazhan, N.en
dc.contributor.authorSawicki, D.en
dc.contributor.authorАзарова, Л. Є.uk
dc.date.accessioned2026-01-09T09:21:25Z
dc.date.available2026-01-09T09:21:25Z
dc.date.issued2025
dc.identifier.citationOralbekova D., Mamyrbayev О., Azarova L., Kurmetkan Т., Gordiichuk Н., Zhumazhan N., Sawicki D. Synthetic Data Generation for Kazakh Speech Separation and Diarization based on the use of neural networks // Proc. SPIE. Photonics Applications in Astronomy, Communications, Industry, and High Energy Physics Experiments 2025, Vol. 14009, Lublin, Poland, 30 December 2025. Lublin, 2025. DOI: https://doi.org/10.1117/12.3099516.єen
dc.identifier.urihttps://ir.lib.vntu.edu.ua//handle/123456789/50414
dc.description.abstractThis paper explores the impact of various synthetic data generation methods on the performance of speech separation and diarization models. Three approaches are considered: simple audio track overlay, synthetic dialogue generation, and acoustic condition modeling. To evaluate their effectiveness, we used Conv-TasNet for speech separation and EENDConformer for diarization, both trained on a 400-hour Kazakh speech corpus. Experiments demonstrated that synthetic data can significantly enhance model performance when adapting to low-resource languages. The most effective method was synthetic dialogue generation, yielding results close to those obtained with real data for both speech separation and diarization. In contrast, acoustic condition modeling showed the highest deviations, indicating the need for further refinement. The findings confirm the potential of synthetic data for speech processing tasks. The proposed methods can improve the performance of automatic speech recognition models in scenarios with limited labeled data and challenging acoustic environments.en
dc.language.isoen_USen_US
dc.publisherSPIEen
dc.relation.ispartofProc. SPIE. Photonics Applications in Astronomy, Communications, Industry, and High Energy Physics Experiments 2025, Vol. 14009, Lublin, Poland, 30 December 2025.en
dc.subjectsynthetic data generationen
dc.subjectspeech separationen
dc.subjectdiarizationen
dc.subjectKazakh languageen
dc.subjectConv-TasNeten
dc.subjectEEND-Conformeren
dc.titleSynthetic data generation for Kazakh speech separation and diarization based on the use of neural networksen
dc.typeArticle, Scopus-WoS
dc.typeArticle
dc.identifier.doihttps://doi.org/10.1117/12.3099516


Файли в цьому документі

Thumbnail

Даний документ включений в наступну(і) колекцію(ї)

Показати скорочену інформацію