Research of the influence of phonation variability on the result of the process of recognition of language units
Автор
Bisikalo, O.
Boivan, O.
Kovtun, O.
Kovtun, V.
Бісікало, О. В.
Бойван, О.
Ковтун, О.
Ковтун, В. В.
Дата
2022Metadata
Показати повну інформаціюCollections
- Наукові роботи каф. АІІТ [268]
Анотації
The limited use of profile services in the corporate and government segments of cyberspace
suggests that the task of recognizing the speech of more than one speaker in non-laboratory
conditions is still relevant. The article presents the technology of improving the process of
recognition of language units by integrating the model of the variability of their phonation in
the decision rule. In the proposed technology, in contrast to existing ones, recognition occurs
at the level of comparison of sound schemes of empirical and etalon language material in the
common parametric space of acoustic, generative and language models. This allowed us to
formalize the concepts of taking into account the influence of phonation variability in
determining the etalon sound schemes of language units in the paradigm of pattern
recognition theory and to formulate a UML activity diagram of the mechanism for
calculating the parameters of these concepts. The classification results demonstrated in the
test sample with high variability of speech material prove the functionality of the author`s
mechanisms to compensate for the influence of phonation variability at the level of the
decision rule and increase the accuracy of recognition by 5-8% ( the original 52% to 57-
60%, respectively). Experiments have shown that for all test samples, the decision-making
rules formulated based on the author`s concept, which took into account the optimal and
suboptimal etalon sound schemes, respectively, exceeded the solving rule, which took into
account the etalon sound schemes, but their frequency was ignored. It turned out that it is not
advisable to use the author`s mechanisms to compensate for the influence of phonation
variability in the classification of speech material with a low or moderate degree of
variability.
URI:
http://ir.lib.vntu.edu.ua//handle/123456789/36150