Method of multi-purpose term search in the terminology database

Yarovyi, A. А.; Kudriavtsev, D. S.; Яровий, А. А.; Кудрявцев, Д. С.

dc.contributor.author	Yarovyi, A. А.	en
dc.contributor.author	Kudriavtsev, D. S.	en
dc.contributor.author	Яровий, А. А.	uk
dc.contributor.author	Кудрявцев, Д. С.	uk
dc.date.accessioned	2025-06-24T11:08:19Z
dc.date.available	2025-06-24T11:08:19Z
dc.date.issued	2024
dc.identifier.citation	Yarovyi A., Kudriavtsev D. Method of multi-purpose term search in the terminology database // Information Technologies and Computer Engineering. 2024. 21(3). P. 20-28.	en
dc.identifier.issn	1999-9941
dc.identifier.uri	https://ir.lib.vntu.edu.ua//handle/123456789/46722
dc.description.abstract	This study investigated the method of multi-purpose term search in a terminological knowledge base, which is based on semantic analysis and the use of modern natural language processing methods. The study considered the key factors affecting the search efficiency, including the structure of data organisation, data format and parameters, and sample size. Particular focus was placed on the semantic similarity between terms, which allows increasing the search accuracy by using vector representations and the Louvain algorithm. The study also described the use of cosine similarity to quantify the similarity between terms. Furthermore, the search process was optimised by filtering relevant databases and dynamically identifying relevant terms using the modularity metric. A comparative analysis of existing methods for searching for terms by the identified factors was conducted. The study noted the advantages and disadvantages of using the Louvain algorithm in comparison with the search algorithms in graph data structures. A series of experiments were conducted on data samples, including dictionary, graph, and network data structures. The study analysed the use of logistic constraints for searching in network data structures and noted the possibility of optimisation due to uniform and dynamic data distribution. Experimental results showed the effectiveness of using a combination of the Louvain algorithm and network data structures in terminological knowledge bases. Examples of the scope of application of this method in information technologies for searching and processing text data were given. A software architecture scheme with the use of a software interface and the possibility of integration for web applications in the form of a package or library was developed. The proposed approach demonstrates effectiveness in the context of intelligent decision support systems and automated chatbots, which makes it particularly useful for industries access to accurate professional terms is critical. A basic version of the software interface for using this method in information technologies for searching and analysing data for use in search engines was developed	en
dc.description.abstract	У цьому дослідженні досліджувався метод багатоцільового пошуку термінів у термінологічній базі знань, який базується на семантичному аналізі та використанні сучасних методів обробки природної мови. У дослідженні розглянуто ключові фактори, що впливають на ефективність пошуку, включаючи структуру організації даних, формат та параметри даних, а також розмір вибірки. Особливу увагу було приділено семантичній подібності між термінами, що дозволяє підвищити точність пошуку за допомогою векторних представлень та алгоритму Лувена. У дослідженні також описано використання косинусної подібності для кількісної оцінки подібності між термінами. Крім того, процес пошуку було оптимізовано шляхом фільтрації релевантних баз даних та динамічної ідентифікації релевантних термінів за допомогою метрики модульності. Було проведено порівняльний аналіз існуючих методів пошуку термінів за виявленими факторами. У дослідженні зазначено переваги та недоліки використання алгоритму Лувена порівняно з алгоритмами пошуку в графових структурах даних. Було проведено серію експериментів на вибірках даних, включаючи словникові, графові та мережеві структури даних. У дослідженні проаналізовано використання логістичних обмежень для пошуку в мережевих структурах даних та зазначено можливість оптимізації завдяки рівномірному та динамічному розподілу даних. Експериментальні результати показали ефективність використання комбінації алгоритму Лувена та мережевих структур даних у термінологічних базах знань. Наведено приклади сфери застосування цього методу в інформаційних технологіях для пошуку та обробки текстових даних. Розроблено схему архітектури програмного забезпечення з використанням програмного інтерфейсу та можливістю інтеграції для веб-додатків у вигляді пакета або бібліотеки. Запропонований підхід демонструє ефективність у контексті інтелектуальних систем підтримки рішень та автоматизованих чат-ботів, що робить його особливо корисним для галузей, де доступ до точних професійних термінів є критичним. Розроблено базову версію програмного інтерфейсу для використання цього методу в інформаційних технологіях для пошуку та аналізу даних для використання в пошукових системах.	uk
dc.language.iso	uk_UA	uk_UA
dc.publisher	ВНТУ	uk
dc.relation.ispartof	Information Technologies and Computer Engineering. 21(3) : 20-28.	en
dc.relation.uri	https://itce.com.ua/en/journals/t-21-3-2024/metod-bagatotsilovogo-poshuku-termiv-v-terminologichniy-bazi
dc.subject	термінологічна база знань	uk
dc.subject	семантична подібність	uk
dc.subject	алгоритм Лувена	uk
dc.subject	векторні представлення	uk
dc.subject	обробка природної мови	uk
dc.subject	terminological knowledge base	en
dc.subject	semantic similarity	en
dc.subject	Louvain algorithm	en
dc.subject	vector representations	en
dc.subject	natural language processing	en
dc.title	Method of multi-purpose term search in the terminology database	en
dc.title.alternative	Метод багатоцільового пошуку термів в термінологічній базі	uk
dc.type	Article, professional native edition
dc.type	Article
dc.identifier.udc	004.896
dc.relation.references	Abdykerimova, L., Abdikerimova, G.B., Konyrkhanova, A., Nurova, G., Bazarova, M., Bersugir, M., Kaldarova, M., & Yerzhanova, A. (2024). Analysis of the emotional coloring of text using machine and deep learning methods. International Journal of Electrical and Computer Engineering (IJECE), 14, article number 3055. doi: 10.11591/ijece. v14i3.pp3055-3063	en
dc.relation.references	Baqal, H., & Sidiq, M. (2024). Graph databases: Revolutionizing database design and data analysis. Current Journal of Applied Science and Technology, 43, 45-56. doi: 10.9734/cjast/2024/v43i114443.	en
dc.relation.references	Beeram, D. (2024). Combining deep learning and heuristic search for efficient text summarization. International Research Journal of Engineering and Technology (IRJET), 11(8), 23-34	en
dc.relation.references	Bienvenu, M., Bourgaux, C., & Jean, R. (2024). Cost-based semantics for querying inconsistent weighted knowledge bases. In Proceedings of the 21st international conference on principles of knowledge representation and reasoning (pp. 167-177). Hanoi: CAI Organization. doi: 10.24963/kr.2024/16	en
dc.relation.references	Bourgaux, C., Guimarães, R., Koudijs, R., Lacerda, V., & Ozaki, A. (2024). Knowledge base embeddings: Semantics and theoretical properties. In Proceedings of the 21st international conference on principles of knowledge representation and reasoning (pp. 823-833). Hanoi: International Joint Conferences on Artificial Intelligence Organization. doi: 10.24963/ kr.2024/77.	en
dc.relation.references	Gabriel, A. (2020). Kensho derived Wikimedia dataset. Retrieved from https://www.kaggle.com/datasets/ kenshoresearch/kensho-derived-wikimedia-data.	en
dc.relation.references	George, S., Elayidom, M.S., & Santhanakrishnan, T. (2019). Semantic desktop search engine using graph database. International Journal of Recent Technology and Engineering, 8(1S2), 373-375.	en
dc.relation.references	Gupta, A., & Singh, T. (2024). Study of various frameworks to develop intelligent chatbots. International Journal of Innovative Science and Research Technology (IJISRT), 9(4), 2969-2978. doi: 10.38124/ijisrt/IJISRT24APR1290	en
dc.relation.references	Kaya, C., Kilimci, Z.H., Uysal, M., & Kaya, M. (2024). A review of metaheuristic optimization techniques in text classification. International Journal of Computational and Experimental Science and Engineering, 10(2). doi: 0.22399/ ijcesen.295	en
dc.relation.references	Li, C., Liang, M., & Qiu, D. (2022). An intelligent search system based on knowledge graph. In 2022 International conference on artificial intelligence of things and crowdsensing (AIoTCs) (pp. 66-70). Nicosia: IEEE. doi: 10.1109/ AIoTCs58181.2022.00017	en
dc.relation.references	Lindemann, N.F. (2024). Chatbots, search engines, and the sealing of knowledges. AI & Society. doi: 10.1007/s00146- 024-01944-w	en
dc.relation.references	Mohabir, S.E., & Joshi, Y.C. (2024). A bibliometric analysis of the knowledge base on multinational corporations’ behavior. SN Business & Economics, 4, article number 105. doi: 10.1007/s43546-024-00705-7	en
dc.relation.references	Morayo, A., Samuel, J., Kennedy, O., Adeyinka, A., Adenugba, A., & Imhade, O. (2024). Development of an artificial intelligent health chatbot for improved telemedicine. In C. So In, N.D. Londhe, N. Bhatt & M. Kitsing (Eds.), Information systems for intelligent systems. ISBM 2023. Smart innovation, systems and technologies (Vol. 379, pp. 585- 600). Singapore: Springer. doi: 10.1007/978-981-99-8612-5_48	en
dc.relation.references	Rathje, S., Mirea, D.-M., Sucholutsky, I., Marjieh, R., Robertson, C., & Van Bavel, J. (2024). GPT is an effective tool for multilingual psychological text analysis. Proceedings of the National Academy of Sciences of the United States of America, 121, article number e2308950121. doi: 10.1073/pnas.2308950121.	en
dc.relation.references	Roy, S., Bharaty, A., Sarkar, S., Sehgal, M., & Panchal, R. (2024). A hybrid ensemble approach for short-text sentiment analysis integrating deep learning and traditional machine learning methods. ResearchGate. doi: 10.13140/ RG.2.2.15182.88643.	en
dc.relation.references	Sattar, N.S., & Arifuzzaman, S. (2018). Parallelizing Louvain algorithm: Distributed memory challenges. In 2018 IEEE 16th Intl conf on dependable, autonomic and secure computing, 16th intl conf on pervasive intelligence and computing, 4th intl conf on Big Data intelligence and computing and cyber science and technology congress (DASC/PiCom/DataCom/ CyberSciTech) (pp. 695-701). Athens: IEEE. doi: 10.1109/DASC/PiCom/DataCom/CyberSciTec.2018.00122.	en
dc.relation.references	Simian, D., & Șerban, M.-E. (2024). Improving search query accuracy for specialized websites through intelligent text correction and reconstruction models. Information, 15, article number 683. doi: 10.3390/info15110683	en
dc.relation.references	Sutramiani, N., Arthana, I.M.T., Lampung, P.F., Aurelia, S., Fauzi, M., & Darma, I.W.A.S. (2024). The performance comparison of DBSCAN and K-Means clustering for MSMEs grouping based on asset value and turnover. Journal of Information Systems Engineering and Business Intelligence, 10, 13-24. doi: 10.20473/jisebi.10.1.13-24	en
dc.relation.references	Wu, L., Hu, J., Teng, F., Li, T. & Du, S. (2023). Text semantic matching with an enhanced sample building method based on contrastive learning. International Journal of Machine Learning and Cybernetics, 14, 3105-3112. doi: 10.1007/ s13042-023-01823-8	en
dc.relation.references	Yarovyi, A. & Kudriavtsev, D. (2021). Multi-purpose search to determine the context of a text message based on the dictionary data structure. In 2021 IEEE 16th international conference on computer sciences and information technologies (CSIT) (pp. 65-68). Lviv: IEEE. doi: 10.1109/CSIT52700.2021.9648803	en
dc.relation.references	Yuehgoh, F., Djebali, S., & Travers, N. (2024). Leveraging recommendations using a multiplex graph database. International Journal of Web Information Systems, 20(5). doi: 10.1108/IJWIS-05-2024-0137.	en
dc.relation.references	Zhang, Y. et al. (2024). A materials terminology knowledge graph automatically constructed from text corpus. Scientific Data, 11, article number 600. doi: 10.1038/s41597-024-03448-0	en
dc.relation.references	Zhao, Y., & Wang, T. (2024). Knowledge base embeddings for a recommendation based on overlapping knowledge and graph learning. Arabian Journal for Science and Engineering. doi: 10.1007/s13369-024-09573-7.	en
dc.identifier.doi	https://doi.org/10.63341/vitce/3.2024.20
dc.identifier.orcid	https://orcid.org/0000-0002-6668-2425
dc.identifier.orcid	https://orcid.org/0000-0001-7116-7869

Files in this item

Name:: 179762.pdf
Size:: 773.4Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Наукові роботи каф. КН [893]
статті, матеріали конференцій
Інформаційні технології та комп'ютерна інженерія. 2024. № 3 [3]

Show simple item record