Моделі глибокого навчання для вирішення задач класифікації текстової інформації

Концевой, А. О.; Бісікало, О. В.; Kontsevoi, A. O.; Bisikalo, O. V.

dc.contributor.author	Концевой, А. О.	uk
dc.contributor.author	Бісікало, О. В.	uk
dc.contributor.author	Kontsevoi, A. O.	en
dc.contributor.author	Bisikalo, O. V.	en
dc.date.accessioned	2023-01-09T11:09:50Z
dc.date.available	2023-01-09T11:09:50Z
dc.date.issued	2022
dc.identifier.citation	Концевой А. О. Моделі глибокого навчання для вирішення задач класифікації текстової інформації [Текст] / А. О. Концевой, О. В. Бісікало // Інформаційні технології та комп'ютерна інженерія. – 2022. – № 3. – С. 13–20.	uk
dc.identifier.issn	1999-9941
dc.identifier.uri	http://ir.lib.vntu.edu.ua//handle/123456789/36149
dc.description.abstract	Аналіз тексту в цілому є новою галуззю вивчення. Такі галузі, як маркетинг, управління продуктами, наукові дослі-дження та управління, вже використовують процес аналізу та вилучення інформації з текстових даних. У попередньому дописі ми обговорили технологію класифікації тексту, одну з найважливіших частин аналізу тексту. Класифікація тексту або категоризація тексту - це діяльність по позначенню текстів природною мовою відповідними категоріями із заздалегідь визначеного набору. Якщо говорити непросто, класифікація тексту - це процес вилучення загальних тегів із неструктурованого тексту. Ці загальні теги похо-дять із набору заздалегідь визначених категорій. Класифікація вмісту та продуктів за категоріями допомагає користувачам легко шукати веб-сайт чи програму та переходити до них. Класифікація тексту, також відома як категоризація тексту, є класичною про-блемою в обробці природної мови (NLP), метою якої є призначення міток або тегів для текстових одиниць, таких як речення, запи-ти, абзаци та документи. Вона має широкий спектр застосувань, включаючи відповіді на запитання, виявлення спаму, аналіз на-строїв, категоризацію новин, класифікацію намірів користувача, модерування вмісту тощо. Текстові дані можуть надходити з різних джерел, включаючи веб-дані, електронні листи, чати, соціальні мережі, квитки, страхові виплати, відгуки користувачів, а також запитання та відповіді від служби підтримки клієнтів. Текст є надзвичайно багатим джерелом інформації. Але витягувати корисні дані з тексту зазвичай складно та займає багато часу через неструктурований характер природно-мовної інформації. Моде-лі, засновані на глибокому навчанні, перевершили класичні підходи на основі машинного навчання в різних завданнях класифікації текстів, включаючи аналіз настроїв, категоризацію новин, відповіді на запитання та умовивід природної мови. У цій статті прово-диться огляд найбільш поширених моделей класифікації текстів на основі глибокого навчання, розроблених за останні роки, про-аналізовано їхній технічний внесок, схожість та сильні сторони.	uk
dc.description.abstract	Text analysis as a whole is a new field of study. Fields such as marketing, product management, research, and management already use the process of analysing and extracting information from textual data. In the previous post, we discussed text classification technology, one of the most important parts of text analysis. Text classification or text categorisation is the activity of labelling texts in natural language with appropriate categories from a predetermined set. To put it bluntly, text classification is the process of extracting generic tags from unstructured text. These generic tags come from a set of predefined categories. Categorising content and products helps users easily find and navigate to a website or app. Text classification, also known as text categorisation, is a classic problem in natural language processing (NLP) that aims to assign labels or tags to text units such as sentences, queries, paragraphs, and documents. It has a wide range of applications, including question answering, spam detection, sentiment analysis, news categorisation, user intent classification, content moderation, and more. Text data can come from a variety of sources, including web data, emails, chats, social media, tickets, insurance claims, user feedback, and customer service questions and answers. The text is an extremely rich source of information. But extracting useful data from text is usually difficult and time-consuming due to the unstructured nature of natural language information. Deep learning based models have surpassed classical machine learning based approaches in various text classification tasks, including sentiment analysis, news categorisation, question answering, and natural language inference. In this paper, we provide a comprehensive review of most widespread deep learning based models for text classification developed in recent years, and discuss their technical contributions, similarities, and strengths.	en
dc.language.iso	uk_UA	uk_UA
dc.publisher	ВНТУ	uk
dc.relation.ispartof	Інформаційні технології та комп'ютерна інженерія. № 3 : 13–20.	uk
dc.relation.uri	https://itce.vntu.edu.ua/index.php/itce/article/view/901
dc.subject	класифікація тексту	uk
dc.subject	аналіз настроїв	uk
dc.subject	відповіді на запитання	uk
dc.subject	категоризація новин	uk
dc.subject	глибоке навчання	uk
dc.subject	висновок з природної мови	uk
dc.subject	класифікація тем	uk
dc.subject	text classification	en
dc.subject	sentiment analysis	en
dc.subject	question answering	en
dc.subject	news categorisation	en
dc.subject	deep learning	en
dc.subject	natural language inference	en
dc.subject	topic classification	en
dc.title	Моделі глибокого навчання для вирішення задач класифікації текстової інформації	uk
dc.title.alternative	Analysis of deep learning models for text information classification tasks	en
dc.type	Article
dc.identifier.udc	004.912
dc.relation.references	Bisikalo O. System for definition of indicator characteristics of social networks participants Profiles / Oleg Bisikalo, Anton Kontsevoi // Proceedings of the 4th International Conference on Computational Linguistics and Intelligent Systems (COLINS 2020). – CEUR Workshop Proceedings Volume 2604, 2020. – Lviv, Ukraine, April 23-24, 2020. – Pp. 77-88. – ISSN: 16130073.	en
dc.relation.references	I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press, 2016.	en
dc.relation.references	S. Wang and C. D. Manning, “Baselines and bigrams: Simple, good sentiment and topic classification,” in Proceedings of the 50th annual meeting of the association for computational linguistics: Short papers-volume 2. Association for Computational Linguistics, 2012.	en
dc.relation.references	R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning, A. Y. Ng, and C. Potts, “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the 2013 conference on empirical methods in natural language processing, 2013.	en
dc.relation.references	X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” in Advances in neural information processing systems, 2015.	en
dc.relation.references	W. Zhao, H. Peng, S. Eger, E. Cambria, and M. Yang, “Towards scalable and reliable capsule networks for challenging NLP applications,” in ACL, 2019.	en
dc.relation.references	W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in Advances in neural information processing systems, 2017.	en
dc.relation.references	Y. Sun, S. Wang, Y.-K. Li, S. Feng, H. Tian, H. Wu, and H. Wang, “Ernie 2.0: A continual pretraining framework for language understanding.” in AAAI, 2020.	en
dc.identifier.doi	10.31649/1999-9941-2022-55-3-13-20

Файли в цьому документі

Ім'я:: 114649.pdf
Розмір:: 742.9Kb
Формат:: PDF

Відкрити

Даний документ включений в наступну(і) колекцію(ї)

Наукові роботи каф. АІІТ [315]
статті, матеріали конференцій
Інформаційні технології та комп'ютерна інженерія. 2022. № 3 [10]

Показати скорочену інформацію