Аналіз еталонних тестів стійкості великих мовних моделей до дезінформації та різних видів маніпуляцій

Левіцький, С. М.; Мокін, В. Б.; Levitskyi, S. M.; Mokin, V. B.

dc.contributor.author	Левіцький, С. М.	uk
dc.contributor.author	Мокін, В. Б.	uk
dc.contributor.author	Levitskyi, S. M.	en
dc.contributor.author	Mokin, V. B.	en
dc.date.accessioned	2025-09-12T10:06:40Z
dc.date.available	2025-09-12T10:06:40Z
dc.date.issued	2025
dc.identifier.citation	Левіцький С. М., Мокін В. Б. Аналіз еталонних тестів стійкості великих мовних моделей до дезінформації та різних видів маніпуляцій // Матеріали LIV Всеукраїнської науково-технічної конференції підрозділів ВНТУ, Вінниця, 24-27 березня 2025 р. Електрон. текст. дані. 2025. URI: https://conferences.vntu.edu.ua/index.php/all-fksa/all-fksa-2025/paper/view/24338.	uk
dc.identifier.isbn	978-617-8132-48-8
dc.identifier.uri	https://ir.lib.vntu.edu.ua//handle/123456789/49249
dc.description.abstract	Розглянуто найновіші підходи до оцінювання та підвищення стійкості великих мовних моделей до дезінформації та маніпулятивних атак, таких як дрейф знань, ін'єкція промптів та інші. Узагальнено сучасні виклики, які стоять перед дослідниками мовних моделей та підприємцями, які інтегрують моделі в свої програмні продукти. Запропоновано практичні рекомендації до підвищення стійкості мовних моделей, що має особливе значення для їхнього безпечного застосування в критично важливих галузях. Виявлено, що великі мовні моделі потребують всебічного тестування, тому також запропоновано удосконалення бенчмарку авторського MST з розширенням критеріїв оцінювання.	uk
dc.description.abstract	The article discusses the latest approaches to evaluating and enhancing the robustness of large language models against misinformation and manipulative attacks, such as knowledge drift, prompt injection, and others. It summarizes contemporary challenges faced by language model researchers and entrepreneurs integrating these models into their software products. Practical recommendations are proposed to improve the robustness of language models, which is particularly important for their safe application in critical industries. It was found that large language models require comprehensive testing, therefore, an improvement of the author’s MST benchmark with an expansion of the evaluation criteria was also proposed.	en
dc.language.iso	uk_UA	uk_UA
dc.publisher	ВНТУ	uk
dc.relation.ispartof	Матеріали LIV Всеукраїнської науково-технічної конференції підрозділів ВНТУ, Вінниця, 24-27 березня 2025 р.	uk
dc.relation.uri	https://conferences.vntu.edu.ua/index.php/all-fksa/all-fksa-2025/paper/view/24338
dc.subject	LLM	en
dc.subject	еталонний тест	uk
dc.subject	дезінформація	uk
dc.subject	маніпуляція фактами	uk
dc.subject	маніпуляція промптами	uk
dc.subject	інженерія промптів	uk
dc.subject	benchmark	en
dc.subject	misinformation	en
dc.subject	factual manipulation	en
dc.subject	prompt manipulation	en
dc.subject	prompt engineering	en
dc.title	Аналіз еталонних тестів стійкості великих мовних моделей до дезінформації та різних видів маніпуляцій	uk
dc.type	Thesis
dc.identifier.udc	004.9+556
dc.relation.references	Stephanie Lin, Jacob Hilton, Owain Evans. TruthfulQA: Measuring How Models Mimic Human Falsehoods, arXiv preprint, ArXiv:2109.07958	en
dc.relation.references	Wei J. et al. Measuring short-form factuality in large language models, arXiv preprint, arXiv:2411.04368, Nov 2024.	en
dc.relation.references	Jia-Yu Yao et al. LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples, arXiv preprint, arXiv:2310.01469	en
dc.relation.references	Alina Fastowski, Gjergji Kasneci. Understanding Knowledge Drift in LLMs through Misinformation. arXiv preprint, arXiv:2409.07085v1	en
dc.relation.references	Левіцький С.М., Мокін В.Б. Метод синтезу бенчмарку для оцінювання робастної стійкості великих мовних моделей до дезінформації та маніпуляцій з фактами, Вісник Вінницького політехнічного інституту, вип. 1, 2025.	uk
dc.relation.references	Patrick Lewis et al. “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” arXiv preprint, ArXiv:2005.11401, May 2020.	en
dc.relation.references	Weiqiang Jin et al. “Veracity-Oriented Context-Aware Large Language Models–Based Prompting Optimization for Fake News Detection,” International Journal of Intelligent Systems. 15 January 2025. https://doi.org/10.1155/int/5920142	en
dc.relation.references	Zekun Li et al. “Evaluating the Instruction-Following Robustness of Large Language Models to Prompt Injection,” arXiv preprint, arXiv:2308.10819	en
dc.relation.references	Sippo Rossi et al. “An Early Categorization of Prompt Injection Attacks on Large Language Models,” arXiv preprint, arXiv:2402.00898	en
dc.relation.references	Huachuan Qiu et al. “Latent Jailbreak: A Benchmark for Evaluating Text Safety and Output Robustness of Large Language Models,” arXiv preprint, arXiv:2307.08487	en
dc.relation.references	Kaijie Zhu et al. “PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts,” arXiv preprint, ArXiv:2306.04528	en

Files in this item

Name:: 24338.pdf
Size:: 698.5Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

НТКП ВНТУ. Факультет інтелектуальних інформаційних технологій та автоматизації (2025) [171]

Show simple item record