Дослідження впливу параметрів великої мовної моделі на різноманітність згенерованого тексту

Мокін, В. Б.; Варер, Б. Ю.

dc.contributor.author	Мокін, В. Б.	uk
dc.contributor.author	Варер, Б. Ю.	uk
dc.date.accessioned	2025-09-12T10:06:04Z
dc.date.available	2025-09-12T10:06:04Z
dc.date.issued	2025
dc.identifier.citation		uk
dc.identifier.uri	https://ir.lib.vntu.edu.ua//handle/123456789/49239
dc.description.abstract	Робота присвячена дослідженню впливу параметрів генерування (temperature та top_p) на різноманітність тексту, створеного великими мовними моделями. Запропоновано методологію оцінювання різних аспектів різноманітності на основі шести метрик, об'єднаних в єдиний показник. Результати експериментального дослідження 30 різних промптів із 36 комбінаціями параметрів для моделі Meta Llama 3.2 3B-instruct показали, що оптимальні значення різноманітності досягаються при помірних і високих значеннях temperature та top_p.	uk
dc.description.abstract	The work is devoted to investigating the influence of generation parameters (temperature and top_p) on the diversity of text produced by large language models. A methodology is proposed for evaluating different aspects of diversity based on six metrics, combined into a single indicator. The results of an experimental study involving 30 different prompts and 36 parameter combinations for the Meta Llama 3.2 3B-instruct model showed that optimal diversity values are achieved with moderate and high settings of temperature and top_p. This research is of practical significance for creating highquality synthetic datasets, which reduces the risk of model overfitting and improves their ability to generalize.	en
dc.language.iso	uk_UA	uk_UA
dc.publisher	ВНТУ	uk
dc.relation.ispartof	// Матеріали LIV науково-технічної конференції підрозділів ВНТУ, Вінниця, 24-27 березня 2025 р.	uk
dc.relation.uri	https://conferences.vntu.edu.ua/index.php/all-fksa/all-fksa-2025/paper/view/24236
dc.subject	великі мовні моделі	uk
dc.subject	параметри генерування	uk
dc.subject	різноманітність тексту	uk
dc.subject	синтетичні дані	uk
dc.subject	донавчання	uk
dc.subject	large language models	uk
dc.subject	generation parameters	uk
dc.subject	text diversity	uk
dc.subject	synthetic data	uk
dc.subject	fine-tuning	uk
dc.title	Дослідження впливу параметрів великої мовної моделі на різноманітність згенерованого тексту	uk
dc.type	Thesis
dc.identifier.udc	004.8: 004.91
dc.relation.references	VM K., Warrier H., Gupta Y. . Fine tuning LLM for enterprise: Practical guidelines and recommendations // arXiv preprint arXiv:2404.10779. 2024.
dc.relation.references	Kang A., Chen J.Y., Lee-Youngzie Z., Fu S. Synthetic data generation with LLM for improved depression prediction // arXiv preprint arXiv:2411.17672. 2024.
dc.relation.references	Setlur A., Garg S., Geng X., Garg N., Smith V., Kumar A. RL on incorrect synthetic data scales the efficiency of LLM math reasoning by eight-fold // Advances in Neural Information Processing Systems. 2024. . 37. . 4300043031.
dc.relation.references	Wei J., Huang D., Lu Y., Zhou D., Le Q.V. Simple synthetic data reduces sycophancy in large language models // arXiv preprint arXiv:2308.03958. 2023.
dc.relation.references	Woolsey C.R., Bisht P., Rothman J., Leroy G. Utilizing large language models to generate synthetic data to increase the performance of BERT-based neural networks // AMIA Joint Summits on Translational Science Proceedings. 2024. 31 . . 2024. . 429438. PMID: 38827067; PMCID: PMC11141799.
dc.relation.references	Bisbee J., Clinton J.D., Dorff C., Kenkel B., Larson J.M. Synthetic replacements for human survey data? The perils of large language models // Political Analysis. 2024. . 32, 4. . 401416. DOI: 10.1017/pan.2024.5.
dc.relation.references	Meta. Llama 3.2 3B [ ]. : https://huggingface.co/meta-llama/Llama-3.2-3B ( : 22.03.2025).

Файли в цьому документі

Ім'я:: 24236.pdf
Розмір:: 688.0Kb
Формат:: PDF

Відкрити

Даний документ включений в наступну(і) колекцію(ї)

НТКП ВНТУ. Факультет інтелектуальних інформаційних технологій та автоматизації (2025) [171]

Показати скорочену інформацію