Дослідження впливу взаємодії різних моделей на розподіл ймовірностей наступного токена у великих мовних моделях

Варер, Б. Ю.; Мокін, В. Б.

dc.contributor.author	Варер, Б. Ю.	uk
dc.contributor.author	Мокін, В. Б.	uk
dc.date.accessioned	2025-08-13T09:51:16Z
dc.date.available	2025-08-13T09:51:16Z
dc.date.issued	2025
dc.identifier.citation		uk
dc.identifier.uri	https://ir.lib.vntu.edu.ua//handle/123456789/48228
dc.description.abstract	Досліджено вплив зміни великої мовної моделі при фіксованому контексті на розподіл ймовірностей наступного токена, у порівнянні з впливом зміни контексту при фіксованій моделі. Проведено експериментальне порівняння факторів зміни моделі та зміни контексту з використанням моделей Meta	uk
dc.description.abstract	The influence of changing a large language model with fixed context on the distribution of next token probabilities was investigated, compared to the influence of changing a context with a fixed model. An experimental comparison of model change and context change factors was conducted using Meta LLaMA 3.2-3B and Microsoft Phi-4-mini models on a dataset of 60 questions from various subject domains. Using Jensen-Shannon divergence, it was established that changing the model with fixed context leads to changes in the next token distribution (JSD 0.640-0.678) that are comparable in magnitude to changing context with a fixed model (JSD 0.638-0.721). The results confirm the importance of optimal model selection when designing effective artificial intelligence systems.	en
dc.language.iso	uk_UA	uk_UA
dc.publisher	ВНТУ	uk
dc.relation.ispartof	// Матеріали Всеукраїнської науково-практичної інтернет-конференції «Молодь в науці: дослідження, проблеми, перспективи (МН-2025)», 15-16 червня 2025 р.	uk
dc.relation.uri	https://conferences.vntu.edu.ua/index.php/mn/mn2025/paper/view/25613
dc.subject	великі мовні моделі	uk
dc.subject	кооперація моделей	uk
dc.subject	агентні системи	uk
dc.subject	дивергенція Дженсена-Шеннона	uk
dc.subject	розподіл ймовірностей	uk
dc.subject	штучний інтелектAbstractThe influence of changing a large language model with fixed context on the distribution of next token probabilitieswas investigated	uk
dc.subject	compared to the influence of changing a context with a fixed model An experimental comparison of	uk
dc.subject	large language models	uk
dc.subject	model cooperation	uk
dc.subject	agent systems	uk
dc.subject	Jensen-Shannon divergence	uk
dc.subject	probabilitydistribution	uk
dc.subject	artificial intelligenceСучасні системи на базі штучного інтелекту дедалі частіше використовують композицію декількохвеликих мовних моделей (англ “Large Language Model” - LLM) для розв’язання складних задач	uk
dc.title	Дослідження впливу взаємодії різних моделей на розподіл ймовірностей наступного токена у великих мовних моделях	uk
dc.type	Thesis
dc.identifier.udc	004.8
dc.relation.references	1. 2. 3. 4. 5. 6. 7. 8. Li X. . A survey on LLM-based multi-agent systems: workflow, infrastructure, and challenges // Vicinagearth. 2024. . 1, 1. . 9. DOI: 10.1007/s44336-024-00009-2. ISSN 3005-060X. Shen Y. . HuggingGPT: solving AI tasks with ChatGPT and its friends in Hugging Face // Advances in Neural Information Processing Systems. 2023. . 36. . 3815438180. Hong S. . METAGPT: meta programming for a multi-agent collaborative framework // Proceedings of the 12th International Conference on Learning Representations (ICLR 2024), Vienna (Austria), 711 May 2024. 2024. : . URL: https://github.com/geekan/MetaGPT ( : 10.06.2025). Jiang D., Ren X., Lin B. Y. LLM-Blender: ensembling large language models with pairwise comparison and generative fusion // Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023), Toronto (Canada), 914 July 2023. 2023. . 1416514178. DOI: 10.18653/v1/2023.acl-long.792. URL: https://aclanthology.org/2023.acl-long.792 ( .: 10.06.2025). Vaswani A. . Attention is all you need // Advances in Neural Information Processing Systems. 2017. . 30. . 59986008. Lin J. Divergence measures based on the Shannon entropy // IEEE Transactions on Information Theory. 1991. . 37,
dc.relation.references	. 145151. DOI: 10.1109/18.61115. ISSN 0018-9448. Meta AI. LLaMA 3.2-3B-Instruct: multilingual large language model // Hugging Face : [ ]. 2024. URL: https://huggingface.co/meta-llama/Llama-3.2-3B-instruct ( .: 10.06.2025). Microsoft Research. Phi-4-mini-instruct: small language model // Hugging Face : [ ]. 2025. URL: https://huggingface.co/microsoft/Phi-4-mini-instruct ( .: 10.06.2025).

Файли в цьому документі

Ім'я:: 25613.pdf
Розмір:: 381.7Kb
Формат:: PDF

Відкрити

Даний документ включений в наступну(і) колекцію(ї)

Молодь в науці: дослідження, проблеми, перспективи (МН-2025) [960]
Молодіжна науково-практична інтернет-конференція студентів аспірантів та молодих науковців

Показати скорочену інформацію

Дослідження впливу взаємодії різних моделей на розподіл ймовірностей наступного токена у великих мовних моделях

Файли в цьому документі

Даний документ включений в наступну(і) колекцію(ї)

Пов'язані елементи

Neural network technologies of investment risk estimation taking into account the legislative aspect ﻿

Епідеміологічні та мережеві моделі поширення дезінформації: огляд підходів і кейсів ﻿

Роль інтегрованих моделей у прогнозуванні поширення дезінформації ﻿

Neural network technologies of investment risk estimation taking into account the legislative aspect

Епідеміологічні та мережеві моделі поширення дезінформації: огляд підходів і кейсів

Роль інтегрованих моделей у прогнозуванні поширення дезінформації