Застосування архітектури трансформер до задачі super-resolution

Козлов, С. Л.; Колесницький, О. К.

dc.contributor.author	Козлов, С. Л.	uk
dc.contributor.author	Колесницький, О. К.	uk
dc.date.accessioned	2024-06-21T07:52:40Z
dc.date.available	2024-06-21T07:52:40Z
dc.date.issued	2024
dc.identifier.citation	Козлов С. Л., Колесницький О. К. Застосування архітектури трансформер до задачі super-resolution. Наукові праці ВНТУ. Електрон. текст. дані. 2024. № 1. URI: https://praci.vntu.edu.ua/index.php/praci/article/view/726.	uk
dc.identifier.issn	2307-5376
dc.identifier.uri	https://ir.lib.vntu.edu.ua//handle/123456789/42849
dc.description.abstract	Протягом останніх 15-ти років згорткові нейронні мережі є основним підходом для вирішення задач комп'ютерного зору, і демонструють високий рівень продуктивності. Проте, архітектура трансформер, яка показала високі досягнення в галузі обробки природної мови, знаходить все ширше застосування до задач комп'ютерного зору і демонструє співставні або кращі результати. Нами розглянуто застосування архітектури трансформер до задачі super-resolution, а також наведено короткий огляд попередніх підходів. Безпосереднє застосування оригінальної архітектури трансформер дозволило забезпечити продуктивність, співставну з актуальними згортковими нейронними мережами. Проте, ефективне застосування архітектури трансформер до задач комп'ютерного зору пов'язане з викликами, які витікають з відмінностей між візуальним і мовленнєвим доменами. Перша відмінність - масштаб, оскільки зображення містять візуальні елементи різних масштабів, це ускладнює їх обробку за допомогою архітектури трансформер, що аналогічно до обробки токенів в ОПМ, працює з фрагментами одного розміру. Друга – об’єм інформації, адже обчислювальна складність обрахунку самоуваги квадратична довжині вхідної послідовності, що стає особливо критичним при обробці зображень високої роздільної здатності. У статті проведено аналіз 12 робіт з цієї тематики, опублікованих починаючи з 2021 року, які пропонують підходи до усунення зазначених складнощів. В проаналізованих роботах можуть бути виділені наступні напрямки: дослідження застосування локальної уваги з вікнами різних форм, зокрема вікнами розрідженої уваги; дослідження канальної самоуваги та її поєднання з просторовою; дослідження можливості розширення архітектури трансформер за допомогою згорткових блоків. Означені дослідження дозволили суттєво збільшити якість відтворених зображень, проте не є вичерпними.	uk
dc.language.iso	uk_UA	uk_UA
dc.publisher	ВНТУ	uk
dc.relation.ispartof	Наукові праці ВНТУ. № 1.	uk
dc.relation.uri	https://praci.vntu.edu.ua/index.php/praci/article/view/726
dc.subject	super-resolution	en
dc.subject	архітектура трансформер	uk
dc.subject	згорткова нейронна мережа	uk
dc.subject	комп'ютерний зір	uk
dc.title	Застосування архітектури трансформер до задачі super-resolution	uk
dc.type	Article
dc.identifier.udc	004.8
dc.relation.references	New edge-directed interpolation [Electronic resource] / Li Xin, M. T. Orchard // IEEE Transactions on Image Processing. – 2001. – Vol. 10, № 10. – P. 1521 – 1527. – Access mode : https://doi.org/10.1109/83.951537 (date of access: 15.02.2024).	en
dc.relation.references	SoftCuts: A Soft Edge Smoothness Prior for Color Image Super-Resolution [Electronic resource] / Shengyang Dai, Mei Han, Wei Xu [et al.] // IEEE Transactions on Image Processing. – 2009. – Vol. 18, № 5. – P. 969 – 981. – Access mode : https://doi.org/10.1109/tip.2009.2012908 (date of access: 15.02.2024).	en
dc.relation.references	Image super-resolution using gradient profile prior [Electronic resource] / Jian Sun, Zongben Xu, Heung-Yeung Shum // 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, AK, 23–28 June 2008. – Access mode : https://doi.org/10.1109/cvpr.2008.4587659 (date of access: 15.02.2024).	en
dc.relation.references	Super-resolution through neighbor embedding [Electronic resource] / Hong Chang, Dit-Yan Yeung, Yimin Xiong // Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., Washington, DC, USA. – Access mode : https://doi.org/10.1109/cvpr.2004.1315043 (date of access: 15.02.2024).	en
dc.relation.references	Image Deblurring and Super-Resolution by Adaptive Sparse Domain Selection and Adaptive Regularization [Electronic resource] / Weisheng Dong [et al.] // IEEE Transactions on Image Processing. – 2011. – Vol. 20, № 7. – P. 1838 –1857. – Access mode : https://doi.org/10.1109/tip.2011.2108306 (date of access: 15.02.2024).	en
dc.relation.references	Fast image super resolution via local regression [Electronic resource] / Gu Shuhang, Sang Nong, Ma Fan // Proceedings of the 21st International Conference on Pattern Recognition, Tsukuba, 11–15 October 2012. – P. 3128 – 3131. – Access mode : https://ieeexplore.ieee.org/document/6460827 (date of access: 15.02.2024).	en
dc.relation.references	ImageNet classification with deep convolutional neural networks [Electronic resource] / Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton // Communications of the ACM. – 2017. – Vol. 60, № 6. – P. 84 – 90. – Access mode : https://doi.org/10.1145/3065386 (date of access: 15.02.2024).	en
dc.relation.references	Learning a Deep Convolutional Network for Image Super-Resolution [Electronic resource] / Dong Chao, Chen Change Loy, Kaiming He [et al.] // Computer Vision – ECCV 2014, 6–12 September 2014. – P. 184 – 199. – Access mode : https://doi.org/10.1007/978-3-319-10593-2_13 (date of access: 15.02.2024).	en
dc.relation.references	Accurate Image Super-Resolution Using Very Deep Convolutional Networks [Electronic resource] / Jiwon Kim, Jung Kwon Lee, Kyoung Mu Lee // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. – Access mode : https://doi.org/10.1109/cvpr.2016.182 (date of access: 15.02.2024).	en
dc.relation.references	Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network [Electronic resource] / Christian Ledig, Lucas Theis; Ferenc Huszár [et al.] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 21–26 July 2017. – Access mode : https://doi.org/10.1109/cvpr.2017.19 (date of access: 15.02.2024).	en
dc.relation.references	Enhanced Deep Residual Networks for Single Image Super-Resolution [Electronic resource] / Bee Lim, Sanghyun Son, Heewon Kim [et al.] // 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017. – Access mode : https://doi.org/10.1109/cvprw.2017.151 (date of access: 15.02.2024).	en
dc.relation.references	Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network [Electronic resource] / Wenzhe Shi, Jose Caballero, Ferenc Huszár [et al.] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. – Access mode : https://doi.org/10.1109/cvpr.2016.207 (date of access: 15.02.2024).	en
dc.relation.references	Image Super-Resolution Using Very Deep Residual Channel Attention Networks [Electronic resource] / Zhang Yulun, Kunpeng Li, Kai Li [et al.] // Computer Vision – ECCV 2018, Munich, 8–14 September 2018. – P. 294 – 310. – Access mode : https://doi.org/10.1007/978-3-030-01234-2_18 (date of access: 15.02.2024).	en
dc.relation.references	ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks [Electronic resource] / Wang Xintao, Ke Yu, Shixiang Wu [et al.] // Computer Vision - ECCV 2018 Workshops, Munich, 8–14 September 2018. – P. 63 – 79. – Access mode : https://doi.org/10.1007/978-3-030-11021-5_5 (date of access: 15.02.2024).	en
dc.relation.references	SRDiff: Single image super-resolution with diffusion probabilistic models [Electronic resource] / Haoying Li, Yifan Yang, Meng Chang [et al.] // Neurocomputing. – 2022. – Vol. 479. – P. 47 – 59. – Access mode : https://doi.org/10.1016/j.neucom.2022.01.029 (date of access: 15.02.2024).	en
dc.relation.references	Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution [Electronic resource] / Wei-Sheng Lai Jia-Bin Huang; Narendra Ahuja [et al.] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 21–26 July 2017. – 2017. – Access mode : https://doi.org/10.1109/cvpr.2017.618 (date of access: 15.02.2024).	en
dc.relation.references	Image Quality Assessment: From Error Visibility to Structural Similarity [Electronic resource] / Z. Wang A. C. Bovik, H. R. Sheikh [et al.] // IEEE Transactions on Image Processing. – 2004. – Vol. 13, № 4. – P. 600 – 612. – Access mode : https://doi.org/10.1109/tip.2003.819861 (date of access: 15.02.2024).	en
dc.relation.references	An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale [Electronic resource] / A. Dosovitskiy, L. Beyer, A. Kolesnikov [et al.] // International Conference on Learning Representations, 3–7 May 2021.– Access mode : https://openreview.net/pdf?id=YicbFdNTTy (date of access: 15.02.2024).	en
dc.relation.references	Attention is All you Need [Electronic resource] / Ashish Vaswani, Noam Shazeer, Niki Parmar [et al.] // Advances in Neural Information Processing Systems, 4 – 9 December 2024. – Access mode : https://papers.nips.cc/paper_files/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html (date of access: 15.02.2024).	en
dc.relation.references	Early Convolutions Help Transformers See Better [Electronic resource] / Xiao Tete, Mannat Singh, Eric Mintun [et al.] // Advances in Neural Information Processing Systems: 2021, 6 – 14 December 2021. – Access mode : https://proceedings.neurips.cc/paper/2021/hash/ff1418e8cc993fe8abcfe3ce2003e5c5-Abstract.html (date of access: 15.02.2024).	en
dc.relation.references	On the Relationship between Self-Attention and Convolutional Layers [Electronic resource] / Cordonnier Jean-Baptiste, Loukas Andreas, Martin Jaggi // International Conference on Learning Representations , 27 – 30 April 2020. – Access mode : https://openreview.net/forum?id=HJlnC1rKPB (date of access: 15.02.2024).	en
dc.relation.references	Pre-Trained Image Processing Transformer [Electronic resource] / Hanting Chen, Yunhe Wang, Tianyu Guo [et al.] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20 – 25 June 2021. – Access mode : https://doi.org/10.1109/cvpr46437.2021.01212 (date of access: 15.02.2024).	en
dc.relation.references	Swin Transformer: Hierarchical Vision Transformer using Shifted Windows [Electronic resource] / Ze Liu, Yutong Lin, Yue Cao, Han Hu [et al.] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. – Access mode : https://doi.org/10.1109/iccv48922.2021.00986 (date of access: 15.02.2024).	en
dc.relation.references	SwinIR: Image Restoration Using Swin Transformer [Electronic resource] / Jingyun Liang, Jiezhang Cao, Guolei Sun [et al.] // 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 11–17 October. 2021. – Access mode : https://doi.org/10.1109/iccvw54120.2021.00210 (date of access: 15.02.2024).	en
dc.relation.references	On Efficient Transformer-Based Image Pre-training for Low-Level Vision [Electronic resource] / Wenbo Li, Xin Lu, Shengju Qian, [et al.] // International Joint Conference on Artificial Intelligence, Macao, 19–25 August 2024. – Access mode : https://www.ijcai.org/proceedings/2023/0121.pdf (date of access: 15.02.2024).	en
dc.relation.references	Accurate Image Restoration with Attention Retractable Transformer [Electronic resource] / Jiale Zhang, Yulun Zhang, Jinjin Gu [et al.] // The Eleventh International Conference on Learning Representations, Kigali, 30 April – 5 May 2023. – Access mode : https://openreview.net/pdf?id=IloMJ5rqfnt (date of access: 15.02.2024).	en
dc.relation.references	Image Super-Resolution Using Dilated Window Transformer [Electronic resource] / Soobin Park, Yong Suk Choi // IEEE Access. – 2023. – P. 1. – Access mode : https://doi.org/10.1109/access.2023.3284539 (date of access: 15.02.2024).	en
dc.relation.references	Cross Aggregation Transformer for Image Restoration [Electronic resource] /Zheng Chen, Yulun Zhang, Jinjin Gu [et al.] // Advances in Neural Information Processing Systems, New Orleans, 11–19 December 2022. – Access mode: https://openreview.net/forum?id=wQ2QNNP8GtM (date of access: 15.02.2024).	en
dc.relation.references	Image Super-Resolution with Unified-Window Attention [Electronic resource] / Gunhee Cho, Yong Suk Choi // IEEE Access. – 2024. – P. 1. – Access mode : https://doi.org/10.1109/access.2024.3368436 (date of access: 15.02.2024).	en
dc.relation.references	SRFormer: Permuted Self-Attention for Single Image Super-Resolution [Electronic resource] / Yupeng Zhou, Zhen Li, Chun-Le Guo [et al.] // 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023. – Access mode : https://doi.org/10.1109/iccv51070.2023.01174 (date of access: 15.02.2024).	en
dc.relation.references	SwinFIR: Revisiting the SwinIR with Fast Fourier Convolution and Improved Training for Image Super- Resolution [Electronic resource] / Dafeng Zhang, Feiyu Huang, Shizhuo Liu [et al.]. : arxiv.org, 2023. – 14 p. – Access mode : https://arxiv.org/pdf/2208.11247.pdf (date of access: 15.02.2024).	en
dc.relation.references	Resolution-robust Large Mask Inpainting with Fourier Convolutions [Electronic resource] / Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin [et al.] // 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2022. – Access mode : https://doi.org/10.1109/wacv51458.2022.00323 (date of access: 15.02.2024).	en
dc.relation.references	Activating More Pixels in Image Super-Resolution Transformer [Electronic resource] / Xiangyu Chen, Xintao Wang, Jiantao Zhou [et al.] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023. – Access mode : https://doi.org/10.1109/cvpr52729.2023.02142 (date of access: 15.02.2024).	en
dc.relation.references	Dual Aggregation Transformer for Image Super-Resolution [Electronic resource] / Zheng Chen, Yulun Zhang, Jinjin Gu [et al.] // 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023. – Access mode : https://doi.org/10.1109/iccv51070.2023.01131 (date of access: 15.02.2024).	en
dc.relation.references	Recursive Generalization Transformer for Image Super-Resolution [Electronic resource] / Zheng Chen, Yulun Zhang, Jinjin Gu [et al.] // The Twelfth International Conference on Learning Representations, Vienna, 7–11 May 2024. – Access mode : https://openreview.net/forum?id=owziuM1nsR (date of access: 15.02.2024).	en
dc.relation.references	Single image super-resolution from transformed self-exemplars [Electronic resource] / Jia-Bin Huang, Abhishek Singh, Narendra Ahuja // 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. – Access mode : https://doi.org/10.1109/cvpr.2015.7299156 (date of access: 15.02.2024).	en
dc.relation.references	Image Super-Resolution with Non-Local Sparse Attention [Electronic resource] / Yiqun Mei, Yuchen Fan, Yuqian Zhou // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20 – 25 June 2021. – Access mode : https://doi.org/10.1109/cvpr46437.2021.00352 (date of access: 15.02.2024).	en
dc.relation.references	NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results [Electronic resource] / Radu Timofte, Eirikur Agustsson, Luc Van Gool [et al.] // 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21 – 26 July 2017. – Access mode : https://doi.org/10.1109/cvprw.2017.149 (date of access: 15.02.2024).	en
dc.relation.references	Relating transformers to models and neural representations of the hippocampal formation [Electronic resource] / James C. R. Whittington, Joseph Warren, Tim E. J Behrens // International Conference on Learning Representations, 25–29 April 2022. – Access mode : https://openreview.net/forum?id=B8DVo9B1YE0 (date of access: 15.02.2024).	en
dc.relation.references	Бардаченко В. Ф. Перспективи застосування імпульсних нейронних мереж з таймерним представленням інформації для розпізнавання динамічних образів / В. Ф .Бардаченко, О. К. Колесницький, С. А. Василецький // УСіМ. – 2003. – № 6. - С. 73 – 82.	uk
dc.relation.references	Колесницький О. К. Принципи побудови архітектури спайкових нейрокомп’ютерів / О. К. Колесницький // Вісник Вінницького політехнічного інституту. – Вінниця: УНІВЕРСУМ-Вінниця. – 2014. – № 4 (115). – С. 70 – 78.	uk
dc.relation.references	Spikformer: When Spiking Neural Network Meets Transformer [Electronic resource] / Zhaokun Zhou, Yuesheng Zhu, Chao He [et al.] // The Eleventh International Conference on Learning Representations, Kigali, 1–5 May 2023. – Access mode : https://openreview.net/forum?id=frE4fUwz_h (date of access: 15.02.2024).	en
dc.relation.references	Optoelectronic implementation of pulsed neurons and neural networks using bispin-devices [Electronic resource] / O. K. Kolesnytskyj, I. V. Bokotsey, S. S. Yaremchuk // Optical Memory and Neural Networks. – 2010. – Vol. 19, № 2. – P. 154 – 165. – Access mode : https://doi.org/10.3103/s1060992x10020062 (date of access: 15.02.2024).	en
dc.relation.references	Optoelectronic spiking neural network [Electronic resource] / V. P. Kozemiako, O. K. Kolesnytskyj, T. S. Lischenko [et al.] // Optical Fibers and Their Applications 2012, Krasnobrod, Poland. – 2013. – Access mode : https://doi.org/10.1117/12.2019340 (date of access: 25.04.2024).	en
dc.relation.references	Neurocomputer architecture based on spiking neural network and its optoelectronic implementation [Electronic resource] / Oleh K. Kolesnytskyj, Vladislav V. Kutsman, Krzysztof Skorupski [et al.] // Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2019, Wilga, Poland, 25 May – 2 June 2019 / ed. by R. S. Romaniuk, M. Linczuk. – [S. l.], 2019. – Access mode : https://doi.org/10.1117/12.2536607 (date of access: 25.04.2024).	en
dc.identifier.doi	https://doi.org/10.31649/2307-5376-2024-1-7-18

Файли в цьому документі

Ім'я:: Застосування архітектури транс ...
Розмір:: 1.096Mb
Формат:: PDF

Відкрити

Даний документ включений в наступну(і) колекцію(ї)

Наукові праці ВНТУ. 2024. № 1 [10]

Показати скорочену інформацію