The Operation and Training of Large Language Models and a Strategic Analysis of Their Application, Part 2

doi: 10.32561/nsz.2025.2.3

Abstract

The complexity of large language models makes them a difficult subject to understand and explain, but this two-part study attempts to present their basic principles and the phenomena they reveal in a clear and accessible way. In the second part, the training cycle for large language models, the types of data sets on which the training is based, the methodology and characteristics of pre-training and fine-tuning, and common applications are explained. Furthermore, in order to provide a comprehensive understanding of the advantages and disadvantages of the large language models, a detailed SWOT analysis has been carried out to evaluate the relevant internal and external factors that may be involved in the information processing procedure of an organisation. The study is recommended for those interested in the field of information processing by national security, who are looking for a comprehensive knowledge on the large language models and want to use it in their own research.

Keywords:

large language model artificial intelligence ChatGPT machine learning

References

BENNETT, Michael T. – PERRIER, Elija (2024): OpenAI Claims Its New Model Reached Human Level on a Test for ‘General Intelligence.’ What Does That Mean? Gizmodo, 2024. december 29. Online: https://gizmodo.com/openai-claims-its-new-model-reached-human-level-on-a-test-for-general-intelligence-what-does-that-mean-2000543834?utm_source=fark&utm_medium=website&utm_content=link&ICID=ref_fark

BOTTYÁN Sándor (2024): Nagy nyelvi modellel támogatott nyílt forrású információgyűjtés a kibertérben. Tudományos diákköri dolgozat. Budapest: Nemzeti Közszolgálati Egyetem Államtudományi és Nemzetközi Tanulmányok Kar.

BROWN, Tom B. – MANN, Benjamin – RYDER, Nick – SUBBIAH, Melanie – KAPLAN, Jared – DHARIWAL, Prafulla – NEELAKANTAN, Arvind – SHYAM, Pranav – SASTRY, Girish – ASKELL, Amanda – AGARWAL, Sandhini – HERBERT-VOSS, Ariel et al. (2020): Language Models are Few-Shot Learners. Online: https://doi.org/10.48550/arXiv.2005.14165

CHAWRE, Huzefa (2023): Prominent Data Collection Methods and Tools for LLMs. Turing, 2023. november 21. Online: https://www.turing.com/resources/data-collection-methods-and-tools-for-llms

DROST, Dorian (2023): Different Ways of Training LLMs. Towards Data Science, 2023. július 21. Online: https://towardsdatascience.com/different-ways-of-training-llms-c57885f388ed

GREENE, Jenna (2022): Will ChatGPT Make Lawyers Obsolete? Reuters, 2022. december 9. Online: https://www.reuters.com/legal/transactional/will-chatgpt-make-lawyers-obsolete-hint-be-afraid-2022-12-09/

HADI, Muhammad Usman – AL-TASHI, Qasem – QURESHI, Rizwan – SHAH, Abbas – MUNEER, Amgad – IRFAN, Muhammad – ZAFAR, Anas – SHAIKH, Muhammad Bilal – AKHTAR, Naveed – AL-GARADI, Mohammed Ali – HASSAN, Syed Zohaib – SHOMAN, Maged – WU, Jia – MIRJALILI, Seyedali – SHAH, Mubarak (2023): A Survey on Large Language Models: Applications, Challenges, Limitations, and Practical Usage. TechRxiv. Online: https://doi.org/10.36227/techrxiv.23589741.v1

HADI, Muhammad Usman – AL-TASHI, Qasem – QURESHI, Rizwan – SHAH, Abbas – MUNEER, Amgad – IRFAN, Muhammad – ZAFAR, Anas – SHAIKH, Muhammad Bilal – AKHTAR, Naveed – HASSAN, Syed Zohaib – SHOMAN, Maged – WU, Jia – MIRJALILI, Seyedali – SHAH, Mubarak (2025): LLMs: A Comprehensive Survey of Applications, Challenges, Datasets, Limitations, and Future Prospects. TechRxiv. Online:

https://doi.org/10.36227/techrxiv.23589741.v8

HARANGI László (2020): Egy hónap alatt összeszerelte az NVIDIA a világ hetedik leggyorsabb szuperszámítógépét. PCW, 2020. augusztus 17. Online: https://www.pcwplus.hu/pcwpro/egy-honap-alatt-osszeszerelte-az-nvidia-a-vilag-hetedik-leggyorsabb-szuperszamitogepet-283131.html

KHARYA, Paresh – ALVI, Ali (2021): Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model. Nvidia Developer, 2021. október 11. Online: https://developer.nvidia.com/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/

TOUVRON, Hugo – LAVRIL, Thibaut– IZACARD, Gautier – MARTINET, Xavier – LACHAUX, Marie-Anne – LACROIX, Timothée – Rozière, Baptiste – GOYAL, Naman – HAMBRO, Eric – AZHAR, Faisal – RODRIGUEZ, Aurelien – JOULIN, Armand et al. (2023): LLaMA: Open and Efficient Foundation Language Models. Online: https://doi.org/10.48550/arXiv.2302.13971

WANG, Lei – MA, Chen – FENG, Xueyang – ZHANG, Zeyu – YANG, Hao – ZHANG, Jingsen – CHEN, Zhiyuan – TANG, Jiakai – CHEN, Xu – LIN, Yankai – ZHAO, Wayne Xin – WEI, Zhewei – WEN, Jirong (2024): A Survey on Large Language Model Based Autonomous Agents. Frontiers of Computer Science, 18. Online: https://doi.org/10.1007/s11704-024-40231-1

WIGGERS, Kyle (2020): OpenAI’s Massive GPT-3 Model is Impressive, But Size Isn’t Everything. VentureBeat, 2020. június 1. Online: https://venturebeat.com/ai/ai-machine-learning-openai-gpt-3-size-isnt-everything

YANG ZIJIAN Győző – DODÉ Réka – FERENCZI Gergő – HÉJA Enikő – JELENCSIK-MÁTYUS Kinga – KŐRÖS Ádám – LAKI László János – LIGETI-NAGY Noémi – VADÁSZ Noémi – VÁRADI Tamás (2023): Jönnek a nagyok! BERT-Large, GPT-2 és GPT-3 nyelvmodellek magyar nyelvre. XIX. Magyar Számítógépes Nyelvészeti Konferencia, Szeged, 2023. január 26–27. Online: https://acta.bibl.u-szeged.hu/78417/

YUPENG, Chang et al. (2023): A Survey on Evaluation of Large Language Models. Arxiv.org, 2023. július 6. Online: https://doi.org/10.48550/arXiv.2307.03109

Downloads

Download data is not yet available.