More Data, More Problems? Exploring the Impacts of Using External Datasets when Training Deep Learning Models for Remote Sensing Applications

Hammas Attila

doi:10.32567/hm.2025.3.4

More Data, More Problems?

Exploring the Impacts of Using External Datasets when Training Deep Learning Models for Remote Sensing Applications

Hammas Attila

doi: 10.32567/hm.2025.3.4

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Absztrakt

The resurgence of large-scale conventional warfare, exemplified by the ongoing conflict in Ukraine, has highlighted the critical importance of modern technologies in enhancing situational awareness and target acquisition on the battlefield. In particular, the integration of unmanned aerial vehicles (UAVs) and artificial intelligence (AI)-driven computer vision systems has emerged as a key enabler of real-time intelligence and precision engagement. This paper presents an approach to enhance object detection models on aerial imagery for military applications. Initial experiments revealed shortcomings in detecting certain object classes, particularly in complex environments and under variable lighting conditions. To address these issues, the paper investigates impacts of cross-dataset training to improve the robustness and accuracy of object detection models. Through selective label integration and careful dataset curation, the paper demonstrates that incorporating assets form external sources significantly enhances generalisation and detection performance. The results underline the potential of leveraging large-scale annotated datasets to augment domain-specific applications with minimal additional labelling cost.

Kulcsszavak:

Intelligence, Surveillance, and Reconnaissance (ISR) IMINT computer vision

Hivatkozások

BALOGH, Péter (2012): A magyar honvédség ISTAR (ISR) képességei, a fejlesztés lehetséges irányai, különös tekintettel az elektronikai hadviselésre. Hadmérnök, 7(4), 75–94. Online: http://hadmernok.hu/2012_4_baloghp.pdf

BOCHKOVSKIY, Alexey – WANG, Chien-Yao – LIAO, Hong-Yuan Mark (2020): YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv. Online: https://doi.org/10.48550/arXiv.2004.10934

BODA, Mihály (2024): A kockázatkerülő háború és a bátorság a 20–21. század fordulóján. Honvédségi Szemle, 152(3), 113–125. Online: https://doi.org/10.35926/HSZ.2024.3.9

BRASSAI, Sándor Tihamér – SZÁNTÓ, Norbert – BAJKA, Adorján – BÁRDI, Olivér – NÉMETH, András – HAMMAS, Attila (2024): Simulation Environment Implementation for Generation of Training Samples. 2024 25th International Carpathian Control Conference (ICCC), Krynica Zdrój, Poland, 22–24 May 2024. Online: https://doi.org/10.1109/ICCC62069.2024.10569502

DU, Dawei – QI, Yuankai – YU, Hongyang – YANG, Yifan – DUAN, Kaiwen – LI, Guorong – ZHANG, Weigang – HUANG, Qingming – TIAN, Qi (2018): The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking. arXiv. Online: https://doi.org/10.48550/arXiv.1804.00518

EVERINGHAM, Mark – VAN GOOL, Luc – WILLIAMS, Christopher K. I. – WINN, John – ZISSERMAN, Andrew (2010): The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision, 88(2), 303–338. Online: https://doi.org/10.1007/s11263-009-0275-4

HAMMAS, Attila (2023): Harcjárművek észlelése szintetikusan előállított mintákon tanított mélytanuló algoritmusok segítségével. Budapest: Ludovika University of Public Service.

LI, Fei-Fei – ADELI, Ehsan (2024): CS231n: Deep Learning for Computer Vision. Online: https://cs231n.stanford.edu/slides/2024/lecture_1_part_2.pdf

LIN, Tsung-Yi – MAIRE, Michael – BELONGIE, Serge – HAYS, James – PERONA, Pietro – RAMANAN, Deva – DOLLÁR, Piotr – ZITNICK, C. Lawrence (2014): Microsoft COCO: Common Objects in Context. In FLEET, David – PAJDLA, Tomas – SCHIELE, Bernt – TUYTELAARS, Tinne (eds.): Computer Vision – ECCV 2014. Lecture Notes in Computer Science. Cham: Springer, 740–755. Online: https://doi.org/10.1007/978-3-319-10602-1_48

RANA, Ajay – CHAUHAN, Kuldeep (2021): Computer Vision and Machine Learning for Image Recognition: A Review of the Convolutional Neural Network (CNN) Model. Asian Journal of Multidimensional Research, 10(10), 1023–1029. Online: https://doi.org/10.5958/2278-4853.2021.00920.4

REDMON, Joseph – DIVVALA, Santosh – GIRSHICK, Ross – FARHADI, Ali (2015): You Only Look Once: Unified, Real-Time Object Detection. arXiv. Online: https://doi.org/10.48550/ARXIV.1506.02640

REDMON, Joseph – FARHADI, Ali (2016): YOLO9000: Better, Faster, Stronger. arXiv. Online: https://doi.org/10.48550/arXiv.1612.08242

REDMON, Joseph – FARHADI, Ali (2018): YOLOv3: An Incremental Improvement. arXiv. Online: https://doi.org/10.48550/arXiv.1804.02767

REN, Shaoqing – HE, Kaiming – GIRSHICK, Ross – SUN, Jian (2017): Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137–1149. Online: https://doi.org/10.1109/TPAMI.2016.2577031

SHETTY, Ksheera R. – SOORINJE, Vaibhav S. – DSOUZA, Prinson – SWASTHIK (2022): Deep Learning for Computer Vision: A Brief Review. International Journal of Advanced Research in Science, Communication and Technology, 2(2), 450–463. Online: https://doi.org/10.48175/IJARSCT-2898

Ultralytics (s. a.): Ultralytics Documentation. Online: https://docs.ultralytics.com/

VAN DOORN, Joost (2014): Analysis of Deep Convolutional Neural Network Architectures.

VOULODIMOS, Athanasios – DOULAMIS, Nikolaos – DOULAMIS, Anastasios – PROTOPAPADAKIS, Eftychios (2018): Deep Learning for Computer Vision: A Brief Review. Computational Intelligence and Neuroscience, 2018(1), 1–13. Online: https://doi.org/10.1155/2018/7068349

XIA, Gui-Song – BAI, Xiang – DING, Jian – ZHU, Zhen – BELONGIE, Serge – LUO, Jiebo – DATCU, Mihai – PELILLO, Marcello – ZHANG, Liangpei (2018): DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 18–23 June 2018. Online: https://doi.org/10.1109/CVPR.2018.00418

YAO, Yongqiang – WANG, Yan – GUO, Yu – LIN, Jiaojiao – QIN, Hongwei – YAN, Junjie (2020): Cross-dataset Training for Class Increasing Object Detection. arXiv. Online: https://doi.org/10.48550/arXiv.2001.04621

ZHAO, Xia – WANG, Limin – ZHANG, Yufei – HAN, Xuming – DEVECI, Muhammet – PARMAR, Milan (2024): A Review of Convolutional Neural Networks in Computer Vision. Artificial Intelligence Review, 57(4). Online: https://doi.org/10.1007/s10462-024-10721-6

ZHU, Pengfei – WEN, Longyin – BIAN, Xiao – LING, Haibin – HU, Qinghua (2018): Vision Meets Drones: A Challenge. arXiv. Online: https://doi.org/10.48550/arXiv.1804.07437