PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 13039, including the Title Page, Copyright information, Table of Contents, and Conference Committee information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Machine Learning for Automatic Target Recognition I: Joint Session with Conferences 13036 and 13039
This conference presentation was prepared for SPIE Defense + Commercial Sensing, 2024
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automatic Target Recognition (ATR) often confronts intricate visual scenes, necessitating models capable of discerning subtle nuances. Real-world datasets like the Defense Systems Information Analysis Center (DSIAC) ATR database exhibit unimodal characteristics, hindering performance, and lack contextual information for each frame. To address these limitations, we enrich the DSIAC dataset by algorithmically generating captions and proposing new train/test splits, thereby creating a rich multimodal training landscape. To effectively leverage these captions, we explore the integration of a vision-language model, specifically Contrastive Language-Image Pre-training (CLIP), which combines visual perception with linguistic descriptors. At the core of our methodology lies a homotopy-based multi-objective optimization technique, designed to achieve a harmonious balance between model precision, generalizability, and interpretability. Our framework, developed using PyTorch Lightning and Ray Tune for advanced distributed hyperparameter optimization, enhances models to meet the intricate demands of practical ATR applications. All code and data is available at https://github.com/sabraha2/ATR-CLIP-Multi-Objective-Homotopy-Optimization.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose a new method of utilizing a spatial light modulator to generate adversarial examples against image classifiers within a black box scenario. The method incorporates a simple-shape-focused strategy that queries the target network and estimates the effect of perturbing specific regions of the Fourier plane. This work is an extension of previous work that uses a spatial light modulator to perturb the phase of incoming light to generate adversarial patterns using l2-norm optimization. Our new method simply uses the final logits of the target network, allowing for it to be used not only in “white box” scenarios but also in the information-constrained “black box” scenarios. Our shape-based algorithm is shown to be widely effective on the original dataset benchmark without the requirement of knowledge about the target network architecture. Our experiments explore how manipulating the size, shape, number, and magnitude of the regions tested affects the efficacy and pattern cycles needed to generate a successful attack. Different combinations showed a range of average efficacy between 32% and 63% under a consistent objective function. Our new method also proved to be effective on a smaller dataset (meaning fewer classes for classification to be misdirected towards). We validate our method using a physical setup.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This work provides a detailed survey of the progress in dictionary-learning methods in the area of Automatic Target Recognition (ATR) systems, emphasizing the importance these techniques, in general, and the role of the methodologies in respect of growing accuracy and the efficiency, in particular, play in the tasks of identifying and classifying targets. Using an approach that combines literature review and bibliometric analysis, we unravel the history of dictionary learning in ATR, pointing out its interaction with machine learning algorithms, sparse representation, and radar imaging technologies. The core themes and innovations that shape the field are identified in the analysis aided by VOSviewer, with the integration of radar imaging and machine learning coming out as vital for developing efficient target recognition strategies. This study reveals a major trend of utilizing advanced computational models in dealing with complexities of modern surveillance and reconnaissance missions that helps increase operational efficiency in military as well as civilian applications. This review not only provides an overview of the current state of ATR research but also identifies potential developing lines, highlighting the crucial role of continuous improvement of computational algorithms and the interrelation between signal processing and machine learning for realization of unparalleled accuracy and efficiency in ATR systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Electro-Optical and Infrared Detection and Tracking I
Modern computer vision systems typically ingest full-resolution, un-compressed imagery from high-definition cameras, then process these images using deep convolutional neural networks (CNNs). These CNNs typically run on high Size, Weight, and Power (SWAP) GPU systems. Reducing the SWAP requirements of these systems is an active area of research. CNN’s can be customized and compiled to lower precision versions, “pruned” to lower complexity, or compiled to run on FPGAs to reduce power consumption.
Advances in camera design have resulted in the development of next generation “event-based” imaging sensors. These imaging sensors provide super-high-temporal resolution in individual pixels, but only pickup changes in the scene. This enables interesting new capabilities like low-power computer vision bullet-tracking and hostile fire detection and their low power-consumption is important for edge systems.
However, computer vision algorithms require massive amounts of data for system training and development and these data collections are expensive and time-consuming; it is unlikely that future event-based computer vision system development efforts will want to re-collect the amounts of data already captured and curated. Therefore, it is of interest to explore whether and how current data can be modified to simulate event-based images for training and evaluation. In this work, we present results from training and testing CNN architectures on both simulated and real event-based imaging sensor systems.
Relative performance comparisons as a function of various simulated event-based sensing parameters are presented and comparisons between approaches are provided.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this study, we present a real-time vehicle detection program that combines the You-Only-Look-Once-X (YOLOX) object detection algorithm with a multi-object Kalman filter tracker, specifically designed for analyzing 3D light detection and ranging (LiDAR) data. The use of an active imager, such as LiDAR, offers significant advantages over conventional passive 2D imagery. By providing its illumination source, LiDAR eliminates color fluctuations caused by shadowing or diurnal cycling, resulting in improved precision and accuracy for object detection and classification. Our approach involves capturing videos of 8 vehicles using an Advanced Scientific Concepts TigerCub 3D Flash LiDAR camera, which provides intensity and range data sequences. These sequences are then converted into representative color images, which are used to train the YOLOX object detector neural network.
To further enhance the detection accuracy for obscured vehicles and minimize the wrong detection rate, we integrate Kalman filter trackers into the detection algorithm. These trackers identify the vehicles and predict their future locations, effectively reducing both false positive and false negative detections. The resulting algorithm is lightweight and capable of producing highly accurate inference results in near real-time on a live-stream of LiDAR data.
To demonstrate the applicability of our approach on small, unmanned vehicles/drones, we deploy the application on NVIDIA's Jetson Orin Nano embedded processor for AI. By optimizing the code using TensorRT for real-time performance, we achieve object detection and classification of flash LiDAR data at an average precision exceeding 95% and a rate of 60 frames-per-second. MATLAB plays a crucial role in enabling rapid prototyping and algorithm testing, facilitating the smooth transfer and deployment of the complex deep learning logic to an edge device without compromising performance or accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Visually detecting camouflaged objects is a hard problem for both humans and computer vision algorithms. Strong similarities between object and background appearance make the task significantly more challenging than traditional object detection or segmentation tasks. Current state-of-the-art models use either convolutional neural networks or vision transformers as feature extractors. They are trained in a fully supervised manner and thus need a large amount of labeled training data. In this paper, both self-supervised and frugal learning methods are introduced to the task of Camouflaged Object Detection (COD). The overall goal is to fine-tune two COD reference methods, namely SINet-V2 and HitNet, pre-trained for camouflaged animal detection to the task of camouflaged human detection. Therefore, we use the public dataset CPD1K that contains camouflaged humans in a forest environment. We create a strong baseline using supervised frugal transfer learning for the fine-tuning task. Then, we analyze three pseudo-labeling approaches to perform the fine-tuning task in a self-supervised manner. Our experiments show that we achieve similar performance by pure self-supervision compared to fully supervised frugal learning.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Radar Frequency and Synthetic Aperture Radar Automatic Target Recognition I
Radar target recognition with Random Forests (RF) and using stepped-frequency radar features is the focus of this paper. Recent comparative studies between RF and convolutional neural networks (CNN) showed that RF yields reliable robust target recognition results with relatively fast training and testing time. The appeal of RF is that they can be implemented in parallel and have far fewer tunable parameters than CNN. In addition to providing measures of variable significance, and permitting differential class weighting, RF can help with imputation of missing data [1]. These RF properties make them a good target recognition alternative tool especially in scenarios where the data is occluded, or corrupted with extraneous scatterer, or when the target signature at certain azimuth position changes drastically compared to other likely positions (or aspect angles). This paper uses real radar data of commercial aircraft models recorded in a compact range. The results show that RF offers a fast and reliable alternative for target recognition systems especially under realistic radar operating conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Reliable computer vision object classification is important for security applications that make high stakes decisions based on automated algorithms. In real world scenarios, it is often impractical to meet the implicit assumption that all relevant, labelled data may be attained prior to training. To avoid performance degradation, a recently developed open-set detection framework is applied to the classification of ships from clutter in satellite, Electro-Optical (EO) imagery and is shown to reliably identify data that is out of distribution from training data. A Binary Classifier (BC) and Category-aware Binary Classifier (CBC) model were compared to OpenMax and found to provide improvements in identifying unknown imagery. This enables an operator to know whether to believe classification results from a deep learning-based algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a comprehensive computational imaging field trial conducted in Meppen, Germany, aimed at assessing the performance of cutting-edge computational imaging systems (compressive hyperspectral, visible/shortwave infrared single-pixel, wide-area infrared, neuromorphic, high-speed, photon counting cameras, and many more) by the members of NATO SET-RTG-310. The trial encompassed a diverse set of targets, including dismounts equipped with various two-handheld objects and adorned with a range of camouflage patterns, as well as fixed and rotary-wing Unmanned Aerial System (UAS) targets. These targets covered the entire spectrum of spatial, temporal, and spectral signatures, forming a comprehensive trade space for performance evaluation of each system.
The trial, which serves as the foundation for subsequent data analysis, encompassed a multitude of scenarios designed to challenge the limits of computational imaging technologies. The diverse set of targets, each with its unique set of challenges, allows for the examination of system performance across various environmental and operational conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Electro-Optical and Infrared Detection and Tracking II
Multi-agent hybrid dynamical systems are a natural model for collaborative missions in which several steps and behaviors are required to achieve the goal of the mission. Missions are tasks featuring interacting subtasks, such as the decision of where to search, how to search, and when to transition from a search behavior to a rescue behavior. While the discrete nature of mission actions (which subtask to accomplish) and the continuous nature of real-world physical state spaces make hybrid systems a good model, control in such systems is poorly understood. Theoretical results on state reachability rely on restrictive assumptions which hinder formal verification and optimization of such systems. Despite this, we find the formalism to have significant value and develop hierarchical state estimation tools to control agents in a hybrid framework and execute missions. In past work, we developed hierarchical dynamic target modeling to estimate the progress of search and track scenarios with moving targets. In this work, we consider the related problem of searching for stationary targets that appear in formation. While this may seem easier than searching for moving targets (e.g. because a preplanned search is guaranteed to find all targets), executing the search efficiently and gaining situational awareness while doing so presents unique challenges. We develop a generative hierarchical model for target locations that relies on stochastic clustering techniques and ideas from object Simultaneous Location and Mapping (SLAM) to address these challenges and demonstrate their efficacy in single- and multi-agent scenarios.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Low-resolution image object recognition and tracking is often required for battlefield reconnaissance. For high-cost military systems, standard signal processing techniques can be used by tracking systems, however, low-cost systems require simpler approaches. We developed a fast detector-agnostic tracker for improving situational awareness using electro-optical video data. Our approach uses low computational techniques such as YOLO, match filters, and shape transforms to segment objects of interest in an image. From two or more successive detections, we initialize an alpha-beta filter that predicts the location of the target of interest in the image. Next, we segment subsequent frames to a search area around the predicted region. This increases the sensitivity of the detector by improving the average signal-to-noise ratio and it also decreases the false alarm rate. The reduction in the size of the processing area can improve the detection speed per frame by an order of magnitude relative to a full-sized frame. By using algorithms with input from variable-size object a such as YOLO, this algorithm can be adapted to track virtually any object captured in a video.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Machine Learning for Automatic Target Recognition II
Many sensors produce data that rarely, if ever, is viewed by a human, and yet sensors are often designed to maximize subjective image quality. For sensors whose data is intended for embedded exploitation, maximizing the subjective image quality to a human will generally decrease the performance of downstream exploitation. In recent years, computational imaging researchers have developed end-to-end learning methods that co-optimize the sensing hardware with downstream exploitation via end-to-end machine learning. This talk will describe two such approaches at Kitware. In the first, we use an end-to-end ML approach to design a multispectral sensor that’s optimized for scene segmentation and, in the second, we optimize post-capture super-resolution in order to improve the performance of airplane detection in overhead imagery.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Remotes sensing and vision problems like object detection and recognition from various active and passive sensors are of great value to many DoD use cases. Usually due to sensor or communication link limitations the images received are of low resolution, quality and have compression artifacts. To combat this we developed a new direct vision task feature pyramid recovery with a joint frequency and pixel domain neural learning approach. It had many successes in problems like ATR from low resolution SAR and EO images, joint delburring and target detection, as well as the very low bit rate complex SAR image compression for phase recovery.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The field of quantum computing, especially quantum machine learning (QML), has been the subject of much research in recent years. Leveraging the quantum properties of superposition and entanglement promises exponential decrease in computation costs. With the promises of increased speed and accuracy in the quantum paradigm, many classical machine learning algorithms have been adapted to run on quantum computers, typically using a quantum-classical hybrid model. While some work has been done to compare classical and quantum classification algorithms in the Electro-Optical (EO) image domain, this paper will compare the performance of classical and quantum-hybrid classification algorithms in their applications on Synthetic Aperture Radar (SAR) data using the MSTAR dataset. We find that there is no significant difference in classification performance when training with quantum algorithms in ideal simulators as compared to their classical counterparts. However, the true performance benefits will become more apparent as the hardware matures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Radar Frequency and Synthetic Aperture Radar Automatic Target Recognition II
This paper summarizes our work in alleviating the vulnerability of neural networks for Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) to adversarial perturbations. We propose an approach of robust SAR image classification that integrates Bayesian Neural Networks (BNNs) to harness epistemic uncertainty for distinguishing between clean and adversarially manipulated SAR images. Additionally, we introduce a visual explanation method that employs a probabilistic variant of Guided Backpropagation (GBP) specifically adapted for BNNs. This method generates saliency maps highlighting critical pixels, thereby aiding human decisionmakers in identifying adversarial scatterers within SAR imagery. Our experiments demonstrate the effectiveness of our approach in maintaining high True Positive Rates (TPR) while limiting False Positive Rates (FPR), and in accurately identifying adversarial scatterers, showcasing our method’s potential to enhance the reliability and interpretability of SAR ATR systems in the face of adversarial threats. Details of our method and experiments can be found in Ref. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We compare the effectiveness of using a trained-from-scratch, unsupervised deep generative Variational Autoencoder (VAE) model as a solution to generic representation learning problems for Synthetic Aperture Radar (SAR) data as compared to the more common approach of using an Electric Optical (EO) transfer learning method. We find that a simple, unsupervised VAE training framework outperforms an EO transfer learning model at classification.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The lack of large, relevant and labeled datasets for synthetic aperture radar (SAR) automatic target recognition (ATR) poses a challenge for deep neural network approaches. In the case of SAR ATR, transfer learning offers promise where models are pre-trained on either synthetic SAR, alternatively collected SAR, or non-SAR source data and then fine-tuned on a smaller target SAR dataset. The concept being that the neural network can learn fundamental features from the more abundant source domain resulting in high accuracy and robust models when fine-tuned on a smaller target domain. One open question with this transfer learning strategy is how to choose source datasets that will improve accuracy of a target SAR dataset when the model is fine-tuned. Here, we apply a set of model and dataset transferability analysis techniques to investigate the efficacy of transfer learning for SAR ATR. In particular, we examine Optimal Transport Dataset Distance (OTDD), Log Maximum Evidence (LogMe), Log Expected Empirical Prediction (LEEP), Gaussian Bhattacharyya Coefficient (GBC), and H-Score. These methods consider properties such as task relatedness, statistical analysis of learned embedding properties, as well as distribution distances of the source and target domains. We apply these transferability metrics to ResNet18 models trained on a set of Non-SAR as well as SAR datasets. Overall, we present an investigation into quantitatively analyzing transferability for SAR ATR.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep neural networks for automatic target recognition (ATR) have been shown to be highly successful for a large variety of Synthetic Aperture Radar (SAR) benchmark datasets. However, the black box nature of neural network approaches raises concerns about how models come to their decisions, especially when in high-stake scenarios. Accordingly, a variety of techniques are being pursued seeking to offer understanding of machine learning algorithms. In this paper, we first provide an overview of explainability and interpretability techniques introducing their concepts and the insights they produce. Next we summarize several methods for computing specific approaches to explainability and interpretability as well as analyzing their outputs. Finally, we demonstrate the application of several attribution map methods and apply both attribution analysis metrics as well as localization interpretability analysis to six neural network models trained on the Synthetic and Measured Paired Labeled Experiment (SAMPLE) dataset to illustrate the insights these methods offer for analyzing SAR ATR performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Artificial Intelligence and Deep Learning: Joint Session with Conferences 13039 and 13046
We present a streamlined pipeline that generates a YOLO object detection application using MATLAB software and NVIDIA® hardware. The application utilizes MATLAB GPU Coder and NVIDIA® TensorRT™ to accelerate inferencing on NVIDIA processors, specifically the latest Jetson Orin™ embedded processor. We evaluated the object detector on the open U.S. Army Automated Target Recognition (ATR) Development Image Dataset (ADID) for multi-class vehicle detection and classification. Overall, this workflow decreases development time over traditional approaches and provides a quick route to low-code deployment on the latest NVIDIA Jetson Orin. This work offers value to researchers and practitioners in many application areas aiming to harness the power of NVIDIA processors for rapid, efficient object detection solutions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automatic target recognition (ATR) algorithms that rely on machine learning approaches are limited by the quality of the training dataset and the out-of-domain performance. The performance of a two-step ATR algorithm (ATR-EnvI) that on fusing thermal imagery with environmental data is investigated using thermal imagery containing buried and surface object collected in New Hampshire, Mississippi, Arizona, and Panama. An autoencoder neural network is used to encode the salient environmental conditions for a given climatic condition into an environmental feature vector. The environmental feature vector allows for the inclusion of environmental data with varying dimensions to robustly treat missing data. Using this architecture, we evaluate the performance of the two-step ATR on a test dataset collected in an unseen climatic condition, e.g., tropical wet climate when the training dataset contains imagery collected in a similar condition, e.g., subtropical, and dissimilar climates. Lastly, we evaluate the impact of including physics-based synthetic training data has on performance for out-of-domain climates.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Traditional data collects of high priority targets require immense planning and resources. When novel operating conditions (OCs) or imaging parameters need to be explored, typically synthetic simulations are leveraged. While synthetic data can be used to assess automatic target recognitions (ATR) algorithms; some simulation environments may inaccurately represent sensor phenomenology. To levitate this issue, a scale model approach is utilized to provide accurate data in a laboratory setting. This work demonstrates the effectiveness of a resource cognizant approach for collecting IR imagery suitable to assessing ATR algorithms. A target of is interest is 3D printed at 1/60th scale with a commercial printer and readily available materials. The printed models are imaged with a commercially available IR camera in a simple laboratory setup. The collected imagery is used to test ATR algorithms when trained on a standard IR ATR dataset; the publicly available ARL Comanche FLIR dataset. The performance of the selected ATR algorithms when given sampled of scale model data is compared to the performance of the same algorithms when using the provided measured data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the proliferation of space-based optical systems and corresponding increased volume of overhead imagery data, there is a growing need for automatic target recognition (ATR) algorithms that both effectively identify objects of interest and remove the burden of analysis from a finite number of human ground operators. Although recent state-of-the-art (SOTA) deep learning architectures like Convolutional Neural Networks (CNNs) and Detection Transformers (DETRs) have shown great performance in accomplishing these tasks, this performance can degrade substantially when operating on data containing unexpected anomalies outside their original training sets. Likewise, the increased amount of automation in these processes magnifies the negative impact that an ATR algorithm can cause prior to a human analyst recognizing any problem in the data downstream. Space-based optical systems rely on accurate calibration to create clean imagery, and as such, these ATR algorithms are subject to sensor calibration artifacts. Previous work has characterized common calibration artifacts such as sensor noise and failed detector artifacts as they affect the performance of an Inceptionv1-based ATR algorithm. This paper looks to extend this analysis to multiple CNN and transformer-based object detection architectures to characterize differences in performance degradation across various SOTA ATR algorithms. Notably, we found the RT-DETR architecture is more robust to uniformly distributed random scaling factors and random pixel failures than YOLOv8, YOLOv9, and Faster-RCNN particularly when detecting large objects like container ships and tankers. These results are useful in summarizing the expected performance impact of common calibration artifacts on ATR algorithms as well as informing algorithm selection when designing systems that leverage overhead imagery.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.