Curriculum-heavy reinforcement learning for multi-domain operations

Nicholas Pioch; Lucas Sheldon; Thomas Harris; Matt Henry; Andrew Spisak; Mikayla Timm

doi:10.1117/12.2663299

12 June 2023 Curriculum-heavy reinforcement learning for multi-domain operations

Nicholas Pioch, Lucas Sheldon, Thomas Harris, Matt Henry, Andrew Spisak, Mikayla Timm

Proceedings Volume 12538, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications V; 1253819 (2023) https://doi.org/10.1117/12.2663299
Event: SPIE Defense + Commercial Sensing, 2023, Orlando, Florida, United States

Abstract

We present a framework for developing software agents via Machine Learning (ML) entitled Curriculum-Heavy Accelerated Learning in a Competitive Environment (CHALICE). CHALICE is designed to train and deploy intelligent agents capable of executing strategies for air-ground combat as embodied in AFRL’s MIST turn-based wargame system. Such agents can be used to suggest courses of action in real-time to operational planners and to provide an adversarial opponent for evaluation of proposed courses of action. CHALICE uses state-of-the-art Deep Neural Networks (DNNs) to represent the state of the environment and Deep Reinforcement Learning (DRL) to train each agent via repeated feedback from outcomes of the MIST Stratagem game. Unlike recent DRL approaches for strategy games such as Go or StarCraft [1] [2], CHALICE minimizes dependence on existing corpora of human gameplay and trains efficiently with low computational resources and short convergence time (hours to days rather than weeks to months). Over the course of four government-led competitions, CHALICE produced agents that continually improved their performance, resulting in competitive play against human and automated opposing agents at relatively low training cost and time. In this paper, we motivate the operational problem and technical challenges, provide an overview of our technical approach, elaborate on our vision-based and graph-based DNN architecture design and agent training procedure, and present results from the most recent Stratagem competition. We close with a discussion of future research recommendations.

Conference Presentation

Citation Download Citation

Nicholas Pioch, Lucas Sheldon, Thomas Harris, Matt Henry, Andrew Spisak, and Mikayla Timm "Curriculum-heavy reinforcement learning for multi-domain operations", Proc. SPIE 12538, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications V, 1253819 (12 June 2023); https://doi.org/10.1117/12.2663299

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
12 PAGES + PRESENTATION

DOWNLOAD PAPER SAVE TO MY LIBRARY

WATCH
PRESENTATION

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Machine learning

Feature extraction

Ecosystems

Artificial intelligence

Deep learning

Software development

RELATED CONTENT

Full field digital mammogram retrieval using Fourier feature based auto...
Proceedings of SPIE (April 03 2023)

Research on fault diagnosis model driven by artificial intelligence from...
Proceedings of SPIE (October 19 2023)

Skin cancer diagnosis using artificial intelligence methods a review...
Proceedings of SPIE (December 19 2023)

Temporal convolutional network-based build result prediction for continuous integration
Proceedings of SPIE (May 25 2023)

Development and implementation of computer emotion recognition software based on...
Proceedings of SPIE (December 07 2023)

Distributed learning for automatic modulation recognition in bandwidth-limited networks
Proceedings of SPIE (June 07 2024)

Age-based clustering of seagrass blades using AI models
Proceedings of SPIE (June 07 2024)

Subscribe to Digital Library

Receive Erratum Email Alert

Keywords/Phrases

Search In:

Publication Years