2026 AI in T&E Forum

WHEN: March 17 - 18, 2026

WHERE: Washington, DC

THEME: Integrating AI-Enabled Systems through Digital Engineering for Decision Advantage


AGENDA & BIOS

Agenda link here.

Bios link here.


OVERVIEW

This forum will focus on:

  1. AI T&E Policies and Guidance: Sharing the latest information.
  2. AI-Enabled Systems: Focusing on new technology being integrated into military capabilities.
  3. Integrating and Validating: Focusing on primary challenges and goals of the T&E community: how to safely and effectively incorporate these complex systems into existing platforms and ensure they work correctly.
  4. Digital Engineering (DE): DE and Model-Based Systems Engineering (MBSE). Using virtual proving grounds, digital twins, and authoritative data sources is seen as the key to accelerating processes and reducing costs.
  5. Decision Advantage: Connecting the technical work of T&E and DE to the ultimate strategic goal for the warfighter: faster, better-informed decisions on the battlefield, which is a core tenet of the DoD’s AI adoption strategy.

SPEAKERS

Congresswoman Jen Kiggans, Virginia Second District, US House of Representatives – Invited

Dr. Amy E. Henninger, Senior Science Advisor for Advanced Computing, Science and Technology Directorate
U.S. Department of Homeland Security, “Adversarial and Counter AI:  Why it Matters Now”

Dr. James Sharp, Defense Science and Technology Laboratory (Dstl), Ministry of Defense, UK

Matt Maroofi, Senior Director of Product Development, Shield AI

Dr. Sandeep (Sandy) Patel, KBR, AI/ML & Space Enterprise Manager and Deputy Program Manager for DIA/DOT&E/TETRA Contract

Dr. Jeremy S. Werner, Defense Tech Architect & Ambassador, Cadence Design Systems, Crossing the Valley of Death: Shifting Left using AI and Hardware-Accurate Digital Twins to Accelerate Acquisition 

Dr. Kerianne Hobbs, Senior Engineering Specialist, Vehicle Autonomy & System Trust, The Aerospace Corporation

Abstract: The rapid evolution of AI-enabled autonomy is reshaping operations across multiple domains. This talk presents an emerging integrated framework combining guardrails, watchdogs, live-virtual-constructive (LVC) testing, human-autonomy teaming (HAT), traditional processor/hardware-in-the-loop (PIL/HIL) methods, and unique approaches to test case generation to accelerate the responsible deployment of AI-enabled autonomy without sacrificing safety. As autonomous systems undertake critical decision-making, it is essential to establish clear behavioral boundaries, manage risk with structured representative testing at scale, and integrate human oversight with machine decision-making.

 


PANEL DISCUSSIONS

Overview: T&E Transformation

Moderated by: Daria Stafford, Technical Director, Director Operational Test and Evaluation

Digital Engineering (DE), Artificial Intelligence (AI), and Acquisition Transformation necessitate a fundamental rethink of our traditional processes. Led by Daria Stafford, Technical Director for DOT&E, this panel brings together leaders to explore how Test and Evaluation is shifting from a late-stage “final gate” to a continuous engine of discovery and learning. The panel will cover how T&E can provide decision advantage, the ability to make faster, better-informed choices in a digitally competitive world. The panel will discuss moving T&E from a series of discrete events at the end of the “V” to a mission-engineering continuum. The panel will discuss the potential for leveraging Digital Engineering, MBSE, spanning LVC environments, moving towards an authoritative data environment that validates complex AI-enabled systems earlier in the lifecycle. However, while DE and MBSE offer a path toward more agile validation, the panel will also address the significant technical and cultural hurdles of creating truly useful digital test environments. This includes the difficult work of integrating authoritative data sources with operational test characteristics, such as representative users and realistic combat environments, to ensure digital models provide a high-fidelity reflection of the battlefield. The discussion will also tackle the unique challenges of AI-enabled systems. Panelists will share insights into the use of guardrails and Bayesian models to build confidence in AI performance across the acquisition lifecycle. Finally, the panel will discuss policy and guidance updates that align with these technology advances. Ultimately, this session challenges T&E professionals to refine their role in ensuring a lethal, effective, and AI-ready force through more integrated, data-driven outcomes.

Panelists 

Dr. Laura Freeman, Deputy Director, Virginia Tech National Security Institute, Assistant Dean for Research, College of Science, and ITEA Fellow
To Be Announced, Chief Digital and Artificial Intelligence Office
To Be Announced, Military Service Representative

Testing and Evaluation Strategies with and for AI in Complex Systems

Moderated by John Frederick, Director, Innovation and Testing Strategies, Veracity Engineering

This panel will explore how T&E and organizational cultures must adapt to assess complex, nondeterministic AI and ML enabled safety and security critical systems. Panelists will discuss technical challenges, system characteristics, and metrics for integrating and testing AI, focusing on how verification and validation (V&V) evidence builds decision confidence, supports certification, and ensures operational suitability. The panel will examine the design and validation of data ontologies, decision support models, and data governance as essential enablers of AI in complex environments. The role of digital engineering, including the integral relationship between digital twins and AI/ML capabilities, will be highlighted. Finally, panelists will explore how AI and ML methods can enhance the effectiveness, efficiency, and coverage of V&V.

Panelists

Dr. Ian Levitt, Distinguished Board Member, National Aerospace Research & Technology Park

Eman Kawas, Independent Advisor, Decision Assurance using AI-Enabled Digital Twins

Dr. Antonios Kontsos, Henry M. Rowan Foundation Professor, Director of the Digital Engineering Hub

Jonathan Dziok, Systems Engineer, Veracity Engineering

 

Artificial Intelligence: An Industry Perspective 

Moderated by Bryan Vandrovec, Chief Technologist, Autonomous and AI Systems, Booz Allen Hamilton

This panel examines how a Digital Proving Ground overcomes the limitations of traditional physical testing for complex AI-enabled systems. Industry experts will discuss leveraging a generative AI-powered knowledge assistant for automated test planning, high-fidelity test range reconstruction, physics-based digital twins to generate synthetic data, and interpretable runtime guardrails to assess machine reasoning. Together, these capabilities accelerate evaluation processes, realize significant cost savings, and establish the calibrated trust required for modern autonomous systems development.

Panelists

Judy Brown Stoer, Autonomy Test Team Lead, Weather Gage Technologies

Johannes Waldstein, Founder & CEO, PiLogic Inc.

Dr. Policarpio Soberanis, Synopsys Inc.

Nelson Santini, Senior Vice President, Edge Case Defense


AUDIENCE PARTICIPATION

Sara Jordan, Institute for Defense Analyses (IDA)

The Audience will have an opportunity to review and provide comments to the latest version of the Practical Strategies for Design and Execution of Test and Evaluation of AI Enabled Systems (AIES)


TRACK: Academic & Government Voices at the Forefront of AI T&E

This technical track features independent research perspectives on adversarial AI assurance, mission-centric evaluation design, digital twin readiness, and auditable AI reasoning — advancing the science behind trusted, responsible AI for the warfighter. Abstracts +

Steve Robert Crews II, PhD., Georgia Tech Research Institute

A Digital Twin Maturity Model for Digital Engineering and Test of AI-Enabled Space Systems

Digital twins and virtual proving grounds are rapidly becoming the backbone of digital engineering and model-based systems engineering for complex, AI-enabled systems. In practice, however, “digital twin” still covers everything from a simple playback tool to a high-fidelity, closed-loop mission twin wired into a digital thread. Test and evaluation teams have no consistent way to describe how mature a given twin is for specific uses across the lifecycle, or to decide when it is ready to live inside a government digital environment.

This presentation introduces a Digital Twin Maturity Model (DTMM) aimed at defense space systems and their supporting kill chain. The DTMM defines six dimensions, each with five maturity levels, that characterize how a twin behaves as a digital-engineering asset rather than a one-off model: (1) Twin Functional Capability; (2) Data & Connectivity Integration; (3) Model Fidelity & Realism; (4) Lifecycle Integration & Operations; (5) Verification, Validation, & Trust; and (6) Interoperability with Government-Controlled Environments. Each cell is defined in observable, DE-friendly terms such as sim-ready deliverables, authoritative data sources, participation in virtual ranges, VV&A artifacts, and links to MBSE models.

The model is anchored in current DoD and USSF digital-engineering and M&S guidance on digital twins, authoritative data, and VV&A expectations, and informed by external work such as Digital Twin Consortium capability frameworks and industry virtual-proving-ground practices. It is designed to align with the forum’s Digital Engineering (DE) and MBSE focus on using digital twins, high-fidelity models, virtual environments, and trusted data sources to accelerate testing, improve quality, and reduce lifecycle costs. The talk will (1) walk through the six dimensions and five levels per dimension, emphasizing how they map to common DE/MBSE artifacts—system models, environment and threat models, digital threads, and test data pipelines; (2) illustrate scoring examples for AI-enabled orbital warfare scenarios, showing what “Level 2 vs. Level 4” looks like in practice for autonomy-on-orbit, AI-assisted C2, and synthetic-data generation; and (3) demonstrate how the DTMM can be used pragmatically in T&E planning and acquisition language—for example, to set minimum maturity thresholds for using a twin in AI-in-the-loop testing, to prioritize model improvements that unlock reuse across design, test, and training, and to compare alternative solutions on a common, transparent scale. By treating maturity as a structured, multi-dimensional property of digital twins and their digital-engineering context, this work offers the community a practical tool for deciding when and how to rely on twins in the AI-intensive test campaigns that DE and MBSE are enabling.

Dr. Rachel Brower-Sinning, Carnegie Mellon Software Engineering Institute

Using MLTE to Support Integrated T&E for ML-Enabled Systems

Delays in fielding of systems in the DoW is a known issue, with problems found during developmental (DT) and operational test (OT) noted as causes, and integration of machine learning (ML) capabilities in DoW systems expected to further increase these delays. Current practice for testing ML capabilities during development is largely limited to testing model properties, such as model performance, without consideration of mission and system requirements. This can lead to failures in model integration, deployment, and operations. Discovery of problems attributed to ML capabilities in a system context in OT is problematic as fixing the problem might require additional data collection and retraining, further delaying fielding. Delays may be exacerbated because test and evaluation (T&E) organizations may be segregated: OT organizations work independently from DT organizations which can lead to uninformed and inefficient testing; and model developers doing contractor testing (CT) may not have access to mission and system requirements and therefore fail to adequately address the real-world operational environment. Integrated test and evaluation (T&E) strives to bring DT and OT earlier in the testing process, ensuring that mission and system requirements are considered during development, to minimize costly fixes and delays.

MLTE (ML Test and Evaluation) is a process and tool that enables negotiation, specification, and testing of an ML component’s functional and non-functional requirements. Designed to support integrated T&E efforts, MLTE produces evidence of testing activities that can be shared with model acquirers and integrators to inform integration and T&E activities from CT to DT to OT, thus enabling traceability of requirements, data, and test results throughout the T&E process.

MLTE integrates a quality model that defines ML component quality through a set of characteristics that correspond to quality attributes (QAs), which are measurable or testable properties of a component that are used to indicate how well the component satisfies its system-derived requirements beyond the basic function of the component. The quality model serves as a guide for requirements elicitation and negotiation, providing a common vocabulary to specify system-derived requirements and focus testing efforts. Evaluations of MLTE in practice show the value of artifacts generated and maintained during the model development and testing process: (1) the Negotiation Card identifies a larger number of relevant model requirements early in the development process; (2) the Test Catalog supports the development and reuse of test code for these validating these requirements, and (3) the Test Code and MLTE Report provide evidence of testing which increases trustworthiness of ML models. MLTE is open-source and available at https://github.com/mlte-team/mlte.

Kelli Esser, PhD., Chief Strategy Officer, Virginia Tech National Security Institute (VTNSI)

A Mission-Centric Approach to AI T&E: Extending Mission Engineering for AI-Enabled Systems

Co-Author: James D. Moreland Jr., PhD., Owner / Principal Engineer, MEI Innovative Solutions, Inc

As artificial intelligence (AI) becomes increasingly embedded in defense systems, confidence in AI-enabled capabilities can no longer be established through component- or model-level testing alone. Current acquisition, test, and governance practices often assess AI performance in isolation, disconnected from the mission context, system-of-systems (SoS) integration effects, and operational uncertainty that ultimately determine mission success. This gap presents a growing challenge for test and evaluation (T&E) organizations, program managers, and senior leaders responsible for ensuring the safe, responsible, and effective fielding of AI-enabled systems. This presentation introduces a Mission Engineering & Integration for AI-Enabled Systems (MEI-AIES) Framework that extends established Department of Defense (DoD) mission engineering principles to address the unique characteristics of AI-enabled capabilities. MEI-AIES re-centers AI T&E and assurance on mission outcomes rather than isolated technical metrics, providing a structured approach to defining, measuring, and governing AI performance across the lifecycle. The framework is aligned with emerging DoD policies and guidance that emphasize SoS thinking, continuous assurance, and mission-focused evaluation for advanced and autonomous systems.

At the core of the MEI-AIES framework is a distinction between (1) stable, mission-anchored definitions of performance and (2) cross-cutting measures used to evaluate performance as AI-enabled capabilities evolve in autonomy, integration, and operational complexity. Stable performance definitions ensure that mission intent remains constant over time, even as AI models adapt, are retrained, or are deployed in new contexts. Cross-cutting measures address how performance, behavior, dependencies, and uncertainty are assessed as capabilities mature and interact with other systems and human decision-makers.

The presentation illustrates the MEI-AIES Framework using a representative intelligence, surveillance, and reconnaissance (ISR) use case structured across three illustrative tiers: passive single-domain analytics, passive multi-domain integration, and active cross-domain autonomy. These tiers demonstrate how AI-enabled capabilities deliver increasing mission value – such as improved situational awareness, accelerated decision timelines, and adaptive tasking – while simultaneously introducing new challenges for traceability, assurance, and uncertainty management. Importantly, the framework highlights how uncertainty evolves in character, not just magnitude, as AI capabilities move from assistive roles to autonomous mission execution under human supervision.

A central contribution of MEI-AIES Framework is its explicit linkage of Measures of Performance (MoP), Measures of Effectiveness (MoE), and Measures of Success (MoS) across tactical, operational, and strategic levels of warfare. This traceability enables T&E practitioners and decision-makers to understand how local variations in AI performance propagate to mission outcomes and strategic risk—supporting more informed decisions about deployment, integration, and governance. The framework also reinforces the need for continuous, lifecycle-based assurance rather than one-time certification events, consistent with best practices emerging across DoD AI policy and guidance. By providing a disciplined, mission-centric analytic structure, MEI-AIES offers a practical path forward for integrating AI T&E with broader DoD mission engineering, acquisition, and governance processes. The framework is intended to support early pilot applications, experimentation, and policy-aligned implementation—helping organizations move beyond technology-centric evaluation toward trusted, mission-effective employment of AI-enabled systems.

Josef B. Schaff, DSc., Chief Scientist, Cyber Dominance Group (A4J), Non-Kinetic Warfare Branch, Johns Hopkins Applied Physics Lab

Current System-of Systems (SoS) are increasing in complexity, requiring advances in testing methodology. Some SoS dynamically adapt to environmental changes, either requiring algorithms that are not fully predictable, or nonlinear control feedback loops. To test such SoS requires systems that can adapt or predict the upcoming states of these systems. Some of these require the use of Machine Learning (ML) to effectively “learn” the behaviors of such systems, thus making the testing faster, better (more comprehensive), and cheaper.  I can discuss a suite of algorithms developed for predicting system destabilizations, and their utility in testing both edge-cases / constraint parameters, as well as discovering the overall system’s performance.


POSTER PRESENTATIONS

We received many outstanding abstracts through our Call for Papers. With space available for only one track of presentations, we invited several highly qualified authors to consider presenting their work as poster papers. We are delighted with the strong response and can confidently say these poster submissions represent exceptional technical quality.

We encourage you to review the poster abstracts in advance and make a point to visit with the authors during the event. Their work reflects significant expertise and innovation, and your engagement will make these sessions even more valuable for everyone involved.


VENUE & HOTEL ACCOMODATIONS

Location

HELIX, Booz Allen’s Center of Innovation

901 15th St NW, Washington, DC 20005

The program will be held on the 1st floor.

Parking

We strongly encourage you to avoid parking at The Helix. Parking at the office building is limited and extremely difficult to navigate. We would urge you to Uber/Lyft/Taxi or take the Metro (The Helix is 1 block from McPherson Square station and Farragut North station is a short walk (0.4 miles). If you decide to drive, please note the IMAPRK Garage is located at the Helix address and will charge a daily fee and closes at 7PM.

Hotel Accommodations

Hotel accommodations should be made on your own to fit your budget.

Timeline – Guide

We expect the following:  8:00 AM Registration Opens | 9:00 AM Program Begins | Tuesday Networking Reception | Wednesday Program Concludes at 5:00 PM – Subject to change 


SPONSOR

Sponsorship opportunities are available to highlight your organization before and during the event. From Small Businesses with 10 or less employees to our biggest industry leaders, our price point and return on investment will be just what you need to succeed to gaining visibility within the T&E community.

Levels of Sponsorships
$2,500 | $1,000 | $500

Lunch and Reception $3000
Breaks $1,500

Application and Benefits

Questions? Contact Jenna Reza [jenna@itea.org]

 


REGISTRATION PRICING

ITEA Member and Full-time Government and Active-Duty Military $350

ITEA Non-Member $450

Category: Speaker/Presenter/Participant $250

Early Career Professional (< 5 years T&E) $350

Full-time Student $95

REGISTER NOW 

OUR SPONSORS