JUNE 2025 I Volume 46, Issue 2
JUNE 2025
Volume 46 I Issue 2
IN THIS JOURNAL:
- Issue at a Glance
- Chairman’s Message
Workforce of the Future
- Encouraging Diversity in AI Test and Evaluation
Technical Articles
- Model Based Test and Evaluation Master Plan Technical Introduction
- Integrating RAG, HCD, and PD in MBSE for Mission Problem Framing
- Then What? The Need for Iterative Assessments to Achieve Successful Operational Capabilities
- Surpass the Adversary: Enhanced Mission Training through Digital Engineering
- Adaptive Algorithms for LIDAR Semantic Segmentation on Edge Devices
- 2025 AI in T&E Forum
- UC UK ITEA Event Summary
- AI and ML Methods in Verification and Validation
News
- Association News
- Chapter News
- Corporate Member News
![]()
Model Based Test and Evaluation Master Plan Technical Introduction

Dr. Randy Saunders (JHU/APL)
Johns Hopkins University Applied Physics Laboratory
Clarksville, Maryland
![]()
![]()
Dr. Jeremy Werner (DOT&E)
Office of the Secretary of Defense (OSD) / Director, Operational Test and Evaluation (DOT&E)
Washington, DC
![]()
![]()
Abstract
Providing timely decision support to decision-making authorities during the various phases of an acquisition program is critical for the on-time delivery of operationally effective weapon systems that meet the needs of the warfighter. To ensure decision makers are equipped with the necessary test and evaluation (T&E) data to inform decisions, the department of defense (DoD) mandated the use of the Integrated Decision Support Key (IDSK) as a tool to encapsulate (i.e., succinctly record) a program’s decisions and the T&E data necessary to support those decisions. To create the information and test results required for decision making and to support the IDSK, programs must develop and execute testing to provide the data that is needed. Traditionally, DoD programs conduct their testing based on a test plan. The planning document that is first created by the program office is the Test and Evaluation Master Plan (TEMP). The TEMP holds information about the program’s plan to conduct and manage the test and evaluation for the development and fielding of the system. Therefore, an approach that utilizes digital engineering, specifically model-based systems engineering (MBSE) best practices, as a means to standardize the linkage of test data to decisions presents a significant value proposition for decision-making authorities: ―linking data from an acquisition program’s digital (system) models (including design and requirements models) to models of test planning and design to key acquisition decisions, test planning and test execution. An overt value of this approach is the resulting digital thread that connects data sources, e.g., digital models, into an authoritative source of truth to both inform and validate decisions. To this end, this paper presents the development of a model-based Test and Evaluation Master Plan MB-TEMP reference architecture that integrates and links data from multiple digital models to a standardized set of acquisition, technical, and T&E decisions. The resulting MB-TEMP reference architecture provides a standardized template and approach for developing program-specific MB-TEMPs from which standardized data and decision table formats can then be generated to support program acquisition and T&E decision-making.
Keywords: Decision Support, IDSK, Test and Evaluation Master Plan, Model Based Systems Engineering, Reference Architecture
Distribution Statement A. Approved for public release: distribution unlimited
1.0 Introduction
The Test and Evaluation Master Plan (TEMP) has been an integral part of program level test planning for many years. However, in many cases historical TEMP documents have been static and not updated after they are completed (typically before any major program review), and become of limited use for future analysis of how to execute a test program, or to support trade studies of which tests need to be performed during the development and fielding of a new system.
It is important to note that the process of creating a TEMP is important because this process involves the key stakeholders in the development of the system to come together and make decisions on the test plan at the time the TEMP is originally developed.
There is and has been a need to capture the process of creating the TEMP and the data in the TEMP in a way that can be easily updated and maintained, and can be available as needed by the different stakeholders in the acquisition process that need the data. The use of digital engineering methods and MBSE is a great opportunity to capture this analysis and decision making, and to add additional capability and utility to test planning at the same time.
1.1 Digital Transformation
In June 2018, the Department of Defense established its expectations for digital transformation in the DoD Digital Engineering Strategy. The strategy outlines five goals aimed at establishing a digital engineering environment for more rapid and effective development and fielding of weapon systems. The goals include the use of models to inform decision making, establishing an infrastructure to enable the digital engineering methods, and transforming the workforce to adopt digital engineering methods over the acquisition lifecycle. The DoD followed up this strategy with the release of formal guidance via DoD Instruction 5000.97, which ensures that the Director of Operational Test and Evaluation (DOT&E) will utilize digital engineering methods to achieve their test objectives for operational assessment and Live Fire Testing. Also in 2023, DOT&E released their DOT&E Strategy Implementation Plan (I Plan)1, which includes objectives and key actions to develop digital, or model based Test and Evaluation Master Plans (TEMP) and Integrated Decision Support Keys (IDSK). As recently as December 2024, the Department released an update to DoD Instruction 5000.982 and five DoD manuals further refining the description and use of digital methods for the entire DoD test community.
In support of the above-mentioned DoD agency expectations, a team of industry, academic, and government representatives has embarked on developing exemplar tooling, models, and architectures to realize the goals and strategy. This team has brought together the technical abilities and goals of DOT&E with Developmental Test, Evaluation, and Assessments (DTE&A), facilitating workshops and hackathons to provision weapon system program offices and T&E practitioners with the ability to comply with the DoD test objectives. This article brings together the latest integrated works of contributing researchers of a Model-Based ‘model-based decision support and test planning.
1 DOT&E Strategy Implementation Plan (I Plan), https://www.dote.osd.mil/Portals/97/pub/reports/DOTE_Strategy_Imp_Plan-Apr2023.pdf?ver=jQHyC5uHXsvM25sYurv5Zw%3d%3d
1.2 Test and Evaluation Enterprise
The test and evaluation enterprise within the Department of Defense has many stakeholders. One of the biggest challenges to digital transformation in the T&E space is that the different stakeholders are transforming at different rates and using different tools and infrastructure. With the development of the Model-Based TEMP Reference Architecture, we have created a key acquisition artifact that can be used and references across both planning and execution of the acquisition program and across several organizations and different parts of the lifecycle. The next key decision is how best to use the digital TEMP to support future planning and coordination activities.
1.3 IDSK
The Integrated Decision Support Key (IDSK) represents a framework created to identify critical T&E data required to inform acquisition and program decisions (DOT&E, 20233). It articulates acquisition lifecycle decision-making informed by operational and technical capabilities evaluation and defines how Developmental Test (DT) and Operational Test (OT) are used to inform a program’s key decisions. The IDSK is a critical element of the Integrated T&E Framework highlighted in Figure 1. The Integrated T&E Framework is used by stakeholders to understand the scope of evaluations required, define up-front the end-state for the evaluations, and to develop an integrated testing approach and evaluation strategy that maps together evaluation focus areas, critical decision points and specific data requirements (Executive Services Directorate, 2020). Furthermore, the IDSK enables the program manager (PM) and T&E WIPT to ensure that critical operational issues (COI) are tied to unit mission accomplishments and are unit-focused, vice being focused solely on the technical specifications of the system.

Figure 1: Integrated T&E Framework (modified from DoDI 5000.89)
2 DoD Instruction 5000.98, https://www.dote.osd.mil/Portals/97/pub/presentations/2024/DoDI%205000.98%20Overview%202024.12.09.pdf?ver=dumbj7_N6pIDW3HkRDoXeA%3d%3d
3 DOT&E, 2023, https://www.dau.edu/sites/default/files/Migrate/EventAttachments/1044/DAU%20Seminar%20-%20Pillar%202_9Aug_%20(003)-Updated.pdf
Although it is a decision support tool that can be used to articulate cradle-to-grave acquisition lifecycle decision making, the IDSK in its traditional form falls short as a digital engineering tool capable of providing a more holistic approach to data-driven decision making. Current implementations of the traditional IDSK are structured as a set of Excel Spreadsheets linking together decisions, data, test planning data, resources, and schedules. This document-centric implementation limits the ability to use important data and relationships within the design that are captured in the digital models of the system. In addition, the current format doesn’t allow for integration across programs in a portfolio and doesn’t support the linkage of requirements or risk into the analysis of decisions. Consequently, the traditional format limits the ability of the IDSK to use important data and relationships captured in a system’s digital models to inform decision making in a holistic and data-driven digital engineering manner. To this end, a key strategy of the Director, Operational Test & Evaluation (DOT&E), as stated in (DOT&E Strategy, 2023), is to accelerate the development of solutions that enable digital representations of numerous T&E tools and artifacts, including a test planning and decision support.
To realize the IDSK’s potential to positively impact acquisition outcomes and program decisions, the concept of a Model Baded-IDSK developed using model-based systems engineering will address a majority of the shortcomings of the traditional IDSK and provide great benefits to decision makers and all stakeholders across the acquisition and T&E enterprise. These benefits include (i) its ability to support digital transformation by integrating the IDSK into a program’s digital engineering ecosystem. (ii) an MB-IDSK would provide mapping of decisions to development (i.e., acquisition) risk, test risk, and test resource models, thereby allowing for more sophisticated analysis including probabilities of success analysis. (iii) an MB-IDSK will expand the ability to link different aspects of the system design, capabilities, and testing to critical program decisions.
In (Anyanhun & Arndt, 20234 ) a MB-IDSK reference architecture (MB-IDSK RA) was proposed and developed to support digital transformation efforts of DOT&E. The motivation behind defining a MB-IDSK RA was based on the premise that an architecture should reflect the organization of the owning enterprise (CAS, 2022). Therefore, for a hierarchical organization such as the DoD T&E enterprise, developing an IDSK RA is a critical first step towards preventing conflicting business objectives for programs of record by serving as a medium to flow down the overarching business objectives for different program MB-IDSK as perceived by the DoD/DOT&E authorities. Specifically, the MB-IDSK RA represents an essential tool to facilitate communication and alignment efforts of current and future IDSK architectures. 2 depicts the IDSK architecture strategy as adapted from the DoD Comprehensive Architecture Strategy.

Figure 2: IDSK RA Architecture Strategy (Adapted from Figure 1 of the DoD CAS (CAS, 2022))
Equipping DoD acquisition programs with overarching guidance on how to leverage digital engineering for decision support is critical to achieving enterprise-wide business and mission objectives of providing weapon systems at the speed of need and relevancy. As reported in (CAS, 2022 & Muller, 2007), a Reference Architecture provides a method for focusing all architecture and design decisions with the intent to enforce common applicable standards and provide a tailorable template. The MB-IDSK RA is developed to demonstrate and provide guidance on how the T&E enterprise and acquisition programs implementing digital engineering could leverage existing digital models created during the various acquisition phases as real-time data sources to inform key program decisions and improve decision outcomes.
4 Arndt, C., Anyanhun, A., & Werner, J. S. (2023). Shifting Left: Opportunities to Reduce Defense Acquisition Cycle Time by Fully Integrating Test and Evaluation in Model Based Systems Engineering. Acquisition Research Program
2.0 Need / Requirements
The Test and Evaluation Master Plan (TEMP) is one of the core artifacts in the DoD acquisition process (DoD 5000.015) as is test planning documents (and models) to any highly complex system development in industry. However, the TEMP and the test process as a whole need to be understood as part of the larger research, engineering, development, and acquisition process. Figure 3 below is an illustration of the larger process needed to understand a test and its relationship to requirements and the use of the system (mission data).

Figure 3: (Develop Iceberg)6
The wide range of interconnections and relationships is a compelling reason for the use of digital engineering and MBSE in both the development of the Test and Evaluation Master Plan and the ongoing planning and execution of a test program.
2.1 Value Proposition for a Model-Based Reference Architecture
There has been a natural skepticism to adopting these new methods and model-based tools. Concerns over data integrity, cyber security, tooling costs, interoperability and resilience, and maturity of method have program decision makers uneasy about diving into this digital abyss. Technical and program leaders who hold confidence in the tried methods of acquisition that have existed for the past several decades are looking to others to prove the worth of a digital or model-based engineering and program decision-making environment. More specifically, programs seek to better understand the value proposition of converting their Test and Evaluation methods. Though the skepticism is understood, a case can be made that there is value in cost, technical coverage, and schedule savings that warrant conversion.
As James Collins highlights in his well-known book <u7 , success doesn’t come from the blind application of technology but from the application of carefully selected technology to achieve the fundamental and specific objectives of an organization. This principle is also true in the application of digital engineering methods to test and evaluation planning, executing, and decision-making. The integration of federated data sources, if executed correctly, equip program test and evaluation WIPT’s with the ability to understand a weapon systems technical compliance and challenges. Traditional Test and Evaluation Master Plans evolve over large periods of time and will be updated at major program milestones. TEMPs represented the negotiated work of multiple test organizations thinking through the necessary contractor, development, live fire, and finally operational test/evaluation to provide fielding recommendations. It is clear to see that much of the value of TEMPs is gained from the actual planning process versus the documentation itself, which can become “shelfware” once officially approved and released. From a schedule perspective, if test organizations and program offices have real or near real time awareness of test results, and how these results validate or bring into question the system key performance parameters and measures of effectiveness, they are equipped to pivot or adjust test plans for efficiency and thus reduce development program execution timelines.
5DoD 5000.01, https://www.dau.edu/cop/pm/documents/dod-directive-500001-defense-acquisition-system-9-september-2020
6Jeremy Werner, Kelli Esser, Craig Arndt, Trisha Radocaj, Awele Anyanhun, Daniel Wolodkin, Geoffrey Kerr, Laura Freeman, “Integrated Decision Support Key: Advancing Acquisition Decisions with Data Models and Tools”, Naval Engineering Journal, Spring 2024, pages 121-133
2.2 Model-Based TEMP
The MB-TEMP reference architecture captures the essence of the test planning and decision support domain relative to the needs of program offices, DOT&E, DTE&A, T&E practitioners, and decision-makers. Specifically, it represents an instantiable template developed using MBSE principles and best practices to guide the development of new and/or extended versions of program-specific MB-TEMPs. A needs analysis was performed to ascertain the goals, stakeholders, scope, and context for the TEMP RA. Specifically, the needs analysis provided crucial insight into the current and future needs of key stakeholders, which for the reference architecture included T&E oversight organizations, program managers, program offices, scheduling staff, decision makers, and test teams. From an architectural point of view, the
perspectives of the key stakeholders were identified to facilitate the definition and development of the relevant viewpoints and views (ISO 42010, 2022) needed to create an architectural description for the MB-TEMP RA. The views of the TEMP RA – depicted using diagrams – are created to serve as digestible chunks of the complete RA and address specific concerns of acquisition test-planning stakeholders and decision makers as it relates to their test planning and decision support needs. Additionally, the TEMP-RA is developed to facilitate both current and future program-specific TEMP implementations by utilizing architectural principles such as the separation of concerns, managing key interfaces, and ensuring minimal coupling between elements. For this article, abstractions and simplification concepts have been utilized in relation to how some diagram views appear. More importantly, the architectural strategy employed in the development of the TEMP RA results in a digital engineering artifact (tool) that when instantiated, will seamlessly integrate into the digital engineering ecosystem of a program.
The primary strategic driver for the Model-Based (MB) TEMP Reference Architecture (RA) is timely data-driven decision support. The MB TEMP RA stakeholders include program offices, mission program offices, test oversight organizations, test team/decision makers, testing organizations, and Combatant Commands (COCOMs). The program offices need the MB-TEMP to be in a format that provides a practical level of standardization/interoperability, easily maintainable, link to program risk, imported and exported from a model to table-based tools and formats, contain TEMP elements specified in the most current policy and guidance, in a style consistent and interoperable with other DoD acquisition system models, have intelligent digital integrations with databases, easy to architect and use, consist of a set of standardized decisions and operational performance metrics, evolvable, and implementable.
7, Collins, James C. Good to Great and the Social Sectors: Why Business Thinking Is Not the Answer : A Monograph to Accompany Good to Great : Why Some Companies Make the Leap–and Others Don’t. [Boulder, Colo.?: J. Collins], 2005.
3.0 Development Approach
Since the Model Based TEMP has been developed, several similar and related development efforts are also advancing digital technologies in the Test and Evaluation space in the DoD. The MB-TEMP-RA development team is very aware of the efforts in different parts of the DoD to create a digital TEMP. The team decided early as a part of the development and fielding of the MB TEMP, to ensure compatibility with as many of these efforts as reasonably possible. Two of the most important efforts in this domain to date have been the JHU/APL Meta-Model (2011), and the University of Arizona T&E ontology (2024).
Model-Based TEMP
The Model-Based Test and Evaluation Master Plan Reference Architecture (MB-TEMP RA) Model was developed using a domain-based approach. The Model-Based TEMP Reference Architecture architecturally references Model-Based IDSK Reference Architectures, mission models, test range and facility models, test models, requirements models, system models, and the Test and Evaluation (T&E) Reference Metadata Model (2011). The set of architectural views of the MB-TEMP RA portrayed in Figures 4 through 9 represents a small subset of the complete set of views which together make up the MB-TEMP RA. The views portrayed below include (i) TEMP RA Domain view, which represents the overarching view of the TEMP RA capturing the key elements of a generic TEMP architecture. (ii) MB-TEMP Evaluation Approach view created to address the concern “What are the key structural elements & relationships required to specify a MB-TEMP Evaluation Approach view?” (iii) SUT Full Spectrum Survivability view which addresses the concern “What are the key structural elements and relationships required to specify cross-cutting SUT Operational Survivability Capability views?” (iv) MB-TEMP IDSK Decisions view captures the limited number of critical decisions that need to be made at different times throughout the acquisition process as a standardized set of program decisions and provides a format to link them to developmental, operational, and integrated test data needed to inform decisions. (v) MB-TEMP IDSK Key Elements view which captures elements used in generating pertinent data and information required by decision makers. (vi) MB-TEMP Requirements view created to portray the various types of requirements defined as part of the TEMP RA and provides insight regarding the TEMP RA’s requirements pattern/schema.

Figure 4: The TEMP Domain view of the MB-TEMP RA
Figure 4.the TEMP Domain view of the MB-TEMP RA provides crucial insights into the top-level composition of the TEMP domain. The RA view links together elements defined within the TEMP model and elements already defined in digital models that exist within a program’s digital engineering ecosystem. These digital models include a program office model, requirements model, system model, SUT model and test range models.

Figure 5: MB-TEMP Evaluation Approach View

Figure 6: MB-TEMP SUT Full Spectrum Survivability Capability View
Figure 5 MB-TEMP Evaluation Approach View and Figure 6. MB-TEMP SUT Full Spectrum Survivability Capability View

Figure 7: Key Decisions view of the MB-TEMP RA
Figure 7. The Key Decisions view of the MB-TEMP RA depicts the 5 key decision classes namely Class I, Critical Technical Requirements Decisions, Class II, Program Milestones / Technical Reviews Decisions, Class III, Sub-System Critical Performance and Technology Maturity Decisions, Class IV, Major Performance Characteristics Decisions, and Class V, Programmatic Decisions. Sample instantiations of each Decision Class are also highlighted.

Figure 8: IDSK Key Elements view of the MB-TEMP RA
Figure 8. The IDSK Key Elements view of the MB-TEMP RA provides crucial insight into the top-level composition of the decision support domain. The architecture allows the linking of various elements together in order to generate views needed to support decision-making across programs and portfolios.

Figure 9: Requirements view of the MB-TEMP RA
Figure 9. The Requirements view of the MB-TEMP RA is created to address the concern “what are the requirements, relationships, and TEMP elements required to support the generation of test planning and decision support views?
Also created as part of the MB-TEMP RA are a set of standardized test planning and IDSK table formats used to capture pertinent information contained in the IDSK RA model and which collectively represent the integration of information, knowledge, capabilities, and data necessary to support decision-making by program offices and the T&E enterprise to achieve their strategic objectives. A major benefit of this model-based architectural approach to test planning and decision support is the ability to query the model in order to generate views that can be tailored and configured based on the needs of the key stakeholders. Notional examples of several test planning and IDSK table formats are reported on in (Anyanhun & Arndt, 20238).
In essence, the MB-TEMP RA provides consistency, integrity, balance, and practical guidelines for program-specific implementations. Specifically, a MB-IDSK will improve test planning and decision-making processes by making the TEMP model compatible and interactive with other digital engineering models for the system under development (SUD). Additionally, a library of standardized tailorable IDSK table templates that are fully consistent with the traditional document and table-based IDSKs used in other programs within the DoD is generated to support test planning and decision making.
In the TEMP Guidebook Version 3.19 Document in Model Form phase, the Phase 1 MB-TEMP Reference Architecture (RA) Model was developed by defining the table of contents and data elements in SysML from the Director, Department of Test and Evaluation (DOT&E) Test and Evaluation Master Plan (TEMP) Guidebook Version 3.1 (2017) document. The Phase 1 MB-TEMP Reference Architecture (RA) Model is intended for programs that are in early stages of digital transformation, and the intended supporting digital models are not available. Given these conditions, a program would specialize from the Phase 1 MB-TEMP RA Model and utilize the Phase 1 MB-TEMP Process activity to develop a Program-Specific Phase 1 MB-TEMP Objective Architecture (OA) Model. A MB-TEMP Mapping Model was developed to audit the traceability between the MB-TEMP T&E data objects and the TEMP Guidebook Version 3.1 table of contents section.
In the MB-IDSK RA-Integrated TEMP Model phase, a version of the MB-TEMP Reference Architecture (RA) Model replaces the Phase 1 MB-TEMP RA Model elements with corresponding elements in the MB-IDSK Reference Architecture (RA). In the MB-TEMP RA Model phase, the MB-IDSK RA model elements are updated to ensure complete coverage of all concepts associated with the elements specified in the TEMP Guidebook Version 3.1. In addition, a design of experiment domain was defined to inform Milestone A, B, and C decision making. The MB-TEMP RA Model is intended to be implemented by programs with mature digital engineering processes and ecosystem, and existing digital engineering artifacts (e.g., system models, requirements models, and an MB-IDSK RA model). If a Program-Specific MB-IDSK OA Model is not in development using the MB-IDSK RA Model, a program would specialize from the MB-TEMP RA Model and utilize the Develop MB-TEMP Using Approach 1 activity to develop a Program-Specific MB-TEMP OA Model. Otherwise, a program should utilize the Develop MB-TEMP Using Approach 2 activity. The activity pre-supposes that key links and relationships have been or are in the process of being developed between elements from digital models. Program stakeholders would prepare for MB-TEMP OA RA Model development, tailor the MB-TEMP OA Model development requirements depending on the Adaptive Acquisition Pathway, and proceed depending on the availability of digital models.
As the Model Based TEMP has been developed a number of similar and related development efforts are also advancing digital technologies in the Test and Evaluation space in the DoD. The MB-TEMP-RA development team is very aware of the efforts in different parts of the DoD to create a digital TEMP. The team decided early as a part of the development and fielding of the MB TEMP, to ensure compatibility with as many of these efforts as reasonably possible. Two of the most important efforts in this domain to date have been the JHU/APL Meta-Model (2011), and the University of Arizona T&E ontology (2024).
8 Arndt, C., Anyanhun, A., & Werner, J. S. (2023). Shifting Left: Opportunities to Reduce Defense Acquisition Cycle Time by Fully Integrating Test and Evaluation in Model Based Systems Engineering. Acquisition Research Program
9 TEMP Guidebook Version 3.1, https://www.dote.osd.mil/Guidance/DOT-E-TEMP-Guidebook/
Meta Model
One unique attribute of model-based TEMP is the ability to process it with computerized analytics. Management questions like “What parts of the IDSK have test data so far?” or enterprise questions like “What programs would be impacted if capability X at range Y was offline for an upgrade during August?” could be constructed once and evaluated routinely to update a dashboard or status report. This sort of analytic can’t be a computer process with legacy TEMPs; humans would need to construct those answers. As a result, answers are not readily available, and decisions with program impacts are made without program inputs. When a program chooses to use a model-based TEMP, they also enable many other decisions to benefit from this sort of analytic.
For analytics to be constructed once and widely used, all model-based TEMPs must share some structural elements. JHU/APL made an initial model to identify all the elements necessary to capture the legacy TEMP content. This initial model was used on the NGJ Pilot Program, to consider possible deployment strategies. The most obvious strategy, all programs must make their system models as specializations of a mandated model, was discarded as unusable. Even though not all programs have fully embraced SysML, the programs that have wouldn’t consider a model-based TEMP that required them to throw out their model and start again. Instead, the NGJ Pilot Program developed a meta-model view through which their system and testing models could be viewed. This meta-model was aligned with the initial JHU/APL model, as well as other standard models such as UML Testing Profile v2 (UTP2)10 and the Unified Architecture Framework (UAF v1.1)11.
The resulting meta-model was validated at several DoD wide events to produce the current candidate for standardization. Adoption of a standardized meta-model provides developers the flexibility to model in whatever approach best represents their program system and processes, while providing analytics a minimum set of known elements to enable robust reusable analytics.
10 UML Testing Profile v2 (UTP2), https://www.omg.org/spec/UTP2/2.1/About-UTP2/
11 Unified Architecture Framework (UAF v1.1). https://www.omg.org/spec/UAF/1.1/UAFP/PDF
However, consistency with other models can be used as an additional guideline during the implementation of a Model Based TEMP based on the Reference Architecture.
University of Arizona Digital Test and evaluation Ontology
In order to facilitate interoperability and consistency of implementation of the Model based TEMP across different programs, maintaining a common and consistent ontology is important. The University of Arizona has done significant work in the development and publishing of a standard test and evaluation ontology. The Arizona digital TEMP is an ontological representation of the Test and Evaluation Master Plan (TEMP) as defined in DoD Instruction (DoDI) 5000.89. Leveraging digital engineering principles, the ontology integrates mission models, requirements, test data, and resources into a cohesive digital ecosystem.
This framework is underpinned by an ontology-based structure that organizes key T&E elements into a modular, scalable system, built on top of the University of Arizona Ontology Stack (UAOS), an open-source set of interoperable ontologies.
The UAOS comprises foundational, core, and domain-specific ontologies. These modular layers define the roles, relationships, and rules essential to T&E. The larger ontology, comprises nine ontologies, each roughly corresponding to a section in DoDI 500.89. The overall structure of the ontology stack is presented in Figure 10.
These features are enabled by semantic web technologies. The ontology stack has been written in the Ontological Modeling Language (OML) and can therefore leverage reasoning, querying, and validation of data to provide comprehensive insights into T&E processes. The ontology stack centralizes models, data, and test artifacts within a digital ecosystem, ensuring consistency, enabling automation, and supporting real-time updates. This environment also supports live data integration, allowing for rapid updates and scenario analysis as test events unfold.

Figure 10: The Ontology Stack
As a part of the development of the DOT&E standard model-based TEMP reference Architecture we evaluated the Arizona ontology to ensure that the MB-TEMP-RA was consistent with the ontology, and made adjustments and additions to the Reference Architecture in be in line with the ontology. These updates will be released as part of the next release of the MB-TEMP-RA.
4.0 Use of Products
In order to effectively use a model-based TEMP it was determined that the model-based format needed to be fully backwardly compatible with both current and past paper-based formats of the TEMP. TEMPs have been used by programs and testing organizations for many years. It is important that existing and new programs be able to develop their TEMPs in whatever format that works for their program, and then be able to easily convert that format into one of several available formats of a model-based TEMP. In this way it will be possible to standardize the format of any program and to do things like aggregate TEMPs in a digital format for additional analysis.
A principal objective of the reference architecture is to provide practical guidelines and support by embodying a framework that enables technical and program leaders, test planning personnel, T&E decision-makers, and program T&E WIPTs to identify and solve problems relating to test planning and system deficiencies prior to making final acquisition or fielding decisions. The set of key questions and corresponding answers highlighted in Table 1 were used to guide the architecting / developmental approach of the MB-TEMP RA. Reasoning about the relationships between the key questions and corresponding answers provided a tailored systems engineering approach to the RA development and the subsequent implementation approach by programs.
Table 1: MB-TEMP RA Key Questions

The phased approach portrayed in Figure 3 was used to evolve the TEMP into a digital engineering artifact and tool and outlines the various phases through which the TEMP model is expected to evolve. TEMP RA models currently available for initial baseline models exists for Phase 1, Phase 2 and Phase 3 MB-TEMPs.

Figure 11: Four Phase MB-TEMP RA Development Approach
The meta model provides a format to adhere to, which will increase compatibility of different implementations of the Model Based TEMP. The current plan is to fully incorporate the meta–Model UML Testing Profile v2 (UTP2) and the Unified Architecture Framework (UAF v1.1), in an update in the spring of 2025, for use by different acquisition programs.
5.0 Planned Future development and improvements
The Model-Based TEMP is a critical tool for expanding the capability to do long range planning of test programs, and to do trade space analysis of different test approaches, test capabilities, and simulation capabilities to build and optimize a test plan for a given program.
To facilitate the protentional of the Model Based TEMP, there are several feature enhancements that are planned for the Reference Architecture soon. Two of the most important features that are currently under development are the integration of a design of experiments function that will add test planners in the design of tests, to satisfy specific requirements, and a function to incorporate mission-based risk assessments into the TEMP to help with understanding how to test for these contingencies.
5.1 Integration of Bayesian Inference Design of Experiments
In the MB-TEMP RA Model phase, the MB-TEMP RA Model defined an initial design of experiment domain based on informal requirements defined in the TEMP Guidebook Version 3.1 that specifies general expectations for design of experiments to support TEMP milestones, test plans, and test designs. The Design of Experiment (DOE) views were developed with the viewpoint of defining the relationships between the PMO, MB-IDSK decision domain, and the design of experiment domain. Model elements represented in this view include program milestone/technical review decisions, and the milestone-specific design of experiment domains. The requirements defined in this phase will be refined based on stakeholder needs. The strategy for the next phase of the MB-TEMP RA is to further define design of experiment experimental design data to test and evaluation planning model elements.
5.2 Mission Based Reliability Assessment
Designed for application across the acquisition lifecycle, the Mission-Based Risk Assessment (MBRA) ensures systems are developed and tested to meet operational, survivability, and lethality requirements under contested, congested, constrained conditions. By identifying vulnerabilities and prioritizing risks, MBRA supports the development of actionable test plans, acquisition strategies, and risk mitigation measures. This dynamic process adapts to new data, threats, and system updates, ensuring that decision-makers have the insights needed for informed risk management.
MBRA is conducted collaboratively by Program Managers, Test & Evaluation Working-level Integrated Product Teams (T&E WIPT/ITTs), Operational Test Agencies (OTAs), Live Fire Test & Evaluation (LFT&E) organizations, cybersecurity specialists, system engineers, and intelligence community representatives. As such, the MBRA is a critical part of future test planning activities and will be added to the Model Based TEMP.
6.0 Integration with other digital models and systems
Model-based systems engineering creates several key opportunities in the development of systems. Some of these opportunities include the transparency of data and the ability to move data and other information across different models. As we develop and improve the different models for systems it is critical that these models are compatible in meaningful ways. This requires a concerted effort to ensure that data and information created and stored in these models can be accessed and used by the model-based TEMP. Several key aspects of creating these interfaces are noted below based on the different classes of models that will need to be integrated with the MB-TEMP as we implement the MB-TEMP on real-world systems soon.
6.1 Mission Models
Mission models are designed to capture the concepts of operations of different systems and how they interact with both the battlefield and other systems that are being used to contribute to the specific mission of interest. Mission models are particularly important in the context of the TEMP as they help us develop meaningful requirements for the system under development and also meaningful test cases to ensure operational effectiveness of these systems.
6.2 System Models
The system models describing the performance, design, and behavior of the system under development are created by the program offices and by the vendors during the process of development. It is critical that the TEMP models’ interface with these models to link the system models to both requirements and to test cases.
6.3 Modeling simulation
In the case of the TEMP digital models, there is an additional need to create and maintain good interface with the major Modeling and Simulation (M&S) tools that are used both by the government and the vendors to design and test the systems under development. These interphases will increase the effectiveness of the systems models and our ability to use M&S as a part of comprehensive test programs.
7.0 Conclusions
The development of the Model-Based TEMP creates an advancement and an opportunity for the Department of Defense to take advantage of the digital advancements in other parts of the acquisition process. With the development of the Model Based TEMP programs, test organizations, venders, and DOT&E can share data and plans and make the test planning process both interactive and dynamic.
The current version of the Model-Based TEMP RA was designed to accommodate all levels of digital engineering maturity of programs and the other key stakeholders. The MB-TEMP can be developed by digitizing current versions of the TEMP that follow the paper-based format or can be implemented using a fully digital format consistent with digital native programs. The continued development of MB-TEMP, including the addition of additional features, will increase the functional usability of the Digital TEMP to the program offices and to the Test working groups to be used as a dynamic planning and analysis tool.
The development of the Model-Based TEMP and the associated integration of the MB-TEMP with other modeling efforts within programs will make it possible to better use the IDSK as a decision support tool, as it was intended.
8.0 Recommendations
The Model-Based TEMP is an important advancement in the use of digital engineering across all parts of the acquisition lifecycle. In order to capitalize on the new capabilities that the MB-TEMP provides to programs and test organizations, we recommend the following activities (many of which are already planned).
First, as we move forward with the major digital engineering efforts within the DoD, we are finding that different organizations are developing different technical approaches to the development of digital engineering systems and artifacts. In order to maximize the utility of these different efforts and to ensure that these systems can continue to be useful as we integrate them together, it is critical that the architects of these systems work together to ensure the compatibility of different digital engineering systems being maintained. Second, efforts should be undertaken to integrate as many of the systems models being used by programs as possible to facilitate the transfer and transparency of data across the integration of different models developed by the program offices and venders. Lastly, to specifically enchase the test planning aspects of the TEMP, we recommend that efforts be taken to better model test capabilities of the Services, and the vender community (this also includes models of the capabilities of major Modeling and Simulation (M&S) tools). This will improve the quality of test program development, and allow for earlier, and better decision making (IDSK) based on knowledge of future performance of systems under development.
References
Executive Services Directorate. (2020). DoD Instruction 5000.89. Retrieved from Executive Service Directorate: https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/500089p.PDF
DOT&E. (2022). Office of the Director, Operational Test and Evaluation Strategy Update – Strategic Pillars, viewed 30 March, 2023, FINAL DOTE 2022 Strategy Update 20220613.pdf (osd.mil)
Beers, S. M., Hutchison, S., & Mosser-Kerner, D. (2013). Developmental Test Evaluation Framework: Focus on Evaluation and Analysis for Acquisition Success. Phalanx, 46(3), 36-39.
Beers, S. M. (2023). Integrated Decision Support Key (IDSK) for Capability Evaluation-Informed MDO Decision-Making, ITEA MDO Workshop, July 2022.
Collins, C. & Beers, S. M. (2021). Mission Engineering, Capability Evaluation & Digital Engineering Informing DoD Technology, Prototyping and Acquisition Decision, NDIA S&ME conference, Dec 2021.
Werner, J. S., & Arndt, C. (2023). Development of Digital Engineering Artifacts in Support of MBSE based Test Planning, Execution, and Acquisition Decision Making. Acquisition Research Program.
Army Aviation and Missile Command Fort Eustis VA. (2022). Comprehensive Architecture Strategy (CAS). https://apps.dtic.mil/sti/pdfs/AD1185001.pdf
Muller, G. & Hole, Eirik. (2007). Reference Architectures; Why, What and How, White Paper Resulting from Architecture Forum Meeting, Embedded Systems Institute and Stevens Institute of Technology, Hoboken NJ, USA March 12 & 13, 2007
ISO/IEC/IEEE (2022). Systems and Software Engineering—Architecture Description, document SO/IEC/IEEE 42010:2022(en), 2022.
Arndt, C., Anyanhun, A., & Werner, J. S. (2023). Shifting Left: Opportunities to Reduce Defense Acquisition Cycle Time by Fully Integrating Test and Evaluation in Model Based Systems Engineering. Acquisition Research Program.
Test and Evaluation Metadata Reference Model. (2011): Range Commanders Council.
Director, Operational Test and Evaluation (DOT&E) Test and Evaluation Master Plan (TEMP) Guidebook Version 3.1. (2017).
J. Gregory and A. Salado, “An Ontology-based Digital Test and Evaluation Master Plan (dTEMP) Compliant with DoD Policy,” Syst. Eng., 2024.
J. Gregory and A. Salado, “dTEMP: From Digitizing to Modeling the Test and Evaluation Master Plan,” Nav. Eng. J., vol. 136, no. 1, pp. 134–146, 2024.
D. of Defense, “DoD Instruction 5000.89: Test and Evaluation,” 2020.
J. Gregory and A. Salado, “Towards a Systems Engineering Ontology Stack,” in INCOSE International Symposium, Dublin, Ireland, 2024.
OpenCAESAR, “Ontological Modeling Language 2.0.0,” 2023. [Online]. Available: https://www.opencaesar.io/oml/.
Jeremy Werner, Kelli Esser, Craig Arndt, Trisha Radocaj, Awele Anyanhun, Daniel Wolodkin, Geoffrey Kerr, Laura Freeman, “Integrated Decision Support Key: Advancing Acquisition Decisions with Data Models and Tools”, Naval Engineering Journal, Spring 2024, pages 121-133
Author Biographies
Dr. Craig Arndt Dr. Arndt has extensive experience as a senior executive and technology leader in research, education, engineering and defense, homeland security and intelligence technologies, with extensive experience as an innovative leader in industry, academia, and government. Dr. Arndt currently serves as a principal research engineer on the faculty of the George Tech Research Institute (GTRI) in the System Engineering Research division of the Electronic Systems Lab. Dr. Arndt is a licensed Professional Engineer (PE), a Certified Human Factors Professional (CHFP), an Expert Systems Engineering Professional (ESEP), and he has over 40 years of professional engineering experience though the defense and government engineering community. He is widely published in the areas of Electrical, Systems, Test and Human Factors engineering, and serves on the boards on several technical organizations. Dr. Arndt holds engineering degrees in electrical engineering, systems engineering and human factors engineering and a Masters of Arts in strategic studies from the US Naval War college.
Dr. Randy Saunders RANDY SAUNDERS is a Principal Staff Analyst at the Johns Hopkins University Applied Physics Laboratory. He has over 40 years of experience in the analysis, model-based design, implementation, and integration of high-fidelity simulations for military and government customers.
Geoffrey Kerr Geoffrey Kerr is a Senior Research Associate at the Virginia Tech National Security Institute with expertise in Systems and Digital Engineering. Mr. Kerr is a 34-year veteran of the Aerospace and Defense industry where he has vast experience in Systems Engineering, Program and Project Management, and diverse technical leadership in developing aircraft for the US Government and allied nation states. Mr. Kerr held a prominent executive role implementing Digital Transformation across the Lockheed Martin Corporation prior to joining VTNSI in April 2022.
Amanda Crawford received her B.S. in Industrial Engineering from California State Polytechnic University, Pomona (2021) and is a research engineer specializing in Model-Based Systems Engineering (MBSE) at Georgia Tech Research Institute. Her research interests focus on applying innovative MBSE solutions to support the enterprise digital transformation of test and evaluation and acquisition processes through the development of digital models and guidebooks for MBSE professional. She has previous systems safety engineering experience at Northrop Grumman.
Dewey Classification: L 681 12



