DECEMBER 2024 I Volume 45, Issue 4

Graphical representation of the model-driven testing methodology used in a case study assessme

A Case Study-based Assessment of a Model-driven Testing Methodology for Applicability and Metrics of Model Reuse

Jose L. Alvarado, Jr.

Jose L. Alvarado, Jr.

AFOTEC Det 5, Edwards AFB,
California

Dr. Thomas H. Bradley

Dr. Thomas H. Bradley

Colorado State University, Ft. Collins,
Colorado

DOI: 10.61278/itea.45.4.1002

Abstract

The Department of Defense (DoD) Test and Evaluation (T&E) community has fully embraced digital engineering, as defined in the 2018 Digital Engineering Strategy, motivating the ongoing development and adoption of model-based testing methodologies. The present article expands on existing grey box model-driven test design (MDTD) approaches [1] by leveraging model-based systems engineering (MBSE) artifacts to generate flight test planning models and documents. A baseline model of a system under test (SUT) and two additional system case studies are used to assess a Model-Driven Test Design (MDTD) process. We illustrate the method’s applicability to these case studies, evaluate the benefits of MDTD by applying novel metrics of model element reuse, discuss the benefits of MDTD by applying novel metrics of model element reuse, and discuss the relevance to operational flight testing. This approach is novel within the flight-testing community as it is the first implementation of MDTD in United States Air Force (USAF) operational testing applications. Whereas previous studies have explored SysML model reuse in small-scale problems or product families [2] [3], MBSE model management for operational tests at flight-system scale and assessment of reuse in the T&E phase of the SE lifecycle are unresearched to date. This methodology and the case studies will be of particular interest to those involved in developing, executing, and reporting on flight test plans in the context of the DoD Digital Engineering transformation.

Keywords: AFOTEC, flight test, MDTD, MBSE, Reusability, SysML, T&E, test planning

Introduction

Within the Department of Defense (DoD), flight test plans are developed to “communicate technical details and logistics for executing and reporting the outcomes of flight, ground, and laboratory tests on air vehicles, subsystems, and components” [4]. They establish a connection between test points and specified requirements, confirming the SUT aligns with system specifications and delivers the intended performance in the proposed operational environment [5]. These plans encompass all aspects of test execution, including allocating critical test infrastructure and resources, collecting data, and analyzing data post-flight. Test plans capture and communicate a broad set of derived requirements that must be met for the successful execution of flight tests and are an integral aspect of the DoD acquisition process [6]. These derived requirements are meticulously designed to ensure that the necessary data for evaluating system performance are collected within the constraints of existing flight test resources, processed into meaningful information, and thoroughly analyzed. The conclusions drawn from the post-flight data analysis are then incorporated into reports to communicate and validate the SUT’s flight test performance [5].

Within existing DoD acquisition processes, flight testing is an essential component of a sequential, document-centric, and resource-intensive process that often does not include early flight test planning. This sequential acquisition process commonly leads to prolonged cycle times, difficulties assimilating warfighter insights and feedback, an inability to swiftly integrate emerging technologies, and poorly justified or uncertain decision-making [7]. The existing flight test planning procedure is a meticulous, documentation-focused process, demanding extensive configuration management efforts to ensure the currency of essential documents relative to the SUT. Considerations for flight testing most often arise late in the acquisition process, resulting in program delays due to test planning, instrumentation, test operations, and analysis. Adding to the causes of program delays, the traditional flight test planning activity is driven by consensus among experts, which consumes significant labor hours and costs.

Integrating flight test planning into an MBT and broader MBSE environment, as envisioned in the DoD’s Digital Engineering Transformation, has the potential to address many causes of program delays related to physical flight testing. This paper builds on existing model-based grey-box testing processes [1] by developing and demonstrating a flight-test-specific MBT methodology. This methodology leverages digital engineering models to represent the outcomes of a DoD MBSE-enabled product development process. This paper expands on existing model-based grey box testing processes [1] that demonstrate a flight-test-specific MBT methodology using a set of digital engineering models that represent the product of a DoD MBSE-enabled product development process. Test planning experts examine these models to assess the formatted information and determine their adequacy in providing the information needed for test planning. The MBT process identifies missing parts or information by reviewing the native digital template and the resulting draft test plan. The proposed MBT process seeks to allow for the re-population and re-generation of a new version of the test plan when new information, updates, or new technology is inserted into the model. These assertions are tested through an assessment of the MBT methodology’s ability to generate model-driven test plan artifacts that are updateable, consistent, and adaptable across acquisition programs [1].

Background

The DoD employs Digital Engineering (DE) methods and tools to transform how data is used to plan, design, develop, test, deploy, and sustain new systems and mission support capabilities [8]. While DE adoption is increasing, current programs face challenges due to existing infrastructure, processes, and workforce limitations. MBSE for T&E applies a digital approach to support test planning, execution, and the analysis of results while integrating model-based data as needed [9]. MBSE allows engineering teams to operate and handle the complexity of modern systems within a digital environment [9]. This digital environment promotes interoperability, helps identify conflicting requirements or technical gaps, and ensures configuration management as system designs evolve [9].

The “MBSE for T&E” roadmap presented in the “Positioning Test and Evaluation for the Digital Paradigm” [8] is a structured shift from a paper-based, siloed acquisition process to a data-centric T&E acquisition lifecycle as part of the DE transformation [8]. This roadmap is divided into three phases. Phase 1, Model-Based T&E Planning and Control, replaces traditional paper-based artifacts with digital ones, including descriptive and execution models. Phase 2, Dynamic T&E Planning and Execution, builds on the Phase 1 framework by incorporating quantitative methods that allow dynamic updates to test plans. Finally, Phase 3, Coupled Mission, System, and T&E Decision Making, integrates the Phase 2 framework with mission and systems engineering models [8].

MBT is a subset of the emerging domain of MBSE [10]. MBSE comprises a set of interconnected processes, methods, and tools operating within a “model-based” or “model-driven” framework [11]. MBSE methodologies create digital models that function as abstract representations [12] of real-world systems, much like statistical, mathematical, or physics-based models. These MBSE models leverage computer power to capture, recall, and relate relevant data efficaciously. MSBE Artifacts can describe aspects of a physical SUT’s desired structure and behavior, business processes, and requirements. In practice, the set of interconnected state machine diagrams, use-case diagrams, sequence diagrams, and mathematical models enable testers to create independent test cases, scenarios, and plans abstracted from specific design implementations [13] [14]. Under the MBT paradigm, the model should precisely define test parameters and automatically generate detailed test plans for test cases. Abstract computational representation enables automated and model-based testing, investigation, and optimization, enhancing the acquisition cycle’s dependence on physical assets. An MBSE testing model enables the parametric optimization of various test parameters, enabling agile or evergreen reuse without rewriting the test plan when new requirements or changes to the SUT are introduced. This reuse contrasts with the current document-centric process, which requires manual revisions each time the SUT or its requirements change.

A component of MBT is MDTD, which seeks to develop a comprehensive understanding of relationships and dependencies that govern test design. MDTD facilitates the thorough assessment of the testing implications of changes made at any stage of the product’s design cycle [15]. MDTD significantly enhances the overall testing development process [15]. The MDTD process allows a subset of experienced testers to handle the high-level, statistical, and defensibility aspects of test design and development at an abstract level. In contrast, other testers plan the details of Validation and Verification (V&V) criteria, testing operations, and safety. This abstraction allows for the early involvement of designers responsible for the test planning process to incorporate testing requirements into the system design [16]. Ideally, MDTD empowers the operators and test team members to influence and contribute expertise in shaping and refining different aspects of a physical SUT’s intended structure and behavior by incorporating instrumentation, system monitoring, and data processing into the SUT.

Methods

Baseline Operational Test & Evaluation Test Planning Process

The OT&E Construct is the Air Force’s framework for conducting an operational test. It identifies two elements of an operational test and their relationship to each other [5]:

  • Operational Effectiveness measures the system’s overall capability to accomplish a mission when used by representative personnel in the expected or planned operational environment while considering organization, doctrine, supportability, survivability, vulnerability, and threat.
  • Operational Suitability measures the degree to which a system can be satisfactorily used in the field while considering its availability, compatibility, transportability, interoperability, reliability, wartime usage rates, maintainability, safety, human factors, workforce supportability, logistics supportability, natural environmental effects, and impacts, documentation, and training requirements.

These two elements of the operational test (Effectiveness & Suitability) are assessed by critical operational issues (COI), which address a vital mission element or operational objective to determine the system’s overall capability to support mission accomplishment. These COIs are assigned measures designed to convey information about a system’s effectiveness, capabilities, functions, properties, or structure. COIs are usually categorized as measures of effectiveness (MOE), measures of performance (MOP), or measures of suitability (MOS) [17]. These MOEs, MOPs, and MOSs are the quantified criteria for assessing system behavior, capability, or operational environment changes. They measure the end state, achieve an objective, or create an effect. An MOE is designed to assess how the SUT can accomplish mission objectives and achieve desired mission effects. A MOP measures a system’s capabilities, while a MOS measures the ability to support the SUT in its intended operational environment [17]. Furthermore, each measure (MOS and MOP) is then composed of data elements, defined as any measurement or information used to determine the value of a measure. Data elements may be quantitative, qualitative, subjective, or objective. Figure 1 illustrates the hierarchy of these measures outlined in red and how they fall within the overall AFOTEC OT&E construct.

Measure Hierarchies and the Overall AFOTEC OT&E Construct
Figure 1. Measure Hierarchies and the Overall AFOTEC OT&E Construct

AFOTEC uses a well-established test planning process outlined in the relevant guides [5] and [17]. The flight-testing community has not demonstrated this process in a digital engineering construct, nor have acquisition systems been developed using an MBSE approach. The methods proposed in this paper define how this process may be adapted to enable an MDTD test planning process for future digital engineering-enabled test articles.

The current “document-centric reuse” process requires test planners to review and research previous test plans and documents as a starting point for drafting new plans. This approach relies heavily on “copy and paste” to develop critical elements such as COIs, MOEs, MOPs, and MOS, which appear multiple times throughout the document and must be manually replicated. As test plans evolve, these components often change, requiring planners to update them repeatedly across different sections. This repetitive process consumes significant time, labor hours, and resources. Additionally, the document-centric approach lacks an efficient reuse mechanism due to personnel turnover, and team committees and stakeholders need to reconvene and debate changes.

Proposed Model-Based AFOTEC Testing Methodology

Test planning is identifying and communicating the conditions to be implemented during testing. Test plans communicate how an SUT will be tested, data will be analyzed, and results reported to validate product requirements [18]. Test plan measures confirm the implementation of product requirements that align with the warfighter’s mission objectives and provide insights into the system’s structure and usability in the field [19]. Using an MBSE-enabled process, testing engineers can generate test plans by building SysML test models during the product’s and associated models’ design and development. This approach facilitates modeling of the testing process parallel to the design and guides all users on the testing requirements and methodology.

A pragmatic approach to formulating a test plan from a given model involves furnishing the model with testing elements to be used in developing the test plan. These elements should contain the critical measures (as illustrated in Figure 1) and operational details needed to create an operational test plan. Developing and implementing a testing model based on the OT&E construct within an MBSE-enabled MDTD enables the creation of test plan models.

Model Reuse Assessment

In this study, we have developed a set of case studies that can be used to derive examples of MBSE-enabled test plan models. We then determine the degree of reuse enabled by the proposed MBSE-enabled MDTD process. Although many benefits have been asserted in literature for an MBSE-enabled MDTD process, model reuse is stated consistently to be a key benefit of MBSE relative to the conventional document-centric test planning process [20] [10]. In the context of the continuous testing activity that is the responsibility of AFOTEC, model reuse is a crucial characteristic of improving the efficiency and effectiveness of test plan development under the Digital Engineering construct. Sharing information and automating administrative tasks save time and facilitate process reuse [19].

System Modeling Language (SysML) is an MBSE approach that uses diagrams and notations to represent a system’s structure, behavior, requirements, and interactions. Since MBSE is still in the early stages of adoption within the DoD and the aerospace industry. A clear definition and ontology of SysML model reuse is required. In this study, we adopt a definition of reuse from software engineering [21]. In the software engineering community, reuse often involves a “copy & paste” method, where specific artifacts are copied and then adapted to meet the current project’s needs [22]. This type of reuse extends the software artifact’s functionality, enabling it to be integrated or repurposed for different applications [23], albeit with a system of systems emergent risks from the multi-proprietary and multi-generational nature of such reuse [24].

To achieve significant benefits, reuse must be systematic, and organizations implementing reuse programs must measure their strategies’ effectiveness [25]. Reuse can be classified into three distinct categories [22]:

  1. Identical Subset Reuse: This occurs when a subset of initial functions or elements are identical to a subset of the new functions or elements, requiring only minor changes such as renaming.
  2. Modified Subset Reuse: This involves a subset of initial functions or elements that are modified versions of the new functions or elements. This category can be further divided into:
    1. Minor Modifications: This involves minor changes, such as renaming or altering input/output values.
    2. Major Modifications: Significant modifications to the content are required while maintaining the same structure.
  3. Creation of New Elements: This involves the creation of entirely new functions or elements.

A standard metric used to measure systematic software reuse is Product Reuse Percent (PRP), which indicates how much of the product can be attributed to reuse. This metric is calculated using the amount of reused source instructions (RSI) and shipped source instructions (SSI). Reuse percent is calculated as follows [25]:

(1) measure systematic software reuse is Product Reuse Percent (PRP)

When the product has no reuse instructions, RSI=0, resulting in PRP=0. When the product consists entirely of reused instructions, the SSI variable in the denominator is eliminated, as all shipped instructions result from reuse. In this case, SSI=0 and PRP=1.

An alternate metric of software reuse is Reuse Value Added (RVA) [25]. This metric evaluates the potential benefits of consuming and producing reusable software and is based on a productivity index that determines the efficiency of reuse across multiple products. The RVA is calculated as follows [25]:

(2) metric of software reuse is Reuse Value Added

Products that do not reuse code have RVA = 1, while an RVA = 2 indicates the product is twice as efficient than if the product was created without reuse [25]. The total efficiency of a development product is higher when RVA scores are higher.

Source Instructions Reused by Others (SIRBO) represent the total lines of code that new products reuse from existing ones. SIRBO measures the extent and success of reusing components from older products [25]. Organizations that develop successful, reusable components will have a high SIRBO, as the metric increases each time another product reuses the software. SIRBO is high in organizations that produce high-quality, well-documented, broadly applicable, and reusable components [25]. SIRBO is calculated by summing the code lines for all product parts that contribute to reuse.

(3) SIRBO

Substituting the SIRBO equation into the RVA calculation provides a simple formula using direct measurements of a product [26].

(4) SIRBO equation into the RVA calculation

Since our models contain modeling elements and not lines of code, we will redefine RSI as the number of reused elements (RSE) and SSI as the number of model elements (ME). Thus, we can define a new metric for model reusability, called Model Reuse percent (MR%, analogous to PRP), which is calculated as follows:

(5) Model Reuse percent

Moreover, since lines of code reused in the SIRBO equation are equivalent to reused elements, and the number of products is the same as the number of models, we can restate the RVA formula for use in modeling as follows:

(6) RVA formula for use in modeling

The RM% metric primarily assesses the reuse of artifacts between two consecutive generations of models. On the other hand, the RVA metric can evaluate reuse between consecutive model generations and between a baseline model and subsequent models that are multiple generations removed. Figure 2 illustrates the relationships between these two metrics and the models. The RM% and RVA metrics will be applied to evaluate the extent of reuse.

The RM% metric
Figure 2.  MR% and RVA Metrics as Applied Between the Generations of Case Study Models.

The proposed refined RVA metric offers a straightforward way to assess how efficiently elements are reused from the baseline model through multiple generations of reuse in other models. A score above 1 signifies successful reuse, indicating that developers consistently reuse components with each new model iteration. This metric encourages developers to create high-quality, well-documented, widely applicable components, which in turn supports the creation of more reusable artifacts.

Case Studies Definition

Three case studies were developed and produced using SysML implemented in MBSE with the Dassault Systemes’ Cameo Systems Modeler™ software package to execute and assess the proposed process. The first case study model, the Combat Search and Rescue Locator (CSARL), was created from a design and development point of view. The model excluded participation and inputs from the testing community during its development. The second model, the E-X airborne multi-sensor platform, is a revision and update of an example in the textbook “Effective Model-Based Systems Engineering” [20]. The third model, the next-generation sensor (NGS) system, is a generalized version of a “real-world” example model developed by the Air Force Material Command (AFMC) digital engineering program office [27]. The NGS model teaches government and contractor personnel how to use and implement the weapon government reference architecture (weapons GRA) used to standardize the Air Force’s implementation of digital engineering into the DoD’s acquisition process. Details of these case studies and why they were chosen are presented in the following paragraphs.

The baseline case study focuses on AFOTEC’s Combat Search and Rescue Locator (CSARL) system. CSARL is a fictional system developed as a training case for AFOTEC flight test personnel learning to create test plans. It is well-documented, well-accepted in the testing community, and has a thorough test plan “answer key.” The CSARL system is notionally a portable hand-held Global Positioning System (GPS) receiver system offering enhanced digital moving maps and real-time navigational capabilities for downed aircrew and search and rescue (SAR) forces during combat and peacetime survival scenarios. Its purpose is to complement current GPS-integrated survival radios, enhancing existing survival and rescue navigation capabilities without relying on traditional map-and-compass techniques [28]. CSARL is an example system demonstrating the proposed methods for relevant flight-test test plan generation. Since the system is well-documented and all AFOTEC personnel are familiar with the application and model, it is a good benchmark for applying the grey box-inspired testing technique implemented in an MBSE environment to create a flight test plan without direct SME intervention. Moreover, it provides a means to compare outputs between the document-centric and model-centric processes.

The first case study uses a system described in the Effective Model-Based Systems Engineering textbook [20]. The textbook introduces the “E-X Airborne Sensor Platform” as a hypothetical system used to exemplify the development of various diagrams for defining and describing software-intensive aerospace systems using SysML. The E-X airborne multi-sensor platform gathers intelligence, surveillance, and reconnaissance (ISR) data. Operating in collaboration with ground stations and affiliated organizations, it facilitates the collection and dissemination of ISR products. E-X platform executes ISR missions and performs aerial imaging and sensing functions [20]. The E-X case study is a valuable example of a moderately complex system, using SysML diagrams readily accessible in the textbook [20] and easily replicated in Cameo. Like the CSARL case study, the E-X system includes comprehensive documentation, making it an ideal standard for applying grey box-inspired testing techniques within an MBSE environment. Additionally, it offers another podium for comparing outcomes between document-centric and model-centric processes.

The second case study uses an example model developed by the AFMC digital engineering program office called the Next Generation Sensor Platform (NGS) System [27]. The NGS system is a notional airborne sensor platform that can be configured to use various sensors to aid the ISR collection mission. The NGS test case study provides a practical illustration of a moderately complex physical system, using SysML diagrams to which the grey box-inspired testing techniques within an MBSE construct can be applied.

Table 1 summarizes the features of each case study model and visually demonstrates the increase in complexity across the test cases. The number of elements and diagrams increases as the test cases become more representative of real-world complex systems. This, in turn, outlines the characteristics of these systems and highlights the correlation between system complexity and the number of elements and diagrams. The selection of two case studies and one baseline case for this research aims to test the applicability to diverse systems with distinct missions and similar systems with identical missions but varying degrees of complexity.

Table 1. Summary Table of Case Study Model Characteristics
Summary Table of Case Study Model Characteristics

Applicability Assessment Methods

The CSARL case study was used to create and develop the model-based AFOTEC testing model (Figure 3), which includes all the elements used in a test plan generation process, as outlined in the Operational Test & Evaluation Construct section. Each folder within the “AFOTEC Methodology” folder of the CSARL system was populated with details to meet the requirements of the AFOTEC test planning guide [5]. Thus, details of each MOE, MOP, and MOS linked to each COI were also populated with all the necessary specifics that each element requires to meet the AFOTEC test planning guide [5]. Appendix Table A-1 provides a detailed, comprehensive overview of the COI, MOE, MOP, and MOS subfolders within the AFOTEC Methodology folder for the CSARL case study. This methodology was then used to generate a sample test plan using the elements within the AFOTEC test structure. The plan was compared with its document-centric counterpart and assessed for content and methodology.

The CSARL Model Layout Incorporates the AFOTEC Test Model, Labeled “AFOTEC Methodology"
Figure 3. The CSARL Model Layout Incorporates the AFOTEC Test Model, Labeled “AFOTEC Methodology.”

Once the “AFOTEC Methodology” model was created and assessed to contain all the information needed to generate a preliminary draft test plan, the structure was adapted to the E-X airborne multi-sensor platform model. To match the requirements of the E-X test planning, the AFOTEC test plan model elements required varying levels of modification, from simple edits to change names and parameters to match the model to significant changes that require rewriting of the needed information to provide detail and tailor the element for use relevant to the model functions or behaviors associated with the aspect. This level of adjustment was expected since the CSARL and the E-X platform models are very different in scope, concept, and mission.

Once the elements within the “AFOTEC Methodology” model for the E-X platform model were modified, they were assessed for content to ensure that each MOE, MOP, and MOS was linked to its associated COI and populated with all the essential details required to meet the AFOTEC test planning guide [3]. Appendix Table A-2 provides a detailed breakdown of the COI, MOE, MOP, and MOS subfolders within the AFOTEC Methodology Model for the E-X test case study.

Last, the “AFOTEC Methodology” model used in the E-X airborne multi-sensor platform was adapted to the second test case, the NGS system model. The elements were again reviewed and modified to fit the model’s parameters outlined in Appendix Table A-3. The elements associated with the NGS system required only minimal modification, with changes limited to simple edits to change names and parameters to match the model since the E-X airborne multi-sensor platform and the NGS system are relatively similar systems. Table 2 summarizes the various components of the AFOTEC Methodology Model and indicates the number of COIs, MOEs, MOPs, and MOSs associated with each case study.

Table 2. Summary Table of AFOTEC Methodology Model Elements for the Two Case Study Models and the Baseline Case Model
Summary Table of AFOTEC Methodology Model Elements for the Two Case Study Models and the Baseline Case Model

Methods Summary

In summary, the two case study models and the one baseline case model, created using MDTD processes to generate model-driven test plans from MBSE-enabled system models, were serially developed and then assessed to evaluate the effectiveness and applicability of the proposed AFOTEC Methodology. The outcome of this process is model-based descriptions of the elements within the “AFOTEC Methodology” model, which are used to develop an operational test plan for any SUT.

Results

The results first compare the baseline CSARL model to the E-X model, focusing on the MR% and RVA metrics of reuse. Next, the E-X platform model is compared to the NGS system model. Once again, the MR% and RVA metrics are assessed and discussed. Lastly, the RVA reuse metric is evaluated between the baseline CSARL case model and the second test case NGS system model. These results illustrate the function of the proposed MDTD methodology and indicate the degree of reuse between the models.

Reuse Between Systems with Different Missions (CSARL and E-X Platform)

Comparing the elements and structures of the baseline CSARL (Table A-1) and the E-X platform models (Table A-2), we observed the “AFOTEC Methodology” model folder structure and model architecture, as shown in Figure 3, remained the same. However, adapting the folder structure and model architecture to the E-X platform model required adjustments within the “AFOTEC Methodology” model. To accommodate the additional COIs needed for the capabilities of the E-X platform, we duplicated the existing COI 1 from the baseline CSARL case model to develop new COIs for the E-X platform model. We modified the duplicated COI’s contents to reflect the information needed for the new COI in the E-X platform model. The E-X platform model incorporates modified versions of the original three CSARL COIs and two additional COIs created using this duplication process. Similarly, the existing MOE 1 from the CSARL model was repeatedly duplicated, reconfigured, and modified to align with the E-X platform model’s requirements, creating eight more MOEs. This duplication process was also applied to the MOP and MOS sections of the E-X platform model, yielding three more MOPs and 14 more MOSs for the E-X platform model.
The replication and modification process used to adapt the “AFOTEC Methodology” model illustrates the model’s architectural adaptability to meet the specific demands of the E-X platform model. This process highlighted the importance of reuse in streamlining the development and creation of new structures within a model. Moreover, this approach maintains consistency among individual elements across models, aids in clarity, and reduces ambiguities in information by reusing aspects within the “AFOTEC Methodology” model. However, caution must be taken when reusing models, as applying them carelessly to a new SUT can introduce significant risks if not carefully considered for the new context.

Reuse Between Systems with Similar Missions (E-X Platform and NGS System)

When the AFOTEC test structure was developed for the E-X platform (Table A-2), it was then adapted to the NGS system (Table A-3), the folder structure and model architecture again remained consistent. Given the similarity in mission between the two models, many analogous functions and related behaviors are shared in the two models. As such, transferring the “AFOTEC Methodology” model to the new model in the “AFOTEC Methodology” folder necessitated only minor adjustments to subsequent elements in the design folder. The COIs remained unchanged, with only minor terminology adjustments to reflect the NGS system’s components. Similarly, adapting the MOEs, MOPs, and MOSs involved renaming and updating elements while significantly reusing model architectures and elements. Since the E-X platform and NGS system models are very similar, no additional MOEs, MOPs, or MOSs were added to the NGS system model and applied to the test plan.
As before, the duplication and modification processes were used to adapt the “AFOTEC Methodology” model from the E-X platform model to the requirements of the NGS system model. This reuse process for developing and creating new structures within a model was straightforward to implement. As with the previous case, the methodology ensures consistency among individual elements across models, enhances clarity, and minimizes uncertainties by reusing aspects within the “AFOTEC Methodology” model. As stated previously, careful consideration is necessary when reusing models, as improper application to a new SUT can lead to significant risks if not thoughtfully adapted to the new environment.

Discussion

Implications of Reuse Applied to the Case Studies

Model reuse is often cited as a primary mechanism by which MBSE processes (including MDTD) can realize efficiencies and improvements [2] over traditional document-centric processes. Still, research literature has rarely measured the degree of reuse and its value [29]. A few studies have defined reuse in SysML models in toy problems [30] or product families [3], but none measure reuse outside of the architectural and design stages (as this study does in the testing stage) of the MBSE lifecycle.

When comparing the types of reuse present between the baseline CSARL and first test case E-X platform models, we found that reuse predominantly fell into the Modified Subset Reuse category, specifically within the Major Modifications sub-category, as the process involved extensively modifying and duplicating elements to meet the E-X platform model’s requirements. Conversely, when examining the E-X platform and NGS system models, reuse predominantly aligned with the Modified Subset Reuse category but within the Minor Modifications sub-category, where adjustments were limited to name changes required to integrate seamlessly with the NGS system model.

Table 3 shows the percentage of element reuse between the baseline CSARL and first test case E-X models, with the application associated with each case study significantly different. The average percentage of element reuse was 25.5%. Table 3 displays the values for the RVA metric, with the average being RVA = 1.72, indicating the E-X platform test case model is 72% more efficient than if the product was created without reuse [29].

Table 3. Reuse of Model Elements Between the Baseline CSARL and the E-X Platform Case Study Models
Reuse of Model Elements Between the Baseline CSARL and the E-X Platform Case Study Models

Similarly, the data presented in Table 4 highlights the percentage of element reuse between the E-X platform and the NGS system models. The table shows the RVA metric between these two models and between the second test case, the NGS system model, and the baseline CSARL model, which is one generation older. The average element reuse percentage between the E-X platform and the NGS system models is 100%, demonstrating the E-X test case model is completely reused. The average RVA metric between the E-X platform and the NGS system models is 2.0, indicating the model is twice as efficient as if the model was created without reuse [29]. Furthermore, the RVA metric between the NGS system model and the baseline CSARL model averages 2.08, indicating the NGS system model is over twice as efficient as if the model was created without reuse [29] despite the generational difference.

Table 4. Reuse of Model Elements Between the E-X Platform and the NGS System Case Study Models
Reuse of Model Elements Between the E-X Platform and the NGS System Case Study Models

Since the E-X platform and NGS system models are similar, and the model changes from the E-X platform to the NGS system are primarily limited to name changes, it is no surprise the reusability of elements from the E-X model to the NGS model is 100%. However, this formula does not distinguish between identical subset and modified subset reuse or account for the two subcategories in the modified subset reuse category. This distinction is crucial, as it helps us understand the nuances of reuse. So, although these results indicate a high degree of reuse, these reuse metrics are perhaps not perfectly characteristic of reuse’s ease or overarching value. By distinguishing between the Identical and Modified Subset Reuse categories and further discriminating between minor changes and significant modifications within the Modified Subset Reuse category, the detailed results can illustrate a complete accounting of the degree of reuse available under MDTD constructs.

Implications for the Model-Driven Test Design

These test models demonstrate the MDTD development process is effective for the case study’s SUTs and can be applied to simple and complex MBSE artifacts. In the authors’ assessment, these case study results show that even for very different projects with low apparent similarity, adapting the AFOTEC MDTD Methodology proved more feasible than expected, highlighting the use of incremental model-driven systems engineering and MDTD.

These findings also have some implications for developing strategies for transitioning to MBSE and Product Line Engineering constructs. This research proves that even simple MBSE projects can lead to an improved understanding of the processes and the building of a well-documented library of models ready for reuse. This study shows that even in very different modeling and application domains, this study measured a 13% reuse rate between the testing models, indicating a clear pathway to effective model-driven test development. However, a significant challenge in realizing the potential of MBSE is the need for expertise for effective implementation. This study leveraged experienced MBSE practitioners to realize well-designed, reusable models with durable value in revision and reuse.

This approach provides a foundation for effective model-driven test development, demonstrating the applicability of the AFOTEC MDTD methodology and the potential for successful model reuse. The two case studies and the baseline case presented in the paper indicate that by using the MBSE-enabled MDTD process to build test models, we can achieve a reuse rate of model components between 25.5% and 100%. Using the RVA metric to measure reuse efficiency, we find that models created with reuse can be twice as efficient as those developed without reuse. Thus, the reuse of even simple MBSE models plays a crucial role in enhancing the total effectiveness of program development, with evidence from this study showing benefits in MDTD example cases.

Discussion Summary

In the context of MDTD, this substantial reduction in effort results from the ability to derive and reuse test models and plans from a model-driven test development process. Reusing these test models and plans significantly reduces the effort required to develop test models for future SUTs.

Conclusion

These two test cases and the baseline have demonstrated that using the “AFOTEC Methodology” model to generate elements needed for developing and creating test plans can be applied to a set of applications and complex SUTs. The extent to which the reuse of modeling components can be enabled depends on the differences between the models. Still, essential elements can be duplicated and modified to meet the new model’s requirements. Metrics of model reuse can be challenging and may not be a reliable indicator. Therefore, greater granularity must be incorporated into the calculation to achieve a metric that effectively spans a broader range of values. Implementing the AFOTEC test methodology using a model-based approach ensures that test planning and documentation can take advantage of MBSE artifacts accompanying future flight test systems. This approach enables the reuse of test plan elements, with corresponding benefits in configuration management, consistency, and standardization across the DoD testing enterprise.

Acknowledgements

I want to thank Dr. J Shelley, a co-worker and adjunct faculty member from California State University, Long Beach Antelope Valley Engineering Program, for helping the authors refine and consolidate the information in this document.

References

[1] J. L. Alvarado Jr and T. H. Bradley, “Developing Model-Based Flight Test Scenarios,” The ITEA Journal of Test and Evaluation, vol. 44, no. 4, 2023.

[2] F. Wilking, D. Horber, S. Goetz and S. Wartzack, “Utilization of system models in model-based systems engineering: definition, classes and research directions based on a systematic literature review,” January 2024. [Online]. Available: https://www.cambridge.org/core/journals/design-science/article/utilization-of-system-models-in-modelbased-systems-engineering-definition-classes-and-research-directions-based-on-a-systematic-literature-review/04066CE1D254458E17969AC5F2D80347. [Accessed 18 August 2024].

[3] H. Mause and J. Hummell, “Model‐based Product Line Engineering–Enabling Product Families with Variants,” IEEE Aerospace Conference, vol. 22, no. 2, pp. 1-8, 2015.

[4] G. Van Peteghem, C. J. Liebmann, S. S. Mailen, J. D. Martin, and K. L. Peck, 2021. Test Plan Author’s Guide. Engineering Directorate, 412th Test Wing, Edwards AFB.

[5] AFOTEC A-2/9, “Air Force Operational Test and Evaluation Center Test Design Guide,”. Albuquerque: AFOTEC, 2018.

[6] Air Education and Training Command, “Air Force Instruction 99-103 – Capabilities-Based Test and Evaluation,” 2020.

[7] J. Gansler, W. Lucyshyn and A. Spiers, “Using Spiral Development to Reduce Acquisition Cycle Times,” p. 76, 2008.

[8] L. Freeman, P. Beling, K. Esser, P. Wach, G. Kerr, A. Salado, J. Werner, and S. Hobson, “Positioning Test and Evaluation for the Digital Paradigm,” The ITEA Journal of Test and Evaluation, vol. 44, no. 2, June 2023.

[9] R. Dunning, W. Matteson, R. Wise, and J. Sharpe, “Using a Model-Based Approach for Test and Evaluation,” in 2020 NDIA Ground Vehicle Systems Engineering and Technology Symposium, Novi, Michigan, 2020.

[10] D. Verma, A. M. Madni, S. Hoffenson and L. Xiao, The Proceedings of the 2023 Conference on Systems Engineering Research: Systems Engineering Towards a Smart and Sustainable World, Springer, 2024.

[11] K. Henderson, T. McDermott, E. Van Aken, and A. Salado, “Towards Developing Metrics to Evaluate Digital Engineering,” Systems Engineering, vol. 26, no. 1, pp. 3-31, 2023.

[12] S. Friedenthal, A. Moore, and R. Steiner, “A practical guide to SysML: the systems modeling language,” Morgan Kaufmann, 2014.

[13] J. Gregory, L. Berthoud, T. Tryfonas, A. Rossignol, and L. Faure, “The long and winding road: MBSE adoption for functional avionics of spacecraft,” Journal of Systems and Software, vol. 160, p. 110453, 2020.

[14] Y. Wang and M. Zheng, “Test case generation from uml models,” in 45th annual midwest instruction and computing symposium, cedar falls, Iowa, 2012, vol. 4.

[15] M. Mussa, S. Ouchani, W. Al Sammane, and A. Hamou-Lhadj, “A survey of model-driven testing techniques,” in 2009 Ninth International Conference on Quality Software, 2009: IEEE, pp. 167-172.

[16] S. Beydeda, M. Book, and V. Gruhn, Model-driven software development. Springer, 2005.

[17] A. A-2/9, “Air Force Operational Test and Evaluation Center Measures Guide,” AFOTEC HQ, Albuquerque, March 2021.

[18] “AcqNotes Program Management Tool for Aerospace,” AcqNotes LLC, 19 12 2021. [Online]. Available: https://acqnotes.com/acqnote/acquisitions/emd-phase. [Accessed 28 5 2022].

[19] “Air Force Digital Transformation,” [Online]. Available: https://usaf.dps.mil/teams/afmcde/SitePages/Functional-Speciality—Test-and-Evaluation.aspx. [Accessed 28 5 2022].

[20] J. M. Borky and T. H. Bradley, Effective Model-Based Systems Engineering. Springer International Publishing, 2018.

[21] Q. Wu, D. Gouyon, and E. Levrat, “Maturity assessment of systems engineering reusable assets to facilitate MBSE adoption,” IFAC-PapersOnLine, vol. 54, no. 1, pp. 851-856, 2021.

[22] M. Shinozaki, F. Mhenni, J.-Y. Choley, and A. Ming, “Reuse of SysML model to support innovation in mechatronic systems design,” in 2017 Annual IEEE International Systems Conference (SysCon), 2017: IEEE, pp. 1-6.

[23] Q. Wu, D. Gouyon, É. Levrat, and S. Boudau, “Use of patterns for know-how reuse in a model-based systems engineering framework,” IEEE Systems Journal, vol. 14, no. 4, pp. 4765-4776, 2020.

[24] F. H. Ferreira, E. Y. Nakagawa, and R. P. dos Santos, “Reliability in Software-intensive Systems: Challenges, Solutions, and Future Perspectives,” in 47th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 2021, pp. 54-61: IEEE.].

[25] W. Frakes and C. Terry, “Software reuse: metrics and models,” ACM Computing Surveys (CSUR), vol. 28, no. 2, pp. 415-435, 1996.

[26] J. S. Poulin, J. M. Caruso, and D. R. Hancock, “The business case for software reuse,” IBM Systems Journal, vol. 32, no. 4, pp. 567-594, 1993.

[27] Air Combat Command A5I, “Architecture Data Standards for Next Generation Sensor (NGX),” Department of the Air Force, Washington D.C., 30 Nov 2022.

[28] M. E. Khan and F. Khan, “A comparative study of white box, black box and grey box testing techniques,” International Journal of Advanced Computer Science and Applications, vol. 3, no. 6, 2012.

[29] A. Albers, N. Bursac and E. Wintergerst, “Product generation development – importance and challenges from a design research perspective.,” Proceedings of the International Conference on Mechanical Engineering, pp. 16-21, 2015.

[30] M. Atif and H. Stephan, “A Modelling Method for Describing and Facilitating the Reuse of SysML Models during Design Process,” Proceedings of the Design Society, vol. 2, pp. 1925-1934, 2022.

[31] “AcqNotes Program Management Tool for Aerospace,” AcqNotes LLC, 7 6 2021. [Online]. [Accessed 28 5 2022].

[32] A. Akundi and W. Ankobiah, “Mapping industry workforce needs to academic curricula – A workforce development effort in model-based systems engineering,” Systems Engineering, 2024.

Appendix

Table A-1. Name of Elements Contained Within the CSARL Model Structure Subfolders.

Name of Elements Contained Within the CSARL Model Structure Subfolders.

Table A-2. Name of Elements Contained Within the E-X Model Structure Subfolders.

Name of Elements Contained Within the E-X Model Structure Subfolders.

Table A-3. Name of Elements Contained Within the NGS Model Structure Subfolders.

Name of Elements Contained Within the NGS Model Structure Subfolders.

Biographies

Jose ALvarado is a senior test engineer and system analyst for AFOTEC at Edwards AFB, California, with over 33 years of developmental and operational test and evaluation experience. He is a Ph.D. candidate in the Systems Engineering doctorate program at Colorado State University. He is interested in applying MBSE concepts to the flight test engineering domain and implementing test process improvements through MBT. Jose holds a B.S. in Electrical Engineering from California State University, Fresno (1991) and an M.S. in Electrical Engineering from California State University, Northridge (2002). He serves as an adjunct faculty member for the electrical engineering department at the Antelope Valley Engineering Program (AVEP) overseen by California State University, Long Beach. He is a member of the International Test and Evaluation Association, Antelope Valley Chapter.

Thomas H. Bradley, Ph.D., serves as the Woodward Foundation Professor and Department Head for the Department of Systems Engineering at Colorado State University. He conducts research and teaches various courses in system engineering, multidisciplinary optimization, and design. Dr. Bradley’s research interests are focused on applications in Automotive and Aerospace System Design, Energy System Management, and Lifecycle Assessment. Bradley earned a BS and BS in Mechanical Engineering at the University of California – Davis (2000, 2003) and a PhD in Mechanical Engineering at Georgia Institute of Technology (2008). He is a member of INCOSE, SAE, ASME, IEEE, and AIAA.

ITEA_Logo2021
ISSN: 1054-0229, ISSN-L: 1054-0229
Dewey Classification: L 681 12

  • Join us on LinkedIn to stay updated with the latest industry insights, valuable content, and professional networking!