Dodging Pitfalls in AI Packages | ITEA Journal

SEPTEMBER 2024 I Volume 45, Issue 3

Dodging Pitfalls in Packages for Artificial Intelligence and Machine Learning

Justin Krometis

Justin Krometis

Research Assistant Professor,
Virginia Tech National Security Institute;
Blacksburg, VA

William Snyder

William Snyder

Biomedical Engineering and Mechanics,
Virginia Tech
Blacksburg, VA

DOI: 10.61278/itea.45.3.1004

Abstract

Recent years have seen an explosion in the application of artificial intelligence and machine learning (AI/ML) to practical problems from computer vision to game playing to algorithm design. This growth has been mirrored and, in many ways, been enabled by the development and maturity of publicly-available software packages that make model building, training, and testing easier than ever. While these packages provide tremendous power and flexibility to users, and greatly facilitate learning and deploying AI/ML techniques, they and the models they provide are extremely complicated and as a result can present a number of subtle but serious pitfalls. This paper presents three examples where obscure settings or bugs in these packages dramatically changed model behavior or performance – one from a deep learning regression application, one from reinforcement learning, and one from computer vision classification. These examples illustrate the importance of thinking carefully about the results that a model is producing and carefully checking each step in its development before trusting its output.

Keywords: Machine learning, artificial intelligence, test and evaluation, best practices, case studies, software packages, software bugs, deep learning, reinforcement learning, regression, classification

Introduction

Artificial intelligence and machine learning (AI/ML) have exploded into almost ubiquitous use in recent years. In contrast with models based on physics, these algorithms have enabled development of models driven by data, allowing widespread extension of modeling into areas like understanding text or images, where little theory is available. The results have been algorithms that can answer questions, complete sentences, generate images, and win games against human experts.

As a part of this growth, the development of popular, powerful, and free software packages, such as TensorFlow (Abadi 2015), PyTorch (Paszke 2019), scikit-learn (Pedregosa 2011), OpenAI Gym (Brockman 2016), and RLlib (Liang 2018), have enabled widespread development and deployment of AI/ML models. These libraries provide ready access to state-of-the-art algorithms with efficient implementations on modern hardware and have large user bases to develop documentation and tutorials. As a result, it has never been easier to train a deep neural network or to set up an environment for reinforcement learning.

At the same time, theory and understanding of what factors drive these models’ performance has been slower to progress. Data-driven models have no physical principles to guide predictions and hyperparameter choices, such as, e.g., the layer size or depth of a deep neural network, are often made by tuning or brute force search. As a result, the model structures are opaque, a problem that can be exacerbated by the packages mentioned above, which remove the user even further from the implementation details. All of these factors together make it difficult to know how to check the results of models developed using these libraries.

This paper illustrates three AI/ML examples wherein serious problems resulted from subtle design choices obscured by popular software packages. Each example is in a different use case and uses a different software library, illustrating that such concerns are not limited to a single application area or package. Erroneous conclusions could have been drawn from each, had the problems not been identified and corrected via thoughtful and skeptical review of the results. Taken together, these examples illustrate the importance of carefully testing data-driven applications prior to deployment, even when they largely rely on popular and widely-used packages.

Background: AI/ML Methods and Terminology

In this section, we briefly describe the classes of AI/ML methods and some of their associated terminology. AI/ML methods are typically divided into one of three classes: (1) supervised learning, wherein a model is fit to a training dataset to minimize an associated loss function measuring the distance between the target and predicted outputs; (2) unsupervised learning, where a model seeks to identify or extract structure from a dataset; and (3) reinforcement learning, where an agent or agents learn online to maximize a user-specified reward function. Supervised learning tasks are typically either regression, approximating real-valued outputs similar to classical linear regression, or classification, where the outputs are discrete, such as the types of objects in an image. Common examples of unsupervised learning and reinforcement learning are pattern recognition and game playing, respectively.

The underlying model for many of these methods is a neural network, a nonlinear model that roughly models how information passes through an animal brain. Networks are typically composed of layers: the inputs (input layer), the outputs (output layer), and a series of layers in between, called hidden layers. The number of layers in a network is its depth, while a network with more than one hidden layer is called deep. Each output value in a layer is associated with a node, which is a model of the neuron in a brain; the number of nodes in a given layer is its layer size. At each edge connecting layers, outputs of the preceding layer are passed through a linear map followed by a (typically nonlinear) activation function. Each linear map in a layer is made up of weights and biases that comprise the parameters that are fit to the data via an optimization process called training. Networks in which the linear maps are dense are considered fully connected. The choice of depth, layer size, activation function, connectedness, and loss function are examples of hyperparameters that must be chosen by the practitioner. Special neural network structures have been designed to adapt to different tasks. Examples of these are convolutional neural networks, which include layers of convolutions for feature extraction in computer vision, and autoencoders, which have hidden layers much smaller than the input or output layers as a means of compressing information. For more information on the types of machine learning algorithms and applications of neural networks, see, e.g., (Bishop 2007), (Goodfellow 2016), or (Sutton 2015).

Example 1: Regression

The goal of the first example project was to use machine learning as a surrogate for a physics-based simulation of soft tissue deformation – i.e., to develop a model that would provide a sufficiently accurate approximation to the physics-based model, but at much lower computational cost. The objective, then, was one of regression: to take a series of continuous-valued inputs and outputs from the original, full order simulation, and fit a model to map one to the other.

For this task, we chose a fully-connected deep neural network and tuned the model’s hyperparameters, such as the depth (number of hidden layers), layer size, activation functions, and loss/cost function, to match the data. For the implementation, we chose TensorFlow (Abadi 2015, Abadi 2016), a free and powerful library for AI/ML originally developed by Google that makes it very easy to build and train deep neural networks.

A key research question from this effort was to compare how purely data-driven models derived solely from the inputs and outputs (“black box”) compare to theory-based reduced order modeling (ROM) techniques that incorporate knowledge of the internal physics of the problem (“white box”). It was therefore essential to measure both the accuracy and computational complexity of the machine learning models.

The results of our initial set of comparisons are shown in Figure 1. Here “G-ROM” represents white box ROM models and the “LS” points represent results from our deep learning model with various choices of hidden layer size. The key takeaway from these results is that the ROM models handily outperformed the deep learning models in both accuracy (y axis) and speed (x axis).

: Initial comparison between data-driven (labeled by “LS” for layer size) and physics-based reduced order models (labeled by “G-ROM”).Figure 1: Initial comparison between data-driven (labeled by “LS” for layer size) and physics-based reduced order models (labeled by “G-ROM”).

However, some of the deep learning models that we tried were quite small – just two hidden layers of size 10, for example. The cost of a prediction in a neural network is dominated by the cost of computing the linear maps, which in this case involved matrix multiplies with 10 rows or columns. Why would such a model not at least be comparable to the ROM method, which involved multiplication of similar-sized matrices, in terms of computational cost? The results seemed difficult to justify in terms of the floating-point operations (FLOPs, essentially the number of additions and multiplies) involved.

The answer turned out to be in a somewhat – at least to us – obscure setting in TensorFlow, which has two execution modes. The first is eager execution, in which each step is evaluated right away, is easier to develop with and to debug, and has served as the default mode since the introduction of TensorFlow 2.0 in 2019. The second is graph execution, which builds a computational graph before execution; this mode served as the default in TensorFlow 1.x and, of great importance to our project, is much faster. A more in-depth description of the two modes is provided in (Yalçın 2021).

The code changes required to change modes were trivial – rather than calling model.predict() to do a prediction (eager mode), we simply created a prediction function with a call like model_predict = tf.function(model), setting up graph mode, and then ran model_predict() thereafter to do a prediction. But the predictions then completed more than an order of magnitude faster than predictions in eager mode. The new results are shown in Figure 2 and yield different conclusions. In particular, while the deep learning models did not achieve the optimal accuracy of the ROM models – not surprising given the ROMs’ incorporation of physics – they at least maintained similar computational cost, and for small models were roughly similar to ROMs by both metrics. Given the ease of implementation of deep learning models, this might be a worthwhile trade-off depending on the application.

Results of ROM vs. deep learning comparison after switching the latter to graph-based execution. Note that “ND” represents the number of hidden layers, an additional test that was run to try to improve model accuracy.

Figure 2: Results of ROM vs. deep learning comparison after switching the latter to graph-based execution. Note that “ND” represents the number of hidden layers, an additional test that was run to try to improve model accuracy.

The issues described in this example may well be familiar to practitioners well-versed in the intricacies of TensorFlow. But it nevertheless underscores the importance of carefully checking the results of data-driven models against what is known (FLOPs, in this case) to ensure that they make sense before trusting them or drawing any conclusions.

Example 2: Reinforcement Learning

The second example problem is in the area of reinforcement learning (RL), wherein an agent learns a policy to maximize a reward function. RL is a key tool in AI and has shown enormous promise in a number of areas, most famously in game-playing – e.g., Go as in (Silver 2016) or Gran Turismo as in (Wurman 2022) – but also autonomous vehicles (see, e.g., Qiao 2020), quantum computing (Sivak 2022), and even in discovery of new computational algorithms (Fawzi 2022).

The example in this case came from the field of multi-agent RL (MARL), wherein multiple agents collaborate to achieve a task, and involved a team of agents tracking a target; an example would be drones tracking a car or a boat. For the implementation, we chose RLlib (Liang 2018), a popular library for distributed RL that supported MARL. The project also involved simulation in Unreal Engine, a gaming engine, as shown in Figure 3, and results of the effort were ultimately documented in (Peterson 2023).

Drones and target from Example 2 as simulated in Unreal Engine, taken from (Peterson 2023).

Figure 3: Drones and target from Example 2 as simulated in Unreal Engine, taken from (Peterson 2023).

Implementing the problem involved defining the state space (the set of allowable configurations of the agent and the target), the observations (the information that a given agent “sees” and can use for decision-making), the action space (the choices an agent can make), the model (how an action affects the state), the neural network used to model the policy (the function mapping the observation to the action), and the reward function (the outcome to be optimized). In our case, we started by implementing these components and selecting a basic reward function to encourage agents to move close to the target. The reward increased steadily throughout training, indicating that the agents were learning a good policy. However, when we went to play back episodes – to visualize individual cases – we found that while some of the agents converged to the target as expected, others moved in a random direction seemingly uncorrelated with the objective. This behavior was observed across a wide array of randomly-generated cases.

This led to several questions about what might be wrong:

  • Was our reward function poorly defined? Did it not capture the information that we wanted it to capture?
  • Was there a mistake somewhere in our implementation of the action, model, and observations connecting the agent decisions and feedback?
  • Was there simply a mistake in our visualization – i.e., did the replay not accurately represent what the agents were doing?

After much digging into the issue, it turned out to be none of the above. In particular, we discovered that while we had defined the action space to be a two-dimensional vector with values between -1 and 1 – corresponding to movements in the two-dimensional plane – and this space was respected during training, certain agents were violating it during replays. That is, we found cases where, in a system with four agents, two of the agents chose actions of, for example, (x,y)=(7.4, 60.3) and (37.6, 44.0), respectively – clearly outside the bounds of what was supposed to be allowed.

In the end, this turned out to be due to a software bug in RLlib (https://github.com/ray-project/ray/issues/2965) wherein the action space was not properly respected, and one that we could work around relatively easily by using a slightly different function call. But it highlighted the need for careful evaluation of model results, even when building upon powerful and widely-used software packages.

Example 3: Classification

Our third example is one of machine learning classification in that, given a series of images, we seek to classify them into one of two categories. This is a common machine learning application, the canonical example of which is the Modified National Institute of Standards and Technology (MNIST) dataset of handwritten digit images (Deng 2012). To accomplish the classification task, we chose a deep neural network made up of the following layer types:

  • Convolutional: To extract image features
  • Batch normalization: To renormalize statistics, a common method for stabilizing and speeding up training (Ioffe 2015)
  • Max pooling: To condense dimension while retaining important features

The model was implemented in PyTorch (Paszke 2019), one of the most popular machine learning frameworks in the world. The losses on the training and validation datasets for the initial model are shown in the left-hand plot of Figure 4. The loss as computed on the training set converges nicely; however, the loss on the validation set is worryingly stagnant, indicating that essentially all of the progress on the training dataset was in some sense overfitting. Moreover, subsequent inference on each of the two datasets, shown in the right-hand plot of Figure 4, was largely uncorrelated with the labels – the error rate on both the training and validation sets was roughly 50 percent, indicating that the model had learned almost nothing, even on the training data.

Results for initial classifier. Left: Losses on the training dataset converges during training, but loss on the validation dataset remains unmoved. Right: Inference results on both datasets show little correlation with the labels.

Figure 4: Results for initial classifier. Left: Losses on the training dataset converges during training, but loss on the validation dataset remains unmoved. Right: Inference results on both datasets show little correlation with the labels.

This led to a series of questions: Is the problem too difficult and the chosen model too simple or somehow ill-suited to the problem? Should we devote effort to pre-processing the data, such as via a latent space representation, to extract its essential features?

The resolution to the problem, as in the other examples, turned out to be more of a subtle issue of libraries than of implementation. PyTorch has two modes:

  • Training Mode, invoked via model.train(), is to be used in training. In addition to outputs, it computes gradients required to run the optimization routines used in training.
  • Evaluation Mode, invoked via model.eval(), is used for inference and only computes outputs.

We used Training Mode during training and Evaluation Mode during inference, as specified in the documentation. However, batch normalization is one of two layers in PyTorch that behaves differently in training and evaluation modes. The layer (https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html) has a flag (track_running_stats) that specifies whether the model should retain statistics from training (False) or recompute them in evaluation (True); by default, the flag is set to the latter. The result is a divergence between behavior in training and evaluation modes. Moreover, with a given image, inference in evaluation mode is inconsistent and converges to the result from training mode as the number of evaluations grows large; this effect is shown in Figure 5.

Results of repeated inference in training and evaluation modes. Using default flags for batch normalization, repeating inference in evaluation mode converges to the result from training mode.

Figure 5: Results of repeated inference in training and evaluation modes. Using default flags for batch normalization, repeating inference in evaluation mode converges to the result from training mode.

Upon changing from the default behavior (setting the flag track_running_stats=False), the results immediately improved as shown in Figure 6. The left plot shows that the validation loss converges just as well as the training loss. The right plot shows that subsequent inference (in Evaluation Mode) shows strong correlation with the labels. In the end, adjusting a subtle flag increased the model performance from roughly 50 percent – no better than random guessing between the two categories – to nearly perfect accuracy on both datasets.

Classification results with track_running_stats=False. Left: Losses on the training and validation datasets both converge during training. Right: Inference results on both datasets show strong correlation with the labels.Figure 6: Classification results with track_running_stats=False. Left: Losses on the training and validation datasets both converge during training. Right: Inference results on both datasets show strong correlation with the labels.

Example 4: Regression, Revisited

Lastly, it is perhaps worth briefly returning to our first example: We started, from the results in Figure 1, believing that data-driven models performed worse than white box reduced order models in both accuracy and computational complexity. Then we learned that the latter of these discrepancies was largely a byproduct of a nuance in the machine learning package leveraged to build the model (see Figure 2). Is it then not worth asking whether the former – the difference in accuracy – might also be overcome?

The answer is that we cannot rule it out. We did try many different combinations of neural network hyperparameters and could not find one that achieved better accuracy than the ones we reported; ultimately, after extensive experimentation, these results were published in (Snyder 2023). Further applications and experiments are forthcoming in (Snyder 2024). However, there is not, as yet, much theory to guide the expected convergence of deep learning models to the target data, so there is not a result that we can use to justify the gap in accuracy between the model types that we see in Figure 2. One promising line of inquiry might be methods that combine physics with deep learning as described in, e.g., (Karpatne 2017) or (Karniadakis 2021). As a result, there is nothing to say that we will not wake up tomorrow and try a new machine learning model that does just as well as the white box reduced order models for this problem. Such are the current challenges of data-driven modeling.

Discussion

This paper describes three cases in which AI/ML models produced problematic answers because of software bugs or subtle settings, each of which could have been missed without careful checking of the results. Each example was in a different field of AI/ML and used a different, highly-popular software package, and the performance degradation for each application was widespread and not limited to rare edge cases. The first example, a deep learning regression problem, found that a subtle setting in TensorFlow dramatically affected the computational efficiency of the deep learning model, which could have affected the results of the numerical comparison between this model and a physics-based surrogate model that was the goal of the study. In the second, we developed a multi-agent reinforcement learning model, but found that replays showed that agents did not achieve the desired behavior after training; the reason turned out to be a bug in the RLlib package in which we developed the model. In the third, a deep learning classification problem, model evaluations after training did not appear to show any improvement over random guessing, which turned out to be the result of a subtle setting in the PyTorch package in which the model was built. These software packages provide enormous value by putting computationally-performant implementations of cutting-edge algorithms in the hands of practitioners. However, these examples illustrate the need for practitioners to carefully test their AI/ML applications prior to deploying them in an operational setting.

Conclusion

This is an age in which we are routinely introduced to new advances in AI/ML, from Alexa and Siri to AlphaGo to ChatGPT. Modern libraries make it easier than ever for practitioners to get started in developing their own models for their own use cases. These and other AI/ML tools are powerful but opaque – there is not much theory to guide us in understanding them and what might be wrong with them (if anything) is not always clear. Moreover, there can be pressure to simply believe them, from deadlines to deliverables to publications. This paper presents a few examples where simple but subtle issues caused serious problems with the application of data driven models across a range of use cases – issues that could have been missed without thoughtful review of the model’s behavior. All of this underscores the need for careful testing and evaluation of data driven models, especially until better theory develops to guide their design and characterize their limitations.

Acknowledgements

I would like to thank the work of my collaborators on the various projects listed here, especially Will Snyder, Raffaella De Vita, Traian Iliescu, David Peterson, Dan Sobien, and Justin Kauffman. I would also like to thank the reviewers and editors who provided helpful feedback in the refinement of this manuscript.

References

Abadi, Martın, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, et al. “TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems.” Google Research, 2015. https://www.tensorflow.org/.

Abadi, Martın, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. “TensorFlow: A System for Large-Scale Machine Learning.” Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, 2016.

Bishop, Christopher M. Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York, 2007.

Brockman, Greg, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. “OpenAI Gym.” arXiv, June 5, 2016. https://doi.org/10.48550/arXiv.1606.01540.

Deng, Li. “The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web].” IEEE Signal Processing Magazine 29, no. 6 (November 2012): 141–42. https://doi.org/10.1109/MSP.2012.2211477.

Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, et al. “Discovering Faster Matrix Multiplication Algorithms with Reinforcement Learning.” Nature 610, no. 7930 (October 2022): 47–53. https://doi.org/10.1038/s41586-022-05172-4.

Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. https://www.deeplearningbook.org/.

Ioffe, Sergey, and Christian Szegedy. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.” arXiv, March 2, 2015. https://doi.org/10.48550/arXiv.1502.03167.

Karniadakis, George Em, Ioannis G. Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. “Physics-Informed Machine Learning.” Nature Reviews Physics 3, no. 6 (June 2021): 422–40. https://doi.org/10.1038/s42254-021-00314-5.

Karpatne, Anuj, Gowtham Atluri, James H Faghmous, Michael Steinbach, Arindam Banerjee, Auroop Ganguly, Shashi Shekhar, Nagiza Samatova, and Vipin Kumar. “Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data.” IEEE Transactions on Knowledge and Data Engineering 29, no. 10 (2017): 2318–31.

Liang, Eric, Richard Liaw, Robert Nishihara, Philipp Moritz, Roy Fox, Ken Goldberg, Joseph Gonzalez, Michael Jordan, and Ion Stoica. “RLlib: Abstractions for Distributed Reinforcement Learning.” In Proceedings of the 35th International Conference on Machine Learning, 3053–62. PMLR, 2018. https://proceedings.mlr.press/v80/liang18b.html.

Paszke, Adam, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, et al. “PyTorch: An Imperative Style, High-Performance Deep Learning Library.” arXiv, December 3, 2019. https://doi.org/10.48550/arXiv.1912.01703.

Pedregosa, Fabian, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, et al. “Scikit-Learn: Machine Learning in Python.” The Journal of Machine Learning Research 12, no. null (November 1, 2011): 2825–30.

Peterson, David, Beyonce Andrades, Kevin Lizarazu-Ampuero, Jai Deshmukh, Thomas Stapor, Will Destaffan, Don Engel, Justin Krometis, and Justin A.

Kauffman. “Integration of Reinforcement Learning and Unreal Engine for Enemy Containment via Autonomous Swarms.” In AIAA SCITECH 2023 Forum.

American Institute of Aeronautics and Astronautics, 2023. https://doi.org/10.2514/6.2023-2674.

Qiao, Zhiqian, Zachariah Tyree, Priyantha Mudalige, Jeff Schneider, and John M. Dolan. “Hierarchical Reinforcement Learning Method for Autonomous Vehicle Behavior Planning.” In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 6084–89. Las Vegas, NV, USA: IEEE, 2020. https://doi.org/10.1109/IROS45743.2020.9341496.

Silver, David, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, et al. “Mastering the Game of Go with Deep Neural Networks and Tree Search.” Nature 529, no. 7587 (January 2016): 484–89. https://doi.org/10.1038/nature16961.

Sivak, V. V., A. Eickbusch, H. Liu, B. Royer, I. Tsioutsios, and M. H. Devoret. “Model-Free Quantum Control with Reinforcement Learning.” Physical Review X 12, no. 1 (March 28, 2022): 011059. https://doi.org/10.1103/PhysRevX.12.011059.

Snyder, William, Alex Santiago Anaya, Justin Krometis, Traian Iliescu, and Raffaella De Vita. “A Numerical Comparison of Simplified Galerkin and Machine Learning Reduced Order Models for Vaginal Deformations.” Computers & Mathematics with Applications 152 (December 15, 2023): 168–80. https://doi.org/10.1016/j.camwa.2023.10.018.

Snyder, William, Mostafa Zakeri, Justin Krometis, Romesh Batra, Traian Iliescu, and Raffaella De Vita. “Deep Learning Reduced Order Models of Vaginal Tear Propagation.” Submitted. 2024.

Sutton, Richard S, and Andrew G Barto. Reinforcement Learning: An Introduction. 2nd ed. MIT Press, 2015.

Wurman, Peter R., Samuel Barrett, Kenta Kawamoto, James MacGlashan, Kaushik Subramanian, Thomas J. Walsh, Roberto Capobianco, et al. “Outracing Champion Gran Turismo Drivers with Deep Reinforcement Learning.” Nature 602, no. 7896 (February 2022): 223–28. https://doi.org/10.1038/s41586-021-04357-7.

Yalçın, Orhan G. “Eager Execution vs. Graph Execution: Which Is Better?” Medium, February 2, 2021. https://towardsdatascience.com/eager-execution-vs-graph-execution-which-is-better-38162ea4dbf6.

Author Biographies

Justin Krometis is a Research Assistant Professor in the Intelligent Systems Division of the Virginia Tech National Security Institute. His research is in the development of theoretical and computational frameworks for Bayesian inference, particularly in high-dimensional regimes, and the application of those methods to domain sciences ranging from fluids to geophysics to testing and evaluation. His areas of interest include statistical inverse problems, parameter estimation, machine learning, data science, and experimental design. Dr. Krometis holds a Ph.D. in mathematics, a M.S. in mathematics, a B.S. in mathematics, and a B.S. in physics, all from Virginia Tech.

William Snyder is a recent graduate from the Biological Transport (BIOTRANS) interdisciplinary program within Virginia Tech’s department of Biomedical Engineering and Mechanics. Their research is in the implementation of the computational modeling techniques for the efficient simulation of soft tissue biomechanics with applications in maternal health. Their areas of interest include nonlinear solid mechanics, finite element analysis, projection-based reduced order models, and machine learning. Dr. Snyder holds a Ph.D. in engineering mechanics and a B.S. in mechanical engineering, both from Virginia Tech.

ITEA_Logo2021
  • Join us on LinkedIn to stay updated with the latest industry insights, valuable content, and professional networking!