The Virtual Fields Method to Indirectly Train Artificial Neural Networks for Implicit Constitutive Modelling

Artificial Neural Networks (ANNs) have the potential to provide a different approach to constitutive modelling, with the main advantage that these do not require to postulate a mathematical formulation or identify empirical parameters. Currently, the training of an ANN for implicit constitutive modelling mostly relies on paired data, usually stress-strain however, stress cannot be directly measured in a real experiment. As such, the training should be carried out indirectly using measurable variables from the experimental setting, such as displacements and the applied force. In the current work, displacements and global force data are used to indirectly train an ANN to predict the stress state of a material. An experimental test is recreated numerically in order to obtain displacement and global force data for different load distributions, i.e. obtaining synthetic data using a virtual experiment. The strain from the current and previous time increments are obtained from the corresponding displacements and used as inputs for the ANN to predict the current state of stress. Training is carried out without stress labels to compute the loss. Instead, the local and global equilibrium conditions, corresponding to the application of the Virtual Fields Method (VFM) to the physical model, are employed to compute the loss and update the network parameters, until the predicted stress state is accurate.


Introduction
Constitutive laws are established according to first-principle assumptions compiled in a set of mathematical expressions, supported by a number of empirical parameters that need to be calibrated via experimentation. Materials with complex behaviors require complex models with a higher number of parameters, resulting in expensive and time-consuming experimental campaigns [1].
Artificial Neural Networks (ANNs) have the potential to provide a radically different approach to the field, as they are able to implicitly learn patterns directly from data, without having to postulate a mathematical formulation or identify parameters [2,3]. Several successful applications of ANNs targeted at the implicit modelling of material behavior have been reported in the literature (e.g. [4][5][6] among others). However, the vast majority of the approaches rely on training the ANNs with paired data, usually stress-strain, from numerically generated datasets. However, with numerical generated data all variables are easily accessible, while in a real experimental setting certain variables (e.g. stress) are not directly measured, only the displacements and global force are measurable quantities and should be used to indirectly train the ANN model.
In this work, displacement and global force data are used to indirectly train an ANN model to predict the stress state of a material. A virtual experiment was recreated in order to obtain the necessary variables considering different load distributions. The strains corresponding to two subsequent time increments and used as inputs for an ANN to predict the current state of stress. Training is carried out without stress labels. Instead, the local and global equilibrium conditions corresponding to the application of the Virtual Fields Method (VFM) [1] to the physical model are employed to compute the loss and update the network parameters. Hence, the main objective of this paper is to show the VFM as capable and viable method to indirectly train ANNs for implicit constitutive modelling.

Fundamentals on ANNs
Direct training. Feedforward neural networks (FFNNs) are the most commonly used topology, consisting of: an input layer, an output layer and one or more hidden layers, which provide complexity for non-linear problems [7]. Considering the generic FFNN, shown in Fig. 1, the dimensions of the vector x dictate the number of neurons in the input layer [7]. These inputs are subsequently mapped to the next layers, triggering the respective activations { (1) 1 , ... ( ) } that form a vector a ( ) ∈ R . The activation potential, from layer ( −1) to layer , is controlled by a matrix of parameters W ( ) ∈ R × and computed as the sum of the weighted output values { 1 , . . . } of all incoming connections. A function (·) is then applied to this weighted sum, leading to the activations of the th layer being computed as [7]: A set of biases b ∈ R is added as the weight of a link that always transmits a value of 1. During training, the network learns the parameters (W, b) that minimize a given loss function. A standard supervised learning procedure uses labeled datasets, where each training sample is associated with an observed value [8]. In the case of implicit constitutive modelling, usually one seeks to predict the stress components, therefore, stress labels are fed to the network in order to compute the cost, such that [4]: where is the total number of training instances, is the vector of observed values andˆ the vector of predictions. There is a wide variety of optimization algorithms available to minimize the cost function. A gradient-based algorithm is normally used to minimize the cost and drive the parameters update, such that [7,9]: with being the learning rate, controlling how quickly the model adapts to the problem [9]. The partial derivatives of the loss with respect to the parameters can be efficiently computed by an error backpropagation.

Input Hidden
Hidden .

Key Engineering Materials Vol. 926 2061
Indirect training. Implicit constitutive modelling based on ANNs mostly relies on paired data, with the strain and stress tensors usually being fed to the system in order to learn the material behavior. However, variables such as stress cannot be directly measured from experiments [10]. Therefore, the training process must be carried out indirectly, using only measurable data. Some authors recently reported different approaches to tackle this issue. For example, Xu et al [11] presented a method for an ANN to model viscoelasticity, based on displacement data, using partial differential equations to introduce the physical constraints during training, and Liu et al [10] addressed the issue via coupling the ANN model with the Finite Element Method (FEM) to learn constitutive laws based on force and displacement data (Fig. 2). In the latter approach, the physical constraints are implicitly imposed as the network's outputs must go through the FEM to generate correct input data. In the current work we propose a disruptive approach in comparison to Liu et al [10], making use of the VFM to guarantee the global equilibrium (Fig. 3). The VFM, first introduced by Grédiac [12], is known by its computational efficiency and does not need to resort to FEM in order to conduct any forward calculations [13]. The key elements behind the VFM are the Principle of Virtual Work (PVW) and the choice of virtual fields. According to the PVW, the internal virtual work must be equal to the external virtual work performed by the external forces, and is written by [14]: where * is the virtual strain, u * is the virtual displacement, is the volume of the solid and T is the traction vector. These virtual entities are mathematical test functions, work as weights and can be defined independently of the measured displacements/strains. An infinite number of virtual fields can be used, nonetheless the following two conditions should be met [1,14]: the chosen virtual fields should be kinematically admissible, meaning that the displacement boundary conditions must be satisfied, and the virtual fields should be constant along the boundary where the force is applied. By coupling the VFM with the ANN model, one can use force and displacement data to indirectly train the ANN. The strains are obtained from the displacements and fed as inputs to the neural network, which will provide the stress tensor components. Then the stress equilibrium is evaluated globally by means of the PVW and the parameters (W, b) are optimized until the equilibrium is respected, that is, by minimizing the loss: where represents the number of virtual fields.   [15] was chosen in order to generate synthetic experimental data to train the ANN model. The configuration consists of a solid 3×3 mm 2 plate with a thickness = 0.1 mm. The physical domain is discretized by a mesh with a total of 9 four-node bilinear plane stress elements. The initial mesh, geometry and boundary conditions are depicted in Fig. 4. Symmetry boundary conditions are applied to the boundaries at = 0 and = 0, and a surface traction is applied to the boundary at = 3 mm. The traction follows a non-uniform distribution and is composed by a single component along the -direction, which varies linearly in the -direction according to: ( ) = + , where and respectively control the slope and intercept of the distribution. The numerical simulations were conducted using the commercial finite element code Abaqus. The model was built with CPS4R elements (bilinear reduced integration plane stress). The material was simulated employing a non-linear isotropic elasto-plastic model, with the isotropic hardening response obeying to the Swift's law, given by:

Achievements and Trends in Material Forming
where is the flow stress, is a hardening coefficient, is the hardening exponent, 0 is the yield stress and 0 the deformation at the yielding point, computed as: Key Engineering Materials Vol. 926 2063 The elastic parameters were defined as = 210 GPa and = 0.3, while 0 = 160 MPa, = 565 MPa and = 0.26 were adopted for the Swift's hardening law, corresponding to a soft steel.

Data generation and ANN training.
To generate the training data, all simulations were performed using a small displacement formulation, with the time period set to 1 and using a fixed time increment Δ = 0.001. For each time increment, the deformation components at the centroid were extracted for all the elements and the global force was determined by means of computing the equilibrium of the internal forces, such that: where is the global force, is the length of the solid, is the element area and is the thickness. The training data was generated for different load distributions, keeping the slope fixed at = 10 N/mm and varying the intercept parameter, such that: = {50, 170, 270} N. Prior to training and for each mechanical trial, the dataset was organized into batches of 9 elements per time increment and shuffled before being split into training (67%) and test data (33%). The input features were normalized to the interval [0, 1]. Two models were trained and aimed at predicting the linear elastic and elastoplastic responses of the material. The latter included training samples with both elastic and plastic data. Once trained, the models were validated with different mechanical trials using the following load distributions: = 12, = 100 for the elastic model and = 10, = 200 for the elasto-plastic model. The neural network model used for both cases was a FFNN with one hidden layer with of 8 neurons. The PReLU was chosen for activation function over a standard ReLU, due to the fact that the latter makes neurons with negative inputs to always output zero. Thus, the gradient flowing through them will forever be zero, irrespective of the input. The issue is commonly known as "dead" neurons and the PReLU circumvents it by having a slope for negative input values, thus making the gradients nonzero. The slope itself is a learnable parameter that the neural network automatically adjusts during training. The ANN architecture is summarized in Table 1. The inputs given to the model were the deformation components in the current and previous time increments, and −1 , respectively, and the outputs were the stress tensor components at the current time increment, . The Adam algorithm was used to optimize the network weights, with the initial learning rate set to 0.1, scheduled to be reduced using a multiplier of 0.2 if no improvement in the training loss was registered after 3 epochs. For the elastic response model, the network was trained during 20 epochs and for the elasto-plastic response model the training was set to occur during a maximum of 150 epochs. However an early-stopping criteria was defined such that it would be interrupted if no further improvement was registered in the test loss. The complete set of virtual fields used to train both models is shown in Fig. 5.

Achievements and Trends in Material Forming
Results and discussion. The learning curves for both models are depicted in Fig. 6. The plots show that loss decreases sharply during training after some point for both models, but convergence is achieved earlier for the elastic model which was easier to train using only 2 virtual fields. For the elasto-plastic response model 15 virtual fields were used during training, substantially lowering the initial loss ( Fig. 6(a)). The validation results (Fig. 7) show that the ANN was able to learn both elastic and plastic behaviors, providing reasonable predictions for the stress response along the -direction, indicating that the VFM approach is at least working. Although using additional virtual fields improves the shear stress response just enough for the plastic model ( Fig. 7(b)), in general the ANN has a noticeable lower sensitivity to the stress responses along and . The issue may be due to fact that either the number of chosen virtual fields is not enough to capture the material behavior or the chosen set of virtual fields provides more weight for the stresses along in detriment of the remaining components. Another factor at play here is that the virtual fields were chosen manually. This strategy is often used for non-linear models and is the easiest to implement, nonetheless it does not guarantee the chosen virtual fields produce the best results and is tied to the expertise of the user [15].