Material Forming Digital Twins: The Alliance between Physics-Based and Data-Driven Models

This paper aims at introducing the main building blocks of a digital twin, embracing physics-based and data-driven functionalities, both enriching mutually. Both should proceed in almost real-time, and the last being able to proceed in the scarce data limit. When applied to materials and processes, model order reduction technologies enable the construction of the so-called “computational vademecum”, whereas data-driven modelling, based in advanced regressions, must be informed by the physics to encompass rapidity and accuracy, in the low data limit. Despite of the recent advances, a lot of functionalities are needed and are under progress, some of them representing real scientific challenges. A number of them, the ones that we estimate being the most crucial, will be discussed in the present work.


Introduction
Virtual twins in the form of simulation tools that represent the physics of materials, processes and structures, making use of physics-based models, were the main protagonists of the XX century engineering. Thus, the virtual twin consists of the so-called nominal model, expected representing the observed reality, in general calibrated offline from the data provided by specific tests, enabling to predict the responses to given loadings, the last also nominal in the sense that they are expected representing the ones that the design will experience in service.
For that purpose, the mathematical models consisting of complex partial differential equations, generally strongly nonlinear and coupled, are discretized despite of the fact that in many cases the calculations are very costly in computational resources and computing time.
The XXI century engineering requests focusing on the real system (instead of on its nominal representation) in operation, subjected to the real loading that it experienced until the present time (instead of the nominal loading) in order to perform efficient diagnosis, prognosis and prescriptive decision making.
Here, usual modeling and simulation techniques are limited, the former because of the fact that a model is sometimes no more than a crude representation of the reality, and the last because of the computational cost that its solution using well experienced state-of-the-art discretization techniques entails.
Recent Model Order Reduction (MOR) techniques enable evaluating in almost real-time the solution of physics-based models. These techniques neither reduce nor modify the model itself, they simply reduce the complexity of its solution by employing more adapted approximations of the unknown fields [1].
Model Order Reduction techniques express the solution of a given problem (a PDE for instance) into a reduced basis with strong physical or mathematical content. Sometimes these bases are extracted from some offline solutions of the problem at hand, as the proper orthogonal decomposition (POD) or the reduced bases method (RB) perform. Now, when operating within the reduced basis, the solution complexity scales with the size of this basis, generally much smaller than the size of the multi-purpose approximation basis associated with the finite element method (FEM) whose size scales with the number of nodes involved in the mesh that covers the domain in which the problem is defined. Even if the use of a reduced basis implies a certain loss of generality, it enables impressive computing time savings while guaranteeing acceptable accuracy as soon as the problem solution continues living in the space spanned by the reduced basis.
The main drawbacks of those techniques are: (i) their limited generality when addressing situations far from the ones that allowed the reduced basis construction; (ii) the difficulties of addressing nonlinear models, that require the use of advanced strategies; and (iii) its intrusive character with respect to its use in well experienced and validated existing software.
For circumventing, or at least alleviating, the just referred computational issues, an appealing route consists of constructing the reduced basis at the same time that the problem is solved, as proper generalized decompositions (PGD) perform [2][3][4]. However, PGD is even more intrusive than POD and RB referred above. Thus, non-intrusive PGD procedures were proposed, that proceed by constructing the parametric solution of the parametric problem from a number of high-fidelity solutions performed offline, for different choices of the model parameters. Among these techniques we can mention the SSL-PGD, that considers hierarchical separated bases for interpolating the precomputed solutions [5], or its sparse counterparts [6,7].
Once the parametric solution of the problem at hand is available, it can be particularized online for any choice of the model parameters, enabling simulation, optimization, inverse analysis, uncertainty propagation, simulation-based control, ... all them under the stringent real-time constraint [3].
On the other hand, recent advances in data-science, artificial intelligence and machine learning make possible an alternative data-driven engineering. The data-driven route becomes especially appealing when: • Physics-based models are unknown or the ones that exist remains too inaccurate in their predictions. In this case the physics-based model can be enriched by considering the datadriven model of the deviation, at the heart of the Hybrid Twin concept [8]. • Diagnosis can be performed in a very efficient from the solely use of data, however, its explanation requires deeper modelling approach. • MOR becomes difficult to perform or employ, with the associated effects on the prognosis performances.

Materials Hybrid Modelling
When addressing the modelling and simulation of materials involved in processes and structural systems, the three points above apply. Thus, different approaches and methodologies can be considered enabling more accurate and efficient modelling frameworks, in particular: • Fast and reliable calibration procedures applied to state-of-the-art constitutive equations, where the problem can be formulated as a problem of parametric inference, from the use of a sort of material computational vademecum combined with adequate data-assimilation (using extended Kalman filters, Bayesian inference, regularized inverse techniques, …). • When the best fit (best calibration of the state-of-the-art constitutive equations) is not general enough (unable to describe all tests) and/or not accurate enough, the gap between the best calibrated model predictions and the measurements reveals an amount of intrinsic "ignorance", sometimes of epistemic nature. In these circumstances material hybrid modelling looks for reducing the ignorance by constructing on-the-fly a data-driven model based on the application of physics-aware artificial intelligence on the deviation data. These physics-aware modeling methodologies fulfill fundamental principles (energy conservation, positive dissipation, frameindifference, convexity when required, …) and in general all existing well-established knowledge [9]. • When combining the just referred functionalities, physics-based and data-driven, a hybrid material description results, a sort of material Hybrid Twin [10].

Achievements and Trends in Material Forming
• Extracting knowledge from data. When materials are not well known, the first step consists of determining from collected data the intrinsic dimensionality of its behaviors (manifold learning), identifying useless parameters, informing on the existence of hidden parameters (internal or state variables), helping to identify the parameters that acts in a combined manner [11]. • Variability is addressed by constructing probabilistic material descriptions (constitutive equation), and the uncertainty propagated by using probabilistic models. • Data compression using sparse sensing allows to reduce the acquisition rate [12]. In the same way, the hybrid description enables performing data-completion to infer thermomechanical fields far from the sensor's location. • Material by design. The inverse problem could be solved to discover the features leading to optimal performances. • In some circumstances, the description of rich microstructures as well as highly fluctuating data series (for instance profiles of rough surfaces) face the difficulty of choosing the parameters for their complete and concise description (at a given scale and with a given purpose) as well as the metric of analysis, the Euclidian proving often inadequate. New techniques based on the topology of data, with the inherent invariance properties that topology provides, are disrupting data processing, time series and images analyses. TDA -topological data analysis -based on homology persistence has been successfully applied to the analysis of rich microstructures and rough surfaces [13,14].

Real-Time Process Simulation
To empower design procedures and predictive capabilities, the state-of-the-art models must be solved in almost real-time. For this purpose, as discussed in the introduction, model order reduction can be applied. However, most of the existing techniques becomes too intrusive with respect to the existing commercial simulation software widely employed industry. Thus, technologies minimizing the intrusiveness are preferred.
Surrogate models, representing the solution of the model at hand at any point of the parametric space irrupted in the industrial procedures, however, they fail to address highly multi-parametric models.
The main limitations that standard meta-modelling technologies encounter are: (i) the difficulty of creating rich approximations (regressions) in multidimensional settings (the so-called combinatorial explosion or curse of dimensionality); and (ii) the increase of the amount of data for construction such a rich regression.
In our former works referred before, the issue related to the multi-parametric solution representation was addressed from the use of separated representations at the heart of the PGD. The separation of variables enables the solution of one dimension of the parametric space each time, one by one, that reduces the complexity of solving a D-dimensional problem to the solution of D onedimensional problems.
Thus, quite rich approximations can be employed without major conceptual issues as proved in our numerous previous works, however, the combination of rich approximation and few data (as commented the amount of data should be reduced as much as possible, because both, the data generation or the data acquisition are far of being cheap and easy operations) exacerbates the socalled overfitting. To limit it, and thus making well and cheap, different regularizations were proposed in our former works referred before, in particular the sPGD, rsPGD, s2PGD and ANOVA-PGD [7] repported in Fig. 1, or the use of multi-PGD that exploits the fact that nonlinear behaviors become locally linear, and for that reason multi-regressions defined in patches paving the whole physical or parametric space, represents a very appealing procedure. Figure 2 represents the parametric solution builder, the so-called AdMoRe platform (Advanced Model Order Reduction) industrialized by ESI Group and that integrate the research developments of ESI and the ESI Chairs at Arts et Métiers and the University of Zaragoza, consisting of a sparse Design of Experiments (DoE) that as discussed before, its richness scales linearly with the   As soon as the parametric solution is available, the parametric space explorer -PSE-allows particularizing the just computed parametric solution for any choice of the parameters, that is at any point within the parametric space. Such a fast exploration makes possible optimization, inverse analysis, uncertainty propagation, simulation-based process control, all them under the stringent realtime constraints. Figure 3 depicts such a parametric solution of a stamping process, with the high-fidelity solutions performed by using the commercial software PAM-Stamp (by ESI Group). This figure emphasizes the sliders representing different material and process parameters (friction coefficient, clamping force, stamping velocity, …).

Functionalities and Roadmap
AdMoRe constitutes at present a very powerful approach for augmented engineering, with major successful accomplishments and a clear roadmap for empowering it from incremental and disruptive innovation. Figure 4 summarizes the AdMoRe cloud platform, with numerous advanced functionalities.
Key Engineering Materials Vol. 926  The different advanced functionalities, that already demonstrated their value, are illustrated in Fig. 5 and synthetically summarized below: • Multi-regressions allow improving the solution accuracy for representing nonlinear nonpolynomial behaviors, where a locally linear approximation performs better than increasing the polynomial degree for approximating non-polynomial solutions. • Parametric Optimal Transport -POT-allows interpolating solutions exhibiting localization with respect to the parameter's choice [15]. Thus, spurious artefacts induced by the interpolation of localized solutions are avoided. • Geometrical parameters are of major interest when addressing shape optimization. Here different possibilities exist, from traditional mappings, to more advanced techniques, like Riemann projection or parametric morphing.

Achievements and Trends in Material Forming
• Error estimation is compulsory for acquiring confidence in the designs and decisions, as well as for certifying them. The error estimation can be obtained as in usual regression techniques, by checking the prediction accuracy in the training and test data sets. • The so-called Lego-PGD aims at computing the parametric models of different components, where other than the intrinsic (physical) parameters, the boundary kinematics is parametrized and included in the general component parametric solution. With all the component parametric models computed and organized in a sort of catalog or dictionary, a system can be composed, like a puzzle, where the parameters of the components boundary are computed for ensuring the kinematic continuity as well as the global structure equilibrium (momentum balance). Then, the change of each parameter of each component will affect the response of the whole structure, the last evaluated in almost real-time. • The treatment of degenerated structural elements (beam, plate, shell, …), degenerated in the sense that at least one characteristic dimension is much smaller than the others, making difficult a fully 3D discretization, can be efficiently addressed by using an in-plane-out-of-plane (or a cross-section-axis in the case of curvilinear structural elements) separated representation, that allows solving the 3D problem from a sequence of 3 one-dimensional problems or one twodimensional (middle surface of cross section) and one one-dimensional (thickness or axis), allowing the accuracy of fully 3D high-resolution simulations at the cost of 1D and 2D simulations, without the need of introducing any hypotheses (in contrast with classical plate and shell theories). • When addressing multi-stage processes (multi-stamping for instance) of chaining different processes (e.g. stamp, weld and crash), the output of one model becomes the input of the next.
In that case, as was the case when considering the Lego-PGD, we must parametrize the input, to produce an output depending on the intrinsic (physical) parameters as well as the ones related to the parametrization of the initial condition (input). This parametric solution is called transfer function, due to the fact that it connects the output to the parametric input, while including other parameters associated to the step or process under consideration. Now, online, the first step (Step #1) applies and from its selected set of parameters produces its output. This output is then projected onto the reduced basis parametrizing the initial condition of Step #2, that allows computing the input parameters to be considered in Step #2. Then, using the last and the intrinsic parameters associated with the current step, the Step #2 output is computed, from which the process repeats for the next Step, and so on, until the end of the multi-step process or processes chaining. • The advanced regressions previously introduced and reported in Fig. 1 involve some hyperparameters (related to the regularizations) whose calibration deserves some fundamental developments. Moreover, the choice of the approximation bases (polynomial, extended or physics-aware, …) constitutes an open issue, in close connection with optimal sampling in the scarce data limit. Smarter samplings could profit of recent developments in active learning, among many other alternatives making use of sparse sensing, theory of information, statistical sensing (Bayes, Anova, …), residual sensing, techniques based on the adjoint methods or error estimation ("a posteriori" and "a priori"), exploration techniques used in robotics (SLAM, …), … • Until now, we addressed the interpolation of fields, however, industrial practices, also involve the construction of parametric curves and parametric quantities of interest -QoI-. One could think that as soon as the parametric field is known, its particularization at a certain point of space will offer a parametric time-dependent curve, and when particularizing the parametric field at a certain point and time, one could have access to the parametric QoI. Even if that procedure is possible and in many cases it represents a direct valuable route, the obtained parametric curves and parametric QoI lack of accuracy. This is due to the fact that when computing a parametric field, we are enforcing (from the use of the L2 norm) being quite good everywhere (in space and time), that means that locally the solution could differ of being the most accurate (local accuracy is sacrificed in favor of global accuracy). For this reason, it is preferable having three ROM

Key Engineering Materials Vol. 926
constructors, one for the fields, one for the curves and one for the QoIs, each of them proceeding from their respective high-fidelity solutions at the sampling points. However, to obtain the maximum accuracy when interpolating curves, some extra-work is needed in order to align them before computing the reduced basis whose coefficients will depends on the considered parameters. This alignment can be performed by using standard mapping, optimal transport or DTW (dynamic time wrapping) [16], among many other options. • In practice parameters are rarely deterministic, they follow a certain statistical distribution. When parameters are for instance normally distributed, each can be characterized by its mean value and standard deviation. Thus, the parametric solution is transformed into the one involving as parameters, the mean value and standard deviation. That description results in a corridor, the interval in which the solution should be located with a given (for instance 95%) probability. Such a statistical parametric solution allows propagating uncertainty in almost real-time and making possible efficient reliability and robustness analyses. In the same way, when instead of making the regression from synthetic data generated from high-fidelity simulations, the regression concerns experimental data, the data variability must be considered to avoid overfitted regressions. In this case, we proceed by looking to the regression (smooth enough) that allows guaranteeing that the data distribute with respect to the regression with the known statistics. When the last remains unknown, a regularization that enforces (with a weight to be calibrated) the minimum variance in the data distribution with respect to the regression, applies. • As soon as component behaviors are condensed into their associated parametric solutions or parametric transfer functions, they could be integrated in a system model (or system of system). Thus, the resulting simulation framework takes advantage from the system modelling efficiency (like when using Modelica, Simulink, SimulationX, …) and from the accuracy of 3D highresolution models. • When constituting a model by assembling many parametric components (as in the Lego-PGD described before) adding a new component, replacing of removing a component seems quite simple. However, when following the standard procedure illustrated in Fig. 2, the reduced bases are no more valid as soon as the system is modified, even when this change remains quite local. The so-called surgery-PGD aims at updating the reduced bases from few global and few component extra-simulations (the component involved in the system update). The appellation "surgery" comes from the fact that a part of the vectors representing the space reduced basis, is suppressed and the one representing the new component grafted in the location occupied by the component just replaced. • Several behaviors are reputed non-reducible, as the ones concerning bifurcations (e.g. buckling), instabilities or unilateral mechanics (contact, …), compromising interpolations except when the solutions are adequately clusterized for ensuring a safe interpolation in each cluster of data for deriving a local parametric solution. • When physics exhibits different characteristic times (for instance when addressing fatigue), time multi-scale simulations must be considered. In presence of scales separation, homogenization techniques perform very well, however, the problem becomes computationally complex in absence of that scale separation. One valuable route for addressing those problems consists in transforming the one-dimensional time axis, into a D-dimensional time, with the different dimensions representing different characteristic time scales. Because of the separated representation, the combination of D one-dimensional solutions, each one involving N degrees of freedom (representing its resolution level) allows calculating a solution of degree of resolution N power D. For example, solving three problems each involving 1000 time-steps (with a computational complexity scaling with 3x1000) is equivalent to a resolution of 1000 millions (1000 power 3) time steps at the lowest scale. Even if the technique has been very successfully applied in the heat and wave equation with rich time-multi-scale thermal and mechanical loadings [17], the main remaining challenge concerns inelastic irreversible behaviors (e.g. elastoplastic behaviors) whose behavior depend on the experienced history (eventually condensed in a series of interval variables to be defined and updated all along the loading process).
• Lagrangian, updated Lagrangian and ALE formulations or those problems where the domain evolves as dictated by the physics-based model (mold filling simulation for instance) make difficult the direct application of the standard reduction framework. In these situations transportbased parametric solutions seem the most valuable route for addressing the parametric solution associated with the just referred formulations. • The other challenging applicative domain concerns the ones involving the solution of eigenproblems. The parametric expression of eigenvalues and eigenvectors, under the orthogonality constraints, remains difficult in both, the continuous formulation due to the problem nonlinearity and the associated nonlinear constraints, as well as when using the solution interpolation mainly induced by the difficulty of interpolating bases, that needs first for an alignment and then, the adequate interpolation on the Grassmannian manifold. • Even if the curse of dimensionality has been efficiently circumvented and the parametric solutions contains an unimaginable amount of information, a very serious issue remains, the one that consists of finding the best solution (in a certain and given sense) within this dense forest of solutions. The combinatorial explosion that was successfully circumvented in the parametric solution construction, reappears now in the post-processing step, when looking for a particular solution in this parametric forest. Different optimization algorithms (deterministic or stochastic) can be applied, however, the real opportunity (for attaining global extrema instead of the usual local ones) could be the use of quantum computing implemented in quantum computers, in which the optimization procedures break the combinatorial explosion (quantum computing is an appealing technology for addressing NP-complex problems). • Addressing parametric topologies is another important challenging issue, especially tricky when using interpolation because of its non-interpolative intrinsic nature. Here the combination of TDA (topological data analysis) in tandem with manifold learning allows unifying shape and topology optimizations. • Many times, as discussed in the introduction, physics-based models are not accurate enough, and data-driven models or data-driven model enrichments seem compulsory. Concerning data-driven modelling and simulation, several functionalities are under intense development. The first concerns data analysis and more particularly data-reduction. Extracting the intrinsic dimensionality of data, by using linear and nonlinear dimensionality reduction (manifold learning: PCA, kPCA, lPCA, LLE, tSNE, … [18,19]; Neural Networks based autoencoders [20]; …), identifying useless features, discovering hidden explanatory variables, or discovering hyperfeatures (combination of features playing conjointly on the target), are of major relevance [11]. However, for grouping data, in a supervised (classification) or unsupervised (clustering) way, the data must be associated with a metric able to quantify and compare them. When data is defined in a vector space, usual metrics (Euclidean in many cases) can be applied, however, data can be complex in many circumstances, consisting of many heterogeneous information, sometimes discrete, categorical, codified, … and for measuring and comparing the data one must learn goaloriented metrics. This is also the case when data contain a huge amount of topology as it is the case of microstructures or time series. Comparing two microstructures or two time series is not an easy matter, because even when they seem similar, they differ pixel to pixel or time to time, without never perfectly matching. In this case three possibilities, among many others, can be envisaged: (i) learning the metric from the available data as most classification techniques perform (SVM [21], decision trees [22] and its random forest counterpart [23], neural networks -convolutional [24], GAN as a valuable procedure for augmenting data [25], …-, Code2Vect [26], …) and others making use of boosting [27], reinforcement, semi-supervised and selfsupervised learning, …; (ii) transforming the data to a suitable target space by using wavelets, Fourier transform, DCT or TDA, …; or (iii) by extracting statistical descriptors (mean, standard deviation, pair-correlation, covariogram, …) on which the clustering or classification will operate. • As soon as data becomes measurable, quantifiable, comparable and explainable, … the next step consists in creating regressions linking input and outputs, that is, models. For that, all the Key Engineering Materials Vol. 926 nonlinear regressions presented in Fig. 1 are available, and many others paving the huge family of machine learning techniques. Sometimes the model is expected relating input and outputs, or the solution at a given time instant from the solution at previous instants, that is extracting the integrator of a dynamical system. Advanced deep neural networks, exploiting the universal approximation theorem [28] (that was extended to functionals and operators [29,30]) enable operating in strongly nonlinear settings. These techniques concern residual nets [31], recurrent NN that account for the existence of hidden variables as well as their time evolution [32], convolutional and graph NN [33] taking profit of invariance properties, Physical Informed NN for prescribing the existing knowledge (PDEs) [34] or the Thermodynamical Informed NN (also known as Structure Preserving NN -SPNN-) that enforces the thermodynamics first principles, in both the Hamiltonian and dissipative settings: simplectic Hamiltonian integrator concerning the free energy in the former, completed by a positive dissipation -entropy production-in the last, incorporating a second thermodynamical variable, the entropy [35,36]. Within the thermodynamical setting, other formulations exist based on the use variational principles, like the Herglotz's variational principle in contact geometry or the Onsager variational principle that considers the extremum of the so-called Rayleighian, instead of the differential formulation employed in the SPNN based on the GENRIC formulation. Other procedures enabling dynamical systems modelling are the NN-based NARX (nonlinear autoregressive exogenous model) [37], the DMD (Dynamic Mode Decomposition) [38] or the Koopman operator as a more valuable option for addressing nonlinear dynamical models [39]. Nonlinear behaviors seem being strongly dependent on the considered description framework. By using an adequate description, the nonlinear content can be drastically reduced. Thus, a valuable route consists in prior of proceeding to learn the model, applying a nonlinear dimensionality reduction to the data (by using for instance autoencoding) and then applying linear or nonlinear regression in the latent (reduced) space, in which, being the nonlinearity less intense and the space dimension much smaller, usual DMD, regularized regressions, SINDy [40] or fully connected NNs, work much better for the same amount of data. In those cases, autoencoding is applied to the input and output data, and then the latent representation of the input and output data are connected by one of the just referred regressions. Moreover, acquiring knowledge could need the combined effect of different kind of data, as considered in multimodal learning, where Boltzmann machines have been intensively considered. • The physics-based models and the data-driven model enrichment obtained by using one of the techniques just described can be coupled into the hybrid twin -HT-concept [8]. This hybrid methodology can be viewed as a physics-augmented learning technique (in contrast to the physics-informed learning just discussed), or as a transfer learning, where the nominal model is adapted by adding a data-driven enrichment when the nominal model is unable to accommodate the collected data. This hybrid formalism can also be applied for locally enrich the model for accounting for example localized damage in structural health monitoring -SHM. Sparse regularization can be used for computing the so-called model enrichment under the constraints of the equilibrium and reproduction of the collected data. On the other hand, when the model must be globally enriched, two routes exist, one that proceeds from data-completion (by using an appropriate representation basis) before contracting the regression for describing the deviation, and the other that computes a global correction of the model from the collected data without data-completion. The last correction becomes local as soon as the model is represented in a reduced setting, where again sparse regularization can help.

Conclusions
The present paper revisited the main methodologies for addressing real-time physics based on advanced reduced order modelling, and the data-driven learning procedures, that can be eventually informed or augmented, by physics or existing knowledge. By combining both, physics-based models, calibrated by using data assimilation, and operating in real time by using advanced model order reduction techniques, with a data-driven model for describing the gap between the measures