Evaluating Sampling Strategies for Visualizing Uncertain Multi-Phase Fluid Simulation Data

With Eulerian Method of Moment (MoM) solvers for the CFD simulation of multi-phase fluid flow, the positions of bubbles or droplets are not modeled explicitly, but through scalar fields of moments. These fields can be interpreted as probability density functions describing the distribution of bubble locations. To enable intuitive visualization that allows direct visual comparisons between simulation and physical experiment, explicit instances of the bubble distribution are required. In this work, we examine one sampling-based method for obtaining sets of bubble positions from density fields. Based on an example dataset, we study the influence of the main parameter, the kernel size, on the resulting bubble set. We identify a tradeoff between numerical accuracy and temporal continuity for the visualization.


Introduction
In process engineering, Method of Moment (MoM) solvers are used to simulate scenarios involving liquid and gaseous phases, with the ultimate goal of reducing the need for physical experiments. In such simulations, the exact positions of bubbles are considered uncertain. Bubbles are modeled via scalar density fields describing the stochastic distribution of positions as well as fields of local properties such as the bubble diameter and phase velocities.
The visualization of such datasets is challenging because the simulation output does not lend itself to visual comparison with the physical experiment. This work considers a sampling approach, where an explicit set of bubbles is generated to match the density field at a time step. The bubble set can then be displayed in a way that resembles the physical experiment. To provide temporal coherence in the visualization, the bubble set is updated over time, matching each time step of the density field as closely as possible. This approach requires a method to measure the density of a given set of bubble locations. We describe how this can be achieved using kernel density estimation, where kernel functions are evaluated across the domain. The bandwidth of the kernel functions strongly influences the resulting measured density field, and thereby the updated bubble set. Given a dataset of a bubble column, we study how different choices of this parameter influence the results of our approach.
The paper is structured as follows: We begin by providing a short overview of related work in the field of visualization and some background for Method of Moments simulations. Then, we describe the sampling approach in detail, before discussing the results of applying it to a bubble column dataset. Finally, we summarize our findings and list further research opportunities.
Potter et al. [4] and Bonneau et al. [5] provide overviews of existing approaches for uncertainty visualization. Flow Visualization. For the visualization of uncertain flow fields, Lodha et al. [6] used envelopes, ribbons, and glyphs to encode uncertainty of particle traces. Botchen et al. [7] presented a texturebased method using blurred streaklines to convey uncertainty information in two-dimensional flow fields obtained from measurements. A focus-and-context flow visualization method was presented by Bürger et al. [8], who combined spatial importance measures with particle glyphs. Specifically for Method of Moments-based simulations, Hlawitschka et al. [9] and Hummel et al. [10] proposed a visualization approach featuring sampling in addition to parametric bubble glyphs, which were shaped according to local flow properties. We build on their approach by extending the sampling method and evaluating its performance. Method of Moments Multi-Phase Fluid Simulation. Using multi-phase fluid simulations, the interaction of liquid or gaseous media can be studied. For this purpose, Eulerian-Eulerian methods such as the Method of Moments [11] have long been in the focus of research, with a number of improvements regarding performance and flexibility being proposed [12,13,14,9]. A fundamental aspect common to all these variants is that bubbles are not represented explicitly. Instead, a field of volume fractions describes, for each cell, the ratio of the amount of gas to the amount of liquid. Since the gaseous phase consists of bubbles that are dispersed throughout the liquid, it is referred to as the dispersed phase. Accordingly, the liquid phase is called the continuous phase.
The bubble population itself is modeled by moments of the local bubble distribution [15]: The number density function n(L, x, t), which describes the number of bubbles with the characteristic length L, is not tracked directly by the simulation. Instead, information such as the local density and bubble size can be derived from the moments. In addition, velocity fields are available for each phase.
For the purpose of this work, the following fields, defined for each time step, are relevant: First and foremost, the zeroth moment M 0 (x) describes, for each cell, how many bubbles per volume are present in the area. The third moment M 3 (x) is related to the volume of the bubble distribution. It can be used to calculate the size of individual bubbles [14]. Even though this size has no influence on the bubble positions, it is needed afterwards for the visualization [10]. Finally, the vector field u disp (x) defines the local velocity of the dispersed phase, and thereby the motion of the bubbles.

Sampling the Bubble Distribution
In this section, we describe the algorithm that is used to obtain a set of bubble locations based on the density field. Starting with an initial sampling, the set of bubbles is updated over time to reflect changes of the density in the simulation output.
The density field M 0 (x) defines, for each location x within the domain, how many bubbles per volume should be present. Since M 0 is constant within each mesh cell, the exact number of bubbles required within each cell i can be calculated: This number is generally not integer, which means that a set of bubbles with the correct number of bubbles within each cell cannot be achieved in practice; instead, the number of bubbles should on average satisfy the density requirement. In a naïve approach, a set of bubble positions could be obtained by rounding and choosing e.g. ⌊n required (cell i )⌋ random positions within the cell. However, the overall density of the resulting set of and kept if y ≤ af (x) (green); otherwise they are rejected (cyan). b) Rejection sampling is performed separately for each cell, minimizing the number of rejected samples. The y-axis in the illustration represents n required (eq. 2). c) An existing bubble set is updated using the measured density (enclosing the red and gray area) and the target density (enclosing the gray and white area). Superfluous bubbles are removed by calculating a new y for each bubble in cells where the measured density exceeds the target density. If the particle falls into the red area, it is removed from the set. d) Finally, missing bubbles are added in areas where the target density is greater than the measured density. The approach is identical to the initial sampling, with the difference that the green acceptance area is defined by the difference between measured and target density.
bubbles would always be smaller than the required density M 0 ; a problem that is exacerbated by small cell sizes or low densities. If, for instance, the density field were to mandate slightly less than one bubble per cell, the naïve algorithm would never produce any bubble locations. Instead of simply counting the expected number of bubbles per cell, we decide to interpret the density function like the probability density function of a stochastic distribution. A straight-forward method for drawing samples from any given probability density function f (x) is rejection sampling [16,17]: A factor a is chosen such that af (x) ≤ 1. Then, two random numbers x, y ∈ [0, 1] are chosen uniformly. x is kept as a sample if y ≤ af (x) and rejected, otherwise. An illustration of this algorithm can be found in Figure 1a).
In the following, we describe how this basic approach can be adapted to bubble density fields. Initial Sampling. Starting with an empty set of bubble locations, we first produce an initial sample using a rejection sampling approach. To avoid having to reject too many samples, we treat each cell separately (Fig. 1b)). In contrast to the classic rejection sampling algorithm, which can produce an arbitrary number of samples, a specific density has to be achieved. For cell i , we first define the number of attempts as follows:

Applied Mechanics and Materials Vol. 869 141
We then use the following criterion for acceptance: Updating the Bubble Positions. As the time progresses in the simulation, bubbles move through the domain in accordance with buoyancy and the velocity of the surrounding liquid medium. The simulation models this behavior via a separate, time-varying velocity field u disp (x, t) for the dispersed phase. In part due to this motion, a set of bubble positions that reflects the density field at the time t i−1 generally does not reflect the density at the next time step t i . The motion can be accounted for by moving the bubbles in accordance with the dispersed velocity field. For each bubble positioned at x(t i−1 ), the differential equationẋ is solved using a numerical integration method such as the Dormand-Prince (4)5 scheme [18]. The resulting set of new bubble positions should then match the new density field more closely. However, aside from motion, there are more causes for change in the density field. Coalescence and separation cause bubbles in a given location to merge or split up. Even more noticable are changes brought on by in-and outflow -new bubbles introduced into the domain at inlets as well as bubbles leaving the domain through outlets -and of bubbles bursting at the transitional area between the liquid medium and the gas-filled area above the surface of the liquid.
To incorporate these changes, after the bubbles have been moved along their trajectories, the bubble set is updated to reflect the new density field. This is achieved in a two-step approach that is based on the rejection sampling algorithm used for the initial sampling.
Using an estimate of the current density field of the bubble set, it is straight-forward to identify cells with too low or too high density by looking at the difference to the target density field. Similar to equation 2, the measured density field can be converted to a number of bubbles for each cell i: In the first step of the update, the rejection sampling algorithm is used to filter the existing bubble set. Instead of drawing new samples (x, y), the existing bubble positions are used, while each bubble is assigned a random value y ∈ [0, 1]. The bubble is then kept in the set only if the acceptance criterion is fulfilled; otherwise, it is removed. This approach is illustrated by Figure 1c). In the second step (Fig. 1d)), new bubbles are added in cells where the measured density is lower than the target density. Again, the rejection sampling algorithm is applied. For each affected cell, a number of samples according to Equation 3 is drawn. The acceptance criterion for adding these new bubbles to the set is slightly modified to account for the bubbles that are already present: The resulting, updated set should reflect the new density field.
Measuring the Bubble Density. The quality of the updated bubble set clearly depends on the method used to measure the density field of the bubble set. A naïve approach would be to simply count, for each cell, the number of bubbles that are located within the cell. The problem with such an approach is analogue to the problem discussed in the beginning of this section: For a set of bubbles that was drawn from a density field which calls for only few bubbles in some areas, the number of bubbles contained  within cells in these areas will be either much higher or much lower than the required number n required , even if the overall density in the neighborhood is correct. This problem is not unique to bubble density fields. In statistics, methods for kernel density estimation are used to estimate the probability density function of a distribution based on a given set of samples [19]. For a set of samples [x 1 . . . x n ], the probability density function f (x) is estimated as follows: where w is a kernel with the property ∫ ∞ −∞ w(u)du = 1. We can adapt this approach to measure the local density of a set of positions [x 1 . . . x n ]: In this case, the three-dimensional kernel w has to fulfill the requirement The density for an entire cell i can then be obtained as follows: We choose a box kernel of width h because it allows us to calculate the cell-based density value (Eq. 12) in a straight-forward manner. This is illustrated for a two-dimensional example in Figure 2. Finally, the width h of the kernel has to be chosen. h is the only parameter of our method for which the optimal choice is not obvious. For the purpose of kernel density estimation, h should be chosen to minimize the integrated mean square error [19] imse which is difficult in practice since f (x) is not known. In general, a smaller number of available samples mandates a larger kernel width. A possible approach would be to choose h such that the integrated mean square error is minimal when comparing the required density field to the measured density field of a freshly sampled set of bubble locations: On the other hand, in the context of visualization, it may be preferable to minimize the number of changes in the bubble set, i.e. to avoid removing existing bubbles and inserting new bubbles as much as possible.

Results
We evaluate the approach using a three-dimensional multi-phase fluid simulation dataset for a bubble column. The domain is a cuboid 1.5m tall, 18cm wide, and 3cm deep, filled with water to a height of about 1.1m. From an inlet positioned at the bottom, bubbles of air stream into the domain. Figure 6 shows a color-coded two dimensional slice of the density field in a single time step as well as two possible bubble sets for the same time step. Initial Sampling. We begin by evaluating the initial sampling step. First, an initial set of bubbles is computed for several time steps. This is achieved by removing all bubbles after each time step, providing a clean slate for the next. Then, the average of the integrated mean square error (Eq. 14) is calculated. Since the density field requires only up to about three bubbles per cell, some noise can be  expected in the result. Therefore, a number of samplings are calculated, with the cell volume artificially multiplied by a factor to increase the number of bubbles and therefore the quality of the density measurement. Figure 3 (blue) shows the results for factors between 1 and 2048, using a logarithmic scale for both the integrated mean square error (vertical axis) and the volume factor (horizontal). It can be seen that the integrated mean squared error converges towards zero as the number of bubbles is increased.
Updated Sampling. Next, we repeat the same measurements as for the initial sampling, but without clearing the bubble set between time steps. The bubble set is updated between each time step according to the method described in the previous section. Since ultimately, a very large number of bubbles will be used, the kernel width h is set to zero for the density measurement. Effectively, this means that the density value is determined simply by the number of bubbles within each cell. The results are also shown as orange dots in Figure 3. The error still approaches zero as the number of bubbles is increased. However, the slope of the curve is considerably less steep than in the case of initial sampling, indicating possible opportunities for improving the update algorithm. The practical impact of this discrepancy is not clear, because for initial sampling and updating with the original cell volumes, the average integrated squared error is nearly identical.
Kernel Width. So far, the the trivial kernel width of zero has been used for the measurements. We now study how the kernel width affects the sampling quality when updating a bubble set. First, the update method described in the previous section is used to produce and update sets of bubbles, using a range of different choices for the kernel width h. The density of each bubble set is measured for every time step both before and after updating. Then, for each kernel width, the average integrated squared error of the measured density after updating compared to the required density is calculated. As can be seen in Figure 4 (bottom), the error measure starts out with a value of 306 at h = 0m and drops slightly until its minimum of 265 at h = 0.01m, which corresponds to a kernel size equal to the cell size. From there on, the error value increases rapidly for larger kernel sizes. Judging by this result, the optimal kernel width for this dataset would be equal to the cell size. However, from a visualization standpoint, there is another quality measure that could be of interest. For the purpose of visualizing the bubble set over time, the bubble set should of course follow the density field as closely as possible, with the additional requirement that bubbles should not be removed or added more often than necessary. Without this additional requirement, it would make sense to simply compute an entirely new set of bubbles at each time step, an approach that, as we have already seen, produces a good fit with the required density field. At the same time, such an approach provides no continuity over time, making it impossible to track individual bubbles visually. Therefore, we consider an additional measure of quality: Minimizing the average number of bubbles that are added or removed in each time step improves the continuity of the bubble set over time. Figure 4 (top) shows the average sum of the number of added bubbles and the number or removed bubbles for each tested kernel width. The minimum of this measure does not coincide with the smallest integrated squared error; instead, the kernel widths between 0.02m and 0.03m (two and three times the cell size, respectively) provide the smallest number of changes. Since the two criteria for the kernel width are in disagreement, the tradeoff between correctness and visual coherence has to be considered. Figure 5 illustrates the tradeoff between different choices of the kernel width h. The left half shows the results for h = 0.01m, which was shown above to provide the lowest integrated mean square error, while the right half shows results for h = 0.03m. It is clearly visible that for the smaller kernel width, the measured densities are subject to some noise when compared to the target density (Fig. 6). This behavior is expected because of the relatively small spatial density of bubbles in relation to the kernel size. In contrast, the larger kernel width produces smoother measurements which appear blurry in comparison to the target density. For both choices, the approach fulfils its purpose of updating the bubble set. This is most apparent when looking at the area above the inlet: as the existing bubbles have moved up since the previous time step, a gap is clearly visible where more bubbles should stream

146
Physical Modeling for Virtual Manufacturing Systems and Processes in from the inlet. The gap can also be identified by the red color in the pre-update difference plots. After the update, the gap has been filled in. Overall, the smaller kernel width appears to produce a smaller density difference after the update. However, both choices produce very similar results for the final bubble visualization (Fig. 6). In the figure, the bubbles are colored to indicate removals (black) and new additions (cyan). Compared to the larger kernel width on the right, the smaller kernel width results in a larger number of removals, which in turn has to be compensated by a larger number of new additions.

Conclusion and Future Work
We have described a sampling approach to generate bubble positions for the purpose of visualization in moment-based time-varying multi-phase fluid simulation data. Sets of bubble positions are generated in an initial rejection sampling step. To update the set of bubbles over time, kernel density estimation is used to measure the current density field. Discovered discrepancies of measured and required density are then mitigated by adding and removing bubbles appropriately. The approach features a central parameter that has major influence on the resulting bubble set, the kernel width h. To examine the effect of this parameter, and to provide hints for chosing its value, we have studied the results of applying our method to a bubble column simulation dataset. We have shown that a tradeoff exists between the temporal continuity of the visualization, quantified through the number of bubble additions and removals, and the measured correctness of the bubble set.
In the future, we will examine whether more complex kernel functions, such as the Gaussian kernel, can improve the quality of the density measurement. We will also examine why the errors of the update algorithm approach zero more slowly than with initial sampling as the number of bubbles is increased.