Abstract
We propose a workflow for reduction in the time required for data generation during generation of statistical digital twins. This methodology is particularly relevant for real-world engineering problems when data generation is expensive. A prerequisite for building surrogates is sufficient input/output data, whereas over-sampling can hardly improve the regression accuracy. The time for data generation can be reduced via (1) reduction of the average time spent on generating individual data points and (2) reduction in the total number of data points, by reducing the sampling rate with the improvement of surrogate quality. Examples of a dynamic process and a steady-state process from the field of carbon capture and utilization are used as two case studies: pressure swing adsorption (PSA) and Gas-to-Liquids (GTL). With the proposed methodology, the time for surrogate generation can be reduced by 88% for PSA and 60% for GTL, respectively.