A Stochastic Optimization Technique for UNI-DEM Framework

This paper introduces a sophisticated multi-dimensional sensitivity analysis, incorporating cutting-edge stochastic methods for air pollution modeling. The study focuses on a large-scale long-distance transportation model of air pollutants, specifically the Unified Danish Eulerian Model (UNI-DEM). This mathematical model plays a pivotal role in understanding the detrimental impacts of heightened levels of air pollution. With this research, our intent is to employ it to tackle crucial questions related to environmental protection. We suggest advanced Monte Carlo and quasi-Monte Carlo methods, leveraging specific lattice and digital sequences to enhance the computational effectiveness of multi-dimensional numerical integration. Moreover, we further refine the existing stochastic methodologies for digital ecosystem modeling. The main aspect of our investigation is to analyze the sensitivity of the UNI-DEM model output to changes in the input emissions of human-induced pollutants and the rates of a number of chemical reactions. The developed algorithms are utilized to calculate global Sobol sensitivity measures for various input parameters. We also assess their influence on key air pollutant concentrations in different European cities, considering the diverse geographical locations. The overarching goal of this research is to broaden our understanding of the elements influencing air pollution and inform potent strategies to alleviate its negative impacts on the environment.


I. INTRODUCTION
T HIS paper focuses on conducting sensitivity analysis (SA) studies in the field of air pollution modeling [23], [26], [27], [28], [29], specifically using the Unified Danish Eulerian Model (UNI-DEM) as a case study.UNI-DEM is chosen for its accurate representation of relevant chemical processes in the atmosphere.The extensive output The work is supported by the Project BG05M2OP001-1.001-0004UNITe, funded by the Operational Programme "Science and Education for Smart Growth", co-funded by the European Union trough the European Structural and Investment Funds and by the Bulgarian National Science Fund under Project KP-06-M62/1 "Numerical deterministic, stochastic, machine and deep learning methods with applications in computational, quantitative, algorithmic finance, biomathematics, ecology and algebra" from 2022.Venelin Todorov is supported by the Bulgarian National Science Fund under Project KP-06-N52/5 "Efficient methods for modeling, optimization and decision making" and Project KP-06-Russia/17 "New Highly Efficient Stochastic Simulation Methods and Applications".data generated by UNI-DEM has been utilized in various realworld applications, necessitating the accurate assessment of data reliability for specific uses.The research objective is to evaluate the dependability of the substantial volume of output data produced by the model.The study primarily examines the variations in hazardous air pollutant concentrations in relation to human-made emission levels and chemical reaction rates.
When it comes to making decisions, doubts arise regarding the reliability of large-scale mathematical models.To enhance their reliability, the sensitivity of model outputs to variations in model inputs caused by natural variability is studied and analyzed.Sensitivity analysis, as defined in this paper, is a procedure used to measure how sensitive mathematical model outputs are to variations in input data.The input data for sensitivity analysis in this study is obtained through simulations of a large-scale mathematical model known as the Unified Danish Eulerian Model (UNI-DEM).The model, developed at the Danish National Environmental Research Institute, covers a vast geographical area of 4800 × 4800 km, encompassing Europe and the Mediterranean fully and parts of Asia and Africa.It accurately represents the primary chemical, photochemical, and physical processes between the species considered and the emissions under rapidly changing meteorological conditions.The choice of this model for the case study is motivated by its precise treatment of chemical processes compared to other atmospheric chemistry models.UNI-DEM is mathematically represented by the following system of partial differential equations (PDE) [22]: where c s are the chemical species' concentrations; u, v, w are the wind components; K x , K y , K z -the diffusion coeff.;E s -the emissions; k 1s , k 2s -dry / wet deposition coeff.;Q s (c 1 , c 2 , . . .c q ) -non-linear functions used to depict the chemical reactions that occur between the species being studied.The Carbon Bond Mechanism (CBM-IV) chemical scheme is utilized to account for both non-linearity and stiffness.[22], [25].

II. GLOBAL SENSITIVITY ANALYSIS -SOBOL APPROACH
Variance-based methods are frequently employed in quantitative global sensitivity analysis, with the aim of assessing the contribution of input variance (either individual or grouped) to the overall variance of model output.Among these methods, the Sobol approach is widely utilized [17], [5], [19].This approach is based on the assumption that the mathematical model can be represented by a specific model function: where x = (x 1 , x 2 , . . ., x d ) ∈ U d ≡ [0; 1] d is the vector of input parameters with a joint probability density function (p.d.f.) p(x) = p(x 1 , . . ., x d ).
The concept behind the Sobol approach involves decomposing the integrable model function f into terms of increasing dimensionality [18], [20]: where f 0 is some constant.According to Sobol [19], the ANOVA (Analysis of Variance) decomposition decompose the output variance of a mathematical model into components attributed to each input.The goal is to identify which inputs contribute most significantly to the output variance.Each input variable is given a sensitivity index, or Sobol index, indicating its relative contribution to the output variance.
In simple terms, the process involves running the model multiple times with different combinations of inputs and observing the changes in the output.The larger the change in output for a given change in input, the more 'sensitive' the model is to that input.
The expression (3), where each term is selected to fulfill the specified condition, is referred to as the ANOVA representation of the model function f (x): This condition ensures that the functions on the right-hand side of (3) have a unique definition and f 0 = The quantities and the total sensitivity index (TSI) of an input parameter x i , i ∈ {1, . . ., d} defined by [19], [17]: where S i is named the main effect (first-order sensitivity index) of x i and S il1...lj−1 is the j th order sensitivity index.The higher-order terms characterize the interaction effects between the unknown input parameters x i1 , . . ., x iν , ν ∈ {2, . . ., d} on the output variance.Therefore comprehensive mathematical analysis of the global sensitivity analysis problem involves the calculation of total sensitivity indices (6) of the corresponding order.This calculation relies on the formulas ( 4)-( 5), which require the computation of multidimensional integrals.The authors of [9] discuss which formulation of is better when calculating the total variance and the Sobol global sensitivity measures.The first approximation formula is and the second one is where x and x ′ are two independent sample vectors.If one estimates sensitivity indices of a fixed order, the expression (8) is better (as it is recommended in [9]), and this is why we apply it here as well.

III. A NEW OPTIMIZATION METHOD FOR SA
Let us take into account a multidimensional integration task in dimension s: We introduce the quadrature formula where ) s are the nodes for the integration of the formula.The selection of these nodes is critical because it establishes the discrepancy of the sequence and the precision of the quadrature.For equation (11), the integration nodes that we will employ are [13], [14]: 1168 PROCEEDINGS OF THE FEDCSIS.WARSAW, POLAND, 2023 where N represents the quantity of nodes, z is an sdimensional generating vector of the lattice set and a = a−[a] is the fractional part of a. Now, the equation (11) with nodes (12) and generators z is referred to as rank-1 lattice rules [2].We will adopt a particular category of rank-1 lattice: the symmetrized lattice (SL).We put forth a unique SL, defined in the following manner.In the unidimensional scenario, we set up a function, appropriate for periodic integrand functions, to be used with a nonperiodic function F by applying the SL to the function in a single dimension.For the two-dimensional situation, the function L is established as The definition of the function L(x 1 , . . ., x s ) is extrapolated for s dimensions: (13) The terms over which the summation takes place can be envisioned as vertices of a parallelotope, with diagonals converging at the point The lattice we will use in our study are defined as follows.The first one is a rank one lattice rule with prime number of points and with product weights, which symmetrized version would be denoted by SL-1pt.The next lattice is a rank one lattice rule with prime number of points and with order dependent weights, which symmetrized version would be designated with SL-1od.These two lattice rules have variant with number of points, which is a prime power instead of prime itself, and we would denote them with SL-1expt and SL-1exod, respectively.The last lattice that would be used is a polynomial rank one lattice sequence in base two and with product weights, designated by SL-2poly.

IV. SENSITIVITY STUDIES WITH RESPECT TO EMISSION
LEVELS In this section, we report the findings of the Sensitivity Analysis performed on the output of UNI-DEM, with particular attention paid to the monthly average ammonia concentrations in Milan, Italy.This analysis scrutinizes how alterations in anthropogenic emission data, employed as input, impact these concentrations.
The input is composed of The domain under examination is the 4-dimensional hypercube [0.5, 1] 4 .
The primary determinant of ammonia output concentrations is the emission of ammonia itself, accounting for approximately 89% in Milan.The next most influential factor is the emission of sulphur dioxide, contributing around 11% to ammonia output.This depiction of first-and second-order sensitivity indices for ammonia in Milan was established through the use of correlated sampling as part of Sobol's variancebased approach for multidimensional sensitivity analysis.This was done to compute all potential sensitivity measures and investigate the impact of the selected four groups of air pollutant emissions on the concentration of three key air pollutants.
This signifies the degree to which ammonia emissions directly affect ammonia concentrations, emphasizing the need for effective monitoring and management of these emissions.The role of sulphur dioxide emissions, albeit smaller, also needs to be taken into account due to their noticeable influence.Using multidimensional sensitivity analysis aids in comprehensively understanding the role of various emissions in air pollution, thereby enabling more targeted strategies to mitigate these issues.The results provide a foundation for future work aimed at improving air quality, informing policy decisions, and guiding future research into pollution control methods.
The relative error estimation for quantities f 0 , the overall variance D, the first-order (S i ) and the total (S tot i ) sensitivity indices is exhibited in Tables I, II, III, correspondingly.f 0 is represented by a 4-dimensional integral, whereas the remaining quantities are denoted by 8-dimensional integrals, drawing upon the concepts of the correlated sampling technique to compute sensitivity measures in a robust manner (refer to [9], [20]).Four distinct stochastic methods utilized for numerical integration are displayed in separate columns in the tables.When examining the model function f 0 with a sample size of n = 2 12 , the most effective algorithm appears to be SL-1EXPT.This can be observed from the results given in Table I, which highlight outcomes for the maximum sample count.
When considering the total variance D for the same number of samples, SL-1OD comes out on top, as can be seen in Table II, which presents findings for the highest sample amount.Regarding Sensitivity Indices (SIs), the optimal method is SL-1EXOD, as evident in Table III.
The efficiency and results of these algorithms can be further examined in Figures 1 and 2. The latter focuses particularly on SIs with smaller values, providing a more detailed look into their performance.
From the data displayed in Table III, it is evident that the SL-1EXOD algorithm enhances results in a majority of scenarios, particularly in determining the low-value sensitivity indices S 2 , S 4 , S tot 2 , and S tot 4 .These specific instances hold substantial significance as they play a crucial role in ascertaining the dependability of the model outcomes.

V. SENSITIVITY STUDIES WITH RESPECT TO CHEMICAL REACTIONS RATES
This section analyzes the sensitivity of the concentration levels of ozone in the atmosphere above Genova, Italy, with respect to modifications in the reaction rates of specific chemical reactions entailed in the condensed CBM-IV model  ( [22]).Notably, reactions # 1, 3, 7, 22 (time-dependent) and # 27, 28 (time independent) are the primary focus.The simplified formulas for the chemical reactions are as follows: The domain under examination is the 6-dimensional hypercube [0.6, 1.4] 6 ).
The findings from our analysis, with a focus on the reactions as described by the CBM-IV scheme, led to several important insights.Reaction rates #1, 3, and 22 have a profound impact on O 3 concentrations, making them extremely influential in this context.On the other hand, reaction rates #7 and 27, while not as dominant, still hold a noticeable significance.Contrarily, the influence of reaction rate #28 can be deemed negligible.
In other words, the study has found that there are clear relationships between specific reaction rates and O 3 concentrations.While the reactions #1, 3, and 22 play a leading role, reactions #7 and 27 still contribute to a certain extent.This information suggests that these specific reactions could be potential targets for strategies to reduce O 3 concentrations.However, the role of reaction #28 appears to be minimal, suggesting that efforts aimed at this reaction are likely to be less effective.These observations provide a better understanding of the dynamics involved in O 3 concentrations, paving the way for more effective and targeted air pollution control strategies.
The estimated relative error for the values f 0 , total variance D, and a subset of the sensitivity indices are detailed in Tables IV, V, and VI, correspondingly.
The parameter f 0 is depicted by a 6-dimensional integral, while the remaining quantities being examined are shown by 12-dimensional integrals, in line with the correlated sampling principle.In the case of the model function f 0 , the optimal algorithm turns out to be the SL-1EXPT, with SL-1EXOD coming in as the second-best choice, as evidenced by the results displayed in Table IV.When dealing with a sample size of n = 2 12 for the total variance D, the top-performing algorithm is SL-1OD, as demonstrated by the results shown in  number of samples.In terms of the Sensitivity Indices (SIs), SL-1EXPT proves to be the most efficient method, closely followed by SL-1POLY and SL-1OD.This is clearly shown in Table VI.The algorithms' performance can be visually inspected through Fig. 3 and 4, with the latter emphasizing the SIs of smaller values.As evidenced by Table VI, the SL-1EXPT algorithm improves outcomes in most instances, most notably for the lower-valued sensitivity indices S 5 , S tot 5 , S 15 , and S 45 .These indices are particularly significant as they greatly impact the trustworthiness of the model's outcomes.

VI. CONCLUSION
We have examined the computational effectiveness of various stochastic methodologies for multi-dimensional numerical integration in relation to relative error and computational resources.The subject of this study is the sensitivity analysis of the output from the UNI-DEM model to changes in input emissions of anthropogenic contaminants and alterations in a selection of chemical reaction rates.
We scrutinize the impact of emission levels on key air pollutants, specifically ammonia, ozone, ammonium sulphate, and ammonium nitrate.
The computational experiments reveal that the optimization methods developed are amongst the most effective stochastic strategies currently available for determining sensitivity indices, particularly for the most challenging task -assessing the least value sensitivity indices, which are crucial for the dependability of the model's outcomes.These findings are of considerable significance for environmental conservation and the credibility of future predictions.

Fig. 2 .
Fig. 2. Relative errors for the calculation of the small in value SIs

TABLE I RELATIVE
ERROR FOR THE EVALUATION OF f 0 ≈ 0.048.

TABLE II RELATIVE
ERROR FOR THE EVALUATION OF THE TOTAL VARIANCE D ≈ 0.0002.

Table V
Fig. 4. Relative errors for the calculation of the small in value SIs

TABLE IV RELATIVE
ERROR FOR THE EVALUATION OF f 0 ≈ 0.27.

TABLE VI RELATIVE
ERROR FOR ESTIMATION OF SENSITIVITY INDICES OF INPUT PARAMETERS USING DIFFERENT QUASI-MONTE CARLO APPROACHES(n = 212).