Mathematical methods

Mathematical modeling of physical processes and mathematical methods of data analysis are an integral part of modern experimental physics. There is a constant need for both improving existing methods and developing fundamentally new approaches.


Statistical regularization of incorrect inverse problems

One of the tasks solved by the group is the popularization and development of the statistical regularization method created by V.F. Turchin in the 70s of the XX century.

A typical incorrect inverse problem that arises in physics is the Fredholm equation of the first kind:

f(y)=abdxK(x,y)φ(x)f(y) = \int \limits_a^b dx K(x,y)\varphi(x)

In fact, this equation describes the following: the hardware function of the device K(x,y)K(x,y) acts on the studied spectrum or other input signal φ\varphi, as a result, the researcher observes the output signal f(y)f(y). The aim of the researcher is to restore the signal φ\varphi from the known f(y)f(y) and K(x,y)K(x,y). It would seem that signal recovery is not a difficult task, since the Fredholm equation has an exact solution. But the Fredholm equation is incorrect - an infinitesimal change in the initial conditions leads to a final change in the solution. Thus, the presence of noise present in any experiment invalidates attempts to solve this equation for sure.

Theory

Consider a certain algebraization of the Fredholm equation:

fm=Kmnφnf_m = K_{mn}\varphi_n

In terms of mathematical statistics, we must evaluateφ\vec{\varphi} using implementation f\vec{f}, knowing the probability density for f\vec{f} and matrix KK content. Acting in the spirit of decision theory, we must choose a vector function S\vec{S}, defining φ\vec{\varphi} on base of f\vec{f} and called strategy. In order to determine which strategies are more optimal, we introduce the squared loss function:

L(φ^,S)=(φ^S)2,L(\hat{\varphi},\vec{S}) = (\hat{\varphi}-\vec{S})^2,

where φ^\hat{\varphi} — the best decision. According to the Bayesian approach, we consider φ\vec{\varphi} as random variable and move our uncertainty about φ\vec{\varphi} in prior density P(φ)P(\vec{\varphi}), Expressing reliability of the various possible laws of nature and determined on the basis of information prior to the experiment. With this approach, the choice of an optimal strategy is based on minimizing aposterior risk:

rS(φ)EφEf[L(φ,S)φ]r_{\vec{S}}(\vec{\varphi}) \equiv E_{\vec{\varphi}}E_{\vec{f}}[L(\vec{\varphi},\vec{S})|\vec{\varphi}]

Then the optimal strategy in case of the square loss function is well known:

Snopt=E[φnf]=φnP(φf)dφS^{opt} _n= E[\varphi_n|\vec{f}] = \int \varphi_n P(\vec{\varphi}|\vec{f})d\vec{\varphi}

Aposterior density P(φf)P(\vec{\varphi}|\vec{f}) is determined by the Bayes theorem:

P(φf)=P(φ)P(fφ)dφP(φ)P(fφ)P(\vec{\varphi}|\vec{f})= \frac{P(\vec{\varphi})P(\vec{f}|\vec{\varphi})}{\int d\vec{\varphi}P(\vec{\varphi})P(\vec{f}|\vec{\varphi})}

In addition, this approach allows us to determine the dispersion of the resulting solution:

σn2=(φnSnopt)2P(φf)dφ\left\langle \sigma_n^2 \right\rangle = \int (\varphi_n - S^{opt}_n)^2 P(\vec{\varphi}|\vec{f})d\vec{\varphi}

We got the solution by introducing a priori density P(φ)P(\vec{\varphi}). Can we say anything about the world of φ(x)\varphi(x) functions, which is defined by a priori density? If the answer to this question is no, we will have to accept all possible φ(x)\varphi(x) equally probable and return to the irregular solution. Thus, we should answer this question positively. This is the statistical regularization method - regularization of the solution by introducing additional a priori information about φ(x)\varphi(x). If a researcher already has some a priori information (a priori density of P(φ)P(\vec{\varphi})), he can simply calculate the integral and get an answer. If there is no such information, the following paragraph describes what minimal information a researcher can have and how to use it to obtain a regularized solution.

Prior information

As British scientists have shown, the rest of the world likes to differentiate. Moreover, if a mathematician will be asked questions about the validity of this operation, the physicist optimistically believes that the laws of nature are described by "good" functions, that is, smooth. In other words, he assigns smoother φ(x)\varphi(x) a higher a priori probability density. So let's try to introduce an a priori probability based on smoothness. To do this, we will remember that the introduction of the a priori probability is some kind of violence against the world, forcing the laws of nature to look comfortable for us. This violence should be minimized, and by introducing an a priori probability density, it is necessary that _ Shannon_'s information regarding φ(x)\varphi(x) contained in P(φ)P(\vec{\varphi}) be minimal. Formalizing the above, let us derive a type of a priori density based on the smoothness of the function. For this purpose, we will search for a conditional extremum of information:

I[P(φ)]=lnP(φ)P(φ)dφminI[P(\vec{\varphi})] = \int \ln{P(\vec{\varphi})} P(\vec{\varphi}) d\vec{\varphi} \to min

Under the following conditions:

  1. Condition for smoothness φ(x)\varphi(x). Let Ω\Omega be some matrix characterizing the smoothness of the function. Then we demand that a certain value of the smoothness functional is achieved:
(φ,Ωφ)P(φ)dφ=ω\int (\vec{\varphi},\Omega\vec{\varphi}) P(\vec{\varphi}) d\vec{\varphi} = \omega

The attentive reader should ask a question about the definition of ω\omega. The answer to this question will be given further down the text.

  1. The normality of probability per unit:
P(φ)dφ=1\int P(\vec{\varphi}) d\vec{\varphi} = 1

Under these conditions, the following function will deliver a minimum to the function:

Pα(φ)=αRg(Ω)/2detΩ1/2(2π)N/2exp(12(φ,αΩφ))P_{\alpha}(\vec{\varphi}) = \frac{\alpha^{Rg(\Omega)/2}\det\Omega^{1/2}}{(2\pi)^{N/2}} \exp(-\frac{1}{2} (\vec{\varphi},\alpha\Omega\vec{\varphi}))

The α\alpha parameter is associated with ω\omega, but since we don't actually have information about the specific values of the smoothness functionality, it makes no sense to find out how it is associated. Then what to do with α\alpha, you ask? There are three paths:

  1. select the value of the parameter α\alpha manually, and thus proceed to regularization of Tikhonov
  2. average all possible α\alpha, assuming all possible α\alpha equally probable
  3. choose the most likely α\alpha by its a posteriori probability density of P(αf)P(\alpha|\vec{f}). This approach is correct if we assume that the experimental data contains enough information about α\alpha

The first case is of little interest to us. In the second case, we get the following formula for the solution:

φi=dφφiP(fφ)dαP(α)αRg(Ω)2exp(α2(φ,Ωφ))dφP(fφ)dαP(α)αRg(Ω)2exp(α2(φ,Ωφ))\left\langle \varphi_i \right\rangle = \frac{\int d\varphi\, \varphi_i P(f|\varphi) \int\limits d\alpha\,P(\alpha) \alpha^{\frac{Rg(\Omega)}{2}} \exp(-\frac{\alpha}{2} (\vec{\varphi},\Omega\vec{\varphi}))}{\int d\varphi P(f|\varphi) \int\limits d\alpha\,P(\alpha) \alpha^{\frac{Rg(\Omega)}{2}} \exp(-\frac{\alpha}{2} (\vec{\varphi},\Omega\vec{\varphi}))}

The third case will be considered in the next section using the example of Gaussian noises in an experiment.

Gaussian noises case

The case where the errors in the experiment are Gaussian distributed is remarkable in that an analytical solution to our problem can be obtained. The solution and its error will be as follows:

φ=(KTΣ1K+αΩ)1KTΣ1Tf\vec{\varphi} = (K^T\Sigma^{-1}K +\alpha^*\Omega)^{-1}K^T\Sigma^{-1^{T}}\vec{f} Σφ=(KTΣ1K+αΩ)1\Sigma_{\vec{\varphi}} = (K^T\Sigma^{-1}K+\alpha^*\Omega)^{-1}

where Σ\Sigma - covariance matrix of a multidimensional Gaussian distribution, α\alpha^* - the most probable value of the parameter α\alpha, which is determined from the condition of maximum a posteriori probability density:

P(αf)=C39;αRg(Ω)2(KTΣ1K+αΩ)1exp(12fTΣ1KT(KTΣ1K+αΩ)1KTΣ1Tf)P(\alpha|\vec{f}) = C39; \alpha^{\frac{Rg(\Omega)}{2}}\sqrt{|(K^T\Sigma^{-1}K+\alpha\Omega)^{-1}|}\exp(\frac{1}{2} \vec{f}^T\Sigma^{-1}K^{T}(K^T\Sigma^{-1}K+\alpha\Omega)^{-1}K^T\Sigma^{-1^{T}}\vec{f})

As an example, we consider the reconstruction of a spectrum consisting of two Gaussian peaks that fell under the action of an integral step kernel (Heaviside function).

deconvolution


Optimal experiment planning with parameter significance functions

Under construction...

This section is being finalized ...