From expression (1.31) it is clear that
the residuals
are
nonlinear functions of parameters
if the activation function
is nonlinear
.
This means that the minimum conditions
An interesting activation function is derived using the Gaussian distribution function
Using activation function (1.34) sum (1.31) depends on
the parameters
, too.
These parameters are unknown, as usual. Therefore they should be optimized together with the parameters
.
In such a case the objective
depends on
unknown parameters represented as a
-dimensional vector
.
That is the main difference of model (1.34) from the traditional ANN models where the activation functions are selected by their resemblance to the natural ones defined using the biophysical experimentation. In this research the resemblance factor is neglected and the activation function (1.34) is regarded just as a reasonable non-linearity that should be adapted to the available data.
Obviously, using ANN one meets the multi-modality problem as it was in a case of ARMA model (see non-linear equations (1.11)). The multi-modality problems of ANN models are discussed in [].