Author
Viktoria BlüschkeNikolaeva^{1}
Keywords
control theory, stochastic dynamic optimum problems.
Review Status
Unreviewed
Abstract
In this article the new version of the algorithm OPTCON^{2} , named OPTCON2, is presented. This algorithm has been developed to obtain approximate solution of control optimum problems where the objective function is quadratic and the dynamic multivariable system is nonlinear. The additive and multiplicative uncertainty are present in dynamic models. OPTCON2 differs from the basic algorithm OPTCON in the dealing with stochastic parameters during the computation of optimal control variables. OPTCON2 uses passive learning, i.e. the stochastic parameters are updated in each time period. This fact can make the results of optimum stochastic control problems more accurate and more reliable than computation without update.
1 Introduction
Optimal control of dynamic systems is an interesting area of research due to its relevance to many economic, engineering etc. applications. The field of optimal control problems is well known and especially for linear deterministic problems well researched. So, there are a lot of literatures that can help to learn this area. Basic concepts for solving of control problems can be found in [2], [3], [5], [6]. Our research aim is the development of a reliable algorithm that solves control optimum problem with a quadratic objective function and a nonlinear dynamic multivariable system under additive and parameter uncertainties. In year 1992, R. Neck and J. Matulka have been developed the algorithm OPTCON [1] that solves such control optimum problems and that combines concepts of nonlinearity and stochastic in control problems. This algorithm uses the simple strategy openloop for dealing with the stochastic parameters during the computation of optimum control variables and thus OPTCON is taken as the basic instrument for our research. This algorithm can be augmented/ improved in order to get more reliable method for solving such problems. While the basic algorithm OPTCON uses in each time period always the same information about stochastic parameters, in the new version OPTCON2 the stochastic parameters are updated in each period. According to Kendrick's approach in [2] the update of stochastic parameters is done using the idea of Kalman Filter expecting more reliable results of the stochastic optimum control problems.
2 The problem
The algorithm OPTCON/OPTCON2 is designed to provide approximate solutions to optimum control problems with a quadratic objective function (a loss function to be minimized) and a nonlinear multivariate discretetime dynamic system under additive and parameter uncertainties. The intertemporal objective function is formulated in quadratic tracking form, which is quite often used in applications of optimum control theory to econometric models. It can be written as
(1)with
(2)$x_t$ is an ndimensional vector of state variables that describes the state of the economic system at any point in time. $u_t$ is an mdimensional vector of control variables, which can be controlled in process. $\tilde{x}_t\in R^n$ and $\tilde{u}_t\in R^m$ are the given 'ideal' levels of the state and control variables, respectively. $S$ denotes the initial and $T$ the terminal time period of the finite planning horizon. $W_t$ is the weight matrix, which can be defined as
(3)where $W_t^{xx}$, $W_t^{xu}$, $W_t^{ux}$ and $W_t^{uu}$ are $(n\times n)$, $(n\times m)$, $(m\times n)$ and $(m\times m)$ matrices, respectively.
Next the dynamic system of nonlinear difference equations has to be defined:
(4)where $\theta$ is a pdimensional vector of unknown parameters that denotes parameter uncertainty, $z_t$ denotes an ldimensional vector of noncontrolled exogenous variables, and $\varepsilon_t$ is an ndimensional vector of additive disturbances (additive uncertainty). $\theta$ and $\varepsilon_t$ are assumed to be independent random vectors with known expectations ($\hat{\theta}$ and $O_n$, respectively) and covariance matrices ($\Sigma^{\theta\theta}$ and $\Sigma^{\varepsilon\varepsilon}$, respectively). f is a vectorvalued function, $f^i(...)$, is the ith component of $f(...)$, $i=1, ..., n$.
3 OPTCON2
Input: $f(...)$, the tentative values $x_{S1}=\overset{\circ}{x}_{S1}$ and $(u_t)_{t=S}^T=(\overset{\circ}{u}_t)_{t=S} ^T$, $\hat{\theta}_{S1}=\hat{\theta}_{S1/S1}$, $\Sigma^{\theta\theta}_{S1}=\Sigma^{\theta\theta}_{S1/S1}$, $E(\varepsilon)=0$ and $\Sigma^{\varepsilon\varepsilon}$, $(z_t)_{t=S}^T$
Output: $(x_t^*)_{t=S}^T$, $(u_t^*)_{t=S}^T$ and $J^*$
The simplistic schema of OPTCON2 is presented as follows:
Step I Solve the nonlinear system of equations and obtain the tentative path $(\overset{\circ}{x}_t)_{t=S}^T$. Thus the tentative path $(\overset{\circ}{x}_t, \overset{\circ}{u}_t)_{t=S}^T$ is known.
Step II Generate $MCruns$ sets of random system noises $(\varepsilon^m_t)_{t=1}^T$ and $\mu^m$ (for $\theta^m = \hat{\theta} + \mu^m$), where $m=1, ..., MCruns$ .
Step III For each MC run m, i.e. for each $((\varepsilon^m_t)_{t=1}^T, \mu^m)$, do:
Step III1 For each t from S to T do
 Find the openloop solution for the subproblem $(t, ..., T)$: $u_t^*$ and $x_t^{*}=f(x_{t1}^{a*},u^*_{t}, \theta^m)$
 Calculate $x_t^{a*}=f(x_{t1}^{a*},u^*_{t}, \hat{\theta}, \varepsilon^m_t)$
 Update $\theta^m$ using $x_t^{*}$ and $x_t^{a*}$: get new $\theta^m$ and $\Sigma^{\theta\theta}$
Step III2 Calculate the objective function $J^*$
Here only the new step 'update' will be described more detailled, the other steps are similar to the corresponding steps in OPTCON [1].
In order to update the stochastic parameters we use the idea of Kalman Filter. The procedure of update via Kalman Filter^{3} consists of two parts: prediction and correction. The prediction of variables consists the calculation of predicted values of variables using the corrected estimate from previous time period. The update phase or correction improves the predicted values using the actual measurement.
Update:
Prediction:
a) $\hat{x}_{t/t1}=f(x^{a*}_{t1},u_t^{*},\theta^m_{t1/t1})=x^{*}_{t}$, $\theta^m_{t/t1}=\theta^m_{t1/t1}$
b) $\Sigma^{xx}_{t/t1}=F^x_{\theta t1}\Sigma^{\theta\theta}_{t1/t1}(F^x_{\theta t1})'+\Sigma^{\varepsilon\varepsilon}_t$, $\Sigma^{x \theta}_{t/t1}=(\Sigma^{\theta x}_{t/t1})'=F^x_{\theta t1}\Sigma^{\theta\theta}_{t1/t1}$ and $\Sigma^{\theta\theta}_{t/t1}=\Sigma^{\theta\theta}_{t1/t1}$
where $F^x_{\theta t1}$ is the derivative of the function with respect to $\theta$.
Correction:
a) $\Sigma^{\theta\theta}_{t/t}=\Sigma^{\theta\theta}_{t/t1}\Sigma^{\theta x}_{t/t1}(\Sigma^{xx}_{t/t1})^{1}\Sigma^{x\theta}_{t/t1}$
b) $\theta^m_{t/t}=\theta^m_{t/t1}+\Sigma^{\theta x}_{t/t1}(\Sigma^{xx}_{t/t1})^{1}[x^{a*}_{t}x^*_{t}]$ and $\hat{x}_{t/t}=x_t^{a*}$
Thus, using the idea of Kalman Filter we update $cov(\theta)=\Sigma^{\theta\theta}$ and $\theta^m$.
This update of stochastic parameters will be done for each time period in each iteration using diverse random noises. With every random noise one obtains different changes in stochastic parameters and thus different results of the control problem. Then one can create distribution of the results, and the assertion about optimal solution of the problem can be deduced using mean and variance. In this way one hope to get more reliable results.
4 Conclusion/ Further work
Finally, I summarize the results and plans of my research regarding OPTCON with passive learning. The new version of the algorithm with the passive learning, OPTCON2, is developed by now theoretically as described above and the implementation in computer language C# is done. My plans for the near future include the testing of OPTCON2 on same existing macroeconomic models and comparing the results with the results of OPTCON.
5 Remarks
Openloop solution is found on following way:
a) Initialization for backward recursion
b) Backward recursion: $T, ..., t$
 Linearize the system equations: $x_t=A_t(\theta)x_{t1}+B_t(\theta)u_t+c_t(\theta)+\xi_t,$ for $T, ..., t$
 min $J$, get feedback rules: $u_t^*=G_tx_{t1}^*+g_t$, $(G_T, ..., G_t$ and $g_T, ..., g_t)$
c) Forward recursion: $t, ..., T$
 $u_t^*=G_tx_{t1}^*+g_t$
 $x_t^{*}=f(x_{t1}^{a*},u^*_{t},\theta^m)$ for the time period $t$
 $x_t^{*}=f(x_{t1}^{*},u^*_{t}, \theta^m)$ for the time periods $t+1, ..., T$
For the time period $t=S$: $x_{S1}^{a*}=\overset{\circ}{x}_{S1}$
Stop criteria:
 when the algorithm converges, i.e. when the optimal control and state variables
do not change more than a prespecified small number from one
iteration to the next
or
 when a prespecified number of iterations is reached
Internal Links
Concepts 
Tutorials 
Tips 
Related Articles 
External links
References 
Bibliography
1. R. Neck; J. Matulka, OPTCON: An Algorithm for the Optimal Control of Nonlinear Stochastic Models, Annals of Operation Research 37, pp. 375401, 1992.
2. D. A. Kendrick, Stochastic Control For Economic Models. McGrawHill, New York, 1981.
3. G. C. Chow, Analysis and Control of Dynamic Economic Systems. John Wiles&Sons, 1975.
4. G. C. Chow, Econometric Analysis By Control Methods. John Wiles\&Sons, 1981.
5. A. E. Bryson; Jr. YoChi Ho, Applied Optimal Control. Hemisphere publishing corporation, 1975.
6. D. A. Kendrick; H. M. Amman, A Classification System for Economic Stochastic Control Models. Cimputational Economics 27, pp. 453481, 2006.

Weblinks 