 Details
 Parent Category: Programming Assignments' Solutions
We Helped With This MATLAB Programming Homework: Have A Similar One?
Category  Programming 

Subject  MATLAB 
Difficulty  Undergraduate 
Status  Solved 
More Info  Online Matlab Expert 
Short Assignment Requirements
Assignment Description
• Plot the ‘convergence’ of these estimated parameters for the first 40 EM iterations, so that you can visually see that the estimates ‘flatten out’. Paste this picture into your report.
• Run 1,000 EM steps and complete the following table:
EM estimates (10^{3 }iterations)  ⇢_{1}  p11  p22  µ_{1}  µ_{2}  1  2  LogL 









· · · · · · · ·
• Plot the smoothed probability of being in the high volatility state, evaluated at the EM parameters after 1,000 EM iterations, and paste this picture into your report.
• In no more than five sentences, discuss why the EM parameters and the log likelihood obtained after 1,000 iterations are similar and/or di↵erent to the parameters and log likelihood obtained in question 1b.
QUESTION 2 (State space models) [4.5 points]
The data for this question can be found in the file data.csv, which contains the following four variables sampled at a monthly frequency:
• US EMPLOYED  NONFARM INDUSTRIES TOTAL (PAYROLL SURVEY) VOLA US
• PERSONAL INCOME LESS TRANSFER PAYMENTS (BCI 51) CONA
• US INDUSTRIAL PRODUCTION  TOTAL INDEX VOLA
• US MANUFACTURING & TRADE SALES CONA
For each series, we define the (annualised) log return as r_{i,t }= 12⇤100⇤log(y_{i,t}/y_{i,t }_{1}) for i = 1,...,4. Subsequently, we ‘demean’ each return series r_{i,t }by subtracting its mean. From this point onwards, the symbol r_{i,t }denotes the demeaned annualised log return at time t for series i.
2a) Estimate a simple AR(1) + noise model on r_{1,t}, which is the series for employment, using Maximum Likelihood (ML). Plot r_{i,t }and your smoothed state estimates ⇠_{1,t}T together in one graph. Copy this graph into your answer sheet. Using the ML parameters based on the whole sample, state the correlation between ⇠_{1,t}t _{1 }and r_{1,t}. In no more than three sentences, discuss the performance of this predictor. To answer this question, you can mostly use the demo code you have been provided with.
2b) Can you use the same approach to predict r_{2,t+1}? That is, by estimating a simple AR(1)+noise model on the second data series? In no more than five sentences, explain why this works better or worse than for the first data series.
2c) We suspect that the four series are driven by the same underlying factor, which is the ‘state of the economy’. For this reason, we consider the following ‘factor model’ in state space form:
r1,t
1 0 0 0 ⇠1,t r3,t B
B0@B AC^{C1 }B0BB@ H11 H12 H13 H14 CC1CAC^{0 }B@B0BB ⇠02,t,t CCC1CA
r2,t
= 0 1 0 0 ⇠ ,
0 0 1 0 ⇠3,t
r4,t
0 0 0 1 ⇠4,t
with
B 
⇠1,t+1 0 F 0 0 0 ⇠1_{,t}
0B ⇠0,t+1 1 B0^{BB}B@ F00 011 022 033 044 CA1C^{CC}B^{B}@BB0 ⇠02,t,t ^{C}A1C^{C}C^{ }t
⇠2,t+1 CC^{C}CA = 0 0 F 0 0 ⇠ + v ,
B@t ⇠34,t,t+1+1 0 0 0 F 0 ⇠_{3,t}
⇠ 0 0 0 0 F ⇠_{4,t }and where vis independently normally distributed for all time t with mean zero and covariance matrix 0BB@ 1 0_{11 }022 033 0 1CCCCA
0 Q 0 0 0
Q =0 0 Q 0 0 . 0 0 0 Q 0
0 0 0 0 Q_{44}
We have normalised the innovation of the ‘state of the economy’ to have variance 1. Adjust the demo code to accommodate for this higher dimensionality and estimate 13 unknown parameters by Maximum Likelihood (ML). Report your results in the form given below. Display two decimal places for each parameter estimate and one decimal place for the resulting log likelihood:
H
2d) Plot the filtered and smoothed estimates ⇠ˆ_{t T }and copy the picture into your

report. In no more than five sentences, discuss your findings.
2e) Use the ML parameters based on the entire data set to generate one step ahead predications for r_{2,t }and r_{2,t}. What is the correlation between your predictions and the actual r_{i,t}? In no more than three sentences, explain why this is better or worse than your answer in question 2b.
2f) Consider jointly the two series for employment and industrial production. Use the expectation maximisation (EM) algorithm to estimate 10 parameters in the unrestricted model
rt = ⇠t + wt, ⇠t+1 = F⇠t + vt+1,
where w_{t }and v_{t }are normally distributed with mean zero and covariance matrices R and Q, respectively. What are your parameter estimates after 1,000 iterations? Comment on the speed of convergence.
QUESTION 3 (Synthesis) [1 point]
3a) In this assignment, we have estimated (annualised log) changes in unemployment using Markov switching models and state space models. In no more than five sentences, discuss advantages and disadvantages of both approaches.
BONUS QUESTIONS [1 point]
4a) Bonus question. Consider the following two time series:
• US EMPLOYED  NONFARM INDUSTRIES TOTAL (PAYROLL SURVEY) VOLA US
• US INDUSTRIAL PRODUCTION  TOTAL INDEX VOLA
It is expected that the changes in employment and industrial production are correlated to some extent. Furthermore, this correlation might be higher in times of crisis. To investigate whether this is the case, consider again a twostate Markov switching model, except now the observation r_{t }is twodimensional. That is, the observation consists of a log return in employment, denoted by r_{1,t}, and a log return in industrial production, denoted by r_{2,t}, which are assumed to be jointly normally distributed with mean µ_{S}_{t }and covariance matrix ⌃_{S}_{t}, where both the mean and the covariance depend on the state S_{t}. Estimate this model by EM and report your estimates. Interpret your results in no more than five sentences.
4b) Bonus question. Consider the EM procedure proposed at the top of page 513 of ‘Economic Forecasting’ by Elliott & Timmermann (2016), which is paraphrased here for your convenience, and using our notation. They suggest to (i) pick some H, F, Q and R, (ii) run the Kalman filter to obtain filtered state estimates ⇠ˆ_{t}t and (iii) estimate the four system matrices by running linear regressions on the equations
y_{t }= H^{0}⇠ˆ_{t}_{}t + w_{t}, w_{t }⇠ N(0,R),
⇠ˆ_{t}t = F⇠ˆ_{t }_{1}t _{1 }+ v_{t}, v_{t }⇠ N(0,Q),
where y_{t }and ⇠ˆ_{t}t are treated as known, while H, F, Q and R are estimated by OLS. The idea is to repeat steps (ii) and (iii) until convergence. To disprove this procedure, it is su cient to show that the true parameters are not a fixed point. That is, derive the distribution of y_{t }H0⇠ˆ_{t}t under the true parameters. Similarly, derive the distribution of ⇠ˆ_{t}t F⇠ˆ_{t }_{1}t _{1 }under the true parameters.
Assignment Code
%% Clear all data
clear
%% Load correct directory
% for Mac users, it should look like this:
pad = '/Users/RutgerJan/Desktop/Demo_state_space_EM_1d';
% for PC users, it should look like this
%pad = 'C:/Users/61849rla/Dropbox/Teaching/Matlab/Demo_state_space_EM_1d';
% load this directory
cd(pad);
format short
%% Set some random model constants
Q0 = rand(); % variance of state innovation
R0 = 4*rand(); % variance of observation noise
F0 = 0.9 + 0.1 * rand(); % persistence of the state
%% Length of series
T = 1/2*10^3;
%% DGP is AR(1) + noise
% xi(t) = F * xi(t1) + v
% y(t) = xi(t) + w
%% Generate data
for t=1:T
v(t) = mvnrnd(0,Q0,1);
w(t) = mvnrnd(0,R0,1);
if t==1;
xi0(t) = 0;
y(t) = xi0(t) + w(t);
else
xi0(t) = F0 * xi0(t1) + v(t);
y(t) = xi0(t) + w(t);
end
end
%clear xi0
figure
plot(y,'k')
hold on
plot(xi0,'g','Linewidth',2)
%% 3x Maximum likelihood optimisation using prediction error decomposition
% Random starting values, which are inspired roughly by the data
dy = y(1,2:end)y(1,1:end1);
% stating values for F, Q and R
startingvalues=[ 0.9+0.1*rand() ; 2*rand()*1/3*var(dy) ; 2*rand()*2/3*var(dy) ];
% Optimisation procedure 1
clearvars options
options = optimset('fminunc');
[ML_parameters1,ML_LogL1]=fminunc('LogLikelihood', startingvalues ,options,y(1,:))
% Optimisation procedure 2
clearvars options
options = optimset('fminsearch');
[ML_parameters2,ML_LogL2]=fminsearch('LogLikelihood', startingvalues ,options,y(1,:))
% Optimisation procedure 3
clearvars options
options = optimset('fmincon');
lb = [0;0;0];
[ML_parameters3,ML_LogL3]=fmincon('LogLikelihood', startingvalues,[],[],[],[],lb,[],[],options,y(1,:))
%% Note there is actually a small difference between different optimisation routines
ML_parameters1ML_parameters2
ML_parameters1ML_parameters3
ML_parameters3  ML_parameters2
%% The heights are also slightly different
format long
ML_LogL1ML_LogL2
ML_LogL1ML_LogL3
ML_LogL3ML_LogL2
%% Run KF and smoother at the estimated parameters
[ xi,P,predictedxi,predictedP ] = KalmanFilter( ML_parameters1(1),ML_parameters1(2),ML_parameters1(3),y );
[ smoothedxi,smoothedP,smoothedPcross,xi0_out,P0_out ] = KalmanSmoother( ML_parameters1(1),ML_parameters1(2),ML_parameters1(3),y );
figure
plot(y,'k')
hold on
plot(xi,'g','Linewidth',2)
hold on
plot(smoothedxi,'r','Linewidth',3)
hold off
%% Calculation of standard errors
ML_std1 = sqrt( diag ( inv( fdhess6('LogLikelihood',ML_parameters1,y) )))
ML_std2 = sqrt( diag ( inv( fdhess6('LogLikelihood',ML_parameters2,y) )))
ML_std3 = sqrt( diag ( inv( fdhess6('LogLikelihood',ML_parameters3,y) )))
%% Comparison of true parameters with ML parameters
format short
[[F0;Q0;R0],ML_parameters1,ML_std1]
[[F0;Q0;R0],ML_parameters2,ML_std2]
[[F0;Q0;R0],ML_parameters3,ML_std3]
%% Run EM to confirm estimates above
% Pick random starting values
F = 0.9 + 0.1 * rand(); % some number between 0.9 and 1
Q = (1/2 + rand() ) * 1/3 * var(y(1,2:end)y(1,1:end1)); % vaguely inspired by the data
R = (1/2 + rand() ) * 2/3 * var(y(1,2:end)y(1,1:end1)); % vaguely inspired by the data
%% Run EM
% Run a bunch of EM steps
format long
for k=1:10000
[F,Q,R]=EM_step(F,Q,R,y);
[F;Q;R]
end
%% Compare the 3 ML routines with EM
[ML_parameters1(1);ML_parameters2(1);ML_parameters3(1);F]
[ML_parameters1(2);ML_parameters2(2);ML_parameters3(2);Q]
[ML_parameters1(3);ML_parameters2(3);ML_parameters3(3);R]
%% Compare the loglikelihoods that are obtained
EM_LogL = LogLikelihood([F;Q;R],y);
[ML_LogL1,ML_LogL2,ML_LogL3,EM_LogL]
EM_LogLML_LogL3
Assignment Code
function [ F_out,Q_out,R_out ] = EM_step( F_in , Q_in , R_in , y )
% When running EM, we are really looking for a fixed point, such that "output = input" holds.
% If this holds, then we are done.... but generally it doesn't, which is why
% we iterate. In theory we should iterate an infinite number of times to
% get convergence.
% However, in practice, we use some convergence criterion, which says that we stop
% when the ouput and input differ by 'only a little bit'.
% For example, we could stop when the input and output are the same up to M decimal places,
% for some prespecified precision level M.
%% Extract length of the data
T = size(y,2);
% Diffuse initialisation
xi0 = 0;
P0 = 10^6; % make sure this is the same as in the filter and smoother
% In this demo code, it is assumed that the initial state is
% truly random: anything could happen. Thus each EM step starts with a
% diffuse prior. Since we are comparing with maximum likelihood, which is also initiated
% with a diffuse prior, it is only fair that we do the same for EM.
% (Recall, however, that for Markov switching models, we solved separately
% for rho, which was the initial distribution of the state. Here we make the conscious choice not to do this.)
%% Estep: Run the Kalman smoother
[smoothedxi,smoothedP,smoothedPcross,xi0_out,P0_out] = KalmanSmoother( F_in , Q_in , R_in , y) ;
% The Kalman smoother has more outputs than usual. It outputs the cross
% covariance smoothedPcross, as well as xi0_out and P0_out.
%% Optimise over F
for t=1:T
if t==1
numerator(t) = smoothedxi(t)* xi0_out' + smoothedPcross(t) ;
else
numerator(t) = smoothedxi(t)*smoothedxi(:,t1)' + smoothedPcross(t) ;
end
end
for t=1:T
if t==1
denominator(t) = xi0_out * xi0_out' + P0_out ;
else
denominator(t) = smoothedxi(t1) * smoothedxi(t1)' + smoothedP(t1) ;
end
end
F_out = mean(numerator,2) / mean(denominator,2);
% note that A / B is the same as A * inv(B)
clear numerator denominator
%% Optimise over Q
for t=1:T
if t==1
Q_estimate(t) = smoothedxi(t) * smoothedxi(t)' + smoothedP(t)  F_out * ( xi0_out * smoothedxi(t)' + smoothedPcross(t)' )  ( smoothedxi(t) * xi0_out' + smoothedPcross(t) ) * F_out' + F_out * ( xi0_out * xi0_out' + P0_out ) * F_out' ;
else
Q_estimate(t) = smoothedxi(t) * smoothedxi(t)' + smoothedP(t)  F_out * ( smoothedxi(t1) * smoothedxi(t)' + smoothedPcross(t)' )  ( smoothedxi(t) * smoothedxi(t1)' + smoothedPcross(t) ) * F_out' + F_out * ( smoothedxi(t1) * smoothedxi(t1)' + smoothedP(t1) ) * F_out' ;
end
end
Q_out = mean(Q_estimate,2) ;
clear Q_estimate
%% Optimise over R
for t=1:T
R_estimate(t) = y(t)*y(t)'  smoothedxi(t) * y(t)'  y(t) * smoothedxi(t)' + ( smoothedxi(t) * smoothedxi(t)' + smoothedP(t) ) ;
end
R_out = mean(R_estimate,2) ;
clear R_estimate
%%Close the function
end
Assignment Code
function H = fdhess(f,x,varargin)
% PURPOSE: Computes finite difference Hessian
% 
% Usage: H = fdhess(func,x,varargin)
% Where: func = function name, fval = func(x,varargin)
% x = vector of parameters (n x 1)
% varargin = optional arguments passed to the function
% 
% RETURNS:
% H = finite differnce hessian
% 
% Code from:
% COMPECON toolbox [www4.ncsu.edu/~pfackler]
% documentation modified to fit the format of the Ecoometrics Toolbox
% by James P. LeSage, Dept of Economics
% University of Toledo
% 2801 W. Bancroft St,
% Toledo, OH 43606
% ...
eps = 1e7;
n = size(x,1);
fx = feval(f,x,varargin{:});
% Compute the stepsize (h)
h = eps.^(1/3)*max(abs(x),1e2);
xh = x+h;
h = xhx;
ee = sparse(1:n,1:n,h,n,n);
% Compute forward step
g = zeros(n,1);
for i=1:n
g(i) = feval(f,x+ee(:,i),varargin{:});
end
H=h*h';
% Compute "double" forward step
for i=1:n
for j=i:n
H(i,j) = (feval(f,x+ee(:,i)+ee(:,j),varargin{:})g(i)g(j)+fx)/H(i,j);
H(j,i) = H(i,j);
end
end
Assignment Code
function [ xi,P,predictedxi,predictedP ] = KalmanFilter( F,Q,R,y )
% This function runs the Kalman filter for the scalar AR(1) model plus
% noise with a diffuse prior
% Extract lenght of the data
T = size(y,2);
% Diffuse initialisation
xi0 = 0;
P0 = 10^6; % make sure this is the same as the one used in the smoother!
% The Kalman filter
for t=1:T
if t<2
% The first prediction is based on xi0 and P0
predictedxi(t) = F * xi0;
predictedP(t) = F * P0 * F'+ Q;
else
% Further predictions are based on previous updates
predictedxi(t) = F * xi(t1);
predictedP(t) = F * P(t1) * F' + Q;
end
% Update
xi(t) = predictedxi(t) + predictedP(t) * inv( predictedP(t) + R ) * ( y(t)  predictedxi(t) );
P(t) = predictedP(t)  predictedP(t) * inv( predictedP(t) + R ) * predictedP(t);
% Close the loop over time
end
% Close the function
Assignment Code
function [ smoothedxi,smoothedP,smoothedPcross,xi0_out,P0_out ] = KalmanSmoother( F , Q , R , y )
% Extract length of the data
T = size(y,2);
% Diffuse initialisation
xi0 = 0;
P0 = 10^6; % make sure this is the same as the one used in the filter!
% Run the Kalman filter
[xi,P,predictedxi,predictedP]=KalmanFilter( F , Q , R , y);
% Initialise the smoother at the end
smoothedxi(T) = xi(T);
smoothedP(T) = P(T);
% Run the Kalman smoother
for j=1:(T1)
% Loop backwards through time
t = Tj;
% Run the smoother
smoothedxi(t) = xi(t) + P(t) * F' * inv( predictedP(t+1) ) * ( smoothedxi(t+1)  predictedxi(t+1) ) ;
smoothedP(t) = P(t)  P(t) * F' * inv( predictedP(t+1) ) * ( predictedP(t+1)  smoothedP(t+1) ) * inv( predictedP(t+1) ) * F * P(t) ;
end
% Calculate the xi0 and P0 separately
xi0_out = xi0 + P0 * F' * inv( predictedP(1) ) * ( smoothedxi(1)  predictedxi(1) ) ;
P0_out = P0  P0 * F' * inv( predictedP(1) ) * ( predictedP(1)  smoothedP(1) ) * inv( predictedP(1) ) * F * P0 ;
% Run the loop to get the cross terms where smoothedPcross(t)=P_{t,t1T}
for t=1:T
if t==1
smoothedPcross(t) = smoothedP(t) * inv( predictedP(t) ) * F * P0;
else
smoothedPcross(t) = smoothedP(t) * inv( predictedP(t) ) * F * P(t1);
end
end
% Close the function
end
Assignment Code
function [output]=LogLikelihood(parameter_vector,y)
% Extract the stuff we need from the input arguments
F = parameter_vector(1,1);
Q = parameter_vector(2,1);
R = parameter_vector(3,1);
% Run the Kalman filter
[~,~,predictedxi,predictedP]=KalmanFilter(F,Q,R,y);
% Collect a row vector of log likelihood per observation
LL = log( normpdf( y , predictedxi , sqrt(predictedP+R) ) );
% Sum over all observations
output =  sum( LL(1:end) ) ;
end