Reduced-Rank Envelope (RRE)
Here we explain the basics of fitting the reduced-rank envelope model (Cook, Forzani and Zhang; 2015 Biometrika).
Contents
- Load the package
- Simulate a data set from the RRE model
- Determining the rank and the envelope dimension in RRE model
- Fit the ordinary least squares (OLS) estimator
- Fit the reduced-rank regression estimator
- Fit the response envelope (Y-env) estimator
- Fit the reduced-rank regression estimator
- Asymptotic and bootstrap standard errors for various estimators
- K-fold cross-validation prediction error
Load the package
clear all; cd D:\EnvelopeComputing\EnvelopeMLM; % change directory to the package folder setpaths; % load all the functions in the package rng(2016) % set random seed
Simulate a data set from the RRE model
p = 5; % number of predictors r = 15; % number of responses d = 2; % rank of regression coefficent matrix u = 4; % dimension of the Y-envelope N = 600; % number of observations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % regression coefficient matrix %% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Gamma = orth(rand(r,u)); Gamma0 = null(Gamma'); eta = orth(rand(u,d)); eta0 = null(eta'); B = rand(d,p); if d==u eta = eye(d); end beta = Gamma*eta*B; beta = beta./sqrt(trace(beta'*beta)); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Covariance matrices %% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Omega = eye(u); for i=1:u for j=1:u Omega(i,j) = (-0.9)^(abs(i-j)); end end Omega0 = eye(r-u); Sigma = Gamma*Omega*Gamma' + 5*Gamma0*Omega0*Gamma0'; SigmaX = eye(p); X = mvnrnd(zeros(p,1),SigmaX,N); eps = mvnrnd(zeros(r,1),Sigma,N); Y = X*beta' + eps;
Determining the rank and the envelope dimension in RRE model
The rank is obtained from asymptotic test, the envelope dimension is obtained from either one of the three methods: AIC, BIC and sequential likelihood-ratio test (LRT).
alpha_d = 0.05; %<- significant level for testing rank alpha_u = 0.05; %<- significant level for likelihood ratio (LRT) test of envelope dimension [drrr, uaic, ubic, ulrt] = dimsRRE(X,Y,alpha_d,alpha_u); disp(['Estimated rank of the regression coefficient matrix: ',num2str(drrr)]); disp(['Estimated envelope dimension: ',... num2str(uaic),'(AIC), ',num2str(ubic),'(BIC), ',num2str(ulrt),'(LRT) ']);
Estimated rank of the regression coefficient matrix: 2 Estimated envelope dimension: 4(AIC), 4(BIC), 4(LRT)
Fit the ordinary least squares (OLS) estimator
Fit the standard model with ordinary least squares. The output is just the regression coefficient matrix, since we are not interested in the slope.
betahat_ols = OLS(X,Y); % r-by-p regression coefficient matrix % Frobenius norm of true beta minus estimator disp(['OLS estimation error is: ', num2str(norm(beta-betahat_ols,'fro'))]);
OLS estimation error is: 0.70319
Fit the reduced-rank regression estimator
Fit the reduced-rank regression (RRR) estimators with two fitting options. One is the maximum likelihood estimator (RRR_ml) under general error covariance assumption. And the other one (RRR_i) is based on minimizing the sum of squares of residual, which is equivalent to the maximum likelihood estimator assuming that the error term is isotropic. In the paper (Cook, Forzani and Zhang (2015)), the option of betahat_RRRi was not discussed.
d = drrr; % rank obtained from the asymptotic test [Ai, Bi, Aml, Bml] = RRR(X,Y,d); betahat_RRRi = Ai*Bi'; betahat_RRRml = Aml*Bml'; disp(['Reduced-rank regression estimation error is: ', num2str(norm(beta-betahat_RRRml,'fro'))]); disp(['Reduced-rank regression (with identity error covariance assumption) estimation error is: ',... num2str(norm(beta-betahat_RRRi,'fro'))]);
Reduced-rank regression estimation error is: 0.44617 Reduced-rank regression (with identity error covariance assumption) estimation error is: 0.47125
Fit the response envelope (Y-env) estimator
Fit the response envelope model (Y-env) by Cook, Li and Chiaromonte (2010; Statistica Sinica), for reducing the multivariate response Y in multivariate linear model. We implemented two options for obtaining the response envelope estimator: FG = 0 or 1 indicating either the 1D algortihm estimation (Cook and Zhang 2016; JCGS) or the Full Grassmannian Optimization, which uses the 1D algorithm to get initial value.
FG = 1; % Full Grassmannian Optimization u = ubic; % the Y-envelope dimension selected by BIC out_yenv = YEnv(X,Y,u,FG); % output: an estimated envelope basis matrix (r-by-dy), and the estimated % regresson coefficient matrix beta (r-by-p) disp(out_yenv); betahat_yenv = out_yenv.beta; % Frobenius norm of true beta minus estimator disp(['Response envelope (FG) estimation error is: ', num2str(norm(beta-betahat_yenv,'fro'))]); FG = 0; % switch to the 1D algorithm estimator (faster but not MLE) out_yenv = YEnv(X,Y,u,FG); betahat_yenv = out_yenv.beta; % Frobenius norm of true beta minus estimator disp(['Response envelope (1D) estimation error is: ', num2str(norm(beta-betahat_yenv,'fro'))]);
Gyhat: [15x4 double] beta: [15x5 double] Response envelope (FG) estimation error is: 0.2899 Response envelope (1D) estimation error is: 0.28978
Fit the reduced-rank regression estimator
Fit the reduced-rank envelope (RRE) estimators with the two fitting options that parallel to the RRR estimators. betahat_RREml is the maximum likelihood estimator under general envelope error covariance assumption. And betahat_RREi is assuming that the material part of error term is isotropic. In the paper (Cook, Forzani and Zhang (2015)), the option of betahat_RREi was not discussed.
d=drrr; u=ubic; % selected rank and dimension FG = 1; % FG optimization, the MLE [A, Bi, Ci, Bml, Cml] = YenvRRR(X,Y,u,d,FG); betahat_RREml = A*Bml*Cml'; disp(['RRE (FG) estimation error is: ', num2str(norm(beta-betahat_RREml,'fro'))]); FG = 0; % 1D algorithm estimation [A, Bi, Ci, Bml, Cml] = YenvRRR(X,Y,u,d,FG); betahat_RREi = A*Bi*Ci'; betahat_RREml = A*Bml*Cml'; disp(['RRE (1D) estimation error is: ', num2str(norm(beta-betahat_RREml,'fro'))]); disp(['RRE (with identity error covariance assumption) estimation error is: ',... num2str(norm(beta-betahat_RREi,'fro'))]);
RRE (FG) estimation error is: 0.27732 RRE (1D) estimation error is: 0.27778 RRE (with identity error covariance assumption) estimation error is: 0.28864
Asymptotic and bootstrap standard errors for various estimators
Given the envelope dimension and the rank, we get the asymptotic standard errors as follows. All estimators are maximum likelihood estimators.
[ase_ols, ase_rrr, ase_env, ase_rre] = Asymse_rre(X,Y,d,u,FG); % To see ratio between two methods, for example, the OLS estimator versus % the RRE estimator, we can use the following command to see the maximum and minimum % of the standard error ratios. disp('Maximum and minimum asymptotic standard error ratios of OLS versus RRE'); disp([max(ase_ols./ase_rre) min(ase_ols./ase_rre)]) disp('Maximum and minimum asymptotic standard error ratios of Envelope versus RRE'); disp([max(ase_env./ase_rre) min(ase_env./ase_rre)]) disp('Maximum and minimum asymptotic standard error ratios of RRR versus RRE'); disp([max(ase_rrr./ase_rre) min(ase_rrr./ase_rre)])
Maximum and minimum asymptotic standard error ratios of OLS versus RRE 9.5174 1.6898 Maximum and minimum asymptotic standard error ratios of Envelope versus RRE 2.1260 1.0005 Maximum and minimum asymptotic standard error ratios of RRR versus RRE 6.8852 1.1380
Given the envelope dimension and the rank, we get the bootstrap standard errors as follows. Moreover, we need to specify the number of bootstrap nboot.
nboot = 10;
[bse_ols, bse_rrr, bse_env, bse_rre] = bootse_rre(X,Y,d,u,FG,nboot);
disp('Maximum and minimum bootstrap standard error ratios of OLS versus RRE');
disp([max(bse_ols./bse_rre) min(bse_ols./bse_rre)])
Maximum and minimum bootstrap standard error ratios of OLS versus RRE 9.2243 0.7851
K-fold cross-validation prediction error
Given the envelope dimension and the rank, compute the k-fold cross-validation prediction errors for various likelihood-based estimators. "nsim" specifies the numbers of repeated k-fold cross-validation.
nsim = 1; k=10; [pe_ols, pe_rrr, pe_env, pe_rre] = kFoldCV_rre(X,Y,d,u,FG,nsim,k); disp(['OLS 10-fold cross-validation MSE is: ',num2str(pe_ols)]) disp(['RRR 10-fold cross-validation MSE is: ',num2str(pe_rrr)]) disp(['Response envelope 10-fold cross-validation MSE is: ',num2str(pe_env)]) disp(['Reduced-rank envelope 10-fold cross-validation MSE is: ',num2str(pe_rre)])
OLS 10-fold cross-validation MSE is: 60.1263 RRR 10-fold cross-validation MSE is: 59.8341 Response envelope 10-fold cross-validation MSE is: 59.5407 Reduced-rank envelope 10-fold cross-validation MSE is: 59.7048