\(~\)

Supplemental R Code for the Paper:
A Joint Impulse Response Function for Vector Autoregressive Models
Thomas F. P. Wiesen
Paul M. Beaumont
2024

Introduction and Directions

This supplemental document provides R code to compute the joint impulse response function (jIRF) and joint forecast error variance decomposition proposed in A Joint Impulse Response Function for Vector Autoregressive Models. For comparison purposes, this code also calculates the generalized impulse response function (gIRF) and generalized forecast error variance decomposition of Pesaran and Shin (1998, Generalized impulse response analysis in linear multivariate models). In addition, this document provides the code to produce the High-Low-Open-Close (HLOC) volatility measures used in this paper and the methods used to construct the European summary variables for volatility used in the application.

Computing the gIRF and jIRF

First load your data into R and estimate the \(K\)-variable VAR(\(P\)) model using your estimation package of choice (e.g., the vars package in R). Then obtain the VMA representation of the VAR:
\[ \mathbf{Y_t} =\mathbf{c}+\sum_{h=0}^{\infty}\mathbf{A}_h \mathbf{\epsilon}_{t-h}, \] where \(\mathbf{A}_0=\mathbf{I}_K\). Save the \(K \times K\) shock covariance matrix \(\mathbf{\Sigma}_{\epsilon}=E \left( \mathbf{\epsilon}_t \mathbf{\epsilon}'_t \right)\) and denote it in your R code as sigma_eps. Also save the VMA coefficient matrices up to at least the chosen time horizon as a \(K \times K \times (\geq H)\) array denoted as A in your R code. Users should save at least two VMA coefficient matrices (namely, save at least \(\mathbf{A}_0\) and \(\mathbf{A}_1\)). If the user only wishes to measure the immediate impulse response (\(h=0\)), then the user should pick \(H=1\). If the user wishes to measure the immediate and one-step ahead impulse response (\(h=0,1\)), then the user should pick \(H=2\). If the user wishes to measure the immediate, one-step, and two-step ahead impulse response (\(h=0,1,2\)), then the user should pick \(H=3\). And so on. Again, the user should save at least \(\mathbf{A}_0\) and \(\mathbf{A}_1\) even if \(H=1\). Furthermore, the user should save at least as many VMA coefficient matrices as are required to compute the impulse responses up to the choose horizon (namely, the size of the third dimension of A should be greater than or equal to \(H\)). If the size of the third dimension of A is less than \(H\), an error will result.

Notation of Inputs

Mathematical Notation	R Code Notation	Dimension	Description
\(\mathbb{J}\)	setJ	\(1 \times \#\mathbb{J}\)	set of variables where the shocks originate for the jIRF & jFEVD
\(j\)	j	\(1 \times 1\)	variable where the shock originates for the gIRF
\(\mathbf{\Sigma}_\epsilon\)	sigma_eps	\(K \times K\)	shock covariance matrix
\(\mathbf{A}_0,...,\mathbf{A}_{H-1},...\)	A	\(K \times K \times (\geq H)\)	VMA coefficient matrices
\(H\)	H	\(1 \times 1\)	forecast horizon

Notation of Outputs

Mathematical Notation	R Code Notation	Equation Reference	Dimension	Description
\(\mathbf{jIRF}(h, \mathbf{\delta}_{\mathbb{J}}=\mathbf{s}_{\mathbb{J}}, \mathbf{\Omega}_{t-1}),\) \(h=0,1,...,H-1\)	jIRF	(12)	\(K \times H\)	joint impulse response function
\(\zeta_{i\|\mathbb{J}}^{jnt}(H),\) \(i=1,2,...,K\)	jFEVD	(26)	\(K \times 1\)	joint forecast error variance decomposition
\(\mathbf{gIRF}(h, \delta_j=\sqrt{\sigma_{jj}}, \mathbf{\Omega}_{t-1}),\) \(h=0,1,...,H-1\)	gIRF	(7)	\(K \times H\)	generalized impulse response function
\(\zeta_{ik}^{gen}(H),\) \(i,k=1,2,...,K\)	gFEVD	(24)	\(K \times K\)	generalized forecast error variance decomposition

Code to Reproduce Example IRFs in Section 4

The code below requires five inputs. (1) Choose set \(\mathbb{J}\), which is the set of variables where the joint, simultaneous shocks originate for the jIRF and jFEVD and save it as a \(1 \times \# \mathbb{J}\) vector labeled as setJ. (2) Choose \(j\), which is the variable where the shock originates for the gIRF and save it as a scalar labeled as j. (3) Save the shock covariance matrix \(\mathbf{\Sigma}_\epsilon\) as a \(K \times K\) matrix labeled as sigma_eps. (4) Save at least \(H\) of the VMA coefficient matrices \(\mathbf{A}_0, \mathbf{A}_1, ..., \mathbf{A}_{H-1},...\) as a \(K \times K \times (\geq H)\) array labeled as A. And (5) choose the time horizon \(H\) and save it as a scalar labeled as H.

These first steps are not included in the R code. However, we provide a simple numeric example in the code that sets \[\begin{equation*} \mathbf{\Sigma}_{\epsilon}=\begin{bmatrix} 1 & 0.5 & -0.1 & 0.1 \\ 0.5 & 1 & 0.8 & 0.1 \\ -0.1 & 0.8 & 1 & 0.1 \\ 0.1 & 0.1 & 0.1 & 1 \end{bmatrix} \end{equation*}\] and \[\begin{equation*} \mathbf{A}_0=\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \;\;\;\;\;\; \mathbf{A}_1=\begin{bmatrix} 0.55 & 0.1 & 0.1 & 0.1 \\ 0.1 & 0.55 & 0.1 & 0.1 \\ 0.1 & 0.1 & 0.55 & 0.1 \\ 0.1 & 0.1 & 0.1 & 0.55 \end{bmatrix} \;\;\;\;\;\; \mathbf{A}_2=\begin{bmatrix} 0.3325 & 0.13 & 0.13 & 0.13 \\ 0.13 & 0.3325 & 0.13 & 0.13 \\ 0.13 & 0.13 & 0.3325 & 0.13 \\ 0.13 & 0.13 & 0.13 & 0.3325 \\ \end{bmatrix} \end{equation*}\] \[\begin{equation*} \mathbf{A}_3=\begin{bmatrix} 0.221875 & 0.13075 & 0.13075 & 0.13075 \\ 0.13075 & 0.221875 & 0.13075 & 0.13075 \\ 0.13075 & 0.13075 & 0.221875 & 0.13075 \\ 0.13075 & 0.13075 & 0.13075 & 0.221875 \end{bmatrix} \;\;\;\;\;\; \mathbf{A}_4=\begin{bmatrix} 0.1612563 & 0.12025 & 0.12025 & 0.12025 \\ 0.12025 & 0.1612563 & 0.12025 & 0.12025 \\ 0.12025 & 0.12025 & 0.1612563 & 0.12025 \\ 0.12025 & 0.12025 & 0.12025 & 0.1612563 \end{bmatrix} \end{equation*}\] \[\begin{equation*} \mathbf{A}_5=\begin{bmatrix} 0.1247659 & 0.1063131 & 0.1063131 & 0.1063131 \\ 0.1063131 & 0.1247659 & 0.1063131 & 0.1063131 \\ 0.1063131 & 0.1063131 & 0.1247659 & 0.1063131 \\ 0.1063131 & 0.1063131 & 0.1063131 & 0.1247659 \end{bmatrix} \end{equation*}\] Thus, in this example, \(K=4\) and \(H=6\). Delete or comment out this example from the R code when estimating the VAR parameters. These example VMA coefficient matrices correspond to the \(\mathbf{\Phi}_1\) VAR coefficient matrix in equation (21) of the main paper. The shock covariance matrix corresponds to the upper-left panel of figure 1 in the main paper. In our example, we also set \(\mathbb{J}=\{2,3\}\) and \(j=2\).

Once the inputs are set, the provided code will calculate four results: (1) the joint impulse response function measuring how \(\mathbf{Y}\) responds due to joint, simultaneous one standard deviation shocks from the variables in set \(\mathbb{J}\), (2) the joint forecast error variance decomposition vector whose element \(i\) measures the proportion of the \(H\)-step forecast error variance of variable \(i\) that can be explained by the shocks of the variables in set \(\mathbb{J}\), (3) the generalized impulse response function measuring how \(\mathbf{Y}\) responds due to a one standard deviation shock from variable \(j\), and (4) the generalized forecast error variance decomposition matrix whose \(i,k\) element measures the proportion of the \(H\)-step forecast error variance of variable \(i\) than can be explained by the shocks of variable \(k\). See the main text of the paper for details.

#R code to calculate the joint impulse response function, the joint forecast error...
#variance decomposition, the generalized impulse response function, and the generalized...
#forecast error variance decomposition. 
#The user must first estimate the VAR and obtain the VMA coefficient matrices (A) and...
#the shock covariance matrix (sigma_eps).  This step is not shown. Note that the "vars"...
#package calls the VMA coefficient matrices Phi, but this code calls them A.


##########################################################################################
#Inputs: where are the originating shocks occurring?
#Pick the variables in setJ for the jIRF and jFEVD. 
setJ=array(c(2,3))   #For example, variables 2 and 3.
#Pick which variable is variable j for the gIRF.
j=2   #For example, variable 2. 
H=6   #Set the forecast horizon (H).  E.g., if H=6, then the code will calculate the...
#impulse response for h=0,1,2,3,4,5 time steps ahead and the forecast error variance...
#decomposition will use the 6-step forecast error variance.


#Set example values for A and sigma_eps.  Users should delete or comment out these...
#example matrices when estimating the VAR parameters from data. 
#Example sigma_eps from the top left graph in figure 1 of the main text. 
sigma_eps=array(c(1,0.5,-0.1,0.1,0.5,1,0.8,0.1,-0.1,0.8,1,0.1,0.1,0.1,0.1,1), dim=c(4,4)) 
#Example Phi from equation (21) of the main text.  
Phi=array(c(0.55,0.1,0.1,0.1,0.1,0.55,0.1,0.1,0.1,0.1,0.55,0.1,0.1,0.1,0.1,0.55), dim=c(4,
4))
A=array(NaN,dim=c(4,4,6))   #Create empty array for VMA coefficient matrices.
#Because our example is a VAR(1) model, A[,,h+1] is simply Phi raised to the h power.
A[,,1]=diag(4)              #4-dimensional identity matrix.
A[,,2]=Phi                  
A[,,3]=Phi%*%Phi
A[,,4]=Phi%*%Phi%*%Phi
A[,,5]=Phi%*%Phi%*%Phi%*%Phi
A[,,6]=Phi%*%Phi%*%Phi%*%Phi%*%Phi


##########################################################################################
#Set the number of variables in the VAR (K).
K=dim(sigma_eps)[1]


#Identity matrix and selection matrix.
I_K=diag(K)  #K by K identity matrix.
e_setJ=I_K[,setJ]   #Selection matrix using the setJ columns of the identity matrix.


#Calculate the joint impulse response function (jIRF) vector.
s_setJ=sqrt(diag(t(e_setJ)%*%sigma_eps%*%e_setJ))  #setJ shock standard deviations.
jIRF=array(0,dim=c(K,H))
for (h in 1:H){
  jIRF[,h]=A[,,h]%*%sigma_eps%*%e_setJ%*%solve(t(e_setJ)%*%sigma_eps%*%e_setJ)%*%s_setJ
}


#Calculate the forecast error covariance matrix (Xi).
Xi_h=array(0,dim=c(K,K,H))
for (h in 1:H){
  Xi_h[,,h]=A[,,h]%*%sigma_eps%*%t(A[,,h]) #Calculated Xi at each h.
}
#Sum them along THIRD dimension to form Xi.
Xi=rowSums(Xi_h, dims=2)
#Note that because the above function is a row sum, dims=2 actually sums along the third...
#dimension.


#Calculate the joint forecast error variance decomposition (jFEVD) vector for given setJ.
#First calculate the numerator of the jFEVD.
jFEVD_numerator_h=array(0,dim=c(K,H))
for (i in 1:K){
  for (h in 1:H){
    #Calculate the numerator of the jFEVD at each h.
    jFEVD_numerator_h[i,h]=I_K[i,]%*%A[,,h]%*%sigma_eps%*%e_setJ%*%(solve(t(e_setJ)%*%
     sigma_eps%*%e_setJ))%*%t(e_setJ)%*%sigma_eps%*%t(A[,,h])%*%I_K[,i]
  }
}
jFEVD_numerator=array(0,dim=c(K))
for (i in 1:K){
  #Calculate the numerator of the jFEVD by summing over h.
  jFEVD_numerator[i]=sum(jFEVD_numerator_h[i,])
}
jFEVD=array(0,dim=c(K))
for (i in 1:K){
  #Divide by Xi to form the jFEVD.
  jFEVD[i]=jFEVD_numerator[i]/Xi[i,i]
}

 
#Calculate the generalized impulse response function (gIRF) vector due to a shock from...
#variable j.
gIRF=array(0,dim=c(K,H))
for (h in 1:H){
  gIRF[,h]=A[,,h]%*%sigma_eps%*%I_K[,j]%*%(1/sqrt(sigma_eps[j,j]))
}


#Calculate the gFEVD matrix.
#First calculate the numerator of the gFEVD.
gFEVD_numerator_h=array(0,dim=c(K,K,H))
for (h in 1:H){
  #Calculate the numerator of the gFEVD at each h.
  gFEVD_numerator_h[,,h]=(A[,,h]%*%sigma_eps)*(A[,,h]%*%sigma_eps)
}
#Sum them along THIRD dimension to form the numerator of the gFEVD.
gFEVD_numerator=rowSums(gFEVD_numerator_h, dims=2)
#Note that because the above function is a row sum, dims=2 actually sums along the third...
#dimension.
gFEVD=array(0,dim=c(K,K))
for (i in 1:K){
  for (k in 1:K){
    #Divide by the shock variance times Xi to form the gFEVD.
    gFEVD[i,k]=gFEVD_numerator[i,k]/(sigma_eps[k,k]*Xi[i,i])
  }
}

##########################################################################################
#OUTPUT.
#Print the jIRF.  The response of vector Y at h=0,1,...,H-1 periods ahead due joint,...
#simultaneous one standard deviation shocks from the variables in setJ.  Each row...
#corresponds to a different response variable and each column corresponds to a different...
#number of time steps ahead.
cat("jIRF \n")

## jIRF

print(jIRF)

##           [,1]      [,2]      [,3]      [,4]      [,5]      [,6]
## [1,] 0.2222222 0.3333333 0.3483333 0.3253333 0.2896958 0.2521646
## [2,] 1.0000000 0.6833333 0.5058333 0.3962083 0.3215896 0.2665168
## [3,] 1.0000000 0.6833333 0.5058333 0.3962083 0.3215896 0.2665168
## [4,] 0.1111111 0.2833333 0.3258333 0.3152083 0.2851396 0.2501143

#Print the jFEVD.  Element i is the proportion of the H-step forecast error variance of...
#variable i that can be explained by the shocks of the variables in setJ.
cat("\n jFEVD \n")

## 
##  jFEVD

print(as.matrix(jFEVD))

##           [,1]
## [1,] 0.9118513
## [2,] 0.9615088
## [3,] 0.9594943
## [4,] 0.2330353

#Print the gIRF.  The response of vector Y at h=0,1,...,H-1 periods ahead due to a one...
#standard deviation shock from variable j.  Each row corresponds to a different response...
#variable and each column corresponds to a different number of time steps ahead.
cat("\n gIRF \n")

## 
##  gIRF

print(gIRF)

##      [,1]  [,2]    [,3]      [,4]      [,5]      [,6]
## [1,]  0.5 0.465 0.41325 0.3593625 0.3091031 0.2643779
## [2,]  1.0 0.690 0.51450 0.4049250 0.3296063 0.2736043
## [3,]  0.8 0.600 0.47400 0.3867000 0.3214050 0.2699138
## [4,]  0.1 0.285 0.33225 0.3229125 0.2927006 0.2569968

#Print the gFEVD.  Element i,k is the proportion of the H-step forecast error variance...
#of variable i that can be explained by the shocks of variable k.  
cat("\n gFEVD \n")

## 
##  gFEVD

print(gFEVD)

##            [,1]      [,2]      [,3]       [,4]
## [1,] 0.83535648 0.4628898 0.1024116 0.07645277
## [2,] 0.27847135 0.9602257 0.5922684 0.07074939
## [3,] 0.06893124 0.7501965 0.8730821 0.07445214
## [4,] 0.09943611 0.2315267 0.1371134 0.81258855

Load The Equity Price and Market Capitalization Data

# Load required libraries.
require(fpp3)
require(tidyquant)
require(TTR) 
require(kableExtra)
require(xts)
require(timetk)
require(readxl)
require(readr)
require(patchwork)
require(gridExtra)

Tickers <- c("BAC","BK","BCS","C","CS","DB","GS","JPM","MS","STT","WFC")
Prices <- tq_get(Tickers,get="stock.prices",from="2017-01-01",to="2021-02-04")
# read market capitalization data
mktcap <- read_xlsx("MktCaps.xlsx")
#mktcap <- read_csv("./MktCap/MktCaps.csv") %>% as_tsibble()
# compute market cap weights for the European stocks
Wgts <- mktcap %>% 
  mutate(wBCS = BCSmc/(BCSmc + CSmc + DBmc)) %>% 
  mutate(wCS = CSmc/(BCSmc + CSmc + DBmc)) %>% 
  mutate(wDB = DBmc/(BCSmc + CSmc + DBmc)) %>% 
  mutate(date = Date) %>% select(-Date)

Computing the Parkinson and Yang-Zhang Volatility Estimates

In this supplement, we summarize how we compute daily stock volatility using High, Low, Open, and Close prices for for section 6 of the paper “A Joint Impulse Response Function for Vector Autoregressive Models.” In our discussion, we will follow the notation of “Drift-Independent Volatility Estimation Based on High, Low, Open, and Close Prices”, by Dennis Yang and Qiang Zhang in The Journal of Business, Vol. 73, No. 3 (July 2000), pp. 477-492. URL. To compute the Parkinson and Yang-Zhang volatility measures, we use the volatility function from the TTR package by Joshua Ulrich. TTR

Define the following notation for date \(t\) prices

\(O_t\) : opening price
\(H_t\) : high price during the trading interval
\(L_t\) : low price during the trading interval
\(C_t\) : closing price
\(o_t = \ln O_t - \ln C_{t-1}\) : the normalized open
\(u_t = \ln H_t - \ln O_{t}\) : the normalized high
\(d_t = \ln L_t - \ln O_{t}\) : the normalized low
\(c_t = \ln C_t - \ln O_{t}\) : the normalized close

# As a reliability check, this is the method originally we used to compute Vp
# except I have used N=250 days to annualize the volatility estimates.
# Our code and Joshua Ulrich's code produce identical results.
N <- 250
VpOld <- Prices %>% 
  mutate(vol = sqrt(N/(4*log(2))*(log(high) - log(low))^2)) %>% 
  select(date,symbol,vol) %>% 
  pivot_wider(names_from = symbol, values_from = vol)

# Compute the Parkinson volatility
# create placeholder dataframe and then replace the columns 
# one at a time using the Parkinson method
Vp <- Prices %>% 
  select(date, symbol, adjusted) %>% 
  pivot_wider(names_from = symbol, values_from = adjusted) 
ndim <- dim(Prices)
nr <- ndim[1] / length(Tickers)
nc <- length(Tickers)
for (i in 1:nc) {
  x <- Prices %>% filter(symbol == Tickers[i]) %>% tk_xts(date_var = "date")
  v <- volatility(x, n=1, N=250, calc = "parkinson")
  Vp[,1+i] <- v
}
# In some instances, such as H = O or L = O, the function 
# returns an NA rather than a zero, so I replace those NA's with zeros.
Vp <- Vp %>% mutate_if(is.numeric, funs(replace_na(., 0)))
# clean up
rm(x,v,i,nr,nc)

An early and commonly used estimator of daily volatility using HLOC data is the method of Parkinson (1980) which assumes an underlying price that follows a geometric Brownian motion process. The time \(t\) Parkinson estimator, denoted as \(V_{p,t}\), is computed as \[\begin{align} V_{p,t} & = \sqrt{\frac{N}{4 n \ln(2)} \sum_{i=0}^{n-1}(u_{t-i} - d_{t-i})^2} \\ &= \sqrt{\frac{N}{4 n \ln(2)} \sum_{i=0}^{n-1}\ln(H_{t-i}/L_{t-i})^2}, \end{align}\] where \(N\) is the number of trading days in the year and \(n\) is the number of trading days over which the volatility is estimated. In our application we use \(n=1\) and \(N=250\) to get annualized daily volatility, which we scale up by 100 so that the units are annualized percentages.

The Parkinson estimator uses only the daily high and low prices and is valid only if: (1) there are no jumps between the previous day’s closing price and the current day’s opening price, and (2) there is no drift in the underlying geometric Brownian motion price process. If these conditions are violated, the Parkinson estimator will be biased and inefficient.

Rogers and Satchell (1991) propose an estimator that is independent of the drift rate in the underlying process and has a smaller estimator variance than \(V_{p,t}\). The time \(t\) volatility estimator of Rogers and Satchell (1991), \(V_{rs,t}\), can be computed as \[ V_{rs,t} = \sqrt{\frac{N}{n} \sum_{i=0}^{n-1}\left[ u_{t-i}(u_{t-i}- c_{t-i}) + d_{t-i}(d_{t-i} - c_{t-i})\right]}. \] Since \(V_{rs,t}\) does not allow for opening jumps in prices, it will still underestimate the true daily price volatility.

# Compute the Yang-Zhang volatility
# create placeholder dataframe
Vyz <- Prices %>% 
  select(date, symbol, adjusted) %>% 
  pivot_wider(names_from = symbol, values_from = adjusted) 
ndim <- dim(Prices)
nr <- ndim[1] / length(Tickers)
nc <- length(Tickers)
for (i in 1:nc) {
  x <- Prices %>% filter(symbol == Tickers[i]) %>% tk_xts(date_var = "date")
  v <- volatility(x, n=2, N=250, calc = "yang.zhang")
  Vyz[,1+i] <- v
}
rm(x,v,i,nr,nc)

Yang & Zhang (2000) derive a minimum-variance unbiased estimator of volatility, denoted as \(V_{yz,t}\) for time \(t\), which is independent of both drift and opening jumps \[ V_{yz,t} = \sqrt{V^2_{o,t} + k V^2_{c,t} + (1-k) V^2_{rs,t}}, \] where \[\begin{align} V_{o,t} &= \sqrt{\frac{N}{n-1} \sum_{i=0}^{n-1}(o_{t-i} - \bar{o})^2}, \\ \bar{o} &= \frac{1}{n} \sum_{i=0}^{n-1}o_{t-i}, \\ V_{c,t} &= \sqrt{\frac{N}{n-1} \sum_{i=0}^{n-1}(c_{t-i}- \bar{c})^2}, \\ \bar{c} &= \frac{1}{n} \sum_{i=0}^{n-1}c_{t-i}, \end{align}\] and \(V_{rs_t}\) is the Rogers & Satchell estimator defined above. Here, the value \(k = 0.07834101\) is chosen to minimize the variance and is dependent on \(n\).

Note the use of \(o_t = \ln O_t - \ln C_{t-1}\) in the expression for \(V_{o,t}\) requires that \(n > 1\). In our application, we use \(n=2\) so that the first available volatility is for period \(t=3\).

European Summary Variables for Volatility using PCA and Market Capitalization Indices for Section 7

# Add the PCA and cap weighted index of Euro vols for the Parkinson method
# compute the PCA of the Euro vols
PCAp <- Vp %>% select(BCS, CS, DB) %>% 
  prcomp(retx = TRUE, center = TRUE, scale. = TRUE)
# summary(PCAp)
EPCAp <- PCAp$x[, 1] / 100
# compute the cap weighted index of Euro vols
EINDa <- Vp %>% 
  select(date,BCS, CS, DB) %>% 
  left_join(Wgts, by = "date") %>%
  mutate(EINDa = BCS*wBCS + CS*wCS + DB*wDB) %>% 
  select(date, EINDa)
# append to Vp matrix
Vp <- mutate(Vp, EPCA = EPCAp)
Vp <- left_join(Vp,EINDa, by = "date")

# Add the PCA and cap weighted index of Euro vols for the Y-Z method
# compute the PCA of the Euro vols
PCAyz <- Vyz %>% select(BCS, CS, DB) %>% 
  #drop_na() %>% 
  replace(is.na(.),0) %>% 
  prcomp(retx = TRUE, center = TRUE, scale. = TRUE)
# summary(PCAyz)
#EPCAyz <- c(NA,NA,(PCAyz$x[, 1] / 100))
EPCAyz <- c((PCAyz$x[, 1] / 100))
# compute the cap weighted index of Euro vols
EINDa <- Vyz %>% 
  select(date,BCS, CS, DB) %>% 
  #drop_na() %>% 
  left_join(Wgts, by = "date") %>%
  mutate(EINDa = BCS*wBCS + CS*wCS + DB*wDB) %>% 
  select(date, EINDa)
# append to Vyz matrix
Vyz <- mutate(Vyz, EPCA = EPCAyz)
Vyz <- left_join(Vyz,EINDa, by = "date")

# Add the cap weighted BCS prices to Vp and Vyz
Pbcs <- Prices %>% 
  filter(symbol == "BCS") %>% 
  select(-c(volume, adjusted)) %>% 
  mutate(wght = Wgts$wBCS) %>% 
  mutate(across(3:6, ~ .x  * wght)) %>% 
  select(-c(wght,symbol)) %>% 
  rename(bcs_open = open, bcs_high = high, bcs_low = low, bcs_close = close)

# cap weighted CS prices
Pcs <- Prices %>% 
  filter(symbol == "CS") %>% 
  select(-c(volume, adjusted)) %>% 
  mutate(wght = Wgts$wCS) %>% 
  mutate(across(3:6, ~ .x  * wght)) %>% 
  select(-c(wght,symbol)) %>% 
  rename(cs_open = open, cs_high = high, cs_low = low, cs_close = close)

# cap weighted DB prices
Pdb <- Prices %>% 
  filter(symbol == "DB") %>% 
  select(-c(volume, adjusted)) %>% 
  mutate(wght = Wgts$wDB) %>% 
  mutate(across(3:6, ~ .x  * wght)) %>% 
  select(-c(wght,symbol)) %>% 
  rename(db_open = open, db_high = high, db_low = low, db_close = close)

# Euro Index of cap weighted HLOC prices
Peuro <- left_join(Pbcs, Pcs, by = "date") %>% left_join(Pdb, by = "date") %>% 
  mutate(open = bcs_open + cs_open + db_open) %>% 
  mutate(high = bcs_high + cs_high + db_high) %>% 
  mutate(low = bcs_low + cs_low + db_low) %>% 
  mutate(close = bcs_close + cs_close + db_close) %>% 
  select(date, open, high, low, close) %>% 
  tk_xts(date_var = "date")

# compute Parkinson volatility and add to tibble
vp <- as.data.frame(volatility(Peuro, n=1, N=250, calc = "parkinson"))
Vp <- Vp %>% mutate(EINDb = vp[,1])

# compute Yang-Zhang volatility and add to tibble
vyz <- as.data.frame(volatility(Peuro, n=2, N=250, calc = "yang.zhang"))
Vyz <- Vyz %>% mutate(EINDb = vyz[,1])

In the application, we use the stock market volatility of the 11 systemically important financial institutions as listed by the Fed’s Large Institution Supervision Coordinating Committee: BAC, BK, C, GS, JPM, MS, STT, WFC, BCS, CS, DB. The first 8 are banks in the USA and the last three banks are in Europe.

To verify the efficacy of the jIRF approach, we replace the three individual European banks (BCS, CS, and DB) with three possible European summary variables: (1) a principal component, (2) a market capitalization weighted average of the volatilities of the European banks, and (3) the volatility of the market capitalization weighted average of the HLOC prices of the European banks.

European Volatility Summary Variable using Principal Component Analysis

We first compute the volatility of each of the three European banks and then construct the first principal component from those three vectors. We center the data to have mean zero and scale the data to have unit standard error. This will scale the volatility measure but will not affect the results of our VAR analysis.

For the Parkinson method, the first component explains about 87 percent of the variance of the European banks:

summary(PCAp)$importance %>%  
  kable(format = "html", table.attr = "style='width:30%;' ") %>% 
  kableExtra::kable_styling()

	PC1	PC2	PC3
Standard deviation	1.619968	0.505163	0.347152
Proportion of Variance	0.874770	0.085060	0.040170
Cumulative Proportion	0.874770	0.959830	1.000000

For the Yang-Zhang method, the first component explains about 84 percent of the variance of the European banks:

summary(PCAyz)$importance %>%  
  kable(format = "html", table.attr = "style='width:30%;' ") %>% 
  kableExtra::kable_styling()

	PC1	PC2	PC3
Standard deviation	1.591577	0.5364415	0.423216
Proportion of Variance	0.844370	0.0959200	0.059700
Cumulative Proportion	0.844370	0.9403000	1.000000

These PCA volatility summary variable measures are denoted as EPCA in the volatility plots and tables.

European Volatility Index using Weighted Market Capitalizations

There are a couple of alternatives for using these market capitalization weights to construct a European volatility index: (1) a market capitalization weighted average of the computed volatilities for the European banks (EINDa), or (2) compute the volatility the market capitalization weighted HLOC components of the European banks (EINDb).

The market capitalization weights for BCS, CS, and DB are shown below.

Wgts_l <- as_tibble(Wgts) %>% select(-c(BCSmc,CSmc,DBmc)) %>% 
  pivot_longer(cols = 1:3, names_to = "symbol", values_to = "wghts")
Wgts_l %>% 
  ggplot(aes(x = date, y = wghts)) +
  geom_line() +
  facet_wrap(nrow = 3, vars(symbol),scales = "free_y") +
  labs(title = "Market Capitalization Weights") +
  ylab(" ") +
  xlab(" ")

The time \(t\) market capitalization weighted averages of the Parkinson and Yang-Zhang measures of volatility for each of the three European banks, BCS, CS, and DB, are computed as

\[ EINDa_t = w_{BCS,t} V_{BCS,t} + w_{CS,t} V_{CS,t} + w_{DB,t} V_{DB,t} \] where \(V_{BCS,t}\), \(V_{CS,t}\), and \(V_{DB,t}\) are the daily volatilities of Barclays, Credit Suisse, and Deutsche Bank, respectively, using either the Parkinson or the Yang-Zhang estimator. The market capitalization weights at time \(t\) are defined as \[ w_{BCS,t}=\frac{MC_{BCS,t}}{MC_{BCS,t}+MC_{CS,t}+MC_{DB,t}}, \] \[ w_{CS,t}=\frac{MC_{CS,t}}{MC_{BCS,t}+MC_{CS,t}+MC_{DB,t}}, \] and \[ w_{DB,t}=\frac{MC_{DB,t}}{MC_{BCS,t}+MC_{CS,t}+MC_{DB,t}}, \] where \(MC_{BCS,t}\), \(MC_{CS,t}\), and \(MC_{DB,t}\) are the time \(t\) market capitalizations of Barclays, Credit Suisse, and Deutsche Bank, respectively.

Alternatively, we first compute the market cap weighted values for the HLOC prices of all three European banks and then compute the Parkinson and Yang-Zhang volatility measures using these weighted HLOC prices. We denote these volatility measures as EINDb.

Comparing the Methods of Computing Volatility

We have two dataframes that contain the computed daily volatility for each of the 11 banks, the European volatility summary variable using the PCA approach (EPCA), the European index using the market capitalization weighted average of the volatilities of the 3 European banks (EINDa), and the European index using the computed volatility of the market capitalization weighted HLOC prices of the 3 European banks (EINDb). The dataframe Vp uses the Parkinson method and the dataframe Vyz uses the Yang-Zhang method.

Below is plot of all of these computed volatilities for Vp, the Parkinson method

Vp_l <- Vp %>%
  pivot_longer(cols = 2:15, names_to = "symbol", values_to = "vol")
Vp_lreorder <- Vp_l
Vp_lreorder$symbol <- factor(Vp_lreorder$symbol, levels = c(Tickers,"EPCA","EINDa","EINDb"))
Vp_lreorder %>% 
  ggplot(aes(x = date, y = vol)) +
  geom_line() +
  facet_wrap(nrow = 7, vars(symbol),scales = "free_y") +
  ylab("Parkinson Volatility") +
  xlab("")

Below is plot of all of these computed volatilities for Vyz, the Yang-Zhang method

Vyz_l <- Vyz %>%
  pivot_longer(cols = 2:15, names_to = "symbol", values_to = "vol")
Vyz_lreorder <- Vyz_l
Vyz_lreorder$symbol <- factor(Vyz_lreorder$symbol, levels = c(Tickers,"EPCA","EINDa","EINDb"))
Vyz_lreorder %>% 
  ggplot(aes(x = date, y = vol)) +
  geom_line() +
  facet_wrap(nrow = 7, vars(symbol),scales = "free_y") +
  ylab("Yang-Zhang Volatility") +
  xlab("")

Next we plot Vp and Vyz together on the same graphs.

ggplot(NULL, aes(x = date, y = vol)) +
  geom_line(data = Vp_l, color = "blue", alpha = .5) + 
  facet_wrap(nrow = 6, vars(symbol),scales = "free_y") +
  geom_line(data = Vyz_l, color = "red", alpha = .5) +
  labs(title = "Volatility: Parkinson (blue) and Yang-Zhang(red)")

To get an easier comparison, we zoom in on the plot of the Parkinson and Yang-Zhang volatilities for bank BAC.

# Choose one: "BAC" "BK"  "C"   "GS"  "JPM" "MS"  "STT" "WFC" "BCS" "CS"  "DB" "EPCA" "EIND"
ggplot(NULL, aes(x = date, y = BAC)) +
  geom_line(data = Vp, color = "blue", alpha = .5) +
  geom_line(data = Vyz, color = "red", alpha = .5) +
  labs(title = "Volatility: Parkinson (blue) and Yang-Zhang(red)")

Next, we compare summary statistics for the Parkinson and Yang-Zhang computations.

Vp_l %>% drop_na() %>%
  group_by(symbol) %>%
  summarise(
    count = n(),
    mean = mean(vol),
    sd = sd(vol),
    min = min(vol),
    max = max(vol)
  ) %>%
  kable(format = "html",
        table.attr = "style='width:30%;' ",
        caption = "Summary statistics of volatility using the Parkinson method") %>%
  kableExtra::kable_styling()

Summary statistics of volatility using the Parkinson method
symbol	count	mean	sd	min	max
BAC	1029	0.2099245	0.1422818	0.0416125	1.2835800
BCS	1029	0.1631068	0.1220365	0.0376814	1.5862968
BK	1029	0.1940706	0.1275583	0.0388080	1.1663046
C	1029	0.2204525	0.1655847	0.0417786	1.7774351
CS	1029	0.1491514	0.1077178	0.0251731	1.1815377
DB	1029	0.1867867	0.1182341	0.0000000	1.2298396
EINDa	1029	0.1632338	0.1083795	0.0437402	1.2039462
EINDb	1029	0.1618716	0.1070742	0.0451941	1.1496418
EPCA	1029	0.0000000	0.0161997	-0.0182360	0.1482884
GS	1029	0.2053800	0.1365660	0.0492078	1.4540587
JPM	1029	0.1837965	0.1341861	0.0394090	1.2909421
MS	1029	0.2212518	0.1509439	0.0552290	1.7892787
STT	1029	0.2286232	0.1578762	0.0437279	1.6652573
WFC	1029	0.2102233	0.1517392	0.0396760	1.5933477

Vyz_l %>% drop_na() %>% 
  group_by(symbol) %>% 
  summarise(
          count = n(),
          mean = mean(vol),
          sd = sd(vol),
          min = min(vol),
          max = max(vol)
            ) %>%
  kable(format = "html",
        table.attr = "style='width:30%;' ",
        caption = "Summary statistics of volatility using the Yang-Zhang method") %>%
  kableExtra::kable_styling()

Summary statistics of volatility using the Yang-Zhang method
symbol	count	mean	sd	min	max
BAC	1027	0.2798882	0.2400234	0.0594519	3.268534
BCS	1027	0.2979218	0.2781389	0.0576923	4.029446
BK	1027	0.2504915	0.2107470	0.0542159	2.952009
C	1027	0.2929773	0.2722538	0.0628614	3.981445
CS	1026	0.2598522	0.2323250	0.0578035	3.591341
DB	1027	0.3324001	0.2536433	0.0581909	3.350463
EINDa	1026	0.2913880	0.2349500	0.0853008	3.706538
EINDb	1027	0.2689574	0.2287443	0.0634655	3.636745
EPCA	1029	0.0000000	0.0159158	-0.0201453	0.229071
GS	1027	0.2594637	0.2030293	0.0617784	3.136767
JPM	1027	0.2453828	0.2266704	0.0533454	3.391097
MS	1027	0.2905240	0.2376557	0.0647197	3.457622
STT	1027	0.2939690	0.2458923	0.0775201	3.527717
WFC	1027	0.2713791	0.2415344	0.0645863	3.310084

Compare the European Index volatility measures using the weighted volatilities and the volatility of the weighted HLOC prices. First, using the Parkinson method.

ggplot(Vp, aes(x = date)) +
  geom_line(aes(y = EINDa), color = "blue", alpha = .5) + 
  geom_line(aes(y = EINDb), color = "red", alpha = .5) + 
  labs(title = "Parkinson European Index Volatility", subtitle = "weighted volatilities (red) and volatility of weighted HLOC (blue)") + xlab(" ") + ylab("")

Next, using the Yang-Zhang method.

ggplot(Vyz, aes(x = date)) +
  geom_line(aes(y = EINDa), color = "blue", alpha = .5) +
  geom_line(aes(y = EINDb), color = "red", alpha = .5) +
  labs(title = "Yang-Zhang European Index Volatility", subtitle = "weighted volatilities (red) and volatility of weighted HLOC (blue)") + xlab(" ") + ylab("")

Produce “allTickersyz” Plot for Paper

Tickers <- c("BAC","BK","BCS","C","CS","DB","GS","JPM","MS","STT","WFC")

Vyz_l <- Vyz %>%
  pivot_longer(cols = 2:12, names_to = "symbol", values_to = "vol")
Vyz_lreorder <- Vyz_l
Vyz_lreorder$symbol <- factor(Vyz_lreorder$symbol, levels = c(Tickers))
Vyz_lreorder %>% 
  ggplot(aes(x = date, y = vol)) +
  geom_line() +
  geom_vline(xintercept=Vyz$date[776], linetype='dotted', color='black', size=.5) +
  facet_wrap(nrow = 6, vars(symbol),scales = "free_y",strip.position = "right") +
  ylab("Yang-Zhang Volatility") +
  xlab("")

p1 <- Vyz %>% 
  ggplot() +
  geom_line(aes(date,BAC*100)) + ylim(0,400) +
  labs(x = NULL, y = "BAC") +
  geom_vline(xintercept = Vyz$date[776], linetype="dashed")

p2 <- Vyz %>% 
  ggplot() +
  geom_line(aes(date,BK*100)) + ylim(0,400) +
  labs(x = NULL, y = "BK") +
  geom_vline(xintercept = Vyz$date[776], linetype="dashed")

p3 <- Vyz %>% 
  ggplot() +
  geom_line(aes(date,BCS*100)) + ylim(0,410) +
  labs(x = NULL, y = "BCS") +
  geom_vline(xintercept = Vyz$date[776], linetype="dashed")

p4 <- Vyz %>% 
  ggplot() +
  geom_line(aes(date,C*100)) + ylim(0,400) +
  labs(x = NULL, y = "C") +
  geom_vline(xintercept = Vyz$date[776], linetype="dashed")

p5 <- Vyz %>% 
  ggplot() +
  geom_line(aes(date,CS*100)) + ylim(0,400) +
  labs(x = NULL, y = "CS") +
  geom_vline(xintercept = Vyz$date[776], linetype="dashed")

p6 <- Vyz %>% 
  ggplot() +
  geom_line(aes(date,DB*100)) + ylim(0,400) +
  labs(x = NULL, y = "DB") +
  geom_vline(xintercept = Vyz$date[776], linetype="dashed")

p7 <- Vyz %>% 
  ggplot() +
  geom_line(aes(date,GS*100)) + ylim(0,400) +
  labs(x = NULL, y = "GS") +
  geom_vline(xintercept = Vyz$date[776], linetype="dashed")

p8 <- Vyz %>% 
  ggplot() +
  geom_line(aes(date,JPM*100)) + ylim(0,400) +
  labs(x = NULL, y = "JPM") +
  geom_vline(xintercept = Vyz$date[776], linetype="dashed")

p9 <- Vyz %>% 
  ggplot() +
  geom_line(aes(date,MS*100)) + ylim(0,400) +
  labs(x = NULL, y = "MS") +
  geom_vline(xintercept = Vyz$date[776], linetype="dashed")

p10 <- Vyz %>% 
  ggplot() +
  geom_line(aes(date,STT*100)) + ylim(0,400) +
  labs(x = NULL, y = "STT") +
  geom_vline(xintercept = Vyz$date[776], linetype="dashed")

p11 <- Vyz %>% 
  ggplot() +
  geom_line(aes(date,WFC*100)) + ylim(0,400) +
  labs(x = NULL, y = "WFC") +
  geom_vline(xintercept = Vyz$date[776], linetype="dashed")


g1 <- p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + p9 + p10 + p11 + plot_layout(ncol=2,byrow=TRUE)
g1

Create Tables for Appendix

Tp <- Vp_l %>% drop_na() %>% 
  group_by(symbol) %>% 
  summarise(
          mean = 100*mean(vol),
          sd = 100*sd(vol),
            ) 
Tyz <- Vyz_l %>% drop_na() %>% 
  group_by(symbol) %>% 
  summarise(
          mean = 100*mean(vol),
          sd = 100*sd(vol),
            ) 
Tpyz <- cbind(Tp[c(1:6,10:14),c(1,2,3)],Tyz[c(1:6,10:14),c(2,3)])


# %>%
#   kable(format = "html",
#         table.attr = "style='width:30%;' ",
#         caption = "Summary statistics of volatility using the Yang-Zhang method") %>%
#   kableExtra::kable_styling()

Create Bank of America’s Volatility plot for Appendix

# Choose one: "BAC" "BK"  "C"   "GS"  "JPM" "MS"  "STT" "WFC" "BCS" "CS"  "DB" "EPCA" "EIND"
ggplot(NULL, aes(x = date, y = 100*BAC)) +
  geom_line(data = Vp, color = "blue", alpha = .5) +
  geom_line(data = Vyz, color = "red", alpha = .5) +
  labs(title = "Volatility: Parkinson (blue) and Yang-Zhang(red)") +
  ylab("BAC") + xlab("")

Create Plot of Yang-Zhang European Market Capitalization Weighted Indices

ggplot(Vyz, aes(x = date)) +
  geom_line(aes(y = 100*EINDa), color = "blue", alpha = .5) +
  geom_line(aes(y = 100*EINDb), color = "red", alpha = .5) +
  labs(title = "Yang-Zhang European Index Volatility", subtitle = "weighted volatilities (red) and volatility of weighted HLOC (blue)") + xlab(" ") + ylab("")

Create Plots showing all Three Summary Variables of European Bank Volatility

pEPCAyz <- ggplot(data=Vyz) + geom_line(aes(date,EPCA))
pEINDayz <- ggplot(data=Vyz) + geom_line(aes(date,100*EINDa)) + ylab("EINDa")
pEINDbyz <- ggplot(data=Vyz) + geom_line(aes(date,100*EINDb)) + ylab("EINDb")
grid.arrange(pEPCAyz,pEINDayz,pEINDbyz,nrow=3,
             top="Summary Variables for Volatilty of European Banks")

Supplemental R Code for the Paper: A Joint Impulse Response Function for Vector Autoregressive Models Thomas F. P. Wiesen Paul M. Beaumont 2024