\(~\)
This supplemental document provides R code to compute the joint impulse response function (jIRF) and joint forecast error variance decomposition proposed in A Joint Impulse Response Function for Vector Autoregressive Models. For comparison purposes, this code also calculates the generalized impulse response function (gIRF) and generalized forecast error variance decomposition of Pesaran and Shin (1998, Generalized impulse response analysis in linear multivariate models). In addition, this document provides the code to produce the High-Low-Open-Close (HLOC) volatility measures used in this paper and the methods used to construct the European summary variables for volatility used in the application.
First load your data into R and estimate the \(K\)-variable VAR(\(P\)) model using your estimation package of
choice (e.g., the vars package in R). Then obtain the VMA representation
of the VAR:
\[
\mathbf{Y_t} =\mathbf{c}+\sum_{h=0}^{\infty}\mathbf{A}_h
\mathbf{\epsilon}_{t-h},
\] where \(\mathbf{A}_0=\mathbf{I}_K\). Save the \(K \times K\) shock covariance matrix \(\mathbf{\Sigma}_{\epsilon}=E \left(
\mathbf{\epsilon}_t \mathbf{\epsilon}'_t \right)\) and denote
it in your R code as sigma_eps
. Also save the VMA
coefficient matrices up to at least the chosen time horizon as a \(K \times K \times (\geq H)\) array denoted
as A
in your R code. Users should save at least two VMA
coefficient matrices (namely, save at least \(\mathbf{A}_0\) and \(\mathbf{A}_1\)). If the user only wishes to
measure the immediate impulse response (\(h=0\)), then the user should pick \(H=1\). If the user wishes to measure the
immediate and one-step ahead impulse response (\(h=0,1\)), then the user should pick \(H=2\). If the user wishes to measure the
immediate, one-step, and two-step ahead impulse response (\(h=0,1,2\)), then the user should pick \(H=3\). And so on. Again, the user should
save at least \(\mathbf{A}_0\) and
\(\mathbf{A}_1\) even if \(H=1\). Furthermore, the user should save at
least as many VMA coefficient matrices as are required to compute the
impulse responses up to the choose horizon (namely, the size of the
third dimension of A
should be greater than or equal to
\(H\)). If the size of the third
dimension of A
is less than \(H\), an error will result.
Mathematical Notation | R Code Notation | Dimension | Description |
---|---|---|---|
\(\mathbb{J}\) | setJ | \(1 \times \#\mathbb{J}\) | set of variables where the shocks originate for the jIRF & jFEVD |
\(j\) | j | \(1 \times 1\) | variable where the shock originates for the gIRF |
\(\mathbf{\Sigma}_\epsilon\) | sigma_eps | \(K \times K\) | shock covariance matrix |
\(\mathbf{A}_0,...,\mathbf{A}_{H-1},...\) | A | \(K \times K \times (\geq H)\) | VMA coefficient matrices |
\(H\) | H | \(1 \times 1\) | forecast horizon |
Mathematical Notation | R Code Notation | Equation Reference | Dimension | Description |
---|---|---|---|---|
\(\mathbf{jIRF}(h, \mathbf{\delta}_{\mathbb{J}}=\mathbf{s}_{\mathbb{J}}, \mathbf{\Omega}_{t-1}),\) \(h=0,1,...,H-1\) | jIRF | (12) | \(K \times H\) | joint impulse response function |
\(\zeta_{i|\mathbb{J}}^{jnt}(H),\) \(i=1,2,...,K\) | jFEVD | (26) | \(K \times 1\) | joint forecast error variance decomposition |
\(\mathbf{gIRF}(h, \delta_j=\sqrt{\sigma_{jj}}, \mathbf{\Omega}_{t-1}),\) \(h=0,1,...,H-1\) | gIRF | (7) | \(K \times H\) | generalized impulse response function |
\(\zeta_{ik}^{gen}(H),\) \(i,k=1,2,...,K\) | gFEVD | (24) | \(K \times K\) | generalized forecast error variance decomposition |
The code below requires five inputs. (1) Choose set \(\mathbb{J}\), which is the set of variables
where the joint, simultaneous shocks originate for the jIRF and jFEVD
and save it as a \(1 \times \#
\mathbb{J}\) vector labeled as setJ
. (2) Choose
\(j\), which is the variable where the
shock originates for the gIRF and save it as a scalar labeled as
j
. (3) Save the shock covariance matrix \(\mathbf{\Sigma}_\epsilon\) as a \(K \times K\) matrix labeled as
sigma_eps
. (4) Save at least \(H\) of the VMA coefficient matrices \(\mathbf{A}_0, \mathbf{A}_1, ...,
\mathbf{A}_{H-1},...\) as a \(K \times
K \times (\geq H)\) array labeled as A
. And (5)
choose the time horizon \(H\) and save
it as a scalar labeled as H
.
These first steps are not included in the R code. However, we provide a simple numeric example in the code that sets \[\begin{equation*} \mathbf{\Sigma}_{\epsilon}=\begin{bmatrix} 1 & 0.5 & -0.1 & 0.1 \\ 0.5 & 1 & 0.8 & 0.1 \\ -0.1 & 0.8 & 1 & 0.1 \\ 0.1 & 0.1 & 0.1 & 1 \end{bmatrix} \end{equation*}\] and \[\begin{equation*} \mathbf{A}_0=\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \;\;\;\;\;\; \mathbf{A}_1=\begin{bmatrix} 0.55 & 0.1 & 0.1 & 0.1 \\ 0.1 & 0.55 & 0.1 & 0.1 \\ 0.1 & 0.1 & 0.55 & 0.1 \\ 0.1 & 0.1 & 0.1 & 0.55 \end{bmatrix} \;\;\;\;\;\; \mathbf{A}_2=\begin{bmatrix} 0.3325 & 0.13 & 0.13 & 0.13 \\ 0.13 & 0.3325 & 0.13 & 0.13 \\ 0.13 & 0.13 & 0.3325 & 0.13 \\ 0.13 & 0.13 & 0.13 & 0.3325 \\ \end{bmatrix} \end{equation*}\] \[\begin{equation*} \mathbf{A}_3=\begin{bmatrix} 0.221875 & 0.13075 & 0.13075 & 0.13075 \\ 0.13075 & 0.221875 & 0.13075 & 0.13075 \\ 0.13075 & 0.13075 & 0.221875 & 0.13075 \\ 0.13075 & 0.13075 & 0.13075 & 0.221875 \end{bmatrix} \;\;\;\;\;\; \mathbf{A}_4=\begin{bmatrix} 0.1612563 & 0.12025 & 0.12025 & 0.12025 \\ 0.12025 & 0.1612563 & 0.12025 & 0.12025 \\ 0.12025 & 0.12025 & 0.1612563 & 0.12025 \\ 0.12025 & 0.12025 & 0.12025 & 0.1612563 \end{bmatrix} \end{equation*}\] \[\begin{equation*} \mathbf{A}_5=\begin{bmatrix} 0.1247659 & 0.1063131 & 0.1063131 & 0.1063131 \\ 0.1063131 & 0.1247659 & 0.1063131 & 0.1063131 \\ 0.1063131 & 0.1063131 & 0.1247659 & 0.1063131 \\ 0.1063131 & 0.1063131 & 0.1063131 & 0.1247659 \end{bmatrix} \end{equation*}\] Thus, in this example, \(K=4\) and \(H=6\). Delete or comment out this example from the R code when estimating the VAR parameters. These example VMA coefficient matrices correspond to the \(\mathbf{\Phi}_1\) VAR coefficient matrix in equation (21) of the main paper. The shock covariance matrix corresponds to the upper-left panel of figure 1 in the main paper. In our example, we also set \(\mathbb{J}=\{2,3\}\) and \(j=2\).
Once the inputs are set, the provided code will calculate four results: (1) the joint impulse response function measuring how \(\mathbf{Y}\) responds due to joint, simultaneous one standard deviation shocks from the variables in set \(\mathbb{J}\), (2) the joint forecast error variance decomposition vector whose element \(i\) measures the proportion of the \(H\)-step forecast error variance of variable \(i\) that can be explained by the shocks of the variables in set \(\mathbb{J}\), (3) the generalized impulse response function measuring how \(\mathbf{Y}\) responds due to a one standard deviation shock from variable \(j\), and (4) the generalized forecast error variance decomposition matrix whose \(i,k\) element measures the proportion of the \(H\)-step forecast error variance of variable \(i\) than can be explained by the shocks of variable \(k\). See the main text of the paper for details.
#R code to calculate the joint impulse response function, the joint forecast error...
#variance decomposition, the generalized impulse response function, and the generalized...
#forecast error variance decomposition.
#The user must first estimate the VAR and obtain the VMA coefficient matrices (A) and...
#the shock covariance matrix (sigma_eps). This step is not shown. Note that the "vars"...
#package calls the VMA coefficient matrices Phi, but this code calls them A.
##########################################################################################
#Inputs: where are the originating shocks occurring?
#Pick the variables in setJ for the jIRF and jFEVD.
setJ=array(c(2,3)) #For example, variables 2 and 3.
#Pick which variable is variable j for the gIRF.
j=2 #For example, variable 2.
H=6 #Set the forecast horizon (H). E.g., if H=6, then the code will calculate the...
#impulse response for h=0,1,2,3,4,5 time steps ahead and the forecast error variance...
#decomposition will use the 6-step forecast error variance.
#Set example values for A and sigma_eps. Users should delete or comment out these...
#example matrices when estimating the VAR parameters from data.
#Example sigma_eps from the top left graph in figure 1 of the main text.
sigma_eps=array(c(1,0.5,-0.1,0.1,0.5,1,0.8,0.1,-0.1,0.8,1,0.1,0.1,0.1,0.1,1), dim=c(4,4))
#Example Phi from equation (21) of the main text.
Phi=array(c(0.55,0.1,0.1,0.1,0.1,0.55,0.1,0.1,0.1,0.1,0.55,0.1,0.1,0.1,0.1,0.55), dim=c(4,
4))
A=array(NaN,dim=c(4,4,6)) #Create empty array for VMA coefficient matrices.
#Because our example is a VAR(1) model, A[,,h+1] is simply Phi raised to the h power.
A[,,1]=diag(4) #4-dimensional identity matrix.
A[,,2]=Phi
A[,,3]=Phi%*%Phi
A[,,4]=Phi%*%Phi%*%Phi
A[,,5]=Phi%*%Phi%*%Phi%*%Phi
A[,,6]=Phi%*%Phi%*%Phi%*%Phi%*%Phi
##########################################################################################
#Set the number of variables in the VAR (K).
K=dim(sigma_eps)[1]
#Identity matrix and selection matrix.
I_K=diag(K) #K by K identity matrix.
e_setJ=I_K[,setJ] #Selection matrix using the setJ columns of the identity matrix.
#Calculate the joint impulse response function (jIRF) vector.
s_setJ=sqrt(diag(t(e_setJ)%*%sigma_eps%*%e_setJ)) #setJ shock standard deviations.
jIRF=array(0,dim=c(K,H))
for (h in 1:H){
jIRF[,h]=A[,,h]%*%sigma_eps%*%e_setJ%*%solve(t(e_setJ)%*%sigma_eps%*%e_setJ)%*%s_setJ
}
#Calculate the forecast error covariance matrix (Xi).
Xi_h=array(0,dim=c(K,K,H))
for (h in 1:H){
Xi_h[,,h]=A[,,h]%*%sigma_eps%*%t(A[,,h]) #Calculated Xi at each h.
}
#Sum them along THIRD dimension to form Xi.
Xi=rowSums(Xi_h, dims=2)
#Note that because the above function is a row sum, dims=2 actually sums along the third...
#dimension.
#Calculate the joint forecast error variance decomposition (jFEVD) vector for given setJ.
#First calculate the numerator of the jFEVD.
jFEVD_numerator_h=array(0,dim=c(K,H))
for (i in 1:K){
for (h in 1:H){
#Calculate the numerator of the jFEVD at each h.
jFEVD_numerator_h[i,h]=I_K[i,]%*%A[,,h]%*%sigma_eps%*%e_setJ%*%(solve(t(e_setJ)%*%
sigma_eps%*%e_setJ))%*%t(e_setJ)%*%sigma_eps%*%t(A[,,h])%*%I_K[,i]
}
}
jFEVD_numerator=array(0,dim=c(K))
for (i in 1:K){
#Calculate the numerator of the jFEVD by summing over h.
jFEVD_numerator[i]=sum(jFEVD_numerator_h[i,])
}
jFEVD=array(0,dim=c(K))
for (i in 1:K){
#Divide by Xi to form the jFEVD.
jFEVD[i]=jFEVD_numerator[i]/Xi[i,i]
}
#Calculate the generalized impulse response function (gIRF) vector due to a shock from...
#variable j.
gIRF=array(0,dim=c(K,H))
for (h in 1:H){
gIRF[,h]=A[,,h]%*%sigma_eps%*%I_K[,j]%*%(1/sqrt(sigma_eps[j,j]))
}
#Calculate the gFEVD matrix.
#First calculate the numerator of the gFEVD.
gFEVD_numerator_h=array(0,dim=c(K,K,H))
for (h in 1:H){
#Calculate the numerator of the gFEVD at each h.
gFEVD_numerator_h[,,h]=(A[,,h]%*%sigma_eps)*(A[,,h]%*%sigma_eps)
}
#Sum them along THIRD dimension to form the numerator of the gFEVD.
gFEVD_numerator=rowSums(gFEVD_numerator_h, dims=2)
#Note that because the above function is a row sum, dims=2 actually sums along the third...
#dimension.
gFEVD=array(0,dim=c(K,K))
for (i in 1:K){
for (k in 1:K){
#Divide by the shock variance times Xi to form the gFEVD.
gFEVD[i,k]=gFEVD_numerator[i,k]/(sigma_eps[k,k]*Xi[i,i])
}
}
##########################################################################################
#OUTPUT.
#Print the jIRF. The response of vector Y at h=0,1,...,H-1 periods ahead due joint,...
#simultaneous one standard deviation shocks from the variables in setJ. Each row...
#corresponds to a different response variable and each column corresponds to a different...
#number of time steps ahead.
cat("jIRF \n")
## jIRF
print(jIRF)
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.2222222 0.3333333 0.3483333 0.3253333 0.2896958 0.2521646
## [2,] 1.0000000 0.6833333 0.5058333 0.3962083 0.3215896 0.2665168
## [3,] 1.0000000 0.6833333 0.5058333 0.3962083 0.3215896 0.2665168
## [4,] 0.1111111 0.2833333 0.3258333 0.3152083 0.2851396 0.2501143
#Print the jFEVD. Element i is the proportion of the H-step forecast error variance of...
#variable i that can be explained by the shocks of the variables in setJ.
cat("\n jFEVD \n")
##
## jFEVD
print(as.matrix(jFEVD))
## [,1]
## [1,] 0.9118513
## [2,] 0.9615088
## [3,] 0.9594943
## [4,] 0.2330353
#Print the gIRF. The response of vector Y at h=0,1,...,H-1 periods ahead due to a one...
#standard deviation shock from variable j. Each row corresponds to a different response...
#variable and each column corresponds to a different number of time steps ahead.
cat("\n gIRF \n")
##
## gIRF
print(gIRF)
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.5 0.465 0.41325 0.3593625 0.3091031 0.2643779
## [2,] 1.0 0.690 0.51450 0.4049250 0.3296063 0.2736043
## [3,] 0.8 0.600 0.47400 0.3867000 0.3214050 0.2699138
## [4,] 0.1 0.285 0.33225 0.3229125 0.2927006 0.2569968
#Print the gFEVD. Element i,k is the proportion of the H-step forecast error variance...
#of variable i that can be explained by the shocks of variable k.
cat("\n gFEVD \n")
##
## gFEVD
print(gFEVD)
## [,1] [,2] [,3] [,4]
## [1,] 0.83535648 0.4628898 0.1024116 0.07645277
## [2,] 0.27847135 0.9602257 0.5922684 0.07074939
## [3,] 0.06893124 0.7501965 0.8730821 0.07445214
## [4,] 0.09943611 0.2315267 0.1371134 0.81258855
# Load required libraries.
require(fpp3)
require(tidyquant)
require(TTR)
require(kableExtra)
require(xts)
require(timetk)
require(readxl)
require(readr)
require(patchwork)
require(gridExtra)
Tickers <- c("BAC","BK","BCS","C","CS","DB","GS","JPM","MS","STT","WFC")
Prices <- tq_get(Tickers,get="stock.prices",from="2017-01-01",to="2021-02-04")
# read market capitalization data
mktcap <- read_xlsx("MktCaps.xlsx")
#mktcap <- read_csv("./MktCap/MktCaps.csv") %>% as_tsibble()
# compute market cap weights for the European stocks
Wgts <- mktcap %>%
mutate(wBCS = BCSmc/(BCSmc + CSmc + DBmc)) %>%
mutate(wCS = CSmc/(BCSmc + CSmc + DBmc)) %>%
mutate(wDB = DBmc/(BCSmc + CSmc + DBmc)) %>%
mutate(date = Date) %>% select(-Date)
In this supplement, we summarize how we compute daily stock
volatility using High, Low, Open, and Close prices for for section 6 of
the paper “A Joint Impulse Response Function for Vector Autoregressive
Models.” In our discussion, we will follow the notation of
“Drift-Independent Volatility Estimation Based on High, Low, Open, and
Close Prices”, by Dennis Yang and Qiang Zhang in The Journal of
Business, Vol. 73, No. 3 (July 2000), pp. 477-492.
URL. To
compute the Parkinson and Yang-Zhang volatility measures, we use the
volatility
function from the TTR
package by
Joshua Ulrich. TTR
Define the following notation for date \(t\) prices
# As a reliability check, this is the method originally we used to compute Vp
# except I have used N=250 days to annualize the volatility estimates.
# Our code and Joshua Ulrich's code produce identical results.
N <- 250
VpOld <- Prices %>%
mutate(vol = sqrt(N/(4*log(2))*(log(high) - log(low))^2)) %>%
select(date,symbol,vol) %>%
pivot_wider(names_from = symbol, values_from = vol)
# Compute the Parkinson volatility
# create placeholder dataframe and then replace the columns
# one at a time using the Parkinson method
Vp <- Prices %>%
select(date, symbol, adjusted) %>%
pivot_wider(names_from = symbol, values_from = adjusted)
ndim <- dim(Prices)
nr <- ndim[1] / length(Tickers)
nc <- length(Tickers)
for (i in 1:nc) {
x <- Prices %>% filter(symbol == Tickers[i]) %>% tk_xts(date_var = "date")
v <- volatility(x, n=1, N=250, calc = "parkinson")
Vp[,1+i] <- v
}
# In some instances, such as H = O or L = O, the function
# returns an NA rather than a zero, so I replace those NA's with zeros.
Vp <- Vp %>% mutate_if(is.numeric, funs(replace_na(., 0)))
# clean up
rm(x,v,i,nr,nc)
An early and commonly used estimator of daily volatility using HLOC data is the method of Parkinson (1980) which assumes an underlying price that follows a geometric Brownian motion process. The time \(t\) Parkinson estimator, denoted as \(V_{p,t}\), is computed as \[\begin{align} V_{p,t} & = \sqrt{\frac{N}{4 n \ln(2)} \sum_{i=0}^{n-1}(u_{t-i} - d_{t-i})^2} \\ &= \sqrt{\frac{N}{4 n \ln(2)} \sum_{i=0}^{n-1}\ln(H_{t-i}/L_{t-i})^2}, \end{align}\] where \(N\) is the number of trading days in the year and \(n\) is the number of trading days over which the volatility is estimated. In our application we use \(n=1\) and \(N=250\) to get annualized daily volatility, which we scale up by 100 so that the units are annualized percentages.
The Parkinson estimator uses only the daily high and low prices and is valid only if: (1) there are no jumps between the previous day’s closing price and the current day’s opening price, and (2) there is no drift in the underlying geometric Brownian motion price process. If these conditions are violated, the Parkinson estimator will be biased and inefficient.
Rogers and Satchell (1991) propose an estimator that is independent of the drift rate in the underlying process and has a smaller estimator variance than \(V_{p,t}\). The time \(t\) volatility estimator of Rogers and Satchell (1991), \(V_{rs,t}\), can be computed as \[ V_{rs,t} = \sqrt{\frac{N}{n} \sum_{i=0}^{n-1}\left[ u_{t-i}(u_{t-i}- c_{t-i}) + d_{t-i}(d_{t-i} - c_{t-i})\right]}. \] Since \(V_{rs,t}\) does not allow for opening jumps in prices, it will still underestimate the true daily price volatility.
# Compute the Yang-Zhang volatility
# create placeholder dataframe
Vyz <- Prices %>%
select(date, symbol, adjusted) %>%
pivot_wider(names_from = symbol, values_from = adjusted)
ndim <- dim(Prices)
nr <- ndim[1] / length(Tickers)
nc <- length(Tickers)
for (i in 1:nc) {
x <- Prices %>% filter(symbol == Tickers[i]) %>% tk_xts(date_var = "date")
v <- volatility(x, n=2, N=250, calc = "yang.zhang")
Vyz[,1+i] <- v
}
rm(x,v,i,nr,nc)
Yang & Zhang (2000) derive a minimum-variance unbiased estimator of volatility, denoted as \(V_{yz,t}\) for time \(t\), which is independent of both drift and opening jumps \[ V_{yz,t} = \sqrt{V^2_{o,t} + k V^2_{c,t} + (1-k) V^2_{rs,t}}, \] where \[\begin{align} V_{o,t} &= \sqrt{\frac{N}{n-1} \sum_{i=0}^{n-1}(o_{t-i} - \bar{o})^2}, \\ \bar{o} &= \frac{1}{n} \sum_{i=0}^{n-1}o_{t-i}, \\ V_{c,t} &= \sqrt{\frac{N}{n-1} \sum_{i=0}^{n-1}(c_{t-i}- \bar{c})^2}, \\ \bar{c} &= \frac{1}{n} \sum_{i=0}^{n-1}c_{t-i}, \end{align}\] and \(V_{rs_t}\) is the Rogers & Satchell estimator defined above. Here, the value \(k = 0.07834101\) is chosen to minimize the variance and is dependent on \(n\).
Note the use of \(o_t = \ln O_t - \ln C_{t-1}\) in the expression for \(V_{o,t}\) requires that \(n > 1\). In our application, we use \(n=2\) so that the first available volatility is for period \(t=3\).
# Add the PCA and cap weighted index of Euro vols for the Parkinson method
# compute the PCA of the Euro vols
PCAp <- Vp %>% select(BCS, CS, DB) %>%
prcomp(retx = TRUE, center = TRUE, scale. = TRUE)
# summary(PCAp)
EPCAp <- PCAp$x[, 1] / 100
# compute the cap weighted index of Euro vols
EINDa <- Vp %>%
select(date,BCS, CS, DB) %>%
left_join(Wgts, by = "date") %>%
mutate(EINDa = BCS*wBCS + CS*wCS + DB*wDB) %>%
select(date, EINDa)
# append to Vp matrix
Vp <- mutate(Vp, EPCA = EPCAp)
Vp <- left_join(Vp,EINDa, by = "date")
# Add the PCA and cap weighted index of Euro vols for the Y-Z method
# compute the PCA of the Euro vols
PCAyz <- Vyz %>% select(BCS, CS, DB) %>%
#drop_na() %>%
replace(is.na(.),0) %>%
prcomp(retx = TRUE, center = TRUE, scale. = TRUE)
# summary(PCAyz)
#EPCAyz <- c(NA,NA,(PCAyz$x[, 1] / 100))
EPCAyz <- c((PCAyz$x[, 1] / 100))
# compute the cap weighted index of Euro vols
EINDa <- Vyz %>%
select(date,BCS, CS, DB) %>%
#drop_na() %>%
left_join(Wgts, by = "date") %>%
mutate(EINDa = BCS*wBCS + CS*wCS + DB*wDB) %>%
select(date, EINDa)
# append to Vyz matrix
Vyz <- mutate(Vyz, EPCA = EPCAyz)
Vyz <- left_join(Vyz,EINDa, by = "date")
# Add the cap weighted BCS prices to Vp and Vyz
Pbcs <- Prices %>%
filter(symbol == "BCS") %>%
select(-c(volume, adjusted)) %>%
mutate(wght = Wgts$wBCS) %>%
mutate(across(3:6, ~ .x * wght)) %>%
select(-c(wght,symbol)) %>%
rename(bcs_open = open, bcs_high = high, bcs_low = low, bcs_close = close)
# cap weighted CS prices
Pcs <- Prices %>%
filter(symbol == "CS") %>%
select(-c(volume, adjusted)) %>%
mutate(wght = Wgts$wCS) %>%
mutate(across(3:6, ~ .x * wght)) %>%
select(-c(wght,symbol)) %>%
rename(cs_open = open, cs_high = high, cs_low = low, cs_close = close)
# cap weighted DB prices
Pdb <- Prices %>%
filter(symbol == "DB") %>%
select(-c(volume, adjusted)) %>%
mutate(wght = Wgts$wDB) %>%
mutate(across(3:6, ~ .x * wght)) %>%
select(-c(wght,symbol)) %>%
rename(db_open = open, db_high = high, db_low = low, db_close = close)
# Euro Index of cap weighted HLOC prices
Peuro <- left_join(Pbcs, Pcs, by = "date") %>% left_join(Pdb, by = "date") %>%
mutate(open = bcs_open + cs_open + db_open) %>%
mutate(high = bcs_high + cs_high + db_high) %>%
mutate(low = bcs_low + cs_low + db_low) %>%
mutate(close = bcs_close + cs_close + db_close) %>%
select(date, open, high, low, close) %>%
tk_xts(date_var = "date")
# compute Parkinson volatility and add to tibble
vp <- as.data.frame(volatility(Peuro, n=1, N=250, calc = "parkinson"))
Vp <- Vp %>% mutate(EINDb = vp[,1])
# compute Yang-Zhang volatility and add to tibble
vyz <- as.data.frame(volatility(Peuro, n=2, N=250, calc = "yang.zhang"))
Vyz <- Vyz %>% mutate(EINDb = vyz[,1])
In the application, we use the stock market volatility of the 11 systemically important financial institutions as listed by the Fed’s Large Institution Supervision Coordinating Committee: BAC, BK, C, GS, JPM, MS, STT, WFC, BCS, CS, DB. The first 8 are banks in the USA and the last three banks are in Europe.
To verify the efficacy of the jIRF approach, we replace the three individual European banks (BCS, CS, and DB) with three possible European summary variables: (1) a principal component, (2) a market capitalization weighted average of the volatilities of the European banks, and (3) the volatility of the market capitalization weighted average of the HLOC prices of the European banks.
We first compute the volatility of each of the three European banks and then construct the first principal component from those three vectors. We center the data to have mean zero and scale the data to have unit standard error. This will scale the volatility measure but will not affect the results of our VAR analysis.
For the Parkinson method, the first component explains about 87 percent of the variance of the European banks:
summary(PCAp)$importance %>%
kable(format = "html", table.attr = "style='width:30%;' ") %>%
kableExtra::kable_styling()
PC1 | PC2 | PC3 | |
---|---|---|---|
Standard deviation | 1.619968 | 0.505163 | 0.347152 |
Proportion of Variance | 0.874770 | 0.085060 | 0.040170 |
Cumulative Proportion | 0.874770 | 0.959830 | 1.000000 |
For the Yang-Zhang method, the first component explains about 84 percent of the variance of the European banks:
summary(PCAyz)$importance %>%
kable(format = "html", table.attr = "style='width:30%;' ") %>%
kableExtra::kable_styling()
PC1 | PC2 | PC3 | |
---|---|---|---|
Standard deviation | 1.591577 | 0.5364415 | 0.423216 |
Proportion of Variance | 0.844370 | 0.0959200 | 0.059700 |
Cumulative Proportion | 0.844370 | 0.9403000 | 1.000000 |
These PCA volatility summary variable measures are denoted as
EPCA
in the volatility plots and tables.
There are a couple of alternatives for using these market
capitalization weights to construct a European volatility index: (1) a
market capitalization weighted average of the computed volatilities for
the European banks (EINDa
), or (2) compute the volatility
the market capitalization weighted HLOC components of the European banks
(EINDb
).
The market capitalization weights for BCS, CS, and DB are shown below.
Wgts_l <- as_tibble(Wgts) %>% select(-c(BCSmc,CSmc,DBmc)) %>%
pivot_longer(cols = 1:3, names_to = "symbol", values_to = "wghts")
Wgts_l %>%
ggplot(aes(x = date, y = wghts)) +
geom_line() +
facet_wrap(nrow = 3, vars(symbol),scales = "free_y") +
labs(title = "Market Capitalization Weights") +
ylab(" ") +
xlab(" ")
The time \(t\) market capitalization
weighted averages of the Parkinson and Yang-Zhang measures of volatility
for each of the three European banks, BCS
, CS
,
and DB
, are computed as
\[ EINDa_t = w_{BCS,t} V_{BCS,t} + w_{CS,t} V_{CS,t} + w_{DB,t} V_{DB,t} \] where \(V_{BCS,t}\), \(V_{CS,t}\), and \(V_{DB,t}\) are the daily volatilities of Barclays, Credit Suisse, and Deutsche Bank, respectively, using either the Parkinson or the Yang-Zhang estimator. The market capitalization weights at time \(t\) are defined as \[ w_{BCS,t}=\frac{MC_{BCS,t}}{MC_{BCS,t}+MC_{CS,t}+MC_{DB,t}}, \] \[ w_{CS,t}=\frac{MC_{CS,t}}{MC_{BCS,t}+MC_{CS,t}+MC_{DB,t}}, \] and \[ w_{DB,t}=\frac{MC_{DB,t}}{MC_{BCS,t}+MC_{CS,t}+MC_{DB,t}}, \] where \(MC_{BCS,t}\), \(MC_{CS,t}\), and \(MC_{DB,t}\) are the time \(t\) market capitalizations of Barclays, Credit Suisse, and Deutsche Bank, respectively.
Alternatively, we first compute the market cap weighted values for
the HLOC prices of all three European banks and then compute the
Parkinson and Yang-Zhang volatility measures using these weighted HLOC
prices. We denote these volatility measures as EINDb
.
We have two dataframes that contain the computed daily volatility for
each of the 11 banks, the European volatility summary variable using the
PCA approach (EPCA), the European index using the market capitalization
weighted average of the volatilities of the 3 European banks (EINDa),
and the European index using the computed volatility of the market
capitalization weighted HLOC prices of the 3 European banks (EINDb). The
dataframe Vp
uses the Parkinson method and the dataframe
Vyz
uses the Yang-Zhang method.
Below is plot of all of these computed volatilities for
Vp
, the Parkinson method
Vp_l <- Vp %>%
pivot_longer(cols = 2:15, names_to = "symbol", values_to = "vol")
Vp_lreorder <- Vp_l
Vp_lreorder$symbol <- factor(Vp_lreorder$symbol, levels = c(Tickers,"EPCA","EINDa","EINDb"))
Vp_lreorder %>%
ggplot(aes(x = date, y = vol)) +
geom_line() +
facet_wrap(nrow = 7, vars(symbol),scales = "free_y") +
ylab("Parkinson Volatility") +
xlab("")
Below is plot of all of these computed volatilities for
Vyz
, the Yang-Zhang method
Vyz_l <- Vyz %>%
pivot_longer(cols = 2:15, names_to = "symbol", values_to = "vol")
Vyz_lreorder <- Vyz_l
Vyz_lreorder$symbol <- factor(Vyz_lreorder$symbol, levels = c(Tickers,"EPCA","EINDa","EINDb"))
Vyz_lreorder %>%
ggplot(aes(x = date, y = vol)) +
geom_line() +
facet_wrap(nrow = 7, vars(symbol),scales = "free_y") +
ylab("Yang-Zhang Volatility") +
xlab("")
Next we plot Vp and Vyz together on the same graphs.
ggplot(NULL, aes(x = date, y = vol)) +
geom_line(data = Vp_l, color = "blue", alpha = .5) +
facet_wrap(nrow = 6, vars(symbol),scales = "free_y") +
geom_line(data = Vyz_l, color = "red", alpha = .5) +
labs(title = "Volatility: Parkinson (blue) and Yang-Zhang(red)")
To get an easier comparison, we zoom in on the plot of the Parkinson
and Yang-Zhang volatilities for bank BAC
.
# Choose one: "BAC" "BK" "C" "GS" "JPM" "MS" "STT" "WFC" "BCS" "CS" "DB" "EPCA" "EIND"
ggplot(NULL, aes(x = date, y = BAC)) +
geom_line(data = Vp, color = "blue", alpha = .5) +
geom_line(data = Vyz, color = "red", alpha = .5) +
labs(title = "Volatility: Parkinson (blue) and Yang-Zhang(red)")
Next, we compare summary statistics for the Parkinson and Yang-Zhang computations.
Vp_l %>% drop_na() %>%
group_by(symbol) %>%
summarise(
count = n(),
mean = mean(vol),
sd = sd(vol),
min = min(vol),
max = max(vol)
) %>%
kable(format = "html",
table.attr = "style='width:30%;' ",
caption = "Summary statistics of volatility using the Parkinson method") %>%
kableExtra::kable_styling()
symbol | count | mean | sd | min | max |
---|---|---|---|---|---|
BAC | 1029 | 0.2099245 | 0.1422818 | 0.0416125 | 1.2835800 |
BCS | 1029 | 0.1631068 | 0.1220365 | 0.0376814 | 1.5862968 |
BK | 1029 | 0.1940706 | 0.1275583 | 0.0388080 | 1.1663046 |
C | 1029 | 0.2204525 | 0.1655847 | 0.0417786 | 1.7774351 |
CS | 1029 | 0.1491514 | 0.1077178 | 0.0251731 | 1.1815377 |
DB | 1029 | 0.1867867 | 0.1182341 | 0.0000000 | 1.2298396 |
EINDa | 1029 | 0.1632338 | 0.1083795 | 0.0437402 | 1.2039462 |
EINDb | 1029 | 0.1618716 | 0.1070742 | 0.0451941 | 1.1496418 |
EPCA | 1029 | 0.0000000 | 0.0161997 | -0.0182360 | 0.1482884 |
GS | 1029 | 0.2053800 | 0.1365660 | 0.0492078 | 1.4540587 |
JPM | 1029 | 0.1837965 | 0.1341861 | 0.0394090 | 1.2909421 |
MS | 1029 | 0.2212518 | 0.1509439 | 0.0552290 | 1.7892787 |
STT | 1029 | 0.2286232 | 0.1578762 | 0.0437279 | 1.6652573 |
WFC | 1029 | 0.2102233 | 0.1517392 | 0.0396760 | 1.5933477 |
Vyz_l %>% drop_na() %>%
group_by(symbol) %>%
summarise(
count = n(),
mean = mean(vol),
sd = sd(vol),
min = min(vol),
max = max(vol)
) %>%
kable(format = "html",
table.attr = "style='width:30%;' ",
caption = "Summary statistics of volatility using the Yang-Zhang method") %>%
kableExtra::kable_styling()
symbol | count | mean | sd | min | max |
---|---|---|---|---|---|
BAC | 1027 | 0.2798882 | 0.2400234 | 0.0594519 | 3.268534 |
BCS | 1027 | 0.2979218 | 0.2781389 | 0.0576923 | 4.029446 |
BK | 1027 | 0.2504915 | 0.2107470 | 0.0542159 | 2.952009 |
C | 1027 | 0.2929773 | 0.2722538 | 0.0628614 | 3.981445 |
CS | 1026 | 0.2598522 | 0.2323250 | 0.0578035 | 3.591341 |
DB | 1027 | 0.3324001 | 0.2536433 | 0.0581909 | 3.350463 |
EINDa | 1026 | 0.2913880 | 0.2349500 | 0.0853008 | 3.706538 |
EINDb | 1027 | 0.2689574 | 0.2287443 | 0.0634655 | 3.636745 |
EPCA | 1029 | 0.0000000 | 0.0159158 | -0.0201453 | 0.229071 |
GS | 1027 | 0.2594637 | 0.2030293 | 0.0617784 | 3.136767 |
JPM | 1027 | 0.2453828 | 0.2266704 | 0.0533454 | 3.391097 |
MS | 1027 | 0.2905240 | 0.2376557 | 0.0647197 | 3.457622 |
STT | 1027 | 0.2939690 | 0.2458923 | 0.0775201 | 3.527717 |
WFC | 1027 | 0.2713791 | 0.2415344 | 0.0645863 | 3.310084 |
Compare the European Index volatility measures using the weighted volatilities and the volatility of the weighted HLOC prices. First, using the Parkinson method.
ggplot(Vp, aes(x = date)) +
geom_line(aes(y = EINDa), color = "blue", alpha = .5) +
geom_line(aes(y = EINDb), color = "red", alpha = .5) +
labs(title = "Parkinson European Index Volatility", subtitle = "weighted volatilities (red) and volatility of weighted HLOC (blue)") + xlab(" ") + ylab("")
Next, using the Yang-Zhang method.
ggplot(Vyz, aes(x = date)) +
geom_line(aes(y = EINDa), color = "blue", alpha = .5) +
geom_line(aes(y = EINDb), color = "red", alpha = .5) +
labs(title = "Yang-Zhang European Index Volatility", subtitle = "weighted volatilities (red) and volatility of weighted HLOC (blue)") + xlab(" ") + ylab("")
Tickers <- c("BAC","BK","BCS","C","CS","DB","GS","JPM","MS","STT","WFC")
Vyz_l <- Vyz %>%
pivot_longer(cols = 2:12, names_to = "symbol", values_to = "vol")
Vyz_lreorder <- Vyz_l
Vyz_lreorder$symbol <- factor(Vyz_lreorder$symbol, levels = c(Tickers))
Vyz_lreorder %>%
ggplot(aes(x = date, y = vol)) +
geom_line() +
geom_vline(xintercept=Vyz$date[776], linetype='dotted', color='black', size=.5) +
facet_wrap(nrow = 6, vars(symbol),scales = "free_y",strip.position = "right") +
ylab("Yang-Zhang Volatility") +
xlab("")
p1 <- Vyz %>%
ggplot() +
geom_line(aes(date,BAC*100)) + ylim(0,400) +
labs(x = NULL, y = "BAC") +
geom_vline(xintercept = Vyz$date[776], linetype="dashed")
p2 <- Vyz %>%
ggplot() +
geom_line(aes(date,BK*100)) + ylim(0,400) +
labs(x = NULL, y = "BK") +
geom_vline(xintercept = Vyz$date[776], linetype="dashed")
p3 <- Vyz %>%
ggplot() +
geom_line(aes(date,BCS*100)) + ylim(0,410) +
labs(x = NULL, y = "BCS") +
geom_vline(xintercept = Vyz$date[776], linetype="dashed")
p4 <- Vyz %>%
ggplot() +
geom_line(aes(date,C*100)) + ylim(0,400) +
labs(x = NULL, y = "C") +
geom_vline(xintercept = Vyz$date[776], linetype="dashed")
p5 <- Vyz %>%
ggplot() +
geom_line(aes(date,CS*100)) + ylim(0,400) +
labs(x = NULL, y = "CS") +
geom_vline(xintercept = Vyz$date[776], linetype="dashed")
p6 <- Vyz %>%
ggplot() +
geom_line(aes(date,DB*100)) + ylim(0,400) +
labs(x = NULL, y = "DB") +
geom_vline(xintercept = Vyz$date[776], linetype="dashed")
p7 <- Vyz %>%
ggplot() +
geom_line(aes(date,GS*100)) + ylim(0,400) +
labs(x = NULL, y = "GS") +
geom_vline(xintercept = Vyz$date[776], linetype="dashed")
p8 <- Vyz %>%
ggplot() +
geom_line(aes(date,JPM*100)) + ylim(0,400) +
labs(x = NULL, y = "JPM") +
geom_vline(xintercept = Vyz$date[776], linetype="dashed")
p9 <- Vyz %>%
ggplot() +
geom_line(aes(date,MS*100)) + ylim(0,400) +
labs(x = NULL, y = "MS") +
geom_vline(xintercept = Vyz$date[776], linetype="dashed")
p10 <- Vyz %>%
ggplot() +
geom_line(aes(date,STT*100)) + ylim(0,400) +
labs(x = NULL, y = "STT") +
geom_vline(xintercept = Vyz$date[776], linetype="dashed")
p11 <- Vyz %>%
ggplot() +
geom_line(aes(date,WFC*100)) + ylim(0,400) +
labs(x = NULL, y = "WFC") +
geom_vline(xintercept = Vyz$date[776], linetype="dashed")
g1 <- p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + p9 + p10 + p11 + plot_layout(ncol=2,byrow=TRUE)
g1
Tp <- Vp_l %>% drop_na() %>%
group_by(symbol) %>%
summarise(
mean = 100*mean(vol),
sd = 100*sd(vol),
)
Tyz <- Vyz_l %>% drop_na() %>%
group_by(symbol) %>%
summarise(
mean = 100*mean(vol),
sd = 100*sd(vol),
)
Tpyz <- cbind(Tp[c(1:6,10:14),c(1,2,3)],Tyz[c(1:6,10:14),c(2,3)])
# %>%
# kable(format = "html",
# table.attr = "style='width:30%;' ",
# caption = "Summary statistics of volatility using the Yang-Zhang method") %>%
# kableExtra::kable_styling()
# Choose one: "BAC" "BK" "C" "GS" "JPM" "MS" "STT" "WFC" "BCS" "CS" "DB" "EPCA" "EIND"
ggplot(NULL, aes(x = date, y = 100*BAC)) +
geom_line(data = Vp, color = "blue", alpha = .5) +
geom_line(data = Vyz, color = "red", alpha = .5) +
labs(title = "Volatility: Parkinson (blue) and Yang-Zhang(red)") +
ylab("BAC") + xlab("")
ggplot(Vyz, aes(x = date)) +
geom_line(aes(y = 100*EINDa), color = "blue", alpha = .5) +
geom_line(aes(y = 100*EINDb), color = "red", alpha = .5) +
labs(title = "Yang-Zhang European Index Volatility", subtitle = "weighted volatilities (red) and volatility of weighted HLOC (blue)") + xlab(" ") + ylab("")
pEPCAyz <- ggplot(data=Vyz) + geom_line(aes(date,EPCA))
pEINDayz <- ggplot(data=Vyz) + geom_line(aes(date,100*EINDa)) + ylab("EINDa")
pEINDbyz <- ggplot(data=Vyz) + geom_line(aes(date,100*EINDb)) + ylab("EINDb")
grid.arrange(pEPCAyz,pEINDayz,pEINDbyz,nrow=3,
top="Summary Variables for Volatilty of European Banks")