Following the notation we have without weights, Bell and McCaffrey (2002) and Pustejovsky and Tipton (2018) suggest
v = sC⊤(X⊤WX)−1∑iXi⊤WiAiϵiϵi⊤AiWiXi(X⊤WX)−1C
where Ai takes Ii, $(I_i - H_{ii})^{-\frac{1}{2}}$, or (Ii − Hii)−1 is unchanged, but H is changed
H = X(X⊤WX)−1X⊤W
For the degrees of freedom, we have
Gij = gi⊤Φgj
where
$$ g_i = s^{\frac{1}{2}} (I - H)_i^\top A_i W_i X_i (X^\top X)^{-1} C $$
Comparing the previous section with our implementation, we can find
out the differences. Since they have nearly the same symbols, to
differentiate the different part, we use subscript 1 to denote the implementation suggested by
Bell and McCaffrey (2002) and Pustejovsky and Tipton (2018), and use 2 to denote the our implementation of
covariance estimator in mmrm
, we have
v1 = sC⊤(X⊤WX)−1∑iXi⊤WiA1, iϵiϵi⊤A1, iWiXi(X⊤WX)−1C
v2 = sC⊤(X⊤WX)−1∑iXi⊤LiA2, iLi⊤ϵiϵi⊤LiA2, iLi⊤Xi(X⊤WX)−1C
Here we will prove that they are identical.
First of all, we assume that all Ai matrix, in any form, are positive-definite. Comparing v1 and v2, we see that the different part is
M1, d, i = WiA1, i and M2, d, i = LiA2, iLi⊤
Substitute H1 and H2 with its expression, we have
M1, d, i = Wi(Ii − Xi(X⊤WX)−1Xi⊤Wi)d
M2, d, i = Li(Ii − Li⊤Xi(X⊤WX)−1Xi⊤Li)dLi⊤
Where d takes 0, −1/2 and −1 respectively.
Apparently, if d = 0, these two are identical because Wi = LiLi⊤.
When d = −1, we have
$$ M_{2, -1, i} = L_i (I_i - L_i^\top X_i (X^\top W X)^{-1} X_i^\top L_i)^{-1} L_i^\top \\ = (L_i^{-1})^{-1} (I_i - L_i^\top X_i (X^\top W X)^{-1} X_i^\top L_i)^{-1} ((L_i^\top)^{-1})^{-1} \\ = [((L_i^\top)^{-1})(I_i - L_i^\top X_i (X^\top W X)^{-1} X_i^\top L_i)(L_i^{-1})]^{-1} \\ = [(L_i^\top)^{-1}L_i^{-1} - X_i (X^\top W X)^{-1} X_i^\top]^{-1} \\ = (W_i^{-1} - X_i (X^\top W X)^{-1} X_i^\top)^{-1} $$
$$ M_{1, -1, i} = W_i (I_i - X_i (X^\top W X)^{-1} X_i^\top W_i)^{-1} \\ = (W_i^{-1})^{-1} (I_i - X_i (X^\top W X)^{-1} X_i^\top W_i)^{-1} \\ = [(I_i - X_i (X^\top W X)^{-1} X_i^\top W_i)((W_i^{-1}))]^{-1} \\ = (W_i^{-1} - X_i (X^\top W X)^{-1} X_i^\top)^{-1} $$
Obviously, M2, −1, i = M1, −1, i, and use the following notation
M2, −1, i = LiB2, iLi⊤
M1, −1, i = WiB1, i
we have
$$ B_{1, i} = W_i^{-1} L_i B_{2, i} L_i^\top \\ = (L_i^\top)^{-1} B_{2, i} L_i^\top $$
When d = −1/2, we have the following
$$ M_{2, -1/2, i} = L_i (I_i - L_i^\top X_i (X^\top W X)^{-1} X_i^\top L_i)^{-1/2} L_i^\top \\ = L_i B_{2, i}^{1/2} L_i^\top $$
$$ M_{1, -1/2, i} = W_i (I_i - X_i (X^\top W X)^{-1} X_i^\top W_i)^{-1/2} \\ = W_i B_{1, i}^{1/2} $$
Apparently if B1, i1/2 ≠ (Li⊤)−1B2, i1/2Li⊤, we should also have B1, i1/2B1, i1/2 ≠ (Li⊤)−1B2, i1/2Li⊤(Li⊤)−1B2, i1/2Li⊤
leading to
B1, i ≠ (Li⊤)−1B2, iLi⊤
which is contradictory with our previous result. Thus, these covariance estimator are identical.
To prove G1, ij = g1, i⊤Φg1, j and G2, ij = g2, i⊤g2, j are identical, we only need to prove that
L−1g1, i = gmmrmi
where Φ = W−1 according to our previous expression.
We first expand L−1g1, i and gmmrmi
L−1g1, i = L−1(I − X(X⊤WX)−1X⊤W)Si⊤A1, idWiXi(X⊤WX)−1C
g2, i = (I − Li⊤X(X⊤WX)−1X⊤Li)Si⊤A2, idLi⊤Xi(X⊤WX)−1C
where Si is the row selection matrix.
We will prove the inner part equal L−1(I − X(X⊤WX)−1X⊤W)Si⊤A1, idWi = (I − L⊤X(X⊤WX)−1X⊤L)Si⊤A2, idLi⊤
With the previous proof of covariance estimators, we already have
M1, d, i = WiA1, id = LiA2, idLi⊤ = M2, d, i we then need to prove L−1(I − X(X⊤WX)−1X⊤W)Si⊤ = (I − L⊤X(X⊤WX)−1X⊤L)Si⊤Li−1
and note the relationship between (I − X(X⊤WX)−1X⊤W) and (I − L⊤X(X⊤WX)−1X⊤L) has already been proved in covariance estimator section, we only need to prove
L−1(I − X(X⊤WX)−1X⊤W)Si⊤ = (I − L⊤X(X⊤WX)−1X⊤L)Si⊤Li−1
Apparently
L−1(I − X(X⊤WX)−1X⊤W)Si⊤ = L−1Si⊤ − L−1X(X⊤WX)−1Xi⊤Wi
(I − L⊤X(X⊤WX)−1X⊤L)Si⊤Li−1 = Si⊤Li−1 − L⊤X(X⊤WX)−1Xi⊤
And obviously L−1Si⊤ = Si⊤Li−1
L−1X(X⊤WX)−1XiWi = L⊤X(X⊤WX)−1Xi⊤
because of the following $$ (X(X^\top W X)^{-1}X_i W_i)_{i} = X_i(X^\top W X)^{-1}X_i W_i \\ = W_i X_i(X^\top W X)^{-1}X_i^\top \\ = (W X(X^\top W X)^{-1}X_i^\top)_{i} $$
Empirical covariance matrix is involved with the inverse of a matrix, or symmetric square root of a matrix. To calculate this, we usually requires that the matrix is positive-definite. However, Young (2016) suggest that this is not always assured in practice.
Thus, following Pustejovsky and Tipton
(2018), we use the pseudo inverse to avoid this. We follow the
following logic (see the corresponding C++
function
pseudoInverseSqrt
) to obtain the pseudo inverse:
cpow
to obtain the square root of the reciprocals
of singular values, if the value is larger than a computational
threshold; otherwise replace the value with 0.In Eigen
package, the pseudo inverse method is already
implemented in Eigen::CompleteOrthogonalDecomposition< MatrixType_ >::pseudoInverse
,
but it is not used for the following reason:
NAN
in calculations.