In mathematics , particularly in linear algebra , the Schur product theorem states that the Hadamard product of two positive definite matrices is also a positive definite matrix. The result is named after Issai Schur [1] (Schur 1911, p. 14, Theorem VII) (note that Schur signed as J. Schur in Journal für die reine und angewandte Mathematik .[2] [3] )
We remark that the converse of the theorem holds in the following sense. If M {\displaystyle M} is a symmetric matrix and the Hadamard product M ∘ N {\displaystyle M\circ N} is positive definite for all positive definite matrices N {\displaystyle N} , then M {\displaystyle M} itself is positive definite.
Proof using the trace formula [ edit ] For any matrices M {\displaystyle M} and N {\displaystyle N} , the Hadamard product M ∘ N {\displaystyle M\circ N} considered as a bilinear form acts on vectors a , b {\displaystyle a,b} as
a ∗ ( M ∘ N ) b = tr ( M T diag ( a ∗ ) N diag ( b ) ) {\displaystyle a^{*}(M\circ N)b=\operatorname {tr} \left(M^{\textsf {T}}\operatorname {diag} \left(a^{*}\right)N\operatorname {diag} (b)\right)} where tr {\displaystyle \operatorname {tr} } is the matrix trace and diag ( a ) {\displaystyle \operatorname {diag} (a)} is the diagonal matrix having as diagonal entries the elements of a {\displaystyle a} .
Suppose M {\displaystyle M} and N {\displaystyle N} are positive definite, and so Hermitian . We can consider their square-roots M 1 2 {\displaystyle M^{\frac {1}{2}}} and N 1 2 {\displaystyle N^{\frac {1}{2}}} , which are also Hermitian, and write
tr ( M T diag ( a ∗ ) N diag ( b ) ) = tr ( M ¯ 1 2 M ¯ 1 2 diag ( a ∗ ) N 1 2 N 1 2 diag ( b ) ) = tr ( M ¯ 1 2 diag ( a ∗ ) N 1 2 N 1 2 diag ( b ) M ¯ 1 2 ) {\displaystyle \operatorname {tr} \left(M^{\textsf {T}}\operatorname {diag} \left(a^{*}\right)N\operatorname {diag} (b)\right)=\operatorname {tr} \left({\overline {M}}^{\frac {1}{2}}{\overline {M}}^{\frac {1}{2}}\operatorname {diag} \left(a^{*}\right)N^{\frac {1}{2}}N^{\frac {1}{2}}\operatorname {diag} (b)\right)=\operatorname {tr} \left({\overline {M}}^{\frac {1}{2}}\operatorname {diag} \left(a^{*}\right)N^{\frac {1}{2}}N^{\frac {1}{2}}\operatorname {diag} (b){\overline {M}}^{\frac {1}{2}}\right)} Then, for a = b {\displaystyle a=b} , this is written as tr ( A ∗ A ) {\displaystyle \operatorname {tr} \left(A^{*}A\right)} for A = N 1 2 diag ( a ) M ¯ 1 2 {\displaystyle A=N^{\frac {1}{2}}\operatorname {diag} (a){\overline {M}}^{\frac {1}{2}}} and thus is strictly positive for A ≠ 0 {\displaystyle A\neq 0} , which occurs if and only if a ≠ 0 {\displaystyle a\neq 0} . This shows that ( M ∘ N ) {\displaystyle (M\circ N)} is a positive definite matrix.
Proof using Gaussian integration [ edit ] Case of M = N [ edit ] Let X {\displaystyle X} be an n {\displaystyle n} -dimensional centered Gaussian random variable with covariance ⟨ X i X j ⟩ = M i j {\displaystyle \langle X_{i}X_{j}\rangle =M_{ij}} . Then the covariance matrix of X i 2 {\displaystyle X_{i}^{2}} and X j 2 {\displaystyle X_{j}^{2}} is
Cov ( X i 2 , X j 2 ) = ⟨ X i 2 X j 2 ⟩ − ⟨ X i 2 ⟩ ⟨ X j 2 ⟩ {\displaystyle \operatorname {Cov} \left(X_{i}^{2},X_{j}^{2}\right)=\left\langle X_{i}^{2}X_{j}^{2}\right\rangle -\left\langle X_{i}^{2}\right\rangle \left\langle X_{j}^{2}\right\rangle } Using Wick's theorem to develop ⟨ X i 2 X j 2 ⟩ = 2 ⟨ X i X j ⟩ 2 + ⟨ X i 2 ⟩ ⟨ X j 2 ⟩ {\displaystyle \left\langle X_{i}^{2}X_{j}^{2}\right\rangle =2\left\langle X_{i}X_{j}\right\rangle ^{2}+\left\langle X_{i}^{2}\right\rangle \left\langle X_{j}^{2}\right\rangle } we have
Cov ( X i 2 , X j 2 ) = 2 ⟨ X i X j ⟩ 2 = 2 M i j 2 {\displaystyle \operatorname {Cov} \left(X_{i}^{2},X_{j}^{2}\right)=2\left\langle X_{i}X_{j}\right\rangle ^{2}=2M_{ij}^{2}} Since a covariance matrix is positive definite, this proves that the matrix with elements M i j 2 {\displaystyle M_{ij}^{2}} is a positive definite matrix.
General case [ edit ] Let X {\displaystyle X} and Y {\displaystyle Y} be n {\displaystyle n} -dimensional centered Gaussian random variables with covariances ⟨ X i X j ⟩ = M i j {\displaystyle \left\langle X_{i}X_{j}\right\rangle =M_{ij}} , ⟨ Y i Y j ⟩ = N i j {\displaystyle \left\langle Y_{i}Y_{j}\right\rangle =N_{ij}} and independent from each other so that we have
⟨ X i Y j ⟩ = 0 {\displaystyle \left\langle X_{i}Y_{j}\right\rangle =0} for any i , j {\displaystyle i,j} Then the covariance matrix of X i Y i {\displaystyle X_{i}Y_{i}} and X j Y j {\displaystyle X_{j}Y_{j}} is
Cov ( X i Y i , X j Y j ) = ⟨ X i Y i X j Y j ⟩ − ⟨ X i Y i ⟩ ⟨ X j Y j ⟩ {\displaystyle \operatorname {Cov} \left(X_{i}Y_{i},X_{j}Y_{j}\right)=\left\langle X_{i}Y_{i}X_{j}Y_{j}\right\rangle -\left\langle X_{i}Y_{i}\right\rangle \left\langle X_{j}Y_{j}\right\rangle } Using Wick's theorem to develop
⟨ X i Y i X j Y j ⟩ = ⟨ X i X j ⟩ ⟨ Y i Y j ⟩ + ⟨ X i Y i ⟩ ⟨ X j Y j ⟩ + ⟨ X i Y j ⟩ ⟨ X j Y i ⟩ {\displaystyle \left\langle X_{i}Y_{i}X_{j}Y_{j}\right\rangle =\left\langle X_{i}X_{j}\right\rangle \left\langle Y_{i}Y_{j}\right\rangle +\left\langle X_{i}Y_{i}\right\rangle \left\langle X_{j}Y_{j}\right\rangle +\left\langle X_{i}Y_{j}\right\rangle \left\langle X_{j}Y_{i}\right\rangle } and also using the independence of X {\displaystyle X} and Y {\displaystyle Y} , we have
Cov ( X i Y i , X j Y j ) = ⟨ X i X j ⟩ ⟨ Y i Y j ⟩ = M i j N i j {\displaystyle \operatorname {Cov} \left(X_{i}Y_{i},X_{j}Y_{j}\right)=\left\langle X_{i}X_{j}\right\rangle \left\langle Y_{i}Y_{j}\right\rangle =M_{ij}N_{ij}} Since a covariance matrix is positive definite, this proves that the matrix with elements M i j N i j {\displaystyle M_{ij}N_{ij}} is a positive definite matrix.
Proof using eigendecomposition [ edit ] Proof of positive semidefiniteness [ edit ] Let M = ∑ μ i m i m i T {\displaystyle M=\sum \mu _{i}m_{i}m_{i}^{\textsf {T}}} and N = ∑ ν i n i n i T {\displaystyle N=\sum \nu _{i}n_{i}n_{i}^{\textsf {T}}} . Then
M ∘ N = ∑ i j μ i ν j ( m i m i T ) ∘ ( n j n j T ) = ∑ i j μ i ν j ( m i ∘ n j ) ( m i ∘ n j ) T {\displaystyle M\circ N=\sum _{ij}\mu _{i}\nu _{j}\left(m_{i}m_{i}^{\textsf {T}}\right)\circ \left(n_{j}n_{j}^{\textsf {T}}\right)=\sum _{ij}\mu _{i}\nu _{j}\left(m_{i}\circ n_{j}\right)\left(m_{i}\circ n_{j}\right)^{\textsf {T}}} Each ( m i ∘ n j ) ( m i ∘ n j ) T {\displaystyle \left(m_{i}\circ n_{j}\right)\left(m_{i}\circ n_{j}\right)^{\textsf {T}}} is positive semidefinite (but, except in the 1-dimensional case, not positive definite, since they are rank 1 matrices). Also, μ i ν j > 0 {\displaystyle \mu _{i}\nu _{j}>0} thus the sum M ∘ N {\displaystyle M\circ N} is also positive semidefinite.
Proof of definiteness [ edit ] To show that the result is positive definite requires even further proof. We shall show that for any vector a ≠ 0 {\displaystyle a\neq 0} , we have a T ( M ∘ N ) a > 0 {\displaystyle a^{\textsf {T}}(M\circ N)a>0} . Continuing as above, each a T ( m i ∘ n j ) ( m i ∘ n j ) T a ≥ 0 {\displaystyle a^{\textsf {T}}\left(m_{i}\circ n_{j}\right)\left(m_{i}\circ n_{j}\right)^{\textsf {T}}a\geq 0} , so it remains to show that there exist i {\displaystyle i} and j {\displaystyle j} for which corresponding term above is nonzero. For this we observe that
a T ( m i ∘ n j ) ( m i ∘ n j ) T a = ( ∑ k m i , k n j , k a k ) 2 {\displaystyle a^{\textsf {T}}(m_{i}\circ n_{j})(m_{i}\circ n_{j})^{\textsf {T}}a=\left(\sum _{k}m_{i,k}n_{j,k}a_{k}\right)^{2}} Since N {\displaystyle N} is positive definite, there is a j {\displaystyle j} for which n j ∘ a ≠ 0 {\displaystyle n_{j}\circ a\neq 0} (since otherwise n j T a = ∑ k ( n j ∘ a ) k = 0 {\displaystyle n_{j}^{\textsf {T}}a=\sum _{k}(n_{j}\circ a)_{k}=0} for all j {\displaystyle j} ), and likewise since M {\displaystyle M} is positive definite there exists an i {\displaystyle i} for which ∑ k m i , k ( n j ∘ a ) k = m i T ( n j ∘ a ) ≠ 0. {\displaystyle \sum _{k}m_{i,k}(n_{j}\circ a)_{k}=m_{i}^{\textsf {T}}(n_{j}\circ a)\neq 0.} However, this last sum is just ∑ k m i , k n j , k a k {\displaystyle \sum _{k}m_{i,k}n_{j,k}a_{k}} . Thus its square is positive. This completes the proof.
References [ edit ] ^ Schur, J. (1911). "Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen". Journal für die reine und angewandte Mathematik . 1911 (140): 1–28. doi :10.1515/crll.1911.140.1 . S2CID 120411177 . ^ Zhang, Fuzhen, ed. (2005). The Schur Complement and Its Applications . Numerical Methods and Algorithms. Vol. 4. doi :10.1007/b105056 . ISBN 0-387-24271-6 . , page 9, Ch. 0.6 Publication under J. Schur ^ Ledermann, W. (1983). "Issai Schur and His School in Berlin". Bulletin of the London Mathematical Society . 15 (2): 97–106. doi :10.1112/blms/15.2.97 . External links [ edit ]