Page 1 of 11
European Journal of Applied Sciences – Vol. 9, No. 5
Publication Date: October 25, 2021
DOI:10.14738/aivp.95.11115.
Jiamwattanapong, K., Ingadapa, N., & Plubin, B. (2021). On Testing Homogeneity of Covariance Matrices with Box’s M and the
Approximate Tests for Multivariate Data. European Journal of Applied Sciences, 9(5). 426-436.
Services for Science and Education – United Kingdom
On Testing Homogeneity of Covariance Matrices with Box’s M and
the Approximate Tests for Multivariate Data
Knavoot Jiamwattanapong
Faculty of Liberal Arts
Rajamangala University of Technology Rattanakosin, Thailand
Nisand Ingadapa
Faculty of Liberal Arts
Rajamangala University of Technology Rattanakosin, Thailand
Bandhita Plubin
Faculty of Science, Chiang Mai University, Thailand
ABSTRACT
Homogeneity of covariance matrices, or equal covariance matrices across groups,
is one of the most important assumptions in multivariate analysis of variance
(MANOVA) and in discriminant analysis. Box’s M test, an exact test, is a generally
accepted method used to check the violation of this assumption. The Box’s statistic
M can be transformed to the statistics serving as the approximate tests based on chi- squared and F distributions. This study aims to assess the performance of Box’s M
test compared with the approximate test using chi-squared distribution for normal
data. When the data are non-normal, the performance of the approximate test and
the nonparametric test using the bootstrap method are examined. The results
showed that under normality with equal sample sizes and certain conditions to
conduct the exact test, the Box’s M and the approximate chi-squared tests perform
well whereas the performance of the approximate test is slightly better for
detecting heterogeneity of covariance matrices. In the case of unequal sample sizes,
in which the data do not meet the conditions to conduct Box’s M test, the
approximate chi-squared test also performs well so it is applicable for such a case.
When dealing with non-normal data, such as multivariate t-distributed data, the
performance of the approximate test as well as the nonparametric test using the
bootstrap method is unsatisfactory in the situations studied so it needs to be further
developed.
Keywords: Tests for covariance matrices, Box’s M test, Homogeneity of covariance
matrices, Unequal covariance matrices, Bootstrap Method.
INTRODUCTION
Homogeneity of covariance matrices is one important assumption in multivariate analysis of
variance (MANOVA) and also in discriminant analysis which are commonly found in various
fields such as medical science, engineering, social sciences, economics, education, and human
resource management. Although these methods are nowadays popular and frequently used, the
heterogeneity of covariance matrices can have an impact on the parameter estimation in such
a model [1]. For example, when the rejection of testing two mean vectors happens, the result
Page 2 of 11
427
Jiamwattanapong, K., Ingadapa, N., & Plubin, B. (2021). On Testing Homogeneity of Covariance Matrices with Box’s M and the Approximate Tests
for Multivariate Data. European Journal of Applied Sciences, 9(5). 426-436.
URL: http://dx.doi.org/10.14738/aivp.95.11115
may be caused by the difference of certain covariance matrices from the others, instead of the
difference of means.
As testing for the equality of covariance matrices among groups in multivariate analysis has
drawn much attention from statisticians and researchers so a number of tests have recently
been developed. The development of the tests can be classified into 3 approaches: (1) tests
based on Likelihood ratio criterion, (2) tests based on empirical distance, and (3) tests based
on largest eigenvalue distribution or random matrix theory [2]. Examples of tests in the first
approach are found in Mauchly [3] and Box [4-5]. Most of the tests in the second approach, see
John [6], Nagao [7], Ledoit and Wolf [8], can be applied in the case of high-dimension data, i.e.
when variable numbers are higher than the sample sizes. An example in the third approach is
appeared in [9]. Most of the existing tests require multivariate normality. For the case when the
data are non-normal, nonparametric methods such as bootstrapping can be applied [10].
Among the tests for homogeneity of covariance matrices, it can be said that Box’s M test [4] is
one of the most well-known methods and is included in many statistical packages such as SPSS.
The Box’s M test is based on the likelihood ration criterion under the multivariate normal
distribution. The critical values of this method can be obtained exactly, so called an exact test,
only when the number of variables equals 2, 3, 4 and 5 with the number of populations not
exceed 10. In certain conditions, the Box’s M statistic can be transformed to the statistics
approximated using chi-squared or distributions. The performance of the exact test and
also the approximate test can be greatly affected by the number of variables, the number of
populations or groups, and the sample sizes.
From the above mention, testing for homogeneity of covariance matrices is a major concern
and therefore it has attracted much attention from statistics researchers. The objectives of this
study are to assess the performance of Box’s M test, the approximate chi-squared test, and a
test using the nonparametric method.
BOX’S M AND THE COMPARATIVE TESTS
Box’s M test is a well-known method for testing the equality of covariance matrices among
populations in multivariate analysis. The test is based on the likelihood ratio criterion, the same
approach as Bartlett’s test [11] for checking the homogeneity of variances in univariate
analysis.
Let be distributed as The hypothesis testing
problem is
at least one pair that (1)
The notations are defined as follows:
k
F -
k
ij x iid ( , ), Npi i μ Σ i k =1, 2,..., , 1, 2,..., .i j n =
H :ΣΣ Σ 1 2 = == ! k
A : , , 1, 2,..., i ji j k 1 = Σ Σ i j 1
1
1
= ,
i n
i ij
j i
x
n =
x å
1
= ( )( )
i n
i ij i ij i
j=
¢ V x xx x å - -
Page 3 of 11
428
European Journal of Applied Sciences (EJAS) Vol. 9, Issue 5, October-2021
Services for Science and Education – United Kingdom
, , ,
, ,
First proposed by Mauchly [3], the test based on the likelihood ratio criterion is based on
generalized variance and trace of the sample covariance matrix ( ) which requires the non- singularity of a sample covariance matrix. The test is based on the statistic as follows
The statistic can be transformed to the statistic as
where vdenotes the pooled sample covariance matrix, that is
The statistic is in the range from 0 to 1 and when the value is closer to 1, the probability
to accept the null hypothesis of homogeneity of covariance matrices is higher or when the value
is closer to 0, the probability of rejection is getting higher. In addition, when , ,...,
become more different, the statistic is close to 0 [12]. One limitation of the method using the
likelihood ratio criterion is that all need to be higher than , otherwise for some
and consequently, The limitation makes this approach not applicable for high- dimensional data where .
The Box’s M test statistic is obtained from the transformation from to as
follows:
(2)
The exact distribution of the test statistic can be obtained only when the number of
variables and the number of populations . The critical values of the
exact test can be found in Table A.14 in [12] and [13]. The Box’s M test is sensitive to non- 1
k
i
i=
V V = å Si
1
1
1
k
i
i i n =
= å - V S =n V
i n 1 i = n - n
1
k
i
i
n
=
= å n
1
k
i
i
n
=
= å
S
l
/2 /2 /2 /2
1 1
V /V / i i
k k n n pn pn
i i
i i
l n n = =
æ öæ ö = ç ÷ç ÷ Õ Õ
è øè ø
l M
1 2 /2 /2 /2
1 2
/2
k
k
p
M
n n n
n = SS S
S
!
i i
i
p
i
i
n
n
å
= å
S
S
M M
M S1 S2 Sk
M
i n p 0 Si = i
M = 0.
p n >
TBox M -2ln M
TBox = -2ln M ( ln ln ) p i i =n k S S - å
TBox
p = 2, 3, 4, 5 k = 2, 3, ..., 10