Page 1 of 11

European Journal of Applied Sciences – Vol. 9, No. 5

Publication Date: October 25, 2021

DOI:10.14738/aivp.95.11115.

Jiamwattanapong, K., Ingadapa, N., & Plubin, B. (2021). On Testing Homogeneity of Covariance Matrices with Box’s M and the

Approximate Tests for Multivariate Data. European Journal of Applied Sciences, 9(5). 426-436.

Services for Science and Education – United Kingdom

On Testing Homogeneity of Covariance Matrices with Box’s M and

the Approximate Tests for Multivariate Data

Knavoot Jiamwattanapong

Faculty of Liberal Arts

Rajamangala University of Technology Rattanakosin, Thailand

Nisand Ingadapa

Faculty of Liberal Arts

Rajamangala University of Technology Rattanakosin, Thailand

Bandhita Plubin

Faculty of Science, Chiang Mai University, Thailand

ABSTRACT

Homogeneity of covariance matrices, or equal covariance matrices across groups,

is one of the most important assumptions in multivariate analysis of variance

(MANOVA) and in discriminant analysis. Box’s M test, an exact test, is a generally

accepted method used to check the violation of this assumption. The Box’s statistic

M can be transformed to the statistics serving as the approximate tests based on chi- squared and F distributions. This study aims to assess the performance of Box’s M

test compared with the approximate test using chi-squared distribution for normal

data. When the data are non-normal, the performance of the approximate test and

the nonparametric test using the bootstrap method are examined. The results

showed that under normality with equal sample sizes and certain conditions to

conduct the exact test, the Box’s M and the approximate chi-squared tests perform

well whereas the performance of the approximate test is slightly better for

detecting heterogeneity of covariance matrices. In the case of unequal sample sizes,

in which the data do not meet the conditions to conduct Box’s M test, the

approximate chi-squared test also performs well so it is applicable for such a case.

When dealing with non-normal data, such as multivariate t-distributed data, the

performance of the approximate test as well as the nonparametric test using the

bootstrap method is unsatisfactory in the situations studied so it needs to be further

developed.

Keywords: Tests for covariance matrices, Box’s M test, Homogeneity of covariance

matrices, Unequal covariance matrices, Bootstrap Method.

INTRODUCTION

Homogeneity of covariance matrices is one important assumption in multivariate analysis of

variance (MANOVA) and also in discriminant analysis which are commonly found in various

fields such as medical science, engineering, social sciences, economics, education, and human

resource management. Although these methods are nowadays popular and frequently used, the

heterogeneity of covariance matrices can have an impact on the parameter estimation in such

a model [1]. For example, when the rejection of testing two mean vectors happens, the result

Page 2 of 11

427

Jiamwattanapong, K., Ingadapa, N., & Plubin, B. (2021). On Testing Homogeneity of Covariance Matrices with Box’s M and the Approximate Tests

for Multivariate Data. European Journal of Applied Sciences, 9(5). 426-436.

URL: http://dx.doi.org/10.14738/aivp.95.11115

may be caused by the difference of certain covariance matrices from the others, instead of the

difference of means.

As testing for the equality of covariance matrices among groups in multivariate analysis has

drawn much attention from statisticians and researchers so a number of tests have recently

been developed. The development of the tests can be classified into 3 approaches: (1) tests

based on Likelihood ratio criterion, (2) tests based on empirical distance, and (3) tests based

on largest eigenvalue distribution or random matrix theory [2]. Examples of tests in the first

approach are found in Mauchly [3] and Box [4-5]. Most of the tests in the second approach, see

John [6], Nagao [7], Ledoit and Wolf [8], can be applied in the case of high-dimension data, i.e.

when variable numbers are higher than the sample sizes. An example in the third approach is

appeared in [9]. Most of the existing tests require multivariate normality. For the case when the

data are non-normal, nonparametric methods such as bootstrapping can be applied [10].

Among the tests for homogeneity of covariance matrices, it can be said that Box’s M test [4] is

one of the most well-known methods and is included in many statistical packages such as SPSS.

The Box’s M test is based on the likelihood ration criterion under the multivariate normal

distribution. The critical values of this method can be obtained exactly, so called an exact test,

only when the number of variables equals 2, 3, 4 and 5 with the number of populations not

exceed 10. In certain conditions, the Box’s M statistic can be transformed to the statistics

approximated using chi-squared or distributions. The performance of the exact test and

also the approximate test can be greatly affected by the number of variables, the number of

populations or groups, and the sample sizes.

From the above mention, testing for homogeneity of covariance matrices is a major concern

and therefore it has attracted much attention from statistics researchers. The objectives of this

study are to assess the performance of Box’s M test, the approximate chi-squared test, and a

test using the nonparametric method.

BOX’S M AND THE COMPARATIVE TESTS

Box’s M test is a well-known method for testing the equality of covariance matrices among

populations in multivariate analysis. The test is based on the likelihood ratio criterion, the same

approach as Bartlett’s test [11] for checking the homogeneity of variances in univariate

analysis.

Let be distributed as The hypothesis testing

problem is

at least one pair that (1)

The notations are defined as follows:

k

F -

k

ij x iid ( , ), Npi i μ Σ i k =1, 2,..., , 1, 2,..., .i j n =

H :ΣΣ Σ 1 2 = == ! k

A : , , 1, 2,..., i ji j k 1 = Σ Σ i j 1

1

1

= ,

i n

i ij

j i

x

n =

x å

1

= ( )( )

i n

i ij i ij i

j=

¢ V x xx x å - -

Page 3 of 11

428

European Journal of Applied Sciences (EJAS) Vol. 9, Issue 5, October-2021

Services for Science and Education – United Kingdom

, , ,

, ,

First proposed by Mauchly [3], the test based on the likelihood ratio criterion is based on

generalized variance and trace of the sample covariance matrix ( ) which requires the non- singularity of a sample covariance matrix. The test is based on the statistic as follows

The statistic can be transformed to the statistic as

where vdenotes the pooled sample covariance matrix, that is

The statistic is in the range from 0 to 1 and when the value is closer to 1, the probability

to accept the null hypothesis of homogeneity of covariance matrices is higher or when the value

is closer to 0, the probability of rejection is getting higher. In addition, when , ,...,

become more different, the statistic is close to 0 [12]. One limitation of the method using the

likelihood ratio criterion is that all need to be higher than , otherwise for some

and consequently, The limitation makes this approach not applicable for high- dimensional data where .

The Box’s M test statistic is obtained from the transformation from to as

follows:

(2)

The exact distribution of the test statistic can be obtained only when the number of

variables and the number of populations . The critical values of the

exact test can be found in Table A.14 in [12] and [13]. The Box’s M test is sensitive to non- 1

k

i

i=

V V = å Si

1

1

1

k

i

i i n =

= å - V S =n V

i n 1 i = n - n

1

k

i

i

n

=

= å n

1

k

i

i

n

=

= å

S

l

/2 /2 /2 /2

1 1

V /V / i i

k k n n pn pn

i i

i i

l n n = =

æ öæ ö = ç ÷ç ÷ Õ Õ

è øè ø

l M

1 2 /2 /2 /2

1 2

/2

k

k

p

M

n n n

n = SS S

S

!

i i

i

p

i

i

n

n

å

= å

S

S

M M

M S1 S2 Sk

M

i n p 0 Si = i

M = 0.

p n >

TBox M -2ln M

TBox = -2ln M ( ln ln ) p i i =n k S S - å

TBox

p = 2, 3, 4, 5 k = 2, 3, ..., 10