**********************************************************************
White's test with SPSS:
======================

* First of all, read "Heterocedasticity: testing and correcting in SPSS", by Gwilym Pryce
    http://pages.infinit.net/rlevesqu/spss.htm#Heteroscedasticity
    This macro is based on his paper.
*   SPSS Code by Marta Garcia-Granero 2002/04/04.

* Steps:

* Create several new variables:
  * The square of the unstandardized residuals.
  * The square of every predictor variable in the model you want to test.
  * The cross-product of all the predictors.

* Run a regression model to predict the squared residuals with
	the predictors, their squares and cross-products.

* Multiply the model R-square (unadjusted) by the sample size (n*R-square).
* This is the White's statistic. Its significance is tested by comparing
	it with the critical value of the Chi-square distribution with "p" degrees
	of freedom, where "p" is the total number of regressors in the last
	regression model (original+squares+cross-products).


* IMPORTANT:
* If any of the original predictors is binary (dummy variable), then its square
will be identical to the original, and they will correlate perfectly.
* In this case, the regression model will drop one of them (the original or its square),
and "p" has to be decreased in 1 unit for each binary predictor in the model.

* WHITE'S TEST MACRO *

* The MACRO needs 5 arguments:
*   a) the number of predictors,
*   b) the number of cross-products that will be created:
"	predictors*(predictors-1)/2"
*      [I could not find other way of making VECTOR to accept the	number],
*   c) "P" (predictors+squares+cross-products), corrected for binary predictors,
*   d) the name of the dependent variable and
*   e) the list of predictors in the form 'first predictor TO last predictor'
*      (ordered and consecutive in the database).

* MACRO definition.

DEFINE whitest(!POSITIONAL !TOKENS(1) /!POSITIONAL !TOKENS(1)
              /!POSITIONAL !TOKENS(1) /!POSITIONAL !TOKENS(1)
              /!POSITIONAL !CMDEND).

* >>>> 1st regression model to get the residuals <<<< *.
REGRESSION
  /STATISTICS R ANOVA
  /DEPENDENT !4
  /METHOD=ENTER !5
  /SCATTERPLOT=(*ZRESID,*ZPRED)
  /SAVE RESID(residual) .

* >>>> New variables <<<< *.
* New dependent variable.
COMPUTE sq_res=residual**2.
* Getting rid of superfluous variables (dependent and residuals).
SAVE OUTFILE='c:\\windows\\temp\\tempdat_.sav'
   /keep=sq_res !5.
GET FILE='c:\\windows\\temp\\tempdat_.sav'.
EXECUTE.
* Vectors for all new predictor variables.
VECTOR v=!5 /sq(!1) /cp(!2).
* Squares of all predictors.
LOOP #i=1 to !1.
COMPUTE sq(#i)=v(#i)**2.
END LOOP.
* Cross-products of all predictors.
* Modification of a routine by Ray Levesque.
COMPUTE #idx=1.
LOOP #cnt1=1 TO !1-1.
LOOP #cnt2=#cnt1+1 TO !1.
COMPUTE cp(#idx)=v(#cnt1)*v(#cnt2).
COMPUTE #idx=#idx+1.
END LOOP.
END LOOP.
EXECUTE.

* >>>> White's test <<<< *.
* Regression of sq_res on all predictors.
REGRESSION /VARIABLES=ALL
  /STATISTICS R
  /DEPENDENT sq_res
  /METHOD= ENTER
  /SAVE RESID(residual) .
* Final report.
* Routine by Gwilym Pryce (slightly modified).
matrix.
compute p=!3.
get sq_res /variables=sq_res.
get residual /variables=residual.
compute sq_res2=residual&**2.
compute n=nrow(sq_res).
compute rss=msum(sq_res2).
compute ii_1=make(n,n,1).
compute i=ident(n).
compute m0=i-((1/n)*ii_1).
compute tss=transpos(sq_res)*m0*sq_res.
compute regss=tss-rss.
print regss
 /format="f8.4"
 /title="Regression SS".
print rss
 /format="f8.4"
 /title="Residual SS".
print tss
 /format="f8.4"
 /title="Total SS".
compute r_sq=1-(rss/tss).
print r_sq
 /format="f8.4"
 /title="R-squared".
print n
 /format="f4.0"
 /title="Sample size (N)".
print p
 /format="f4.0"
 /title="Number of predictors (P)".
compute wh_test=n*r_sq.
print wh_test
 /format="f8.3"
 /title="White's General Test for Heteroscedasticity"
+ " (CHI-SQUARE df=P)".
compute sig=1-chicdf(wh_test,p).
print sig
 /format="f8.4"
 /title="Significance level of Chi-square df=P (H0:"
+ "homoscedasticity)".
end matrix.

!ENDDEFINE.

* Sample data Nr. 1: continuous predictors *.
INPUT PROGRAM.
- VECTOR x(5).
- LOOP #I = 1 TO 100.
-  LOOP #J = 1 TO 5.
-   COMPUTE x(#J) = NORMAL(1).
-  END LOOP.
-  END CASE.
- END LOOP.
- END FILE.
END INPUT PROGRAM.
execute.
* x1 is the dependent and x2 TO x5 the predictors.
rename variables x1=y.
execute.
* MACRO call: there are 4 predictors, therefore, 6 cross-products and 14
regressors.
whitest 4 6 14 y x2 TO x5.

* Sample data Nr. 2: one binary predictor *.
INPUT PROGRAM.
- VECTOR x(5).
- LOOP #I = 1 TO 100.
-  LOOP #J = 1 TO 5.
-   COMPUTE x(#J) = NORMAL(1).
-  END LOOP.
-  END CASE.
- END LOOP.
- END FILE.
END INPUT PROGRAM.
execute.
RECODE x2  (Lowest thru 0=0)  (0 thru Highest=1)  .
EXECUTE .

* x1 is the dependent and x2 TO x5 the predictors.
rename variables x1=y.
execute.
* MACRO call: as before, 4 predictors, 6 cross-products but ONLY 13
regressors.
whitest 4 6 13 y x2 TO x5.

* As you can see from the output, X2 is not included in the model.