********************************************************************** White's test with SPSS: ====================== * First of all, read "Heterocedasticity: testing and correcting in SPSS", by Gwilym Pryce http://pages.infinit.net/rlevesqu/spss.htm#Heteroscedasticity This macro is based on his paper. * SPSS Code by Marta Garcia-Granero 2002/04/04. * Steps: * Create several new variables: * The square of the unstandardized residuals. * The square of every predictor variable in the model you want to test. * The cross-product of all the predictors. * Run a regression model to predict the squared residuals with the predictors, their squares and cross-products. * Multiply the model R-square (unadjusted) by the sample size (n*R-square). * This is the White's statistic. Its significance is tested by comparing it with the critical value of the Chi-square distribution with "p" degrees of freedom, where "p" is the total number of regressors in the last regression model (original+squares+cross-products). * IMPORTANT: * If any of the original predictors is binary (dummy variable), then its square will be identical to the original, and they will correlate perfectly. * In this case, the regression model will drop one of them (the original or its square), and "p" has to be decreased in 1 unit for each binary predictor in the model. * WHITE'S TEST MACRO * * The MACRO needs 5 arguments: * a) the number of predictors, * b) the number of cross-products that will be created: " predictors*(predictors-1)/2" * [I could not find other way of making VECTOR to accept the number], * c) "P" (predictors+squares+cross-products), corrected for binary predictors, * d) the name of the dependent variable and * e) the list of predictors in the form 'first predictor TO last predictor' * (ordered and consecutive in the database). * MACRO definition. DEFINE whitest(!POSITIONAL !TOKENS(1) /!POSITIONAL !TOKENS(1) /!POSITIONAL !TOKENS(1) /!POSITIONAL !TOKENS(1) /!POSITIONAL !CMDEND). * >>>> 1st regression model to get the residuals <<<< *. REGRESSION /STATISTICS R ANOVA /DEPENDENT !4 /METHOD=ENTER !5 /SCATTERPLOT=(*ZRESID,*ZPRED) /SAVE RESID(residual) . * >>>> New variables <<<< *. * New dependent variable. COMPUTE sq_res=residual**2. * Getting rid of superfluous variables (dependent and residuals). SAVE OUTFILE='c:\\windows\\temp\\tempdat_.sav' /keep=sq_res !5. GET FILE='c:\\windows\\temp\\tempdat_.sav'. EXECUTE. * Vectors for all new predictor variables. VECTOR v=!5 /sq(!1) /cp(!2). * Squares of all predictors. LOOP #i=1 to !1. COMPUTE sq(#i)=v(#i)**2. END LOOP. * Cross-products of all predictors. * Modification of a routine by Ray Levesque. COMPUTE #idx=1. LOOP #cnt1=1 TO !1-1. LOOP #cnt2=#cnt1+1 TO !1. COMPUTE cp(#idx)=v(#cnt1)*v(#cnt2). COMPUTE #idx=#idx+1. END LOOP. END LOOP. EXECUTE. * >>>> White's test <<<< *. * Regression of sq_res on all predictors. REGRESSION /VARIABLES=ALL /STATISTICS R /DEPENDENT sq_res /METHOD= ENTER /SAVE RESID(residual) . * Final report. * Routine by Gwilym Pryce (slightly modified). matrix. compute p=!3. get sq_res /variables=sq_res. get residual /variables=residual. compute sq_res2=residual&**2. compute n=nrow(sq_res). compute rss=msum(sq_res2). compute ii_1=make(n,n,1). compute i=ident(n). compute m0=i-((1/n)*ii_1). compute tss=transpos(sq_res)*m0*sq_res. compute regss=tss-rss. print regss /format="f8.4" /title="Regression SS". print rss /format="f8.4" /title="Residual SS". print tss /format="f8.4" /title="Total SS". compute r_sq=1-(rss/tss). print r_sq /format="f8.4" /title="R-squared". print n /format="f4.0" /title="Sample size (N)". print p /format="f4.0" /title="Number of predictors (P)". compute wh_test=n*r_sq. print wh_test /format="f8.3" /title="White's General Test for Heteroscedasticity" + " (CHI-SQUARE df=P)". compute sig=1-chicdf(wh_test,p). print sig /format="f8.4" /title="Significance level of Chi-square df=P (H0:" + "homoscedasticity)". end matrix. !ENDDEFINE. * Sample data Nr. 1: continuous predictors *. INPUT PROGRAM. - VECTOR x(5). - LOOP #I = 1 TO 100. - LOOP #J = 1 TO 5. - COMPUTE x(#J) = NORMAL(1). - END LOOP. - END CASE. - END LOOP. - END FILE. END INPUT PROGRAM. execute. * x1 is the dependent and x2 TO x5 the predictors. rename variables x1=y. execute. * MACRO call: there are 4 predictors, therefore, 6 cross-products and 14 regressors. whitest 4 6 14 y x2 TO x5. * Sample data Nr. 2: one binary predictor *. INPUT PROGRAM. - VECTOR x(5). - LOOP #I = 1 TO 100. - LOOP #J = 1 TO 5. - COMPUTE x(#J) = NORMAL(1). - END LOOP. - END CASE. - END LOOP. - END FILE. END INPUT PROGRAM. execute. RECODE x2 (Lowest thru 0=0) (0 thru Highest=1) . EXECUTE . * x1 is the dependent and x2 TO x5 the predictors. rename variables x1=y. execute. * MACRO call: as before, 4 predictors, 6 cross-products but ONLY 13 regressors. whitest 4 6 13 y x2 TO x5. * As you can see from the output, X2 is not included in the model.