DoT-Test with only means, SD and Ns

* I send you some syntax to perform a T test for independent samples with
 summary data. Although you already have a solution for that (with matrix
 data input in ANOVA), this method is far more complete:

- A test for equality of variances is performed first (Hartley's F test)

- Both the standard T test (assuming equal variances) and Welch test
(not assuming equal variances) are calculated.
- Asymptotic (if sample sizes are greater than 30) 95%CI for the
difference of means are calculated for both tests
- Non asymptotic 95%CI are also given if sample sizes are low.

* I also provide the original data used to get the means, sd and sample
sizes (with the "normal" T test) for comparison purposes.

*Best regards

*Marta

SYNTAX:

* Just one set of data (one row) can be processed each time
  (see below the original data).

data list list /mean1(f8.3) sd1(F8.3) n1(F8.0) mean2(f8.3) sd2(F8.3) n2(F8.0).
begin data
187.643 38.098 14 235.929 54.286 14
end data.

* T-test *.
matrix.
PRINT  /TITLE "T TEST FOR INDEPENDENT SAMPLES FROM SUMMARY DATA".
GET DATA /FILE=* /names=vecnam.
get mean1 /var=mean1.
get sd1 /var=sd1.
get n1 /var=n1.
get mean2 /var=mean2.
get sd2 /var=sd2.
get n2 /var=n2.
compute sem1=sd1/sqrt(n1).
compute sem2=sd2/sqrt(n2).
print {n1,mean1,sd1,sem1;n2,mean2,sd2,sem2}
 /title='Input data'
 /clabels='N','Mean','sd','sem'
 /rlabels='Sample 1','Sample 2'
 /format='f8.2'.
compute diff=mean1-mean2.
compute var1=sd1**2.
compute var2=sd2**2.
do if var1 ge var2.
compute ftest=var1/var2.
compute fsig=1-fcdf(ftest,n1,n2).
else if var1 lt var2.
compute ftest=var2/var1.
compute fsig=1-fcdf(ftest,n2,n1).
end if.
print {ftest,fsig}
 /title='Hartley test for equality of variances'
 /clabels='F','Sig.'
 /format='f8.3'.
compute n=n1+n2.
compute poolvar=((n1-1)&*(var1)+(n2-1)&*(var2))/(n-2) .
compute eedif1=sqrt(poolvar*(1/n1+1/n2)).
compute t1=diff/eedif1.
compute df1=n-2.
compute t1sig=2*(1-tcdf(abs(t1),df1)).
compute eedif2=sqrt(var1/n1+var2/n2).
compute t2=diff/eedif2.
compute
df2=((var1/n1+var2/n2)**2)/(((var1/n1)**2)/(n1-1)+((var2/n2)**2)/(n2-1)).

compute t2sig=2*(1-tcdf(abs(t2),df2)).
print {diff,eedif1,t1,df1,t1sig;diff,eedif2,t2,df2,t2sig}
 /title='T test for independent means with equal or unequal variances'
 /clabels='Diff.','SE(dif)','t','df','2-Sig.'
 /rlabels='Equal','Unequal'
 /format='f8.3'.
do if (n1 ge 30) and (n2 ge 30).
compute low1=diff-1.96*eedif1.
compute upp1=diff+1.96*eedif1.
compute low2=diff-1.96*eedif2.
compute upp2=diff+1.96*eedif2.
print {low1,upp1;low2,upp2}
 /title='Aproximate 95%CI for diff (asymptotic)'
 /clabels='Lower','Upper'
 /rlabels='Equal','Unequal'
 /format='f8.3'.
end if.
compute data={data,diff,eedif1,df1,eedif2,df2}.
compute vecnam={vecnam,"diff","eedif1","df1","eedif2","df2"}.
save data /outfile=* /names=vecnam.
end matrix.

* Computation of exact (non asymptotic) 95%CI for diff *.
COMPUTE low1 = diff -eedif1* IDF.T(0.975,df1) .
COMPUTE upp1 = diff +eedif1* IDF.T(0.975,df1) .
COMPUTE low2 = diff -eedif2* IDF.T(0.975,df2) .
COMPUTE upp2 = diff +eedif2* IDF.T(0.975,df2) .
EXECUTE .
REPORT FORMAT=LIST AUTOMATIC ALIGN(CENTER)
  /VARIABLES=low1 upp1
  /TITLE "95%CI for diff assuming equal variances".
REPORT FORMAT=LIST AUTOMATIC ALIGN(CENTER)
  /VARIABLES=low2 upp2
  /TITLE "95%CI for diff not assuming equal variances".

* Original data (for comparison purposes) *.
data list free /group(F8.0) wgain(F8.0).
begin data
1 175 1 149 1 132 1 187 1 218 1 123 1 151
1 248 1 200 1 206 1 219 1 179 1 234 1 206
2 142 2 214 2 311 2 249 2 337 2 176 2 262
2 211 2 302 2 216 2 195 2 236 2 253 2 199
end data.
var label group 'Diet'/wgain 'Weight gain (lb.)'.
value labels group 1 'Control' 2 'A Vitamin'.

T-TEST
  GROUPS=group(1 2)
  /VARIABLES=wgain
  /CRITERIA=CIN(.95) .

* The only difference between both methods is the
  use of Hartley's F test instead of Levene's F test
  (the last method evaluates residuals and requires
   the original data, not aggregated).
...
Navigate from here