** Normalization of raw scores
** Valentim R. Alferes (University of Coimbra, Portugal)
** valferes@fpce.uc.pt

* This syntax job does normalization of raw scores and can be used
* in a variety of measurement contexts (e.g., psychometrics). 

* Just a few words on terminology.

* Normalization is a kind of nonlinear transformation (area 
* conversion) of scores, so that the new distribution may have a 
* normal or bell shape. We can do this by taking the cumulative 
* proportions of raw scores as probabilities, finding their
* corresponding normal deviates, and converting them to normalized
* scores with a desired mean and standard deviation.


* Standardization is a simple linear transformation of raw scores,
* so that the new distribution will have a mean of 1 and a sd of 0.
* To find a standard score, just calculate z = (X – mean)/sd.
* You can do this in SPSS either by using the menus (DESCRIPTIVES…/Save 
* standardized values as variables) or by the simple syntax:
*              DESCRIPTIVES VARIABLES = VAR1 (ZVAR1).
* Note that the standardization doesn't change the shape of the
* original distribution.

* You can also change the mean and the sd of standard scores
* by calculating C = z*sd + mean.
* This is also a linear transformation and the resulting scores (C)
* are often known as converted scores.

* Getting back to normalization, we have illustrated this syntax
* with an example from the classic:
* Guilford, J. P., & Fruchter, B. (1978). Fundamental statistics
* in psychology and education (6th ed.). New York: McGraw-Hill. 

* In this example (Table 19.2, p. 479), we have 83 raw scores
* grouped in 15 classes as well their upper limits and frequencies.
* In Table 19.4 (p. 482), you can find all the raw scores for
* which Guilford and Fruchter intend to have the normalized 
* or T Scores with a Mean of 50 and a SD of 10.

* After running the syntax, you will have the output with normalized
* scores rounded up to nearest integer (TSCORE1), to the nearest .5
* (TSCORE2) and to one decimal place (TSCORE3). Usually, we choose one
* of the first two solutions, but it is up to you.


DATA LIST FREE /UPPERLIM (F8.0).
* Enter raw scores for which you desire normalized scores.
* (Table 19.4, column 1, Guilford & Fruchter, 1978, p. 482).
BEGIN DATA
120 125 130 135 140 145 150 155 160 165 170 175 180 
185 190 195 200 205 210 215 220 225 230 235 240
END DATA.
SAVE OUTFILE=OUTF1.

DATA LIST LIST /SCORES(A20) UPPERLIM(F8.1) FREQ(F8.0).
* Enter classes of observed scores, upper limits, and frequencies.
* (Table 19.2, columns 1 to 3, Guilford & Fruchter, 1978, p. 479).
BEGIN DATA
130-134  134,5  1
135-139  139,5  0
140-144  144,5  1
145-149  149,5  1
150-154  154,5  2
155-159  159,5  5
160-164  164,5  6
165-169  169,5  5
170-174  174,5  5
175-179  179,5  9
180-184  184,5  11
185-189  189,5  6
190-194  194,5  6
195-199  199,5  6
200-204  204,5  7
205-209  209,5  5
210-214  214,5  5
215-219  219,5  1
220-224  224,5  0
225-229  229,5  1
END DATA.

* Enter mean for T Scores (50 in the Guilford & Fruchter example).
COMPUTE MEAN = 50.
* Enter standard deviation for T Scores (10 in the same example).
COMPUTE SD= 10 .

COMPUTE DUMMY=1.
AGGREGATE/OUTFILE=OUTF2/BREAK=DUMMY/N=SUM(FREQ).
MATCH FILES/FILE=*/TABLE=OUTF2/BY DUMMY.
CREATE CUM_F=CSUM(FREQ).
COMPUTE CUM_PRO=CUM_F/N.
COMPUTE Z=IDF.NORMAL(CUM_PRO,0,1).
COMPUTE T_SCORE=Z*SD+MEAN.
FORMATS CUM_F (F8.0) CUM_PRO (F8.3) T_SCORE (F8.1).
* This line produces Table 19.2 (Guilford & Fruchter, 1978, p. 479).
LIST SCORES UPPERLIM FREQ CUM_F CUM_PRO T_SCORE.

ADD FILES /FILE=*/FILE=OUTF1.
REGRESSION/DEPENDENT T_SCORE/METHOD=ENTER UPPERLIM/SAVE PRED.
COMPUTE TSCORE1=RND(PRE_1).
COMPUTE TSCORE2=RND(2* PRE_1)/2.
COMPUTE TSCORE3=RND(PRE_1*10)/10.
SEL IF (SYSMIS(FREQ)).
COMPUTE RAWSCORE=UPPERLIM.
FORMATS RAWSCORE (F8.0) TSCORE1 (F8.0) TSCORE2 (F8.1) TSCORE3 (F8.1).
* This line produces Table 19.4 (Guilford & Fruchter, 1978, p. 482).
LIST RAWSCORE TSCORE1 TSCORE2 TSCORE3.