# Sample SPSS Syntax

SPSS syntax is a must!

Are you aware of the book *SPSS Programming and Data Management*?

Don't satisfy yourself with the Graphic User Interface (GUI)!

The GUI is fine (I use it every day); however, using syntax** in addition** of the GUI can easily **increase productivity** by a factor of 5 to 10 times for simple jobs. The increase can easily be 50 times or more for larger, complex jobs. Furthermore some of SPSS's features are only available through syntax. As a "bonus", syntax files work on all versions of SPSS, not just on Windows.

There is something for everybody in the sample syntax's included here:

- Some do simple things, are easy to understand and have a lot of comments.
- Some do complex things and have either no comments or a lot of comments.
- Others fall between these two extremes

Suggestions and code contributions are welcomed. Share what you know! Learn what you don't!

## Index

The syntax files are **broadly classified by purposes** as follows:

- Area Under the Curve (AUC)
- Batch files
- Block Designs
- Bootstrap and random numbers
- Charts and Tables
- Cluster Analysis
- Combinations, permutations, interactions
- Compute
- Concatenate/modify string variables (see also Parse or flag data)
- Data Editor
- Data validation
- Dates and time (see also the Dates, Time and Age Tutorial)
- Distributions, Confidence Interval
- Export import (see also the FAQ)
- Factor analysis
- Flag or select Cases
- IGRAPH (see also the corresponding script section)
- Item Analysis
- Labels, variable names and format
- Matching data files
- Matrix
- Meta Analysis
- Multiple responses

- OMS
- Outliers
- Parse or flag data (see also String Manipulation Tutorial)
- Random sampling
- Ranking, largest values, sorting, grouping
- Read write or create data
- Regression, repeated measures
- Remove characters, duplicates or variables
- Restructure file
- ROC curves
- Sample Size, Power
- Self adjusting syntax
- Strings
- Survival Analysis
- Tests of inequality
- Test if file or variable exists
- Time Series
- Transform variable
- T-Test or Means or ANOVA
- Unclassified
- Working with many files (see also the corresponding script section)
- Working with missing values

## Caveats and Suggestions

I have not necessarily checked each and every file found here. I have grabbed some files which looked interesting but I might not have had the time to review them up to now. So much code… so little time…

When the information is readily available, I show the name of the authors of syntax I did not write. Usually, I do not show email addresses in order to reduce the number of emails to authors. **If somebody objects to having his / her code or name listed, please send me an email and I will quickly remove the reference**. ** If, on the other hand, you send me code which would be useful to other visitors, I will gladly include it here with due credit. **Such code should contain dummy data (ideally using DATA LIST or INPUT PROGRAM) and a description of its purpose.

If you don't measure it… you can't improve it!

## Syntaxes

### Area under the curve (AUC)

### Batch files

### Block Designs (with thanks to Valentim R. Alferes)

- Completely Randomized Designs.sps (equal or unequal n per treatment)
- Random assignment of units to experimental treatments.sps This is for Randomized Block Designs (Simple & Generalized) and Completely Randomized Designs (equal n per treatment)

### Bootstrap and random numbers

**Tip:** any time you use random numbers and need to be able to **reproduce your results**, use "SET SEED=number." at the beginning of your syntax where 'number' is any 'random number' you come up with. One option is to use the current date & time (e.g. "SET SEED=1120722." if it is 7h22 on Nov 20th)

- Bootstrap confidence interval for the variance of a variable.sps
- Bootstrap confidence interval for Cronbach alpha.sps
- Bootstrap crosstab.sps
- Bootstrap ordinary least square (OSL) estimators.sps
- Bootstrap the mean and median.sps
- Generating multinominal random variables.sps(AnswerNet)
- Generating multivariate hypergeometric random variables.sps (AnswerNet)
- Generating multivariate normal variables with a specific covariance matrix (AnswerNet)
- Generate random triad numbers.sps
- Get random sample of various size then calculate statistics.sps (Compare means of n samples of size s1 s2 ... sn ...)
- Get various random samples of same size calculate statistics.sps (Compare means of n1 n2 ... nn samples of size s)
- Sampling distribution of the correlation between 2 variables.sps

**Bootstrapping using OMS (this requires v12)** - oms_bootstrapping.sps

### Charts and Tables

#### Charts

- Bar charts for school types by sex where percentages of each sex add up to 100 percent.sps
- Blank bar for unselected category.sps (from AnswerNet)
- Blank bar for unselected category (generalized).sps (note however that to show empty categories in CTABLES is trivial. All that is required is to specify EMPTY=INCLUDE in the /Categories subcommand)
- Compare (superimpose) two histograms.sps could also use a population pyramid (see IGRAPH section below for an example)
- Count outliers.sps (show number of outliers in a boxplot)
- Do bar charts excluding categories with small number of cases.sps

This macro is fully commented here.

Newbie's who do not know how to use a macro should read this explanation. - Do many histograms with the same axis boundaries.sps (this demonstrates how the use of the macro Dograph.sps and template Dograph.sct to produce many graphs with the same x and y scale.
- Graph cumulative percentage retired at attained age by categorical variable.sps
- Graph cumulative percent on X axix.sps
- Graph survey question.sps
- Histogram with percent on y axis instead of numbers.sps
- Identify your own data in the chart.sps
- Identify your own data in the chart version2.sps This is a
**generalization**of the above syntax. It uses the following chart template Identify Your Own Data.sct (right-click on this template and select save target) - Print current date in chart title.sps
- Print current date and time in chart title.sps (Same technique can be used with Tables)
- Print histogram or bar chart depending on data.sps(A good macro example)
- Print school names as part of graph titles.sps
- Show mean values in line graph.sps
- Show 2 categories on same histogram.sps
- ZIPF law and graph.sps

#### Tables

- Construct a table "manually" in the data editor.sps(A good example of data restructuration)
- Construct a table "manually" example no 2.sps

(non trivial code...) - Find population frequency when multiple response with long strings.sps
- List variables in frequency table by order of medians.sps
- Print actual name group and id in heading of each listing.sps
- Print mean plus minus standard deviation in Table.sps
- Put 4 variables in the same frequency table.sps
- Show empty category in tables.sps (from AnswerNet) Note: This is trivial with CTABLE)
- Show empty categories in tables (second method).sps
- Show mean values.sps
- Show number of valid cases in table footnote.sps
- Table where list of variables is generated by macro.sps (Illustrates the !IF ...!ELSE ... !IFEND macro command)

#### CTABLES

- Get statistics for grouping of variables.sps
- Sort categories by decreasing count but with Others as last one.sps
- Using Macros and CTABLE.sps

### Cluster Analysis

- Cluster analysis using similarity proximity (count) data as input.sps
- Save centers of Hierarchical cluster analysis as initial value of K-means.sps

### Combinations, Permutations, Interactions

- All combinations of 3 numbers out of n.sps (see "Find all Combinations .." below for a generalization)
- All combinations of 3 letters out of n.sps (with replacement)
- Calculate interaction terms between 2 categorical variables.sps (within a regression context)
- Create a new variable for each combination of 2 variables.sps
- Find all combinations of 1 up to n items out of m items.sps (high power stuff!)
- Find all combinations of n items out of m items.sps (high power stuff!)
- Find all permutations of integers 1 to n.sps Maximum value of n is 7. Combined with recode, this can find permutations of any strings or numbers.
- Generate orders for block of trials.sps
- Get all possible crossproducts of pairs of variables.sps (contains a fair amount of comments)

### Compute

- Automatically compute sample weights to approximate population.sps
- Box-CoxTransformation.sps To transform var1 using each of the 31 values of lambda that are between -2 and 1 (increments of 0.1).
- Count number of distinct values across 400 variables.sps
- Compute percentage of patients having each fracture category.sps
- Compute z = x / max( y) where max( y) is over all cases.sps (it is sometimes preferable to use this macro technique)
- Compute distances between 2 points on earth.sps (with thanks to Simon Freidin)
- Compute average of m variables where m is a variable in the data file.sps
- Create a new variable equal to mean of an other variable.sps
- Find the cubic root.sps
- Reverse the digits on an integer.sps
- Weight data based on 2 or more vars.sps With thanks to Jo?o Duarte!

### Concatenate/modify string variables (see also Parse or flag data)

- Apparent problem with concat.sps Newbies should take a look at this example.
- Combine a string variable and a numeric variable.sps
- Concatenate.sps (new string equals concatenation of values in second variable)
- Concatenate content of cases with same id.sps
- Concatenate numbers.sps
- Concatenate 22 variables.sps
- Convert first letter of each word to upper case.sps (Thanks to A. Paul Beaulne for sending me this code)
- Create an id using name and dob.sps
- Normalise alpha.sps (Capitalise the first letter of each word, use lower case for the other letters)
- Normalize string.sps (delete spaces at beginning, remove period at end, capitalize all letters)
- Remove initial from name.sps
- Remove period from string.sps(can be modified to remove any other characters)
- Reorganize names.sps (place family name at the beginning of the sting)
- Transform ascii codes into characters.sps

### Data Editor

### Data validation

### Dates and time (see also the Dates, Time and Age Tutorial)

- Add 60 days to a date then find end of that month.sps
- Add leading zeros to a string date.sps
- Ages are in nnH nnD nnM and nnA.sps
- Break down number of days in hospital by calendar month.sps
- Calculate age.sps
- Calculate time differences to milliseconds.sps
- Calculate mean date and standard deviation in days.sps
- Calculate nb of days within the eligibility period.sps
- Caculate number of minutes between 2 timestamps (crossover midnight).sps
- Calculate number of months between 2 dates.sps
- Calculate waiting time when time is coded in hh min.sps
- Compute number of weekdays between 2 dates.sps
- Compute number of weekdays excluding public holidays.sps
- Compute sleep time.sps
- Convert basis.sps
- Convert strings into numbers.sps (variable contains age in either of the following format "7 Y" for 7 years, "3m" for 3 months, "28D" for 28 days. Need to convert these to years.)
- Convert string formated as hhmmss into numeric time variable.sps (thanks to Jim Marks)
- Convert string 01jan1992 to a date variable.sps
- Convert string 1997-08-22 into a date variable.sps
- Convert string into date and time variables.sps
- Convert string "2006-04-28 18:20:01" to datetime format.sps
- Convert string to date and select cases which fall during the weekend.sps
- Convert string "04Apri03" to a date variable.sps
- Date plus 3 months.sps
- Dates appear as asterix on chart.sps (solution)
- Extract time portion from string variable containing date and time.sps
- From AM PM to military time.sps
- Importing from excel (convert days into dates).sps
- Keep time portion of date when creating Tab delimited file.sps
- Make variable equal to current date.sps
- Number of consecutive 30 minutes of hypoxia.sps
- Print current date as part of graph title.sps
- Print date and time before a procedure.sps
- Print day name along with date.sps
- Read time stamp.sps
- Save data file with current date as part of name.sps
- Select a range of dates.sps
- Time an SPSS procedure.sps

### Distributions, Confidence Intervals3>
- Add variables containing lower
and upper CI for mean.sps
- Bayes estimates for proportions and their CI.sps
with thanks to Evgeny Ivashkevich (this also calculates Confidence
Intervals for a category not present in the sample)
- Calculate Chi-square significance
given q and df.sps
- Calculate 95 percent
confidence interval for the median.sps
(thanks to Marta
Garcia-Granero)
- Calculate McNemar Chi-Square test.sps(thanks to Marta)
- Hodges Lehmann
Confidence Interval for Median difference.sps
(thanks to Marta
Garcia-Granero)
- Exact
confidence limits for a binomial parameter.sps
- Goodness of fit test for Poisson Distribution.sps
(thanks to Marta Garcia-Granero)
- Normalization
of raw scores (with thanks to Valentim R. Alferes)
- Proportion tests and confidence
intervals.sps (thanks to Gwilym Pryce) This includes large-sample

- significance test for a single population proportion

- confidence interval for a single population proportion

- significance test for two population proportions

- confidence intervals for comparing two population proportions - Testing linear
constraints in MR.sps with thanks to Johannes Naumann. This macro
tests General Linear hypothesis of the type cb=d, where b is a vector
of regression coefficients and c is a matrix of linear constraints.
- Univariate and multivariate
tests of skew and kurtosis (a link to Lawrence T. Decarlo's SPSS
macro). The same site also contains

- SPSS macro for Mardia's multivariate skew

- SPSS programs for signal detection models expressed as generalized linear models

**Fitting distributions**

- Fitting models with overdispersion or 'extra-Poisson' variation.sps

### Export_Import (see also FAQ
and Sample Scripts)

- Export all tables in word.sps
(see Sample Scripts for an automated
solution)
- Export
data and value labels to excel.sps
- Export content of data editor to
a specified sheet of an existing Excel workbook.sps
- Export from SPSS to ACCESS.sps
- Export from SPSS to
ACCESS (method2).sps
- Export more than 256 vars to Excel.sps
- Export some SPSS vars to many sheets of Excel workbook.sps
- Import from ACCESS or
LotusNotes.sps (no DSN needed: this is very handy. Thanks to Tom
Dierickx)
- Writing back
an SPSS 10 file to an ODBC database.sps (from AnswerNet)

### Factor Analysis

- Determining the number of
components using parallel analysis and Velicer's MAP test (a link to
Brian P. O'Connor's site)
- Factor analysis with Spearman
correlation through a matrix.sps

### Flag or Select Cases

- Exclude
"outliers" from analysis.sps (where outliers are defined
as cases outside Mean +/- 2 SD)
- Flag cases
where a given string variable contains a given word.sps
- Flag
cases where any of a list of variables have same value.sps
- Flag
cases where salary is in top 95 percentile.sps
- Flag
cases meeting a certain condition as well as preceding and following case for
the same person.sps
- Flag first and last
dates (within each ID).sps
- Keep only duplicate
cases.sps
- Print
frequency table of the n most (less) frequent items.sps
- Select
patients where drug1 was given before drug2.sps
- Select
cases where same letter appears twice in string.sps
- Sophisticated
search in string variable.sps (data were scanned, portion of strings
include letters (eg B) instead of numbers (eg 8); this syntax flags the errors)

### IGRAPH (see also the corresponding script section)

- Clustered bars
with percent based on total in cluster.sps
- Example of surface plot.sps
- Graphing an arbitrary function.sps
- Graph showing
interaction in multiple regression.sps
- How to speed up IGRAPH.sps (A similar approach could be used for other type of
graphs)
- Population pyramids.sps
- Produce long
IGRAPHs.sps
- Separate box plot graph
for each category value.sps (syntax can be adapted to any other type
of graph)

### Item Analysis

- Syntax for item analysis.sps
This is based on SPSS's
White Paper on Item Analysis and on this Exercise
data file.
- Syntax
For Item Analysis V6.sps This is a
**much improved** version
of the above. It is fully automated and has been developed and
tested using SPSS 15.

### Labels, Variable Names and Format

- Add
(or replace) a character at the beginning of each var names.sps
- Add
'_99' at the end of every variable names.sps
- Apply lab1 as
value label to var1 by syntax.sps
- Assign same
label to many variables.sps
- Assign value
labels to a vector.sps
- Assign
variable and value labels of a given variable to other variables.sps
- Automatically
rename variables.sps
- Auto
variable renaming or copying.sps
- Change
case of Var Labels and/or Value Labels.sps with thanks to Simon
Freidin
- Change format of
600 variables.sps
- Convert variable
format.sps (see also the
following tutorial. If you are not
familiar with macros, see this macro
tutorial for newbies).
- Create dummy
variables.sps (also called indicator or binary variables)
- Create dummy
variables (AnswerNet).sps
- Create
new variable equal to number of occurrences of var1.sps
- Define a global
variable.sps (this is a useful programming
technique)
- Define variable
label by Macro.sps
- Delete all variable
labels of a given sav file.sps
- DeleteListOfVariableNames But Some May Not Exist.sps
- Delete variables with all values equal to zero.sps
- Delete or
reorder variable names (data fields).sps
- Delete many
variable labels.sps
- Group
data and define corresponding value labels.sps
- Define list of
variables between two variables.sps (a macro Gem)
- Match label file
with data file.sps
- Print
variable labels and value labels in FREQ Tables.sps
- Read
ASCII data variable name, value and value labels.sps
- Recode
variables var1 becomes varx etc.sps
- Remove
underscores from all variable names.sps (can
be adapted to remove any other character)
- Rename variables.sps
- Rename
all variables t2abc becomes t1abc etc.sps
- Rename var in file1 to names in file2.sps
- Reverse scale and value labels.sps
- Round
and change format of all numeric variables.sps
- Show 0.45 instead of
.45.sps
- Sort
variable names by alphabetical order (AnswerNet).sps
- Sort variables
by name in data file.sps (sent by A. Paul Beaulne)
- SortVariablesByAlphabeticalOrder.sps
- Write
value labels to ASCII file (AnswerNet).sps
- Xpand vector names.sps

### Matching data files

- Compare 2 data files.sps
with thanks to Simon Freidin
- Create data file if
double entries are equals.sps (where entries
done by 2 different persons in 2 different files)
- Double entry check.sps
- Find errors in 2 files (data entered twice).sps
- Match one to many where key has 4 variables.sps
- Match 2 files using between-dates criteria.sps
- Merge 2 data files
based on many to many relationship.sps

### Matrix

- Coder Reliability with Nominal Data.sps for
**version 14+** of SPSS with thanks to Pieter van Groenestijn. Here are related notes and references. Here are the syntax for **version 13** and related notes
- Cohen's Kappa.sps with
thanks to Brian G. Dates. This syntax provides complete information on
kappa for any number of raters and categories.
- Example that reads, writes, creates and transforms matrices.sps
- Export variance
covariance matrix to ASCII file.sps
- Export variance
covariance matrix to sav file.sps
- Find inverse of a matrix.sps
- Fuzzy Crosstable using
Matrix command.pdf with thanks to Ruben P. Konig
- Hierarchical sort in MATRIX.sps
with thanks to Kirill Orlov
- Macro autogenerate initial data file.sps
with thanks to Fernando Cartwright
- Matrix out in.sps
- Maximizing the trace of a
matrix.sps This is high powered stuff. Need to test all permutations
of rows in order to find the one which maximizes the trace. For a 7*7 matrix (the maximum
size this macro will handle), there are 5,040 permutations to test.
- Read matrix data.sps
- Reliability
analysis when input is a correlation matrix.sps
- Transform a matrix into a vector.sps

### Meta Analysis (See also meta-analysis
stuff by David B. Wilson)

- META-SPSS.ZIP An
**exhaustive** set of syntax files written by **Marta Garcia-Granero
**as well as sample data files and supporting documents. This is
the Read Me First
documentation.
- Meta Analysis: fixed
and random effects models.sps (With thanks to Valentim R.
Alferes) This SPSS syntax does a meta-analysis on a set of studies
comparing two independent means. It produces results for both fixed
and random effects models, using Cohen's d statistics. The user has a
total of
**10 modes** for entering summary data.

### Multiple responses

- Count unique
occurences of a multiple response.sps
- Create
dichotomous variables from multiple responses which are not in order.sps
- Multiple responses are
encoded as comma separated letters.sps

### OMS

### Outliers

**Caution**: Before replacing or deleting outliers, see the warning at the beginning of syntax # 3.
- Exclude cases over mean plus 2 times sd.sps
- Replace
outliers by average of cases with same characteristics.sps
- Replace outliers by
mean plus/minus n times sd.sps
- Winsorize a mean.sps

### Parse or Flag data (see also
String Manipulation Tutorial)

- Extract bits from an integer.sps
- Extract portion of string.sps (string
contains first and last name, want first 3 letters of last name)
- Extract portion of string
starting with a digit.sps
- Extract Zip code from
address field.sps
- Extract two numbers from a
string.sps (e.g. string "120/90" becomes numbers 120 and
90)
- Flag if last
characters of string are 'Esq'.sps
- Parse a string into one letter per
variable.sps
- Parse comma separated
numbers.sps
- Parse data separated by
slashes.sps
- Parse domain name from
email addresses.sps
- Parse comma separated strings then autorecode results.sps
- Parsing a
variable which has embedded line feeds.sps (thanks to Bjarte Aagnes)
- Remove
letter at end of string and convert remaining string to a number.sps
- Split
a string variable into plaintiff and defendant portions.sps
- String
variable contains items separated by a slash.sps
(there is a variable number of items from one case to the next)
- Weed out letters in a
string and create a number with remaining digits.sps

### Random Sampling

- Complex sampling
without replacement.sps
- Draw
without replacement (random permutation of numbers).sps
- Generate random phone
numbers.sps (the syntax uses the following data file)
- Find random pairs of
cases for T-test.sps
- Find
random pairs of cases with same characteristics.sps
- Flag n random
cases within each subgroups.sps
- Get 2
independent samples meeting given criteria.sps
- Get 2 random samples
same sex age education.sps
- Get n independent random samples of size m from same file.sps
- Get random
sample of x% of each stratum.sps
- Get
random sample of N cases from each stratum.sps
- Getting repeated
sampling from same file.sps
- List of random cases id
10 per line.sps
- Match cases on
basis of propensity scores.sps (this involves matching cases which
do
**not** match **but are close** to each other)
- Proportional
sampling without replace.sps
- Proportional random
sampling.sps
- Proportional
sampling without replacement.sps
- Random sample n males
and n females.sps
- Random samples with
same age sex education.sps
- Random split a file in
two files.sps
- Randomize
a variable n times and keep each randomization.sps
- Scramble social insurance numbers.sps
- Select 2 cases from each
group.sps
- Select random samples
of each group.sps
- Split files in 2 random
portions.sps
- Split a file into 10
random groups of equal size.sps
- Systematic fixed
sampling.sps

### Ranking,Largest values,sorting,grouping

- Aggregating
with the median.sps
- Calculate
cumulative sum of Var1.sps
- Calculate mode.sps
- Calculate
number of distinct values within Case.sps
- Calculate
z scores across variables.sps
- Code
using percentiles of a subset.sps
- Compute
percentiles for one variable and by one or more grouping variables.sps
(with thanks to Tom Dierickx). Note that the percentiles
end up in the data editor, not in the Output window.
- Compute
percentages based on values of first case.sps
- Create
n tiles based on percent ranges rather than on count.sps
- For each
case, find the earliest case in the preceding 7 days.sps (relatively
complex stuff)
- Find
5 largest values within case.sps
- Find
last 2 scores on repeated measure.sps
- Find multi-modal values per id.sps
- Identify
the highest 3 scores of each case.sps
- Identify
variables having minimum value.sps (with thanks to Maciek Lobinski)
- New
variable equals cumulative totals by id.sps
- Number
consecutively cases with the same id.sps
- Random order.sps
- Rank
equal intervals between minimum and maximum.sps
- Rank
on basis of percentage of good.sps
- Rank
variable names in alpha order.sps
- Rank within
cases.sps
- Replace
missing by median values within the case.sps
- Round up
to the higher point 5.sps
- Saving
confidence interval for mean (within groups).sps
- Score a
test with an answer key.sps (thanks to A. Paul Beaulne)
- Sorting
values within cases (using the bubble sort algorithm).sps
- Syntax
group data in bands.sps
- Various
Algorithms to sort within cases.spsWith thanks to Kirill Orlov

### Read, Write or Create Data

#### Create

- Adding new cases
using syntax.sps
- Add
variable equal to function of an existing Var.sps
- A few simple examples of INPUT PROGRAM.sps
(a short tutorial)
- Copy
some variables from each record type 1 to add a new record of type 0.sps
- Create consecutive records at the end of the file.sps
- Create
constants for each non missing date.sps
- Define new
variables in empty data set.sps
- Define varx to vary.sps
- Duplicate
cases n times where n is variable.sps (see also Expand Crosstab Data below)
- Expand
crosstab data into original data file.sps (disaggregate data)
- Expand data x and y
times.sps eg from a case where age=20, males=5
and females=6 want to create 5 cases with age 20 and sex=1 and 6 cases where age=20
and sex=0
- Fill the
gaps when Aggregate has empty categories.sps Syntax creates
cases to fill the gaps
- Generate random dates.sps
- INPUT
program (to generate a random data file).sps
- Insert missing
cases (within id).sps
- Insert missing dates
(within id).sps
- Printing date time in
output.sps

#### Read

- Example of data list.sps
- Example of INPUT program.sps
- Read a variable
number of records per case.sps
- Read
ASCII (logical case is made up of 5 rows of 10 cases).sps
- Read ASCII file
using FILE TYPE.sps
- Read ASCII file
using INPUT PROGRAM.sps
- Read
ASCII file with a forward slash delimiter.sps
- Read
ASCII file with comma or dash delimited data .sps
- Read
ASCII with comma and dot separated decimals.sps
- Read
ASCII file with comma separated data (within quotes).sps
- Read ASCII
file with fixed and free data.sps
- Read ASCII file with
FIXED Data.sps
- Read ASCII file with
REPEATING data.sps
- Read
comma delimited fields with commas inside quoted strings.sps
- Read comments
between the lines of data.sps
- Read complex file.sps
- Read data files that has no carriage returns.sps (from AnswerNet) (data is just one long stream, with no separation between records or fields, and no carriage returns)
- Read data list free with consecutive commas.sps
- Read data produced
by CGI script.sps
- Read
data where each case has 4 numeric records and a variable number of string
records.sps (this is illustrates the use of the REREAD
command)
- Read data
inline File Type MIXED Records.sps
- Read text
file where n columns are to be ignored.sps (n
is a variable which varies by file)
- Skip first 6 Records.sps
- Skip one line of data.sps

#### Write

- Write comma or tab
delimited file.sps
- Write
frequency percentages to data file.sps
- Write missing values
as a dot.sps
- Write special ASCII
file.sps
- Writing value
labels instead of values.sps

### Regression, Repeated Measures

- Add
casewise regression coefficients to data file.sps
- Breusch-Pagan
& Koenker test.sps (thanks to Marta Garcia-Granero)
- Calculate
predicted values (unianova).sps
- Chow test.sps
- Compare
regression coefficients.sps (thanks to A. Paul Beaulne for sending
me this code)
- Compare coefficients generated by various groups.sps
- Conditional
logistic regression.sps
- Do
all univariate linear and logistic regressions.sps
(thanks to Marta Garcia-Granero)
- Do All-Subsets
regressions.sps
- Generalized
Estimation Equations (GEE).zip (thanks to Terry Duncan, PhD,
Oregon Research Institute). GEE is a macro for analyzing
longitudinal data. The SPSS macro uses the GEE approach of Liang and
Zeger (1986) to model longitudinal data for a general class of outcome
variables including gaussian, poisson, binary and gamma outcomes.
- Logistic
regression by macro.sps
- Multinominal
Logistic Regression with split-sample validation.sps (thanks
to Maciek Lobinski for sending me this macro)
- Non linear
regression (NLR) with variance of residuals as the loss function.sps
(this is
**not trivial**)
- Piecewise
regression.sps (also known as "spline regression" and
"piecewise polynomials")
- Regression
calculates table of predicted values.sps
- Regression in a loop.sps
- Regression
when holding out k cases.sps
- Regression
with correlation matrix as input.sps
- Regression
with normed weight.sps
- Repeated measures
macro.sps
- Ridge regression.sps
(this comes with SPSS)
- Testing
individual regressors in logistic regression.sps
- White's standard
errors full OLS and White's SE output.sps (thanks to Gwilym Pryce)

See also the following related tutorials on
Heteroscedasticity.
- White's
test: calculate the statistics and its significance.sps (thanks
to Marta Garcia-Granero)

### Remove Characters, Duplicates or Variables

- Delete cases with
offset cases.sps
- Delete double entries.sps
(thanks to Maciek Lobinski) For instance if, for a
given case, var1 equals var2, the syntax replaces var2 by sysmis.
- Find duplicates.sps
- Remove double quotes.sps
- Remove duplicate
records.sps
- Remove unused
variables from many files.sps
- Replace
consecutive spaces in string by a single space.sps
- Replace character In
string.sps (see also String
Manipulation Tutorial)
- Save duplicates in a separate file.sps

### Restructure File

- Allocate dummy
variables to 24 hours.sps
- Automated data
transform from tall to wide.sps
- Automated
restructure v4.sps
(thanks to Kevin Hynes) This
example maintains a grouping factor while restructuring data from tall
to wide.
- Automated
restructure from long to wide.sps with thanks to Hillel Vardi. This
is the sample data file used.
- Collapse empty
variables within a case.sps
- Deduplicate
cases while keeping all the information.sps (a cute little problem)
- Each variable
occupies 5 rows of 10 columns.sps (an other nice little problem)
- Find
beginning and end of continuous periods.sps
- From many to one example1.sps
- From many to one example2.sps
- From many to one with
alpha data.sps
- From many
to one with specific order of new variables.sps
- From one to Many simple.sps
- From one to many
with indicator variable.sps
- Restructure data file
example1.sps
- Restructure data file
example2.sps
- Restructure data file
example3.sps
- Restructure data file
example4.sps
- Restructure
from tall to wide (general solution).sps (non trivial macro code...)
- Restructure time
periods to a time matrix.sps
- Restructure to
calculate Kappa.sps
- Transpose
(FLIP) string
variables.sps
- Use former variable
names as value labels.sps

Following examples require **version 12 or above**

- VarsToCases and CasesToVars.sps

### ROC Curves

- ROC curves & Youden's Index.sps The
syntax also computes Likelihood Ratios and Kullback-Leiber distances
(requires v12 or above).

### Sample Size and Power

- Power analysis
examples.sps With thanks to
Bruce Weaver
- Sample size for means.sps With thanks to Marta Garcia-Granero. This is a collection of several short macros that perform sample size calculations for confidence interval estimation and one sample / two samples tests for means (this last one with equal or unequal sample sizes).
- Sample size for proportions.sps With thanks to Marta Garcia-Granero. A collection of macros that perform sample size calculation for the estimation of one proportion and one or two samples hypothesis testing, as well as the calculation of the power of a test.
- Sample size for correlation hypothesis testing.sps thanks to Marta!

### Self Adjusting Syntax (other examples are scattered throughout this site)

- Automated
data transform from tall to wide.sps
- Choice of
include file depends on data.sps
- End of macro DO
LOOP comes from data.sps
- Execute
selective portions of syntax.sps see also
tips on INCLUDE command
- From 2 files to 1 cases
per id.sps
- Syntax varies
based on name of data file.sps

### Strings

- Are all words
present.sps This tests whether all words passed to the macro are
present within a given string variable.
- Are all words present
dichotomy vars.sps Similar to above but creates one dichotomy
variables for each target word
- Change all strings in data file to lower case.sps
- Convert numbers to strings.sps
- Convert string
'250 million' into a number.sps (or '16 billion' etc)
- Convert string to numeric variable.sps
- Soundex
Phonetic Comparison.sps

### Survival Analysis

### Tests of Inequality

- Index of dissimilarity.sps
(formulas from Negroes in Cities (1965) by Karl and Alma Taeuber)
- Many tests of inequality v5.sps
(this chart template is used by the
syntax)

- significance test for a single population proportion

- confidence interval for a single population proportion

- significance test for two population proportions

- confidence intervals for comparing two population proportions

- SPSS macro for Mardia's multivariate skew

- SPSS programs for signal detection models expressed as generalized linear models

**Fitting distributions**

**much improved**version of the above. It is fully automated and has been developed and tested using SPSS 15.**version 14+**of SPSS with thanks to Pieter van Groenestijn. Here are related notes and references. Here are the syntax for**version 13**and related notes**exhaustive**set of syntax files written by**Marta Garcia-Granero**as well as sample data files and supporting documents. This is the Read Me First documentation.**10 modes**for entering summary data.**Caution**: Before replacing or deleting outliers, see the warning at the beginning of syntax # 3.

**not**match**but are close**to each other)**not trivial**)See also the following related tutorials on Heteroscedasticity.

Following examples require

**version 12 or above**

**The above syntax** (formulas come from poorcity.richcity.org ) calculates the following indexes:

the ATKINSON index = DEMAND coefficient.

the THEIL redundancy.

the RESERVE coefficient.

the D&R coefficient.

the KULLBACK-LIEBLER redundancy.

the HOOVER coefficient.

the COULTER coefficient.

the GINI coefficient.

The Lorenz curve is produced. The various indexes are plotted on the
same graph when there is data for more than one year. At the end, there are **9**
**examples** about how to use the syntax.

### Test if file or variable exists

- Check for existence of file.sps
- Choice of include file depends on existence of a given variable.sps
- Get all string or all numeric variables.sps (the 2 macros produced by this macro allow you to process all string or all numeric variables in the data file)

### Time Series

### Transform variable

- Automatically rescale variable to be between 0 and 1.sps
- Calculate utility of EuroQol 5D questionnaire.sps with thanks to AJ Garcia Ruiz
- Constrain a variable to a given interval.sps(syntax is first given, then it is generalized using 2 macros)
- Convert numbers to string with leading zeros.sps
- Create variable equal to z-scores of an existing variable.sps
- Extract fist or first 2 digits of a large integer.sps
- Global autorecode.sps
A nice problem: Autorecode many string variables where the
recode formula (eg a=1,b=2, etc) is
**the same**for all variables even though none of the variables have all possible values - Replace confidential information eg a ssn by a new (known) id.sps
- Replace values higher than n by the mean of the other values.sps
- Replace letter by 9999 then convert to number.sps
- Transform string coding into numbers.sps (5A becomes 5.1; 7B becomes 7.2; 9D becomes 9.4 etc)
- zip code syntax.zip With thanks to Christopher Boyd. This zip file contains 2 syntax files: one recodes zip into town; the other does the reverse. Zip codes are those used by the US Census to generate zip code level employment data.

### T-Test or Means or ANOVA

- ANOVA A*B.sps (thanks to
Valentim Alferes) This does an A*B Factorial ANOVA and calculates
variance components, measures of association, measures of effect size
and observed power. Works with raw data
*or*published summary statistics. - ANOVA_Tables using 4 methods.sps (thanks to Valentim Alferes) method 1:for Ns, Means and SDs; method 2 for Ns, Means and Variances; method 3 for Ns,Means and MS Error; method 4 for Means, Df num, Df den and MS Error.
- Cochran Hartley Critical Values.sps This gives the tabulated critical values at 5% and 1% for both HOV tests. Thanks to Marta Garcia-Granero
- Compare mean of each hospital with mean of all other hospitals.sps (nice little macro)
- Do a T-Test with only the Means, SD and Ns.sps (uses ANOVA)
- Do T-Test with only means,SD and Ns.sps (thanks to Marta Garcia-Granero) this includes Hartley's F test, the standard T-test and Welch test, asymptotic and non asymptotic 95% CI are calculated.
- Hotelling's T**2 & Profile Analysis.sps (thanks to Richard MacLennan)
- Multiple Mann-Whitney tests.sps (using a macro to have a procedure inside a LOOP)
- ONEWAY with summary data1.sps Performs a ONEWAY ANOVA plus several Homogeneity of Variances tests on summary data. Thanks to Marta Garcia-Granero
- ONEWAY with summary dataI2.sps Performs several ONEWAY ANOVAS plus several Homogeneity of variances tests on summary data. Any number of variables can be analysed. Thanks to Marta Garcia-Granero.
- Standardized effects size (Cohen Glass and Hedges's d).sps (with thanks to Marta) The effects size and their standard errors are added to the data file.
- T-Tests and Likert scales.sps
- T-Test effect size non overlap and power.sps (thanks to Valentim Alferes) User can either analyse raw data or reproduce the SPSS T-Test standard output using summary statistics in published articles.

### Unclassified

- Adjusted p-values algorithms.sps thanks to Marta Garcia-Granero for this improved version of her code.
**References are included**.

The code calculates**adjusted p-values**using the**following 8 methods**:

*One-step:*Bonferroni and Sidak,

*Step-down*: Bonferroni (Holm's), Sidak (also called Holm's-Sidak) and Finner.

*Step-up:*Hommel, Hochbeerg and Simes - Calculate average percent score.sps
- Calculations on dynamic columns.sps
- Canonical correlation.sps (this comes with SPSS)
- Fill in the gaps.sps (information in file has been left blank when it equals the information in the preceding case, this syntax fills the gap)
- Fill in the gaps (within ID).sps
- Interaction in factorial designs when dependent variable is not normal.sps Thanks to Marta Garcia-Granero for this code.
- Stop or resume generating outputs in the output window.sps

### Working with Many Files (see also the corresponding scripts section)

- Combine
many data files with same variables.sps

alternative 1: the following script works even when file names are unknown

alternative 2: Use the DOS command**copy *.sps newfile.sps**to combine all txt files in the folder (thanks to Scott Clark for this suggestion) - Combine any number of consecutively named sav files 50 at a time.sps
- Combine many xls files into a single sav file.sps
- Combine 2 data files many to many.sps
- Data list is outside the main syntax.sps (illustrates how a syntax file can be modified by syntax)
- Delete cases contained in file2 from the main data file.sps
- Erase files.sps
- Example 1 using UPDATE command.sps
- Example 2 using UPDATE command.sps
- Get mean from 3 different files.sps
- Include 200 syntax files by macro.sps
- Keep only cases from Master file whose id are in second file.sps
- Macro to delete a list of files.sps
- Many folders and many files.sps
- Process all
xls files in folder.sbs (this
**scripts**works jointly with this syntax file) - Run a macro on several files.sps
- Run syntax on files whose names are derived from a data file.sps
- Run a macro on every file whose name is in a sav file.sps
- Save file1 file2 file3 etc by macro.sps
- Split big files into separate categories.sps (create a different sav file for each value of a numeric categorical variable)
- Split big files into separate categories string var.sps (create a different sav file for each value of a string categorical variable)
- Split file with kn cases into k files of n cases each.sps
- Unusual file merge.sps
- Show number of differences, if any, between 2 files.sps (to check double entry of data). For additional examples, see Matching data files

### Working With Missing Values

Caveat: Replacing missing values is not something to be done lightly. David C Howell has a good page on the Treatment of Missing Data

- Conditionally replacing missing by mean.sps
- Conditionally replacing missing by mean example 2.sps
- Delete variables that have only missing values.sps
- Identifying the 3 types of missing values.sps
- Hot Deck.sps substitution of missing values of X within STRATUM (thanks to Theo van der Weegen)
- List variable names with missing values and identify main elements of cases.sps
- Listing Variables with missing values (Per case).sps with thanks to David Marso
- Mean substitution in additive scale.sps
- Missing values and DO IF.sps
- Recode certain dates as missing.sps
- Replace "Blanks" by value from preceding case.sps
- Replace missing by mean of category.sps
- Replace missing by median values within each case.sps
- Replace missing by random value taken from cases with valid value.sps (see Hot Deck.sps above for a more general solution)
- Replace missing with mean.sps