Please use the index below. Clicking one of the questions sometimes take you to another page of this site.
Index of SPSS FAQ
- What is SPSS?
- Where can I find help on my SPSS problem?
- Other General questions
- How can I use an alternative syntax editor? Thanks to Tom Dierickx)
- How do I subscribe/unsubscribe to the SPSSX-L mailing list?
- How many variables and cases can SPSS for windows handle?
- I need to reference SPSS in an article I am writing, how should I do that?
- What does SPSS stands for?
- Where are the archives of the SPSSX-L list and of the SPSS newsgroup?
- Where can I find the formula used by SPSS to calculate statistics XYZ?
- Why is my *.spo file so huge?
- Calculations related questions
- Tables related questions
See also the SPSS FAQ at the University of Texas .
Don't ask if SPSS can do this or that! Ask how it can be done!
What is SPSS?
I never expected visitors of this site would ask this question but many do!
The following brief definition is taken from the SPSS Base User's Guide.
"SPSS is a comprehensive system for analyzing data. SPSS can take data from almost any type of file and use them to generate tabulated reports, charts, and plots of distributions and trends, descriptive statistics, and complex statistical analysis."
See IBM's SPSS webpage for more information.
Where can I find help on my SPSS problem?
Option 1. Of course there are IBM/SPSS's Statistical Support pages (you may login using "guest" as User name and Password) which include
- a searchable database with thousands of Q and A's.
- Statistical Articles addressing various topics of interest to SPSS users (e.g. why is my Coefficient Alpha negative?),
- Algorithms used by SPSS to calculate statistics. (This is also on the CDROM)
- Macros written mostly by SPSS
- Keywords archives
- Script exchange a collection of scripts written by SPSS or by users
- SPSS Developer's Guide (for people writing scripts; contains code examples. This is also on the CDROM)
- SPSS Patches
- Books which may be purchased from SPSS
- Download a free pdf copy of SPSS Programming and Data Management
Option 2. The Help menu ;-) This is actually more useful than many people think. You should, at least once, go through all the "books" in the Contents Tab of the HELP>TOPICS window. The last "book" covers Frequently Asked Questions (FAQ) on 10 different topics ranging from "Opening Data Files" and "Saving Data and Results" to "Memory and Performance"
Option 3. The spssbase.pdf file (which you will find using Help>Syntax Guide>Base or on your SPSS CD).
This file is very useful, it is always open when I work with SPSS. It is the electronic version of the "Syntax Reference Guide" book which is available from SPSS. However, I know that users need some time to get used to either versions.
The second CD of version 12 includes pdf versions of the following manuals
- SPSS Advanced Models 12.0
- SPSS Base User's Guide 12.0
- SPSS Brief Guide 12.0
- SPSS Categories 11.0
- SPSS Complex Samples 12.0
- SPSS Conjoint 8.0
- SPSS Exact Tests 7.0
- SPSS Interactive Graphics 10.0
- SPSS Maps 10.0
- SPSS Missing Value Analysis 12.0
- SPSS Regressions Models 12.0
- SPSS Tables 11.5
- SPSS Trends 10.0
Option 4. The SPSSX-L mailing list (see also How do I subscribe?)
- A mailing list is managed by a list server (listserv for short). People must register with the listserv in order to receive a copy of the postings. Each time somebody sends an email to the list, everybody who has registered receives a copy of the message. When a person on the list sends a reply to the list, everybody receives a copy. This is a very effective method of communication. There are usually between 1,000 and 1,200 persons registered with the SPSSX-L listserv including some SPSS employees (even though the list is not managed by and does not "belong to" SPSS). The number of postings averages about 350 per month.
- Messages (postings) to the list are archived and may be searched.
- I highly recommend participation to this list to anybody who wants to improve his/her knowledge of SPSS.
Option 5. The newsgroup comp.soft-sys.stat.spss
- This newsgroup includes more than 23,000 posting. The answer to your question may already be there.
- If you are not familiar with newsreaders such as FreeAgent I would recommend that you use google (I do not recommend Outlook). It is very easy to search archives or post messages using google.
- Personally I use a newsreader to send and read posting but I use google to search for past posting.
- The main advantage of using a newsreader is that you can create folders for different categories of topics and keep interesting articles for future reference.
My general experience is that no company can beat the "service" that its users are giving to each other via mailing lists or newsgroups. In other words, if you send a question to both support and to the above newsgroup/mailing list at the same time, the odds are that you will already have received the answer when SPSS gives it to you. This is not a criticism of the quality of SPSS support.
Option 6. This web site
- Consult the Sample Syntax, Macro or Scripts pages or Search for key words.
- As a last resort, send me an email. I read all emails but I do not necessarily answer all of them (there are only 24 hours in a day!). I give preference to problems which I find interesting or new. If I you do not get a reply, it is not because I do not "like" you...However, considering that I monitor the SPSSX-L list and the newsgroup, your best bet is to ask your question through those medium. You then have many more persons who are likely to answer your question (including some SPSS employees). Such solutions also become "naturally" available to the whole community of SPSS'ers. This is consistent with the motto of the defunct dejanews site: Share what you know, learn what you don't.
How do I subscribe/unsubscribe to the SPSSX-L mailing list?
Note that the address to subscribe & unsubscribe is different than the address used to post messages to the list!
To subscribe: send an email to LISTSERV@LISTSERV.UGA.EDU with no subject, no signature, but only the words:
SUB SPSSX-L <your name>
in the body of the message. For instance if your name is John Doe the body of the message would contain
SUB SPSSX-L John Doe
After having registered, you will receive an email from the listserv giving the address to be used to post emails to the list, get copies of earlier posts, unsubscribe, etc. Read and keep that email for future reference.
To unsubscribe: send an email to LISTSERV@LISTSERV.UGA.EDU with no subject, no signature, but only the words:
Note that you need to send that email using the same email address to which you are subscribed. For more information, see the Listserv Reference Card.
Where are the archives of the SPSSX-L list and of the SPSS newsgroup?
a) Most people will want to use the web based archives described in b) below.
"Old timers" may prefer to retrieve messages from LISTSERV@LISTSERV.UGA.EDU using the method described in Listserv Reference Card
For instance sending the command GET SPSSX-L LOG9912 to the LISTSERV would cause the listserv to email you the full text of postings in December 1999. More commands are described in the Listserv Reference Card. Note that it is also possible to do searches, for instance one can ask the listserv to forward a copy of all messages which include the word
!ENDDEFINE or the word covariance and were posted between January 1, 2000 and December 1, 2000 (or any other date range).
b) There are two web based archives of SPSSX-L messages:
The best one is www.listserv.uga.edu it is quick and easy to use. Postings can be sorted by subject, by date or by author by month. There is also a search facility.
The second choice (if the first one is not available) is www.marist.edu (be patient this server is slow; look at the bottom of the page for the search facility).
c) Unlike a mailing list, one does not "register" to a newsgroup. One simply go and either post, read or search the newsgroup. The Web site groups.google.com contains searchable archives of most usenet newsgroups including comp.soft-sys.stat.spss.
Where can I find the formula used by SPSS to calculate statistics XYZ?
Many of the formulas used by SPSS are in Algorithms section of the Support web site, if you have a recent version of SPSS, this information is also available from the CDROM.
What does SPSS stands for?
The following explanation was posted to the SPSS-X list a few years ago:
"For some historical clarification on the term SPSS as used for the software package (as opposed to the company), the original name was indeed Statistical Package for the Social Sciences. When what was to be Release 10 of that package was released in about 1983, the name was changed to simply SPSS-X (actually, the X was a superscript). I have seen nothing in any SPSS materials dated later than the early 1980s that have referred to it as Statistical Package for the Social Sciences or anything other than just SPSS-X. With Release 4.0, it became just SPSS. There was also SPSS/PC+ for DOS, and now of course SPSS for Windows, the first release of which was 5.0.
So my understanding is that the term SPSS as applied to the company means Statistical Product and Service Solutions. As applied to the product or package SPSS, it now just means SPSS.
Principal Support Statistician and
Manager of Statistical Support
Why is my *.spo file so huge??
See AnswerNet ID 18462 (you may use guest as User name and Password).
Note that beginning with version 10, SPSS comes with a defrag utility (This is not the disk utility which comes with window). From Windows Explorer, right click on an *.spo file and select defrag. This sometimes significantly reduces the size of *.spo files.
How many variables and cases can SPSS for Windows handle?
Up to version 10, the regular window version has a maximum of 215 – 1 = 32 767 variables and a maximum of 231 – 1= 2.15 billion cases. The student version is limited to 50 variables and 1,500 cases.
Starting with version 10, the limit on the number of variables has been removed, the only "hard coded" limit is 2.15 billion variables. This does not mean that it would make sense to work with millions of variables. It is always more efficient to load only the variables you currently need.
Here are additional points made by Jon Peck in his 06/05/03 posting to the SPSSX-L list:
In calculating these limits, count one for each 8 bytes or part >thereof of a string variable. An a10 variable counts as two >variables, for example. >Approaching the theoretical limit on the number of variables, >however, is a very bad idea in practice for several reasons. >1. These are the theoretical limits in that you absolutely cannot go >beyond them. But there are other environmentally imposed limits >that you will surely hit first. For example, Windows applications >are absolutely limited to 2GB of addressable memory, and 1GB is a >more practical limit. Each dictionary entry requires about 100 >bytes of memory, because in addition to the variable name, other >variable properties also have to be stored. (On non-Windows >platforms, SPSS Server could, of course, face different >environmental limits.) Numerical variable values take 8 bytes as >they are held as double precision floating point values. >2. The overhead of reading and writing extremely wide cases when you >are doubtless not using more than a small fraction of them will >limit performance. And you don't want to be paging the variable >dictionary. If you have lots of RAM, you can probably reach between >32,000 and 100,000 variables before memory paging degrades >performance seriously. >3. Dialog boxes cannot display very large variable lists. You can >use variable sets to restrict the lists to the variables you are >really using, but lists with thousands of variables will always be awkward. >4. Memory usage is not just about the dictionary. The operating >system will almost always be paging code and data between memory and >disk. (You can look at paging rates via the Windows Task >Manager). The more you page, the slower things get, but the >variable dictionary is only one among many objects that the >operating system is juggling. However, there is another effect. On >NT and later, Windows automatically caches files (code or data) in >memory so that it can retrieve it quickly. This cache occupies >memory that is otherwise surplus, so if any application needs it, >portions of the cache are discarded to make room. You can see this >effect quite clearly if you start SPSS or any other large >application; then shut it down and start it again. It will load >much more quickly the second time, because it is retrieving the code >modules needed at startup from memory rather than disk. The Windows >cache, unfortunately, will not help data access very much unless >most of the dataset stays in memory, because the cache will >generally hold the most recently accessed data. If you are reading >cases sequentially, the one you just finished with is the LAST one >you will want again.
How do I transfer data to/ from SPSS and …
SPSS can automatically (using menu or syntax) read / write data from many different formats including SAS, STATA, Excel, Access, ASCII files.
How do I create an index, ID or key variable?
Three situations are discussed:
(Q first situation) I would like to create a new variable (an ID variable) which would number each case in my data, starting with 1 and ending with n, the number of cases in the file. How can I do that?
(A first situation)
- Using syntax:
COMPUTE id=$CASENUM.(If nothing happens, select TRANSFORM>RUN PENDING TRANSFORM)
- Using the menu: select TRANSFORM>COMPUTE then enter
id in the Target Variable text box and
$casenum in the Numeric Expression text box. Click OK.(If nothing happens, select TRANSFORM>RUN PENDING TRANSFORM)
$CASENUM is a system variable. The easiest method to find the list of all system variables is to open spssbase and search for "$casenum". (Note that pi is not one of the system variable. Use
COMPUTE pi=4*ARTAN(1). to get pi.
(Q second situation) What if I do not have any data in my data file but I would still like to create an ID variable with values between 1 and 10.
(A second situation): This cannot be done using the menu, you need syntax:
- Using syntax:
INPUT PROGRAM. LOOP id=1 TO 10. END CASE. END LOOP. END FILE. END INPUT PROGRAM. LIST.
(Q third situation) I have many cases with the same ID and would like to create a new variable where I would number consecutively those cases (within each ID).
(A third situation): See the syntax "Number consecutive cases within a given ID.sps" in this section.
How do I go about using an alternative syntax editor?
The following zip file and notes were provided by Tom Dierickx (Thanks Tom!) It contains
- Instructions (readme.doc)
- SPSS syntax submitter (tdRunSyntax.exe)
- Template clip library (tdSPSS.tcl)
- Sample keyword list (tdSPSS.syn)
Tom lists the following reasons for using an alternative syntax editor:
Some of the advantages are the following:
1) You can create a "code library" that allows you to maintain/re-use code snippets
2) It has color-coding syntax features
3) It has powerful search-and-replace features (including regular expressions such as finding tabs, etc.)
4) You can have line numbers display in the left margin, making it easier to keep track of where you're at!
5) You can have multiple files open at once with tab selectors at the bottom that allow you to toggle between open documents
6) It has macro capabilities that can record and replay your keystrokes
7) You can setup menu items that run configured tools. For example, I built a small .exe file in VB that takes any selected text and submits it to SPSS. (In essence, just like SPSS's syntax editor).
I could probably think of 101 more reasons, but the moral of the story is that I haven't used the SPSS syntax editor for several months. TextPad is far superior for all of the above reasons and more. In fact, you can use it to code for ANY language - Java, HTML/ASP, C++, etc.
Note that it's also useful to set up a new menu item in SPSS that automatically opens TextPad (with a new, blank .sps document)
I need to reference SPSS in an article I am writing, how should I do that?
The following information is taken from official SPSS FAQ:
For most of our products, you can get the complete product name, version or release number and date of creation from the "About . . ." dialog box that appears under the "Help" menu. For example, if you have SPSS for Windows, you might see this:
SPSS for Windows
Release 10.0.0 (September 1999)
You can then write a bibliographic citation as follows:
SPSS for Windows, Rel. 10.0.0. 1999. Chicago: SPSS Inc.
How can I show, in the table, categories which have *not* been selected by respondents?
For versions earlier than 11.5 see Show empty categories in tables.SPS (2 methods are shown) in this section
For versions 11.5 or over, this option is directly available from syntax of the CTABLES (you'll need Custom Tables module licensed).