Peter Brownell's page of advice for reading ASCII/RAW/Text data into SPSS 6.1 for UNIX.

This page exists because a great many data archives and providers offer their data in ACSII format with only SPSS (.sps) syntax to attach labels and save in a format that can be converted to other formats (e.g., Stata's .dta format) using a program such as StatTransfer.

SPSS itself is no longer widely used in Academia, largely because its licensing scheme requires a separate license for each machine running the software, even in student labs.

For this same reason, rather than offering the most recent Windows version, most labs that do offer SPSS offer 6.1 for UNIX, as the UNIX version is licensed to run on one machine, but allows multiple users. UNIX users can run a graphical interface version of SPSS 6.1 using an X windows client such as eXceed (Windows) or X11 (Mac OS X) or can be run in batch mode, that is, submitting .sps syntax files and viewing the results in the resulting log (.1st) file.

One source of frustration is that data may be provided with .sps syntax files written for more recent (Windows or Mac) versions or arcane older version, and thus may include commands and/or syntax not supported by SPSS 6.1 for UNIX.

It would be a great boon to quantitative researchers and educators everywhere if we could agree upon and develop a standard syntax for reading and labeling ASCII data and/or a standard (non-ASCII) format for saving data that could be read and saved with any statistical package.

The United States Food and Drug Administration (FDA) has perhaps helped spur this along by requiring that data for New Drug or Device Applications (NDAs) be submitted in SAS Transport format. Thus Stata (and presumably other packages) now includes commands (fdause and fdasave) for reading and writing to FDA (SAS Transport) formats. Unfortunately Data Archives and other data providers have not adopted this format very widely at this time.

READING DATA TO SPSS AND SAVING AS .SAV or .POR

First you will need to download (and most likely unzip) an ASCII dataset and an accompanying .sps syntax file from a source such as ICPSR. I am going to assume that you either downloaded the data to your UNIX home directory or that if you didn't, that you know how to navigate between directories (the command is "cd" - type "man cd" for UNIX's cryptic help file).

Next you will need to open the .sps file with a text editor such as Wordpad (Windows) or Emacs or Pico (UNIX).

Typically the .sps file will contain instructions about the editing the .sps file. You will need to edit it in at least two places. One is the name and path of the ASCII (.dat/.asc/.txt/.raw) file. The other is the path and name of the (.sav or .por) file to which the data will be saved (for later conversion to formats for use with other software).

There is a UNIX command, "pwd", that will return the full path of the current directory. You want to combine this path with the filename to identify the files you want to read and/or create. For example, "/home/foo/hdir/username/Data1234.asc".

Once you've changed the filenames in the .sps file (and anything else it instructs you to do) you should save it (as text).

Then you can run it. I suggest batch mode as my experience with SPSS 6.1 is that it is slow and clunky in Xwindows.

The basic command is

"spss -m yourfilename.sps > yourlogname.1st".

But I suggest that you type

"nohup spss -m yourfilename.sps > yourlogname.1st &".

In the case that you have an exceedingly large file that will take a really long time to process, you can log out, go have a cup of tea, and log back in to get the results.

The results should be two files. If the job is done you will find a log (.1st) file that will report the results of each command you have given SPSS. Here is were you will find any error messages which may provide you with clues about how you might edit the commands in your .sps file.

Hopefully you will also find the .sav or .por file you specified when editing the .sps file. If so, you should still read the log file to makes sure that all your cases transferred without errors.

If you don't find a .sav or .por file, you will definitely need to read the log file to try to figure out what went wrong.

The following webpages offer useful support in trying to make SPSS for UNIX read and save you data:

  • Yale page dealing with SPSS "DATA LIST" syntax
  • USC page dealing specifically with SPSS (UNIX) "DATA LIST" syntax
  • Columbia page on SPSS (UNIX) programming, including sections on "Defining and Reading Data," and "Saving a save file"
  • University of Chicago page on "Data Handling" with info on SPSS (and Stata and SAS)
  • Creating Stata .do files based on SPSS .sps files to read data directly into Stata

    If after reading these web pages, you absolutely cannot make SPSS read your data, you may wish to do the following which will read your data into Stata, but without variable or value labels. That is, you'll have to rely on a codebook (or your ability to read the .sps syntax) to provide you with information about which variable is which and what the values that they take on mean.

    This assumes the data are in fixed format. If your .sps "DATA LIST" command contains a "FREE" (freeform) argument, don't do this!

    You'll be making a Stata ".do" syntax file similar to the .sps file to tell Stata directly how to read the ASCII data in and save it.

    You don't need UNIX to do any of this (unless you're running Stata under UNIX, in which case you can edit the text files in Windows (if you like) and run it in UNIX).

    Open your .sps file in a text editor. Look for the "DATA LIST" command which should be followed by a list of variable names and then numbers and then a period, such as:

    DATA LIST FILE="Data1234.asc"

    /
    ID 1-3
    AGE 4-5
    SEX 6
    Q1 7-8
    Q2 9-10
    Q3 11-15
    Q4 16-18
    .

    It could also have all the variable names and the numbers indicating their locations on one line such as:

    DATA LIST FILE="Data1234.asc"
    /
    ID 1-3 AGE 4-5 SEX 6 Q1 7-8 Q2 9-10 Q3 11-15 Q4 16-18
    .

    COPY the text which contains the variable names and the numbers that follow immediately after them and PASTE them into a new text document. Save this new document as a Text document with a .do suffix, for example "Data123.do". Sometimes Windows will call it "Data123.do.txt" which is annoying and can be fixed, but isn't fatal.

    Next remove any "/" characters and any numbers immediately following.

    If your DATA LIST command has a "RECORDS" flag and you have multiple "records" indicated by "/1", "/2", "/56", etc., *AND* the numbers following the variable names reset after each change of "record" you should look at either the U. Chicago page listed above or Stata's help for INFIX, in the section on "multiple lines."

    If you encounter string (text) variables with "(A)" after them such as:

    NAME 455-468 (A)

    remove the "(A)" and instead put "str" in FRONT of the variable name:

    str NAME 455-468

    You can (must) remove numbers in parentheses, other text indicate unusual variable formats. You can try removing them, but be sure to compare the resulting Stata variables to your codebook (use "describe" or "codebook" in Stata) to make sure they make sense and you haven't lost any implied decimal places.

    Save your work.

    Next take a look at the size of your ASCII data file. How big is it (when unzipped)? You'll need to give Stata at least this much memory to read it in. Try to avoid giving Stata more memory than actually exists in the machine on which it is running, however.

    At the top of your .do file (above the text you pasted from the .sps file) type the text within the quotation marks:

    "set mem XXm" (where XX is the number of megabytes you wish to give Stata.)
    "set more off" (which will stop Stata from pausing every time it fills the screen)
    "#delimit ;" (which tells Stata to wait until it hits a semicolon to end a command)
    "infix"

    Then immediately BELOW the text you pasted from the .sps file type:.

    ", using "H:\Data1234.asc" ;" (where "H:\Data1234.asc" is the path to your ASCII data file within quotes)
    "compress ;"
    "save "H:\Data1234.dta" ;" (where "H:\Data1234.dta" is the Stata format file you wish to create)

    (Note that at the UC Berkeley Social Science Computing Lab your files should be located in the "H:\" drive, which is mapped to "My Documents" at login).

    Save your .do file. It should look like this.

    Open Stata. Go to File -> Do and then browse to you .do file, select it and click "Open."

    Hopefully it will read in your data and save it. If not, you'll have to try to interpret Stata's error messages and make corrections.

    Good Luck.

    P.S. If you have access to SPSS but not StatTransfer, try the SPSS script "spps2stata.sbs," available here .