Suppose you want to generate all two-way crosstabs for a set of variables in your dataset but without creating any of the redundant transposed tables (V2 BY V1 and V1 by V2).  You could, of course, write a lot of CROSSTABS commands, but that would be very tedious and error prone.  You could also write an SPSS Statistics macro that looped over all the variable pairs.  Instead, we will explore using the SPSSINC PROGRAM extension command and generalizing the solution.

It's easy to write a program with a list of variables that generates all the necessary CROSSTAB syntax, but editing an embedded variable list is problematic: it requires a bit of Python knowledge on the part of the user, and it makes the program specific to a particular usage.  The SPSSINC PROGRAM command is designed to allow arguments such as a list of variable names to be passed into a function using the standard Python command line mechanism to retrieve the arguments.  Here is the first version of the program followed by a usage example.  These examples ignore the 20-table limit in CROSSTABS and do not check for bad usage such as specifying only one input variable as that is beside the point here, but such checks could easily be added.


The program defines a function named manytabs.  It retrieves its arguments from the list sys.argv.  Because the first item in the argument list is the name of the program, the variable names start with the second item.  In order to run the program, the user runs SPSSINC PROGRAM with the first parameter being the name of the program to run followed by two or more variable names.  The author could have created an extension command for this program, but this program operates essentially like a built-in command without putting in that extra although modest effort.

In this example, the program is in-line in the syntax window.  However, the program could be a function in a Python module, in which case the user would include the module name in the first parameter, e.g., mylibrary.manytabs.  Although this program does nothing that couldn't be done with an SPSS macro, it has the advantage that if it is in an external  module, there is no need to explicitly load the module in order to run it.  As long as the module is on the Python search path, it will be loaded automatically when needed.

To run this program, the user has to list all the variables  to be crosstabbed.  We can make it smarter by making it use variable metadata such as patterns in the variable names or the variable measurement level to select the variables.  In the second version, the program automatically crosstabs all of the variables of specified measurement levels.  The command parameters are now measurement levels instead of variable names.




This time the variable list is built by filtering the set of variables by their measurement levels.  (In a real version, we would issue a message and stop if no variables matched the specified types rather than mysteriously producing no output or a puzzling message.)  The program is now more general in a certain respect, but it is no more complicated than it was before.  This version could not be duplicated by a reasonable SPSS macro.

If the program arguments are more complex, say a combination of variable names or types and other crosstab specifications, the author could use keywords and standard library Python tools such as the argparse module to interpret the request.  The extension command mechanism would be a better solution if the parameters are complex, however.

The SPSSINC PROGRAM extension command is available in the Extension Commands collection of the SPSS Community website or directly here.

In summary, this simple mechanism allows you to use Python programs as SPSS Statistics commands that take parameters using standard Python command line conventions with no extra work.

Join The Discussion

Your email address will not be published. Required fields are marked *