In my last post I wrote about my new extension command, SPSSINC TRANS.  That command makes it very easy to apply Python functions to the case data by handling all the data passing, variable creation, etc, so you just have to write one line of Python code to call the function.

I have now posted a substantial rework of the initial beta version.  As the saying goes, plan to throw one away: you will anyway.  The difficult part of designing and implementing this command was getting the Python function expression through the SPSS Universal Parser, which doesn’t speak Python, and then taking it apart and setting up the requisite connections with the data.

My first version was based on regular expressions to extract the parameters and PASW Statistics variable names.  That worked well enough for what I originally had in mind, although the re’s were a bit complicated.  But as I explored the sorts of functions that would be useful with this facility, the problem got more complicated.

  • I wanted to support functions that did not have named parameters for everything.  The original implementation required function parameters to be specified in the style
    parm=variable.  But many of the built-in Python functions only accept positional arguments.
  • I wanted to support lists as parameters so that a bunch of variables could be passed in as a single parameter.
  • I wanted to support other more complicated expressions as parameters.

As I thought this over, I realized that instead of my trying to parse Python code, I should let Python do it.  Python has a compile function that can compile an expression such as a function call.  This is then evaluated using the eval function.  Just what I needed.  So I ripped out all the original code that sort of parsed the function call expression and used compile to set it up.

It took me a little while to get the hang of how to use compile and what it produces – not the best documentation you might find.  The issues were how to know what to import to make the function call valid and how to figure out which parameter values needed to be satisfied by Statistics variables.  And a little bit of error handling code to help the user when something isn’t right.

Got all that worked out, so now the command is much more general, and the implementation code is shorter and more robust.  Should have thought of this the first time.  And because function parameters can now be more general expressions, I axed the ASINTEGER subcommand in favor of just using int(x) in the parameter expression if that capability was needed, which would be infrequent anyway.

Because the code has to pass through the Universal Parser, there are still some expressions that will not work, but you can quote the entire function call expression and be protected from that if needed.

The new version, still considered a beta, is now posted to Developer Central.

So, once again, my hat is off to the Python designers: just about everything in the language is open to use in ordinary program code.  Just a bit more work on the documentation, please.

Join The Discussion

Your email address will not be published. Required fields are marked *