Following up on my last post, I will now provide an example of Python scripting in terms of SQL optimization. When retrieving data from a database, one may want to limit data to a certain amount of dates. Many of Modeler’s date functions though do not convert to SQL as each database vendor has its own particular functions. What is possible with all databases that support SQL optimization is quoting an absolute date (e.g.) in a select node.

datetouse > '2015-12-01' and datetouse <= '2015-12-08'

But what if you want to on retrieve a relative period – such as the last seven days from the point of execution? That is where scripting can be really helpful so that one not need edit the select node each time the stream is execute.

I have put below a very simple stream – a user input node which services as a proxy for any data source such as a database, a select node and a table node. The goal is to retrieve records just for the past week which are stored in the column “datetouse”. Suppose the select node right now only has the line from above – December 1st to December 8th. While I can edit the select node, let’s say I want to automate this stream and run it every day. I do not want to have to edit the select node every time.

output_for_scripting

Thus I have created a script to only pick the first seven days. I will explain the script in two parts. This first part below calculates the dates. I first import some libraries that are needed to do the calculate.

# import date libraries
from datetime import date, timedelta, datetime

I set today’s date to the field first_date and calculate the field second_date as a week prior.

# calculate raw dates
first_date = date.today()
second_date = first_date + timedelta(days=-7)

The fields ending with form convert the dates to the desired format – which in this case is YYY-MM-DD.

# calculate formatted dates dates
first_date_form = first_date.strftime("%Y-%m-%d")
second_date_form = second_date.strftime("%Y-%m-%d")

Note that this first part is purely Python and could be part of any Python program.
The second part changes the date in the select node and runs the stream. The first two lines are necessary parts of any Python script as it lets Jython know to import the library referencing the Modeler objects and what stream to work with.

# import Modeler libraries and identify stream
import modeler.api
stream = modeler.script.stream()

The next line forms the necessary expression for the select node.

# create select node expression
datetext = "datetouse > '%s' and datetouse <= '%s'" % (second_date_form,first_date_form)

These two lines find and change the select node

# find and change select node
dateselectnode = stream.findByType("select", "DateSelect")
dateselectnode.setPropertyValue("condition",datetext)

Finally, I run the stream

# run stream
dateselectnode.run(None)

I ran this on January 6, 2016 and the select node now contains

datetouse > '2015-12-30' and datetouse <= '2016-01-06'

and the output has the right date.

output_for_scripting

4 comments on"SPSS Modeler Python Scripting Example"

  1. Govardhan Bhure November 13, 2017

    what if we want to install other python libraries and use it..Please help me

    • You cannot add other libraries to Python scripting. However you can add other libraries to Python run in stream — a capability exposed as an extension in 10.0 and as pure code in 18.1.

      • It would really nice to hear that. Can you please explain in little brief. Thanks.

      • SavvvyJain June 14, 2018

        Please direct me to the documentation or some reference where I can add pure python functions for pre processing my data and that code can be used in the stream

Join The Discussion

Your email address will not be published. Required fields are marked *