When using Regression in an Auto-Numeric modeling node, the regression models in the generated nugget indicate that there are errors in column numbers greater than the number of data columns.
My data set has 25 columns yet the 'Log' (under Advanced tab) of each generated model indicates errors in column numbers well beyond 25. No idea how to interpret this.
The behavior happens regardless of of the "Method" parameter.
I've figured out what was causing the errors, but not sure if valid reason. Turns out to be the use of reading in missing values in Var. File source node.
Variable X consists of unquoted numbers but is stored as string because of the occurrence of "NULL" (unquoted) in the data. I enabled definition of blanks and specified NULL as a Missing Value. I then over-rode the storage type to Integer.
To resolve the issue, though in a way I consider unsatisfactory, I went back to string storage and disabled definition of blanks. I then used a Filler node to replace "NULL" with "" followed by another Filler to convert to number.
Seems like I should be able to specify in the Source node that "NULL" is an empty value. It doesn't work. Always reads in "NULL".
When using the "Var. File" source node to read data from a text file in SPSS Modeler, there are a number of different settings and options available. The problem that you are encountering here is caused by your source data having text in a field that you were expecting to be numeric.
The solution that you have implemented seems to be just fine and should get the job done.
The "Var. File" source node will scan a number of lines of the selected file to determine the the fields and their storage. The default for this is set to 50 lines, but can be changed on the "File" tab of the node.
The "Data" tab will display each field and the pre-determined storage of that field after scanning the file, but you also have the option to "Override" any of these by checking the "Override" box for the field and then selecting a different storage type.
In your case, the presence of the value "NULL" in the field would have caused the node to set the storage to "String". If you were to override this setting and select "Integer", Modeler would read the field as integers, discard any values that could not be converted to an integer value and set these as undefined, which is displayed in output as "$null$".