Batch deployment input details for SPSS models

Follow these rules when specifying input details for batch deployments of SPSS models.

Data type summary table:

Data Description
Type inline, data references
File formats CSV

Data Sources:

Input/output data references:

Notes:

If you are specifying input/output data references programmatically:

Using connected data or connection asset for an SPSS modeler flow job

An SPSS modeler flow can have a number of input and output data nodes. When connecting to a supported database as an input and output data source, note that the connection details are selected from the input and output data reference, but the input and output table names are selected from the SPSS model stream file.

To perform batch deployment of an SPSS model using a database connection, make sure the modeler stream Input and Output nodes are Data Assetor Connection Asset nodes. In SPSS Modeler, the Data Asset or Connection Asset nodes must be configured with the table names that will be used later for job predictions. Set the nodes and table names before you save the model to Watson Machine Learning. While configuring the Data Assetor Connection Asset nodes, choose the table name from the Connections; choosing a Data Asset or Connection Asset that is created in your project is currently not supported.

When creating the deployment job for the SPSS model, make sure the type of data sources are the same for input and output. The configured table names from the model stream will be passed to the batch deployment and the input/output table names provided in the connected data will be ignored.

To perform batch deployment of SPSS model using a Cloud Object Storage (COS) connection, make sure the SPSS model stream has single input and output data asset nodes.

Supported combinations of input and output sources

You must specify compatible sources for the SPSS Modeler flow input, the batch job input, and the output. If you specify an incompatible combination of types of data sources, you will get an error trying to execute the batch job.

These combinations are supported for batch jobs:

SPSS model stream input/output Batch deployment job input Batch deployment job output
File Local/managed or referenced data asset or connection asset (file) Remote data asset or connection asset (file) or name
Database Remote data asset or connection asset (database) Remote data asset or connection asset (database)

For details on how Watson Studio connects to data, see Accessing data.

Specifying multiple inputs

If you are specifying multiple inputs for an SPSS model stream deployment with no schema, specify an ID for each element in input_data_references.

For details, see Using multiple data sources for an SPSS job.

In this example, when you create the job, provide three input entries with ids: "sample_db2_conn", "sample_teradata_conn" and "sample_googlequery_conn" and select the required connected data for each input.

{
"deployment": {
    "href": "/v4/deployments/<deploymentID>"
  },
  "scoring": {
        "input_data_references": [{
               "id": "sample_db2_conn",              
               "name": "DB2 connection",
               "type": "data_asset",      
               "connection": {},
               "location": {
                     "href": "/v2/assets/<asset_id>?space_id=<space_id>"
               },
           },
           {
               "id": "sample_teradata_conn",          
               "name": "Teradata connection",
               "type": "data_asset",      
               "connection": {},
               "location": {
                     "href": "/v2/assets/<asset_id>?space_id=<space_id>"
               },
           },
           {
               "id": "sample_googlequery_conn",        
               "name": "Google bigquery connection",
               "type": "data_asset",      
               "connection": {},
               "location": {
                     "href": "/v2/assets/<asset_id>?space_id=<space_id>"
               },
           }],
        "output_data_references": {
                    "id": "sample_db2_conn",
                "type": "data_asset",
                "connection": {},
                "location": {
                    "href": "/v2/assets/<asset_id>?space_id=<space_id>"
                },
          }
}

Notes The environment variables parameter of deployment jobs is not applicable.

Parent topic: Batch deployment input details by framework