Cleaning Data

Cleaning your data involves taking a closer look at the problems in the data that you've chosen to include for analysis. There are several ways to clean data using the Record and Field Operation nodes in IBM® SPSS® Modeler.

Table 1. Cleaning data
Data Problem Possible Solution
Missing data Exclude rows or characteristics. Or, fill blanks with an estimated value.
Data errors Use logic to manually discover errors and replace. Or, exclude characteristics.
Coding inconsistencies Decide upon a single coding scheme, then convert and replace values.
Missing or bad metadata Manually examine suspect fields and track down correct meaning.

The Data Quality Report prepared during the data understanding phase contains details about the types of problems particular to your data. You can use it as a starting point for data manipulation in IBM SPSS Modeler.