The One-source Match stage matches records from a single
source file.
An example of grouping records might be that you locate all records
that apply to the same individual, household, or event. In addition,
you might deduplicate a file to group all invoices for a customer
or merge a mailing list.
The One-source Match stage accomplishes the following actions.
- Categorizes all records with weights above the match cutoff as
a set of duplicates.
- Identifies a master record by selecting the record within the
set that matches to itself with the highest weight. The master record
is associated with its set of duplicates.
- Determines that records that are not part of a set of duplicates
are nonmatched records. The nonmatched and master records are generally
made available for the next pass.
- Excludes duplicates in subsequent passes. However, you can choose
the Independent match type if you want duplicates to be included in
subsequent passes.
The output of the One-source Match stage can include master records,
duplicates above the match cutoff, clerical duplicates, nonmatched
records, and statistics about the results of the matching process.
You can use this output as input to the Survive stage.