Data classification in the InfoSphere Information Analyzer thin client
A data class is an asset that categorizes database columns and data file fields according to the type of the data and how the data is used. Data classification is the process of assigning a data class to a database column by InfoSphere® Information Analyzer during a column analysis job. Data classification can also be done manually in InfoSphere Information Governance Catalog.
Data classes in InfoSphere Information Server
The list below contains all existing data classes enabled for InfoSphere Information Server. Note: Metadata in the Data that's considering
during classification column includes information such as the name of the column,
inferred types, inferred formats, the number of formats, counts, and the number of distinct values.
If only metadata is considered during the classification, then the actual column data is not taken
into account.
Data Class | Description | Type | Data that is considered during classification |
---|---|---|---|
Account number | A string representing an account number. The column name is analyzed to see if it matches the following regular expression: columnaccount|acc|accnumber|accnum|accno|accountnumber | Regex | Column data |
Address Line 1 | Address Line 1 of a multi-line address.
The address values are classified based on
the following logic:
|
Java | Column data and metadata |
Address Line 2 | Address Line 2 of a multi-line address. | Java | Column data and metadata |
Address Line 3 | Address Line 3 of a multi-line address. | Java | Column data and metadata |
Airport Code | A string representing the IATA airport code. | Value list | Column data |
Alabama State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Alabama. | Regex | Column data |
Alaska State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Alaska. | Regex | Column data |
Alberta Province Driver's License (sub-category of Driver's License) | A string representing the driver's license in the Canadian province Alberta. | Regex | Column data |
American Express Card (sub-category of Credit Card) | A 16-18 character number that identifies an American Express credit card account. | Java | Column data |
Arizona State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Arizona. | Regex | Column data |
Arkansas State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Arkansas. | Regex | Column data |
BIC | A string representing a Business Identifier Code. | Java | Column data |
Boolean | Numeric or alpha code for boolean values. Either 0 or 1, or True or False, Yes or No. | Value list | Column data |
British Columbia Province Driver's License (sub-category of Driver's License) | A string representing the driver's license in the Canadian province British Columbia. | Regex | Column data |
California Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state California. | Regex | Column data |
Canada Post Code | A system of postal codes that are used by Canada Post. | Regex | Column data |
Canada Province Code | Two-letter alphabetic codes that are used to identify Canada provinces and territories. | Value list | Column data |
Canada Province Name | The name of the Canada provinces and territories. | Value list | Column data |
Canadian Social Insurance Number | A social insurance number (SIN) is a number issued in Canada to administer various government programs. | Java | Column data |
City | A name of a place such as a city or town. | Value list | Column data |
Code | Code System-defined data values from a domain data value set, each of which has a specific meaning, for example, product status codes. | Java | Metadata |
Colorado State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Colorado. | Regex | Column data |
Colors | A string representing the name of colors. | Value list | Column data |
Commercial and Government Entity Code | The CAGE code represented by a string of five characters. | Java | Column data and metadata |
Computer Host Name | Hostname is a label that is assigned to a device connected to a computer network and is used to identify the device. | Java | Column data |
Connecticut State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Connecticut. | Regex | Column data |
Country Code | A standard code defined for most of the countries and dependent areas in the world. | Value list | Column data |
Country Name | Specifies the name of any country. | Value list | Column data |
Credit Card Number | A credit card number. | Java | Column data |
Currency | A number followed or following a currency symbol. The following currencies are
supported:
|
Java | Column data |
Current Procedural Terminology | CPT medical code set. | Java | Column data and metadata |
Customer number | A string representing a customer number. | Regex | Column data and metadata |
Date | Data values which are specific date, time, or duration references, for example, a product order date. | Java | Column data |
Date of Birth | Data values which are dates, and represent a date of birth. | Java | Column data and metadata |
Delaware State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Delaware. | Regex | Column data |
Diners Club Card (sub-category of Credit Card) | A 15-18 character number that identifies a Diners Club credit card account. | Java | Column data |
Discover Card (sub-category of Credit Card) | A 17-18 character number that identifies a Discover Card credit card account. | Java | Column data |
Driver's License | A string representing a driver's license. | Regex | Column data and metadata |
DUNS Number | A unique numeric identifier assigned by Dun & Bradstreet (D&B) to a business entity. | Regex | Column data and metadata |
Email Address | An email address identifies an email box to which email messages are delivered. | Java | Column data |
Eye Color | A string representing the eye color of an individual. | Value list | Column data and metadata |
First Name (sub-category of Person Name) | First name of an individual. | Java | Column data |
Florida State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Florida. | Regex | Column data |
Fortune 1000 company (sub-category of Name of an Organization) | A string representing the name of a company from the Fortune 1000 list. | Java | Column data |
French INSEE Number | The INSEE code is a numerical indexing code used by the French National Institute for Statistics and Economic Studies (INSEE) to identify various entities, including communes, départements. | Java | Column data |
Gender | An alpha code setting for gender. Either M or F, or Male or Female. | Value list | Column data |
Geographic Coordinates | A string representing the longitude and latitude in degrees. | Java | Column data |
Georgia State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Georgia. | Regex | Column data |
Germany car registration number | A string representing a registration number for a German car. | Java | Column data |
Hair Color | A string representing the hair color of an individual. | Value list | Column data and metadata |
Hawaii State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Hawaii. | Regex | Column data |
Honorific | Salutation of a person added before the first name (name prefix). | Value list | Column data |
IBAN | A string representing an International Bank Account Number. | Java | Column data |
ICD-10 | The 10th revision of the International Statistical Classification of Diseases and Related Health Problems, a medical classification list by the World Health Organization (WHO). | Java | Column data |
Idaho State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Idaho. | Regex | Column data |
Identifier | Non-intelligent data values that are typically unique, and are used to reference a specific entity, for example, a product number. | Java | Metadata |
Illinois State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Illinois. | Regex | Column data |
INCO Terms (International Commercial Terms) | A 3-characters string representing INCO Terms. | Value list | Column data |
Indiana State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Indiana. | Regex | Column data |
Indicator | Code values that have only two mutually-exclusive values in the domain set, for example, a product Make/Buy indicator, that are often called Flags. | Java | Metadata |
Individual Taxpayer Identification Number | A 9-digit tax processing number issued by the Internal Revenue Service (IRS) to individuals who are required to have a US taxpayer identification number but who do not have, and are not eligible to obtain a SSN. | Regex | Column data and metadata |
International Securities Identification Number | An International Securities Identification Number (ISIN) uniquely identifies a security. | Java | Column data |
International Standard Book Number | The International Standard Book Number (ISBN) is a 13-digit number assigned by standard book numbering agencies to control and facilitate activities within the publishing industry. | Java | Column data |
International Standard Industrial Classification | A string representing International Standard Industrial Classification of All Economic Activities. | Java | Column data and metadata |
Internet Protocol Address | An Internet Protocol address (IP address) is a numerical label assigned to each device (e.g., computer, printer) participating in a computer network that uses the Internet Protocol for communication. | Regex | Column data |
Internet Protocol Version 6 Address | Internet Protocol version 6 (IPv6) is the latest version of the Internet Protocol. | Regex | Column data |
Iowa State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Iowa. | Regex | Column data |
Ireland Eircode | A system of postal codes that are used by An Post, Ireland's postal service. | Regex | Column data |
ISO 3166-2 Code | A string representing ISO 3166-2 code of a state or province of a country. | Value code | Column data |
Italian Fiscal Code | The Italian fiscal code card, officially known as Italy's Codice Fiscale, is the tax code card in Italy. | Regex | Column data |
Japan CB (sub-category of Credit Card) | A 17-18 character number that identifies a Japanese Credit Bureau (JCB) credit card account. | Java | Column data |
Kansas State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Kansas. | Regex | Column data |
Kentucky State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Kentucky. | Regex | Column data |
Latitude | A decimal number or a string representing the latitude in degrees. | Java | Column data and metadata |
Longitude | A decimal number or a string representing the longitude in degrees. | Java | Column data and metadata |
Louisiana State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Louisiana. | Regex | Column data |
Maine State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Maine. | Regex | Column data |
Manitoba Province Driver's License (sub-category of Driver's License) | A string representing the driver's license in the Canadian province Manitoba. | Regex | Column data |
Marital/Civil Status | A string representing the relationship status of an individual. | Value list | Column data |
Maryland State Driver's License | A string representing the driver's license in the US state Maryland. | Regex | Column data |
Massachusetts State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Massachusetts. | Regex | Column data |
Master Card (sub-category of Credit Card) | A 17-18 character number that identifies a Master Card credit card account. | Java | Column data |
Michigan State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Michigan. | Regex | Column data |
Minnesota State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Minnesota. | Regex | Column data |
Mississippi State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Mississippi. | Regex | Column data |
Missouri State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Missouri. | Regex | Column data |
Montana State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Montana. | Regex | Column data |
Month | A string or integer value representing a month in a date. | Java | Column data |
Name of an Organization | A string representing the name of an organization. | Java | Column data |
Name Suffix | Name suffix of a person. | Value list | Column data |
Nebraska State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Nebraska. | Regex | Column data |
Nevada State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Nevada. | Regex | Column data |
New Brunswick Province Driver's License (sub-category of Driver's License) | A string representing the driver's license in the Canadian province New Brunswick. | Regex | Column data |
New Foundland and Labrador Province Driver's License (sub-category of Driver's License) | A string representing the driver's license in the Canadian province New Foundland and Labrador. | Regex | Column data |
New Hampshire State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state New Hampshire. | Regex | Column data |
New Jersey State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state New Jersey. | Regex | Column data |
New Mexico State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state New Mexico. | Regex | Column data |
New York State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state New York. | Regex | Column data |
NoClassDetected | Do Not Delete this Data Class. This Data Class is used to display the count of the number of distinct data values that did not meet any of the other Data Classes which were enabled during the most recent Column Analysis for a given column. | Data | Not applicable |
North Carolina State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state North Carolina. | Regex | Column data |
North Dakota State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state North Dakota. | Regex | Column data |
Nova Scotia Province Driver's License (sub-category of Driver's License) | A string representing the driver's license in the Canadian province Nova Scotia. | Regex | Column data |
Ohio State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Ohio. | Regex | Column data |
Oklahoma State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Oklahoma. | Regex | Column data |
Ontario Province Driver's License (sub-category of Driver's License) | A string representing the driver's license in the Canadian province Ontario. | Regex | Column data |
Oregon State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Oregon. | Regex | Column data |
Passport Number | Passport number is the unique ID assigned to a travel document, usually issued by the government of a nation, that certifies the identity and nationality of its holder for the purpose of international travel. | Regex | Column data |
Pennsylvania State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Pennsylvania. | Regex | Column data |
Percentage | A number representing a percentage. | Regex | Column data |
Person Name | The name of an individual. | Java | Column data |
Prince Edward Island Province Driver's License (sub-category of Driver's License) | A string representing the driver's license in the Canadian province Prince Edward Island. | Regex | Column data |
Quantity | Numerical data values that could be used in a computation, for example, a product price. | Java | Metadata |
Quebec Province Driver's License (sub-category of Driver's License) | A string representing the driver's license in the Canadian province Quebec. | Regex | Column data |
Rhode Island State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Rhode Island. | Regex | Column data |
Routing Transit Number | A 9-digit code, used in the United States, identifying financial institutions. | Java | Column data |
Saskatchewan Province Driver's License (sub-category of Driver's License) | A string representing the driver's license in the Canadian province Saskatchewan. | Regex | Column data |
South Carolina State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state South Carolina. | Regex | Column data |
South Dakota State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state South Dakota. | Regex | Column data |
Spanish Fiscal Identification Number | The NIF is the Spanish tax identification number. | Regex | Column data |
State/Province name | The name of a state or province of a country. | Value list | Column data |
Temperature | A number representing a temperature. | Java | Column data |
Tennessee State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Tennessee. | Regex | Column data |
Texas State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Texas. | Regex | Column data |
Text | Free-form alphanumeric data values from an unlimited domain set, for example, a product description. | Java | Metadata |
UK National Insurance Number | It is a number used in the United Kingdom (UK) in the administration of the National Insurance or Social Security System. | Regex | Column data |
UK Post Code | A system of postal codes that are used by UK's Royal Mail. | Regex | Column data |
UK Province Code | Two-letter alphabetic codes that are used to identify UK provinces. | Value list | Column data |
Uniform Resource Locator | A URL is one type of Uniform Resource Identifier (URI); the generic term for all types of names and addresses that refer to objects on the World Wide Web. | Java | Column data |
Universal Product Code | The Universal Product Code (UPC) is a barcode symbology that is widely used in many countries, for tracking trade items in stores. | Java | Column data |
United States Standard Industrial Classification | A 4-digit number used to classify industries in the United States. | Java | Column data and metadata |
US County | The name of a US county. | Value list | Column data |
US Employer Identification Number | A 9-digit number to identify US employer, typically in nn-nnnnnnn format with dash (-) being optional. Issued by IRS. | Regex | Column data and metadata |
US National Drug Code | A 10-digit code to identify US National Drug Code (NDC) represented either in 4-4-3, 5-3-2 or 5-4-1, often without dashes. UPC-A bar code of NDC coded product package embeds its NDC code. | Java | Column data and metadata |
US Phone Number | US Phone Number is a string of specific numbers that a telephone or cell phone user can dial to reach another telephone or mobile phone in the United States (US). | Regex | Column data |
US Social Security Number | In the United States, a Social Security number (SSN) is a unique 9-digit number issued to US citizens, permanent residents, and temporary (working) residents. | Regex | Column data |
US Social Security Number Last 4 | The last four digits of a United States Social Security Number (SSN). | Regex | Column data and metadata |
US State Capital Name (sub-category of City) | Specifies the name of US states and territories capitals. | Value list | Column data |
US State Code | Two-letter alphabetic codes used to identify US states and certain other associated areas. | Value list | Column data |
US State Name | Specifies the name of US states and territories. | Value list | Column data |
US Zip | US ZIP codes are a system of postal codes used by the United States Postal Service (USPS) since 1963. | Java | Column data |
Utah State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Utah. | Regex | Column data |
Vehicle Identification Number | A vehicle identification number (VIN), also called a chassis number, is a unique code, including a serial number, used by the automotive industry to identify individual motor vehicles. | Java | Column data |
Vermont State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Vermont. | Regex | Column data |
Virginia State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Virginia. | Regex | Column data |
VISA Card (sub-category of Credit Card) | A 17-18 character number that identifies a VISA credit card account. | Java | Column data |
Washington DC Driver's License (sub-category of Driver's License) | A string representing the driver's license in US Washington, DC. | Regex | Column data |
Washington State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Washington. | Regex | Column data |
West Virginia State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state West Virginia. | Regex | Column data |
Wisconsin State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Wisconsin. | Regex | Column data |
Wyoming State Driver's License (sub-category of Driver's License) | A string representing the driver's license in the US state Wyoming. | Regex | Column data |
You can enable additional data classes by following the steps listed here. Those additional data classes are experimental and are disabled by default.