XML validation is the process of determining whether the structure, content, and data types of an XML document are valid. XML validation also strips off ignorable whitespace in the XML document.
Validation is optional but highly recommended when data integrity is in question, since it ensures that XML documents abide by the rules provided by their XML schemas on top of being well-formed.
Note that you can validate your XML documents against XML schemas only. You cannot validate an XML document against a DTD.
To validate an XML document, use the XMLVALIDATE function. You can specify XMLVALIDATE with an SQL statement that inserts or updates XML documents in a DB2® database. For automatic validation of XML documents, a BEFORE trigger on an XML column can also invoke the XMLVALIDATE function. To enforce validation of XML documents, you create a check constraint.
Before you can invoke the XMLVALIDATE function, all schema documents that make up an XML schema must be registered in the built-in XML schema repository. An XML document itself does not need to be in a database in order for you to be able to validate it with XMLVALIDATE.
According to the XML standard, whitespace is space characters (U+0020), carriage returns (U+000D), line feeds (U+000A), or tabs (U+0009) that are in the document to improve readability. When any of these characters appear as part of a text string, they are not considered to be whitespace.
Ignorable whitespace is whitespace that can be eliminated from the XML document. The XML schema document determines which whitespace is ignorable whitespace. If an XML document defines an element-only complex type (an element that contains only other elements), the whitespace between the elements is ignorable. If the XML schema defines a simple element that contains a non-string type, the whitespace within that element is ignorable.
<xs:element name="description" minOccurs="0" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string" minOccurs="0" />
<xs:element name="details" type="xs:string" minOccurs="0" />
<xs:element name="price" type="xs:decimal" minOccurs="0" />
<xs:element name="weight" type="xs:string" minOccurs="0" />
…
</xs:complexType>
</xs:element>
The description element
has an element-only complex type because it contains only other elements.
Therefore, whitespace between elements in a description element
is ignorable whitespace. The price element can also
contain ignorable whitespace because it is a simple element that contains
a non-string type.In the XMLVALIDATE function, you can explicitly specify the XML schema document to use for validation. If you do not specify an XML schema document, the DB2 database server looks in the input document for an xsi:schemaLocation or xsi:noNamespaceSchemaLocation attribute that identifies the XML schema document. xsi:schemaLocation or xsi:noNamespaceSchemaLocation attributes are defined by the XML Schema specification, and are called XML schema hints. An xsi:schemaLocation attribute contains one or more pairs of values that help to locate the XML schema document. The first value in each pair is a namespace, and the second value is a hint that indicates where to find the XML schema for the namespace. An xsi:noNamespaceSchemaLocation value contains only a hint. If an XML schema document is specified in the XMLVALIDATE function, it overrides the xsi:schemaLocation or xsi:noNamespaceSchemaLocation attribute.
REGISTER XMLSCHEMA http://posample.org/product.xsd FROM product.xsd \
AS myschema.product
COMPLETE XMLSCHEMA myschema.product
REGISTER XMLSCHEMA http://posample.org/product.xsd FROM product.xsd \
AS myschema.product COMPLETE
CREATE TABLE MyProduct LIKE Product
<product xmlns="http://posample.org" pid=''110-100-01'' >
<description>
<name>Anvil</name>
<details>Very heavy</details>
<price> 9.99 </price>
<weight>1 kg</weight>
</description>
</product>'
Insert into MyProduct
(pid, name, Price, PromoPrice, PromoStart, PromoEnd, description)
values ( '110-100-01','Anvil', 9.99, 7.99, '11-02-2004','12-02-2004',
XMLVALIDATE(? ACCORDING TO XMLSCHEMA ID myschema.product))
<product xmlns="http://posample.org" pid="110-100-01"><description><name>Anvil
</name><details>Very heavy</details><price>9.99</price><weight>1 kg</weight>
</description></product>
The product schema
defines the whitespace around the name, details, price,
and weight elements, and the whitespace within the
price element as ignorable whitespace, so XMLVALIDATE removes it.If you need to ensure that you insert only documents that are validated into an XML column or retrieve only documents that are validated from an XML column, use the VALIDATED predicate.
To test whether or not an XML document has been validated before inserting or updating the document, create a check constraint that contains the VALIDATED predicate on the XML column. To retrieve only validated documents from an XML column, or to retrieve only those documents that have been inserted without validation, you use the VALIDATED predicate in a WHERE clause. If you need to check whether or not an XML document has been validated according to certain XML schemas, include the XML schemas with the VALIDATED predicate in the ACCORDING TO XMLSCHEMA clause.
The VALIDATED predicate can also be used as part of a trigger. To trigger validation of XML documents that have not yet been validated before you insert or update them in an XML column, create a BEFORE trigger that contains the VALIDATED predicate on the XML column in the WHEN clause to invoke the XMLVALIDATE function.
SELECT Info FROM MyCustomer WHERE Info IS VALIDATED
ALTER TABLE MyCustomer ADD CONSTRAINT CK_VALIDATED CHECK (Info IS VALIDATED)
Issuing
this statement, however, makes the use of the VALIDATED predicate
in the previous example unnecessary, as only valid documents would
be successfully inserted or updated in the table.<customerinfo xml:space="default"
xmlns="http://posample.org"
Cid='1011'>
<name>Kathy Smith</name>
<addr country='Canada'>
<street>25 Rosewood</street>
<city>Toronto</city>
<prov-state>Ontario</prov-state>
<pcode-zip>M6W 1E6</pcode-zip>
</addr>
<phone type='work'>416-555-6676</phone>
</customerinfo>
Assume that you have assigned the
document to an application variable. You can use a VALUES statement
like this to do the validation: VALUES XMLVALIDATE(? according to xmlschema id myschema.customer)
This
document is valid according to the XML schema, so the VALUES statement
returns a result table that contains the document. If the document
is not valid, VALUES returns an SQL error.