Chapter 3. XML Data Manipulation

Knowledge to identify the results of XML parsing

[Note]

XML parsing is the process of converting XML data from its serialized string format to its hierarchical format.

You can let the DB2 database manager perform parsing implicitly, or you can perform XML parsing explicitly.

Implicit XML parsing occurs in the following cases:

  • When you pass data to the database server using a host variable of type XML, or use a parameter marker of type XML.

    The database server does the parsing when it binds the value for the host variable or parameter marker for use in statement processing.

    You must use implicit parsing in this case.

  • When you assign a host variable, parameter marker, or SQL expression with a string data type (character, graphic or binary) to an XML column in an INSERT, UPDATE, DELETE, or MERGE statement. The parsing occurs when the SQL compiler implicitly adds an XMLPARSE function to the statement.

You perform explicit XML parsing when you invoke the XMLPARSE function on the input XML data. You can use the result of XMLPARSE in any context that accepts an XML data type. For example, you can use assign the result to an XML column or use it as a stored procedure parameter of type XML.

XML parsing and whitespace handling

During implicit or explicit XML parsing, you can control the preservation or stripping of boundary whitespace characters when you store the data in the database.

According to the XML standard, whitespace is space characters (U+0020), carriage returns (U+000D), line feeds (U+000A), or tabs (U+0009) that are in the document to improve readability. When any of these characters appear as part of a text string, they are NOT considered to be whitespace.

Boundary whitespace is whitespace characters that appear between elements. For example, in the following document, the spaces between <a> and <b> and between </b> and </a> are boundary whitespace:


<a> <b> and between </b> </a>
					
					

With explicit invocation of XMLPARSE, you use the STRIP WHITESPACE or PRESERVE WHITESPACE option to control preservation of boundary whitespace. The default is stripping of boundary whitespace.

Assume that the following document is in BLOB host variable blob_hostvar:


<customerinfo xml:space="default" xmlns="http://posample.org" Cid='1009'>
  <name>Kathy Smith</name>
  <addr country='Canada'>
    <street>15 Rosewood</street>
    <city>Toronto</city>
    <prov-state>Ontario</prov-state>
    <pcode-zip>M6W 1E6</pcode-zip>
  </addr>
  <phone type='work'>416-555-4444</phone>
</customerinfo>
					
					
In a static embedded C application, insert the document from the host variable into XML column Info of table MyCustomer. The host variable is not an XML type, so you need to execute XMLPARSE explicitly. Specify STRIP WHITESPACE to remove any boundary whitespace:
EXEC SQL BEGIN DECLARE SECTION;
   SQL TYPE BLOB (10K) blob_hostvar;
EXEC SQL END DECLARE SECTION;
...
EXEC SQL INSERT INTO MyCustomer (Cid, Info) 
  VALUES (1009,
  XMLPARSE(DOCUMENT :blob_hostvar STRIP WHITESPACE));
					
The document contains the xml:space="default" attribute, so the XMLPARSE specification of STRIP WHITESPACE controls whitespace handling. This means that the carriage returns, line feeds, and spaces between the elements in the document are removed.

If you retrieve the stored data, you see a single line with the following content:


<customerinfo xml:space="default" xmlns="http://posample.org" Cid='1009'>
<name>Kathy Smith</name><addr country='Canada'><street>15 Rosewood</street>
<city>Toronto</city><prov-state>Ontario</prov-state><pcode-zip>M6W 1E6</pcode-zip>
</addr><phone type='work'>416-555-4444</phone></customerinfo>

					

Professional hosting     Belorussian informational portal         Free SCWCD 1.4 Study Guide     Free SCDJWS 1.4 Study Guide     SCDJWS 1.4 Quiz     Free IBM Certified Associate Developer Study Guide     IBM Test 000-287. Enterprise Application Development with IBM WebSphere Studio, V5.0 Study Guide     SCDJWS 5.0 Quiz