Big Data/Analytics Zone is brought to you in partnership with:

Daniel have developed support for proprietary technology which is processing huge volumes of XML data. Now is working on Register of traffic accidents(J2EE application build on Websphere) for Ministry of Interior of the Slovak Republic. Daniel has posted 9 posts at DZone. You can read more from them at their website. View Full User Profile

SQLX - From DB Straight to XML and Back

05.28.2013
| 7312 views |
  • submit to reddit

Querying db tables, getting XML instead of resultset

In most of today's enterprise apps there's a constantly changing area keeping developers busy every time the data model is revised: the part of code where DB tables are mapped to the XML structure. The way it is done varies from one case to another but the goal is always the same - create XML from relational data and the harder part do it backward, update of db based on incoming data in the XML format.


Possible solutions:

  • Probably the worst solution you can choose is query db, interpret result set and construct the DOM in the raw code. and then code it all again and read DOM interpret it and create update statement, Why? You have just generated huge amount of the glue code and mapping is very hard to maintain.

  • Better approach is create isolated business objects in the respect of desired XML structure, fill them with data from the query and serialize/marshall them to XML with JAXB. Now it is much simpler to implement it backward. You can even enrich that solution with JPA but be careful, unmarshaling JPA entities from XML isn’t good idea when doing update!  

  • Cool and trendy would be using XQuery support in your DB server. What about querying DB data and instead of result set get  desired XML document? But there are buts. XQuery is young it became a W3C Recommendation on January 23, 2007. Support in DB machines is still evolving and by my opinion the most important “but” is that  it is still mixing two languages(SQL and XQuery) to achieve one query. It is promising but...

  • My favorite SQL/XML or shortened SQLX we will get back to it right after short XQuery example.

Be cool with XQuery

“But” implementation differs between Oracle DB and IBM DB2. Example below is for DB2, on Oracle DB you can not(you can with sqlplus but that is different story) replace SQL statement with XQuery directly, you have to use XMLQuery() and XMLTable() functions and actually select from XQuery result. This embedding of XQuery in SQL statement is possible in both with little differences.

Oracle
SELECT XMLQUERY('$y/desc/title' 
       PASSING indoc AS "y" RETURNING CONTENT) 
INTO titlexml FROM HR.HR_DOCS;
DB2
SELECT XMLQUERY('$y/desc/title' 
       PASSING indoc AS "y") 
INTO titlexml FROM HR.HR_DOCS;
In DB2 it is possible embed an SQL query in XQuery statement like in this example
xquery
<ErrorCatalog>{
for $y in 
db2-fn:sqlquery(
'SELECT XMLELEMENT(NAME "TmpErr",
        XMLATTRIBUTES(e.ERRCODE AS "code"),
        XMLTEXT(e.DESC)) FROM CC.CC_ERROR e')
where $y/@code=404
return (
   <Error> {$y/@code} {$y/text()}</Error>
)}</ErrorCatalog>
Output
<ErrorCatalog>
  <Error code="404">
    The requested resource could not be found 
    but may be available again in the future. 
    Subsequent requests by the client are 
    permissible.
  </Error>
</ErrorCatalog>
If you go deep enough you will find that reversed embedding XQuery in SQL is on the other hand little more convenient on the Oracle DB. But all the way you are trying use XQuery you encounter XMLELEMENT() and XMLFOREST() and XMLAGG() and so on. And that is SQL/XML, the tool with which SQL itself can create XML.

SQL/XML

Brief history of SQL/XML

It all started in October of 2000 when industry consortium called SQLX Group started its work on extension of the SQL support for querying of XML from database. Founding industries of the SQLX Group are famous names like Hewlett-Packard, IBM, Microsoft, Oracle, Sybase and many others. First revision of SQL came 3 years later, SQL:2003-14 has brought basic XML features as we know them, in the years later XQuery started to appear joining the game. Function XMLTABLE() came in SQL/XML:2006 revision.

Lets start with simple select

We will use XMLELEMENT function and XMLAGG which aggregates all found rows under one parent element. XMLAGG is really strong tool, because with GROUP BY  it is possible to sort grouped rows as elements under different parents. But leave it for now.

SELECT XMLELEMENT(NAME "PhoneBook", -- root element name
                  XMLAGG(-- aggregation over the rows	
                  XMLELEMENT(NAME "Contact",                                                  XMLATTRIBUTES(cust.FIRST_NAME AS "Name",
                                cust.TEL)
                              )
                        )
                  )
FROM TMP.CUSTOMER AS cust;
Output
<PhoneBook>
    <Contact Name="Daniel" TEL="788255855"/>
    <Contact Name="Martin" TEL="889665447"/>
    <Contact Name="Eva"    TEL="111222333"/>
    <Contact Name="Alena"  TEL="444555666"/>
    <Contact Name="Oliver" TEL="777888999"/>
    <Contact Name="George" TEL="444882446"/>
    <Contact Name="Jamie"  TEL="123456789"/>
</PhoneBook>
Well to see output right away we have to serialize it with:

XMLSERIALIZE(XMLELEM...) AS VARCHAR(500))

or:

XMLSERIALIZE(XMLELEM...) AS CLOB(500) INCLUDING XMLDECLARATION)

Including xml declaration adds <?xml version... so it became full grown XML document but it works only on DB2. Another advantage of serializing xml is that we don't need any special new jdbc driver which is aware of XML types.

Namespaces

Function XMLNAMESPACES() is available only in DB2 8.2 and higher, strangely on Oracle you can use  XMLNAMESPACES() in EXTRACT() but not in XMLELEMENT(), on Oracle and PostgreSql use XMLATTRIBUTES() function and add xmlns manually. Default namespace in DB2
SELECT XMLSERIALIZE(CONTENT 
        XMLELEMENT(NAME "PhoneBook",
        XMLNAMESPACES(DEFAULT 'http://daniel.kecovi.cz/tmp/1.0'),
        XMLAGG(
        XMLELEMENT(NAME "Contact",
         XMLATTRIBUTES(cust.FIRST_NAME||' '||cust.LAST_NAME AS "Name",
                                             cust.TEL)
                              )
                  ))AS VARCHAR(500))
FROM TMP.CUSTOMER AS cust;
Default namespace in Oracle
SELECT XMLSERIALIZE(CONTENT 
        XMLELEMENT(NAME "PhoneBook",
        XMLATTRIBUTES('http://daniel.kecovi.cz/tmp/1.0' AS "xmlns"),
         XMLAGG(
          XMLELEMENT(NAME "Contact",
           XMLATTRIBUTES(cust.FIRST_NAME||' '||cust.LAST_NAME AS "Name",
                                             cust.TEL)
                              )
                  ))AS VARCHAR(500))
FROM TMP.CUSTOMER AS cust;
Output
<PhoneBook xmlns="http://daniel.kecovi.cz/tmp/1.0">
    <Contact Name="Daniel Kec"   TEL="788255855"/>
    <Contact Name="Martin Kec"   TEL="889665447"/>
    <Contact Name="Eva Kec"      TEL="111222333"/>
    <Contact Name="Alena Kec"    TEL="444555666"/>
    <Contact Name="Oliver Kec"   TEL="777888999"/>
    <Contact Name="George Takei" TEL="444882446"/>
    <Contact Name="Jamie Oliver" TEL="123456789"/>
</PhoneBook>
Namespace declaration is different on oracle and postgre, you have to manually add attribute this way:  XMLATTRIBUTES('http://daniel.kecovi.cz/tmp/1.0' AS "xmlns")

Do I have to repeat myself?

There is a thing with which you are going to meet when building some larger documents. One element which is on multiple different places in the document tree. You don’t have to write redundant code in your SQL query, because there is WITH. With clause allows you to create common table expression before query itself, so why not use it as a function? Instead of pre preparing table we can just pre prepare table with reusable XML fragment.

Tested on DB2 9.7 and Oracle 11g only difference is dummy table name
WITH 
-- prepared to create Customer element, 
-- lastName column is possible filtering
tmpCustomer(lastName,custEl) AS(
SELECT LAST_NAME,XMLELEMENT(NAME "Customer",
                  XMLATTRIBUTES(FIRST_NAME||' '||LAST_NAME AS "FullName"
                                   )
                    )
FROM TMP.CUSTOMER cust
)
-- main select
SELECT XMLSERIALIZE(
        CONTENT
        XMLELEMENT(NAME "CustomerList",
         -- using prepared Customer
         (SELECT XMLAGG(t1.custEl) FROM tmpCustomer t1),
         XMLELEMENT(NAME "KecFamily",
         -- using prepared Customer 
         -- and filtering only those with last name kec
          (SELECT XMLAGG(t2.custEl) 
           FROM tmpCustomer t2 
           WHERE t2.lastName='Kec')
                    )                                     
                  )AS CLOB)
-- in this example using dummy table
-- (dual for oracle and SYSIBM.SYSDUMMY1 for db2)
--FROM DUAL;
FROM SYSIBM.SYSDUMMY1;
Output
<CustomerList>
    <Customer FullName="Daniel Kec" />
    <Customer FullName="Martin Kec" />
    <Customer FullName="Eva Kec" />
    <Customer FullName="Alena Kec" />
    <Customer FullName="Oliver Kec" />
    <Customer FullName="George Takei" />
    <Customer FullName="Jamie Oliver" />
    <KecFamily>
        <Customer FullName="Daniel Kec" />
        <Customer FullName="Martin Kec" />
        <Customer FullName="Eva Kec" />
        <Customer FullName="Alena Kec" />
        <Customer FullName="Oliver Kec" />
    </KecFamily>
</CustomerList>


Ah please I need variables

Little help with variables brings WITH.

On DB2 simple like this

WITH 
vars(
DATETIMEMASK,               -- datetime mask
DATEMASK,                   -- date mask
TIMEMASK                    -- only time mask
) 
AS (VALUES(                 
'DD.MM.YYYY HH24:MI',     
'DD.MM.YYYY',               
'HH24:MI'))       
-- main select
SELECT XMLSERIALIZE(
        CONTENT
        XMLELEMENT(NAME "CustomerList",
          XMLAGG(
          XMLELEMENT(NAME "Customer",
          XMLATTRIBUTES(c.FIRST_NAME AS "Name",
TO_CHAR(c.LAST_ORDER,vars.DATEMASK) AS "LastOrder"
                                 )
                             )                            
                          )
                  )AS CLOB)
FROM vars,TMP.CUSTOMER c;

Oracle need SELECT something 

WITH 
vars(
DATETIMEMASK,           -- datetime mask
DATEMASK,               -- date mask
TIMEMASK                -- only time mask
) 
AS (SELECT                
'DD.MM.YYYY HH24:MI',     
'DD.MM.YYYY',               
'HH24:MI' 
FROM DUAL)                 
-- main select
SELECT XMLSERIALIZE(
        CONTENT
        XMLELEMENT(NAME "CustomerList",
         XMLAGG(
         XMLELEMENT(NAME "Customer",
         XMLATTRIBUTES(c.FIRST_NAME AS "Name",
TO_CHAR(c.LAST_ORDER,vars.DATEMASK) AS "LastOrder"
                                 )
                             )                            
                          )
                  )AS CLOB)
FROM vars,TMP.CUSTOMER c;

Output

<CustomerList>
    <Customer Name="Daniel" LastOrder="22.05.2013" />
    <Customer Name="Martin" LastOrder="22.05.2013" />
    <Customer Name="Eva"    LastOrder="22.05.2013" />
    <Customer Name="Alena"  LastOrder="22.05.2013" />
    <Customer Name="Oliver" LastOrder="22.05.2013" />
    <Customer Name="George" LastOrder="22.05.2013" />
    <Customer Name="Jamie"  LastOrder="22.05.2013" />
</CustomerList>

Expression as element name

Maybe you wonder how to make an element with name other than constant. On Oracle you can use EVALNAME() function:

SELECT XMLSERIALIZE(CONTENT 
        XMLELEMENT(NAME "PhoneBook",   
         XMLAGG(
          XMLELEMENT(EVALNAME(cust.FIRST_NAME),
           XMLATTRIBUTES(cust.LAST_NAME AS "LastName",
                                             cust.TEL)
                              )
                  ))AS VARCHAR(500))
FROM KYTYR.CUSTOMER cust;

And back to the Table

We have been dealing only with querying tables until now but what about inserts and updates? It is no problem with XMLTABLE which enable us to map the XML document to temporary table with XPath. Then it’s old plain SQL again.

Example of updates which works on both Oracle 11g and DB2 9.7

INSERT INTO TMP.DISCOUNT_CODE 
(SELECT discount_code, rate 
 FROM XMLTABLE('$doc/discount' PASSING 
                XMLPARSE(DOCUMENT 
'<discount>
 <code>H</code>
 <rate>5</rate>
 </discount>') as "doc"
       COLUMNS
          discount_code CHAR    PATH 'code',
          rate          DECIMAL PATH 'rate' ));

UPDATE TMP.DISCOUNT_CODE
SET (discount_code, rate) = (
    SELECT discount_code, rate
    FROM XMLTABLE('$doc/discount' PASSING 
                  XMLPARSE(DOCUMENT 
'<discount>
 <code>H</code>
 <rate>6</rate>
 </discount>') as "doc"
       COLUMNS
          discount_code CHAR    PATH 'code',
          rate          DECIMAL PATH 'rate' )
) WHERE discount_code = 'H';

Recreating of the sample table TMP.CUSTOMER, tested on DB2 9.7 and ORACLE 11g

CREATE TABLE TMP.CUSTOMER
(
ID int NOT NULL PRIMARY KEY,
FIRST_NAME varchar(255),
LAST_NAME  varchar(255),
LAST_ORDER TIMESTAMP,
TEL        DECIMAL(20)
);


INSERT INTO TMP.CUSTOMER(ID, FIRST_NAME, LAST_NAME,LAST_ORDER, TEL)
(SELECT id,first_name,last_name,CURRENT_TIMESTAMP,tel 
 FROM XMLTABLE('$doc/PhoneBook/Contact' PASSING 
                         XMLPARSE(DOCUMENT 
'<PhoneBook>
    <Contact id="1"        
             Name="Daniel" 
             Surname="Kec" 
             TEL="788255855"/>
    <Contact id="2"        
             Name="Martin" 
             Surname="Kec" 
             TEL="889665447"/>
    <Contact id="3"        
             Name="Eva"    
             Surname="Kec"    
             TEL="111222333"/>
    <Contact id="4" 
             Name="Alena"  
             Surname="Kec"    
             TEL="444555666"/>
    <Contact id="5" Name="Oliver" 
             Surname="Kec"    
             TEL="777888999"/>
    <Contact id="6" 
             Name="George" 
             Surname="Takei"  
             TEL="444882446"/>
    <Contact id="7" 
             Name="Jamie"  
             Surname="Oliver" 
             TEL="123456789"/>
</PhoneBook>') as "doc"
       COLUMNS
          id         INT         PATH '@id',
          first_name VARCHAR(50) PATH '@Name',
          last_name  VARCHAR(50) PATH '@Surname',  
          tel        DECIMAL(20) PATH '@TEL' ));

TMP.CUSTOMER
ID FIRST_NAME LAST_NAME LAST_ORDER TEL
1 Daniel Kec 2013-05-21 10:28:26.614 788255855
2 Martin Kec 2013-05-21 10:30:09.294 889665447
3 Eva Kec 2013-05-16 10:30:10.422 111222333
4 Alena Kec 2013-05-21 10:30:13.550 444555666
5 Oliver Kec 2013-05-01 10:30:14.486 777888999
6 George Takei 2013-02-05 10:30:19.246 444882446
7 Jamie Oliver 2012-07-17 10:30:23.174 123456789


Comparison of XML functions support across DB machines

Comparison is not easy because lot of functions can be replaced by different approach or another function. So missing “+” in the table below means in the most cases that function has an alternative, for example XMLCOLATTVAL is almost synonym to XMLFOREST, XMLROOT is almost same thing as XMLDOCUMENT and so on.

Date of release 2004 2006 2009 2012 2011 2003 2005 2010
funtions DB2 8.2 DB2 9.1 DB2 9.7 DB2 10.1 PostgreSQL 9.1 Oracle 10.1 Oracle 10.2 Oracle 11.2
XMLELEMENT + + + + + + + +
XMLNAMESPACES + + + +



XMLATTRIBUTES + + + + +
+ +
XMLAGG + + + + + + + +
XMLFOREST + + + + + + + +
XMLCOLATTVAL




+ + +
XMLSEQUENCE




+ + +
XMLSERIALIZE + + + +

+ +
XMLCONCAT + + + + + + + +
XMLTABLE
+ + +

+ +
XMLPARSE
+ + +

+ +
XMLDOCUMENT
+ + +



XMLROOT



+
+ +
XMLCOMMENT
+ + + +
+ +
XMLPI
+ + + +
+ +
XMLCDATA





+ +
XMLQUERY
+ + +

+ +
EXTRACT




+ + +
UPDATEXML




+ + +
XMLTEXT
+ + +



XMLVALIDATE
+ + +



XMLXSROBJECTID
+ + +



XMLGROUP

+ +



XMLROW

+ +



XSLTRANSFORM

+ +



XMLTRANSFORM




+ + +
XPATH



+


EXISTSNODE




+ + +
XMLEXISTS
+ + + +

+
XMLISVALID






+
XPATH_EXISTS



+


IS DOCUMENT



+


xml_is_well_formed



+


XMLTYPE




+ + +
I’m sorry for possible mistakes in this table it took hard googling to create comparison like this, if you find any inaccuracies leave the comment and I will fix it.

Resources:

Published at DZone with permission of its author, Daniel Kec.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)