XSL Transforms in the database
Previously, I wrote on how to extract XPath refs from an arbitrary XML document. Well, you can actually do this inside a database too - specifically Oracle 9i/10g with built-in XMLDB support.
Say we have XML data and XSL templates stored in a simple table:
CREATE TABLE x1 (item varchar(25) primary key, xml xmltype);
Where the data is stored with item="data", and the XSL template to extract paths to text is stored as item="xsl-to-text", then our transform may be executed as simply as this:
select
XMLTransform(
xml,
(select xml from x1 where item='xsl-to-text')
).getstringval() into v_out_text
from x1 where item='data';
dbms_output.put_line(v_out_text);
A full sample script is available here.
read more and comment..
Extracting XPath refs from an XML document
I was inspired by a recent post in the XMLDB Forum to look at the question of how to extract a complete list of XPaths and the associated text node values from an arbitrary XML file. I looked into an XSLT approach which I'll describe here.
Say we have an XML file like this:
<?xml version="1.0" encoding="ISO-8859-1"?>
<Library>
<Books>
<Book>
<Author>
<Last>Perry</Last>
<First>Anne</First>
</Author>
<Title>Long Spoon Lane</Title>
</Book>
</Books>
<Members>
<Member>
<Name>Paul</Name>
<Joined>2005-11-01</Joined>
</Member>
</Members>
</Library>
And our objective is to produce a listing like this:
/Library/Books/Book/Author/Last():Perry
/Library/Books/Book/Author/First():Anne
/Library/Books/Book/Title():Long Spoon Lane
/Library/Members/Member/Name():Paul
/Library/Members/Member/Joined():2005-11-01
After some investigation and reference to sites like Path Tracing and the XSLT 1.0 spec I arrived at what I think is the simplest xsl possible:
<?xml version="1.0" encoding="windows-1252" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements = "*" />
<xsl:template match="text()">
<xsl:for-each select="ancestor-or-self::*">
<xsl:text>/</xsl:text>
<xsl:value-of select="name()" />
</xsl:for-each>
<xsl:text>():</xsl:text>
<xsl:value-of select="." />
<xsl:text>
</xsl:text>
<xsl:apply-templates/>
</xsl:template>
</xsl:stylesheet>
What is going on here?
Well, firstly note that we strip-spaces and then match on all text() nodes - this ensures we skip all the pure whitespace nodes.
The magic that generates the XPath is the the "for-each" over all "ancestor-or-self" elements which generates the XPath identifier. Then we simply add the text value on the end.
A variation on the XSL template that produces an XML structure instead of text is as follows. It really varies just in terms of output formatting:
<?xml version="1.0" encoding="windows-1252" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"/>
<xsl:strip-space elements = "*" />
<xsl:template match="/">
<items>
<xsl:apply-templates/>
</items>
</xsl:template>
<xsl:template match="text()">
<item>
<path>
<xsl:for-each select="ancestor-or-self::*">
<xsl:text>/</xsl:text>
<xsl:value-of select="name()" />
</xsl:for-each>
</path>
<value>
<xsl:value-of select="." />
<xsl:apply-templates/>
</value>
</item>
</xsl:template>
</xsl:stylesheet>
read more and comment..
Handling namespaces with DBMS_XMLDOM
The PL/SQL package dbms_xmlgen has been around for a while now, but it sort of suffers from a lack of doc and examples. Getting it to handle namespaces properly is a good example - it seems like it should be a bit more intelligent than it actually is!
Let's say we wanted to generate this:
<?xml version="1.0"?>
<a:gadgets xmlns:a="uri:a" xmlns:b="uri:b" >
<b:phones/>
</a:gadgets>
Part of the solution would look like this (enough to illustrate some key points):
...Here are the few guidelines I've "inferred" about DBMS_XMLDOM behaviour (note that this is all my reverse-engineered understanding, so I may be well off base!):
V_CURRENT_EL := DBMS_XMLDOM.CREATEELEMENT(doc => V_DOMDOCUMENT, tagName => 'gadgets', ns => 'uri:a');
DBMS_XMLDOM.SETATTRIBUTE(elem => V_CURRENT_EL, name => 'xmlns:a', newvalue => 'uri:a');
DBMS_XMLDOM.SETATTRIBUTE(elem => V_CURRENT_EL, name => 'xmlns:b', newvalue => 'uri:b');
V_CURRENT_CHILD_NODE := DBMS_XMLDOM.makeNode(V_CURRENT_EL);
DBMS_XMLDOM.SETPREFIX(n => V_CURRENT_CHILD_NODE, prefix => 'a');
V_CURRENT_NODE := DBMS_XMLDOM.APPENDCHILD(n => V_CURRENT_NODE, newChild => V_CURRENT_CHILD_NODE);
...
1. namespaceURI/ns params of createDocument, createElement methods just declare the namespace of the entity and have no practical impact on the generated xml
So the xml output would be the same even if you changed line 1 of the code above to:
V_CURRENT_EL := DBMS_XMLDOM.CREATEELEMENT(doc => V_DOMDOCUMENT, tagName => 'gadgets', ns => 'someotheruri');NB: you can verify the namespace is internally altered using DBMS_XMLDOM.GETNAMESPACE
2. to set the xmlns attributes of the document, you need to explicitly create the attribute. You need lines 3 and 4 of the code above.
3. to cause a node or attribute to be serialised with a namespace prefix, you need to explicitly request it with the SETPREFIX call e.g.
DBMS_XMLDOM.SETPREFIX(n => V_CURRENT_CHILD_NODE, prefix => 'a');
4. however, the namespace prefix will only be used in the serialised xml if you also set a namespaceURI/ns (from point 1).
5. there is no validation or verification that the namespace prefix you set (point 3) matches a namespace you have declared (points 1 and 2).
As you can see, DBMS_XMLDOM is really just a very thin wraper to programmatically generate XML at the lowest level. It leaves much of the "intelligence" for you to provide!
Bearing that in mind, you probably can think of many cases where using DBMS_XMLDOM is not the right answer, but is sort of brute force/naive.
If you have data in Oracle and want to produce some complex XML output, there are many smarter alternatives, such as defining an xmltype view over the source data and then doing an xsl transform into the desired document format. Aside from being much less tiresome than DBMS_XMLDOM, it has the benefit of separating presentation (the template) from the data. When you need to change the output format, you just change the template rather than hacking away at the DBMS_XMLDOM code.
read more and comment..
Complex SOAP::Lite requests - my rules for SOAP::Sanity!
Previously, I mentioned I'd come back to more complex request and response structures with SOAP::Lite.
Frankly, I haven't posted because I can't avoid the feeling that there's still a little more to unravel. Did I mention that good documentation is sparse? ;) Byrne and others have posted some good stuff (see for example the majordojo soaplite archive and this hands-on tour at builder.com), but mostly you'll find the experiences shared go along lines like "...after hacking around for a bit, I found that this worked...".
But is it possible to try and pin down at least a couple of guidelines for using SOAP::Data? Well, so far I can't claim to solving it all, but I am able to share a few anchors I've tried to plant for my own sanity!
My Rules for SOAP::Sanity
In the following "rules", $soap is a pre-initialised SOAP::Lite object, as in:
my $soap = SOAP::Lite->uri ( $serviceNs ) -> proxy ( $serviceUrl );
1. The value of a SOAP::Data element becomes the content of the XML entity.
It may seem bleeding obvious. Nevertheless, get this idea fixed in you head and it will help for more complex structures.
So if we are calling a "getHoroscope" service with the following request structure:
<getHoroscope>
<sign>Aries</sign>
</getHoroscope>
"Aries" is the value, i.e. the content, of the XML entity called "sign". Thus our request will look like this:
$data = SOAP::Data->name("sign" => 'Aries');
$som = $soap->getHoroscope( $data );
2. To create a hiearchy of entities, use references to a SOAP::Data structure.
In (1), the content of the entity was a simple string ("Aries"). Here we consider the case where we need the content to encapsulate more XML elements rather than just a string. For example a request with this structure:
<getHoroscope>
<astrology>
<sign>Aries</sign>
</astrology>
</getHoroscope>
Here "astrology" has an XML child element rather than a string value.
To achieve this, we set the value of the "astrology" element as a reference to the "sign" SOAP::Data object:
$data = SOAP::Data->name("astrology" =>
\SOAP::Data->name("sign" => 'Aries')
);
$som = $soap->getHoroscope( $data );
3. To handle multiple child entities, encapsuate as reference to a SOAP::Data collection.
In this case, we need our "astrology" element to have multiple children, for example:
<getHoroscope>
<astrology>
<sign>Aries</sign>
<sign>Pisces</sign>
</astrology>
</getHoroscope>
So a simple variation on (2). To achieve this, we collect the "Aries" and "Pisces" elements as a collection within an anonymous SOAP::Data object. We pass a reference to this object as the value of the "astrology" item.
$data = SOAP::Data->name("astrology" =>
\SOAP::Data->value(
SOAP::Data->name("sign" => 'Aries'),
SOAP::Data->name("sign" => 'Pisces')
)
);
$som = $soap->getHoroscope( $data );
4. Clearly distinguish method name structures from data.
This is perhaps just a style and clarity consideration. In the examples above, the method has been implicitly dispatched ("getHoroscope").
If you prefer (or need) to pass the method information to a SOAP::Lite call, I like to keep the method information distinct from the method data.
So for example, item (3) can be re-written (including some additional namespace handling) as:
$data = SOAP::Data->name("astrology" =>
\SOAP::Data->value(
SOAP::Data->name("sign" => 'Aries'),
SOAP::Data->name("sign" => 'Pisces')
)
);
$som = $soap->call(
SOAP::Data->name('x:getHoroscope')->attr({'xmlns:x' => $serviceNs})
=> $data
);
I prefer to read this than have it all mangled together.
That brings me to the end of my list of rules! I am by no means confident that there aren't more useful guidelines to be added, or that in fact the ones I have proposed above will even stand the test of time.
Nevertheless, with these four ideas clearly in mind, I find I have a fair chance of sitting down to write a complex SOAP::Lite call correctly the first time, rather than the trial and error approach I used to be very familiar with!
read more and comment..