XSLT for compound documents tutorial
Introduction
In principle, it should be easy to transform so-called compound documents with XSLT. In practice it is not because (a) documentation only can be found by searching specialized XML web sites and (b) there are some very tricky issues. For now, this article just includes an example that demonstrates the principle with a working "life" example.
- Learning goals
- Learn how to create XSLT that can handle XML documents that combine several vocabularies, for example combined XHTML + RDF/dc + your own XML documents
- Prerequisites
- Level and target population
- Intermediate XML/XSLT users
- Remarks
- Only works with more recent versions of IE explorer, i.e. version 9 or better. Should work well with older Firefox version. Not tested with Safari.
XSLT can handle XML documents that include more than one namespace.
Principles:
- Declare all the namespaces found in the XML document on top of the XSLT stylesheet
- If you produce XHTML, you must declare the XHTML namespace twice, as default namespace and with a prefix for the XSLT rules
- Each XPath expression must use a prefix, and that includes the XHTML ones !
Warning:
- The namespace URI/URN/URLs must be identical between the XML and the XSLT. One spelling mistake and nothing will work
- Really, every XPath element and attribute name must have a prefix, e.g. "match" and "select" attributes. See the example below.
The XSLT engine of your navigator may not be able to handle namespaces. Transform in your XML editor, or use a server-side solution. E.g. in php, do it like this:
<?php
# Made by DKS in 2005, still works in 2015. Substitute "YOUR" by your file names.
error_reporting(E_ALL);
$xml_file = 'YOUR.xml';
$xsl_file = 'YOUR.xsl';
// load the xml file (and test first if it exists)
$dom_object = new DomDocument();
if (!file_exists($xml_file)) exit('Failed to open $xml_file');
$dom_object->load($xml_file);
// create dom object for the XSL stylesheet and configure the transformer
$xsl_obj = new DomDocument();
if (!file_exists($xsl_file)) exit('Failed to open $xsl_file');
$xsl_obj->load($xsl_file);
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl_obj); // attach the xsl rules
$html_fragment = $proc->transformToXML($dom_object);
print ($html_fragment);
Examples
XHTML with RDF, Dublin core and our own soup
Tested on April 2013 with IE9, Chrome and Firefox 20 under Windows 7. In principle, this should work with all modern browsers ....
Life files:
XML input:
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<?xml-stylesheet href="compound-cd-list.xsl" version="1.0" type="text/xsl"?>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>Compound XML document demo</title>
</head>
<body>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:Description rdf:about="http://edutechwiki.unige.ch/XSLT_for_compound_documents_tutorial">
<dc:title>XSLT for compound documents demo</dc:title>
<dc:creator>DKS</dc:creator>
<dc:format>XHTML + private XML + DC</dc:format>
<dc:rights>Free as in free beer</dc:rights>
</dc:Description>
</rdf:RDF>
<p>This is an XML document with a compound vocabulary,
i.e. XHMTL + CD-list + Dublin Core metadata.
Read the <a href="http://edutechwiki.unige.ch/en/XSLT_for_compound_documents_tutorial">XSLT for compound documents tutorial</a></p>
<p>Stuff below belongs to a "my" namespace. Stuff at the bottom are "dc" contents. Output is ugly, no time for styling - DKS/4/2013.</p>
<hr/>
<my:cd-list xmlns:my="http://edutechwiki.unige.ch/XML">
<my:title>My (reduced) Hard Bop list</my:title>
<my:cd>
<my:artist>John Coltrane</my:artist>
<my:title>Blue Train</my:title>
<my:genre>Jazz</my:genre>
<my:description>From Wikipedia: ...... </my:description>
<my:track-list>
<my:track no="1">
<my:title>Blue Train</my:title>
<my:artist>John coltrane</my:artist>
<my:genre>Blues</my:genre>
</my:track>
<my:track no="2">
<my:title>Moment's Notice</my:title>
<my:artist>John coltrane</my:artist>
<my:genre>Hard Bop</my:genre>
</my:track>
</my:track-list>
</my:cd>
<my:cd>
<my:artist>Art Blakey</my:artist>
<my:title>Moanin'</my:title>
</my:cd>
</my:cd-list>
</body>
</html>
XSLT file:
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:h="http://www.w3.org/1999/xhtml"
xmlns="http://www.w3.org/1999/xhtml"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:my="http://edutechwiki.unige.ch/XML"
version="1.0">
<xsl:output method="xml"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" indent="yes"/>
<xsl:template match="h:html">
<html>
<head>
<title>
<xsl:value-of select="h:head/h:title"/>
</title>
</head>
<body bgcolor="#FFFFFF">
<xsl:apply-templates select="h:body"/>
</body>
</html>
</xsl:template>
<xsl:template match="h:body">
<!-- for all HTML tags -->
<xsl:apply-templates select="h:*"/>
<!-- CD list will come first -->
<xsl:apply-templates select="my:cd-list"/>
<!-- Metadata at the end, skip the RDF part -->
<xsl:apply-templates select="rdf:RDF/dc:Description"/>
</xsl:template>
<!-- CD list contents -->
<xsl:template match="my:cd-list">
<h1><xsl:value-of select="my:title"/></h1>
<xsl:apply-templates select="my:cd"/>
</xsl:template>
<xsl:template match="my:cd">
<h3><xsl:value-of select="my:artist"/>:
<xsl:value-of select="my:title"/> -
<xsl:value-of select="my:genre"/>
</h3>
<p><xsl:value-of select="my:description"/></p>
<xsl:apply-templates select="my:track-list"/>
</xsl:template>
<xsl:template match="my:track-list">
<ol>
<xsl:apply-templates select="my:track"/>
</ol>
</xsl:template>
<xsl:template match="my:track">
<li>
<xsl:value-of select="my:title"/> -
<xsl:value-of select="my:artist"/> -
<xsl:value-of select="my:genre"/>
</li>
</xsl:template>
<!-- metadata -->
<xsl:template match="rdf:RDF/dc:Description">
<hr/>
<p style="font-size:60%;">Meta data:
Title:<xsl:value-of select="dc:title"/> -
Creator: <xsl:value-of select="dc:creator"/> -
Format: <xsl:value-of select="dc:format"/> -
Copyright: <xsl:value-of select="dc:rights"/>
</p>
</xsl:template>
<!-- HTML tags and contents are just copied -->
<xsl:template match="h:*">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Using XSLT with documents that have a single name space
The same principle applies, you must use prefixes in your XSLT code. See also XML Schema tutorial - Basics
XSL
<xsl:stylesheet
xmlns="http://www.ibm.com/software/analytics/spss/xml/oms"
xmlns:oms="http://www.ibm.com/software/analytics/spss/xml/oms"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html" indent="yes"/>
<xsl:template match="oms:outputTree">
<html>
<head>
<title>SPSS Codebook</title>
</head>
<body bgcolor="#ffffff">
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
.....
<xsl:template match="oms:pivotTable/oms:dimension//oms:category[@text='Measurement']/oms:dimension/oms:category/oms:cell[@text='Nominal']">
.........
<xsl:apply-templates/>
</xsl:template>
XML (a codebook file in XML generated by SPSS)
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="spss-codebook.xsl"?>
<outputTree
xmlns="http://www.ibm.com/software/analytics/spss/xml/oms"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.ibm.com/software/analytics/spss/xml/oms http://www.ibm.com/software/analytics/spss/xml/oms/spss-output-1.8.xsd">
<command command="Codebook" displayOutlineValues="label" displayOutlineVariables="label" displayTableValues="label" displayTableVariables="label" lang="en" text="Codebook">
<pivotTable subType="Variable Information" text="id">
<dimension axis="row" text="Attributes">
<group text="Standard Attributes">
<category text="Position">
<dimension axis="column" text="Values">
<category text="Value">
<cell number="1" text="1"/>
</category>
</dimension>
</category>
.......
A live example should be here: http://tecfa.unige.ch/proj/ccl/ILICS
Alternatively, you also could use XSLT functions instead, e.g.
The following two are equivalent for retrieving the same elements (but I did not test the second expression, may have a syntax error ...)
oms:pivotTable/oms:dimension//oms:category[@text='Measurement']/oms:dimension/oms:category/oms:cell[@text='Nominal']
/*[name()='outputTree']/*[name()='command']/*[name()='pivotTable']/*[name()='dimension']/*[name()='group']/*[name()='category']/*[name()='dimension']/*[name()='category']/*[name()='cell'][@text='Nominal]'
Links
- XSLT and namespaces
- Handling namespaces (Jenny Tennison, btw. a good resource for all XSLT problems).
- Avoid common XSLT mistakes by Jirka Kosek, Dec 2008 (retrieved 4/2013)
- RDF and Dublin Core (older version)