XSLT Tutorial - Basics

The educational technology and digital learning wiki
Jump to navigation Jump to search

<pageby nominor="false" comments="false"/>

Introduction

This is a beginners tutorial for XSLT made from slides

Objectives
  • Understand the purpose of XSLT
  • Do simple transformations from XML to HTML
  • Understand the most simple XPath expressions (tag names)
Prerequisites
  • Editing XML (being able to use a simple DTD)
  • XML namespaces (some)
  • HTML and CSS (some)
Warning

XSLT is a rather complex transformation language. I believe that one could distinguish four levels of difficulty:

  • This tutorial is introductory (level 1)
  • Level 2 XSLT is more sophisticated template ordering, conditional expressions, loops, etc.
  • Level 3 is advanced XPath expressions
  • Level 4 is functional programming with templates

Introduction Extensible Stylesheet Language Transformations

Goals of XSLT
  • XSLT is a transformation language for XML
  • XSLT is a W3C XML language (the usual XML well-formedness criteria apply)
  • XSLT can translate XML into almost anything , e.g.:
    • wellfomed HTML (closed tags)
    • any XML, e.g. yours or other XML languages like SVG, X3D
    • non XML, e.g. RTF (a bit more complicated)

Xslt-basics-2.png

History and specifications

Specification
History
  • Initially, XLS (XSL: eXtensible Stylesheet Language) was a project to replace CSS for both display and print media and to provide support for complex formatting and layout features (pagination, indexing, cross-referencing, recursive numbering, indexing, etc.
  • XSLT ( Extensible Stylesheet Language Transformations ) was originally intended as a small part of the larger specification for XSL
  • However, when the XSL specification draft became very large and complex it was decided to split the project into XSLT for transformations (that were urgently needed) and XSL for the rest (W3C recommendation of 2001)
Related languages

A first glance at XSLT

Root of an XSLT file stylesheet

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
....
</xsl:stylesheet>
Mandatory elements
  • XML declaration on top of the file
  • A stylesheet root tag with the following version and namespace attributes:
 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
  • XSLT must be wellformed (and also obey the XSLT specification)
  • XSLT files usually have the *.xsl extension and should have the text/xsl or application/xml mimetype when served by http.

Association of XML and an XSLT file

An XSLT stylesheet is associated with a processing instruction (similar to a CSS stylesheet)

<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet href="project.xsl" type="text/xsl" ?>
<yourxml> .... </yourxml>

Basic XSLT

Basic (!) use of XSLT means:

  • writing translation rules (aka templates) for each XML tag we want to translate
  • translating XML to HTML
A simple translation rule (called "template" in XSLT)

Xslt-basics-3.png

Example Translation of a title tag into HTML centered H1

XML Source we want to translate:

<title>Hello friend</title>

The XSLT rule that does it:

Xslt-basics-4.png

A complete XSLT example

(Hello XSLT)

XML file (source)
  • hello.xml
<?xml version="1.0"?>
<?xml-stylesheet href="hello.xsl" type="text/xsl"?>
<page>
 <title>Hello</title>
 <content>Here is some content</content>
 <comment>Written by DKS.</comment>
</page>
Wanted result document
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
<html>
  <head>
    <title>Hello</title>
  </head>
  <body bgcolor="#ffffff">
    <h1 align="center">Hello</h1>
    <p align="center"> Here is some content</p>
    <hr><i>Written by DKS</i>
  </body>
</html>
The XSLT Stylesheet
  • hello.xslt
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="page">
   <html> <head> <title> <xsl:value-of select="title"/>
 </title> </head>
    <body bgcolor="#ffffff">
     <xsl:apply-templates/>
    </body>
   </html>
  </xsl:template>
  <xsl:template match="title">
   <h1 align="center"> <xsl:apply-templates/>  </h1>
  </xsl:template>
  <xsl:template match="content">
   <p align="center"> <xsl:apply-templates/> </p>
  </xsl:template>
  <xsl:template match="comment">
   <hr/> <i><xsl:apply-templates/> </i>
  </xsl:template>
</xsl:stylesheet>

Anatomy of a simple stylesheet

Xslt-basics-5.png

Rule execution order

(1) The XSLT engine first looks at the XML file and tries to find a XLT rule that will match the root element

  • E.g. in the above example it will find "page" and then the template for page

(2) The XSLT processor will then "move" inside the rule element and do further processing

  • HTML elements (or any other tags) will be copied to the output document
  • If an XSLT instruction is found, it will be executed
 <xsl:apply-templates/>  means: "go and look for other rules"

E.g. in the above example

  • the processor dealing with root element "page" will first find a rule for "title" and execute it according to the same principle.
  • once it is done with "title" and its children, it then will find the rule for "content" and execute it

(3) and so forth ....

More information

  • <xsl:value-of select="title"/> will retrieve contents of the "title" child element.
    • In our example, it would only work in the template for "page", since only "page" has a "title" child
  • You have to understand that XSLT works down "depth-first" the XML tree, i.e.
    • it first deals with the rule for the root element,
    • then with the first instruction within this rule.
    • If the first instruction says "find other rules" it will then apply the first rule found for the first child element and so forth...
    • The rule of the root element is also the last one be finished (since it must deal step-by-step with everything that is found inside) !!!

The procedure recapitulated

(1) Create a XSLT stylesheet file: xxx.xsl

(2) Copy/paste the XSLT header and root element below (decide encoding as you like)

<?xml version="1.0" encoding="ISO-8859-1" ?> 
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
</xsl:stylesheet>

(3) Write a rule that deals with your XML root element This rule must produce the root, head and body of the HTML (copy/paste this too, but replace " page ")

<xsl:template match="page "> <html>
   <head> <title> <xsl:value-of select="title"/> 
     </title> 
   </head>
   <body bgcolor="#ffffff">
     <xsl:apply-templates/>
   </body>
  </html>
</xsl:template>

(4) Write rules for each (!!) of your XML elements,

  • for each insert some HTML, sometimes some text, or sometimes nothing
  • make sure to place a <xsl:apply-templates> inside each rule (usually between some HTML) ... unless you wish to censor contents.

(5) Associate this stylesheet with your XML file using:

<?xml-stylesheet href="xxx.xsl" type="text/xsl"?>

Tuning output with xsl:output and CSS

Output declarations

  • So far, HTML output produced would display in a naviagor, but is not fully HTML compliant.

xsl:output is an instruction that allows you to fine-tune XSLT translation output. It's definition is the following:

<xsl:output
method = "xml" | "html" | "text"
version = nmtoken
encoding = string
omit-xml-declaration = "yes" | "no"
standalone = "yes" | "no"
doctype-public = string
doctype-system = string
indent = "yes" | "no"
media-type = string />

  • You should put this instruction in the beginning of the file (after xsl:stylesheet)
Example - Output in HTML ISO-latin encoded
<xsl:output method="html"
    encoding="ISO-8859-1"
    doctype-public="-//W3C//DTD HTML 4.01 Transitional//EN"/>
Example - Output in XHTML transitional with a namespace
  • This is quite more complicated than producing simple HTML
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns="http://www.w3.org/1999/xhtml" >
<xsl:output
   method="xml"
   doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
   doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN"
   indent="yes"
   encoding="iso-8859-1" />
<xsl:template match="recipe">
   <html xmlns="http://www.w3.org/1999/xhtml" >
    <head> ... </head> ... <body> ... </body>
</xsl:template>
Exemple - Your XML
<xsl:output
 method="xml" indent="yes"
 doctype-system="mydtd.dtd" />
Exemple - Output in SVG
<xsl:output
 method="xml"
 indent="yes"
 standalone="no"
 doctype-public="-//W3C//DTD SVG 1.0//EN"
 doctype-system="http://www.w3.org/TR/2001/PR-SVG-20010719/DTD/svg10.dtd"
 media-type="image/svg" />

CSS styling of HTHML

Associating a CSS stylesheet with HTML output is trivial:

  • add a link tag in the "head" produced by the template for the root element
  • .... in the hello.css file you then have to define styles of HTML elements you generate
 <xsl:template match="hello">
  <html>
   <head>
    <link href="hello.css" type="text/css" rel="stylesheet"/>
   </head>
   ......
</xsl:template match="hello">
Example 3-5
cooking
  • cooking.xsl, cooking.xml and cooking-html.css
 <xsl:template match="recipe">
   <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
      <title> <xsl:value-of select="title"/> </title>
      <link href="cooking-html.css" type="text/css" rel="stylesheet"/>
    </head>
     <body bgcolor="#ffffff">
     <xsl:apply-templates/>
    </body>
   </html>
  </xsl:template>

If things go wrong

Frequent problems and remediation

Style-sheet error !
  • Validate the style-sheet in your XML editor
  • If it provides XSLT support, it will help you find the error spots
XHTML doesn't display in Firefox !
  • Firefox wants a namespace declaration in the XHMTL produced, do it (see above).
HTML doesn't seem to be right !
  • Transform the XML document within your XML editor and look at the HTML

In "Exchanger Lite", use Transform in the menu bar with the following parameters:

Transform->Execute Advanced XSLT
Input = current document
XSLT = Use Processing instructions
  • You also may validate the output HTML !
There is various unformatted text in the output !
  • See the XSLT default rule (below)
HTML still doesn't seem to be right !!
  • Use a XSLT debugger/tracer to understand how your XSLT executes

The XSLT default rule

  • When you test your first style sheet, it is likely that some of your contents will appear non-formatted.
  • This is due to the fact that XSLT will apply a default rule to all XML elements for which it didn't find a rule.
    • If you forget to write a rule for a tag (or misspell tag names) this will happen .....
  • The XSLT default rule simply copies all contents to the output.

A modified default rule that will help you find missing pieces

  • simply cut/paste this to your XSLT (but remove it later on)
 <xsl:template match="*">
  <dl><dt>Untranslated node:
      <strong><xsl:value-of select="name()"/></strong></dt>
  <dd>
   <xsl:copy>
     <xsl:apply-templates select="@*"/>
     <xsl:apply-templates select="node()"/>
   </xsl:copy>
 </dd>
 </dl>
 </xsl:template>


<xsl:template match="text()|@*">
  Contents: <xsl:value-of select="."/>
</xsl:template>

Selective processing

Steering rule execution and information filtering

  • Instead of letting XSL apply rules in "natural order", you can tell which rules to apply when.
Example 5-1
Hello without content
  • The rule for the root element will only "call" the rules for the "title" and the "comment" element
  • Information within a content tag will not be displayed (since we don't let the processor find rules by itself, but only let it execute a rule for "title" and another for "comment").
Hello2.xml
 <?xml version="1.0"?> 
  <?xml-stylesheet href="hello2.xsl" type="text/xsl"?>
   <page>
    <title>Hello</title>
    <content>Here is some content</content>
    <comment>Written by DKS.</comment>
   </page> 
Hello2.xsl
 <?xml version="1.0"?>
 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

 <xsl:template match="page">
   <html> <head> <title> <xsl:value-of select="title"/> </title> </head>
    <body bgcolor="#ffffff">
     <xsl:apply-templates select="title"/>
     <xsl:apply-templates select="comment"/>
    </body>
   </html>
  </xsl:template>

  <xsl:template match="title">
   <h1 align="center"> <xsl:apply-templates/> </h1>
  </xsl:template>

  <xsl:template match="comment">
   <hr/> <i><xsl:apply-templates/></i>
  </xsl:template>

 </xsl:stylesheet>
Hello2.html
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN""http://www.w3.org/TR/REC-html40/strict.dtd">
 <html>
  <head>
    <title>Hello</title>
  </head>
  <body bgcolor="#ffffff">
    <h1 align="center">Hello</h1>
    <p align="center"> Here is some content</p> 
    <hr><i>Written by DKS</i>
  </body>
 </html>

A short glance at Xpath

  • XPath is a very powerful language to extract information from XML
  • XPath was published at the same time as XSLT 1.0 (1999)
  • Values of XSLT match and select attributes are XPath expressions

Xslt-basics-6.png

  • XSLT beginners don't need to know a lot about XPath (so don't worry right now !).
    • Simply stick to the idea of writing a template for each XML tag, as explained before
  • XPath expressions can be more complicated:
<xsl:apply-templates select="course/module[position()=1]/section[position()=2]
"/>

means: "find rule for 2nd section of the first module of course"

  • XPath also includes arithmetics and tests
"//Participant[string-length(Nom)>=8]"

means: "return all participant nodes with content of name longer than 7 characters"

Examples of a few simple XPath expressions (optional !)
  • These should remind you of CSS selectors

Syntax
elemen

(Type of path)

Example path

Example matches

tag

element name

project

<project> ...... </project>

/

separates children

project/title

<project> <title> ... </title>

/

(root element)

//

descendant

project//title

<project><problem> <title>....</title>

//title

<racine>... <title>..</title> (any place)

*

"wildcard"

*/title

<bla> <title>..</title> and <bli> <title>...</title>

|

"or operator

title|head

<title>...</title> or <head> ...</head>

*|/|@*

All elements: root, children and attributes

.

current element

.

../

parent element

../problem

<project>

@

attribute name

@id

<xyz id="test">...</xyz>

element/@attr

attribute of child

project/@id

<project id="test" ...> ... </project>

@attr='type'

type of attribute

list[@type='ol']

<list type="ol"> ...... </list>

Basic value extraction

xsl:value-of
  • inserts the value of an XPath expression and copies it to the output
  • e.g. you can take contents of an element or attribute values and insert them in HTML table cells.
Example - Value-of
  • Let's assume that we have an author element and that we would like to put this information on top of the page and that we should like to display the value of the revision attribute.
XML fragment
<page>
 <title>Hello</title>
 <content revision="10
">Here is some content</content>
 <comment>Written by <author>DKS
</author></comment>
</page>
XSLT rules
<xsl:template match="page">
    <P><xsl:value-of select="comment/author" /></P>
</xsl:template>
<xsl:template match="content">
    <P>Revision number: <xsl:value-of select="@revision" /></P>
</xsl:template>

Inserting a value inside a string

  • If you want to insert information inside an HTML attribute value, things get a little bit tricky.
There is a special syntax

{....}

This is the equivalent of < xsl:value-of select="..."/> which can not be used here !!

Example 5-3
Building a href tag with an email
  • We will use both the {...} and the value-or select constructs.
The XML information
<contact-info email ="test@test ">
The XSLT rule
<xsl:template match="contact-info">
....
  <a href="mailto:{ @email } "><xsl:value-of select="@email "/></a>
...
The result
<a href="mailto:test@test ">test@test </a>

Dealing with pictures

There is no special "magic" for dealing with images, links, stylesheets etc. Simply:

  • look at your XML and figure out how to translate into equivalent HTML (or whatever else)
  • the following example demonstrates the use of value extraction
  • several other solutions than the one demonstrated exist ...
Example - Dealing with pictures
images.xml
 <?xml version="1.0"?>
 <?xml-stylesheet href="images.xsl" type="text/xsl"?>
 <page>
  <title>Hello Here are my images</title>
  <list>
    <!-- pictures are either contents or attribute values of elements -->
   <image>dolores_001.jpg</image>
   <image>dolores_002.jpg</image>

   <image3 source="dolores_002.jpg">Recipe image</image3>
  </list>
  <comment>Written by DKS.</comment>
 </page>
images.xsl
 <?xml version="1.0"?>
 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:template match="page">
   <html> <head> <title> <xsl:value-of select="title"/> </title> </head>
    <body bgcolor="#ffffff">
     <xsl:apply-templates/>
    </body>
   </html>
  </xsl:template>

  <xsl:template match="title">
   <h1 align="center"> <xsl:apply-templates/> </h1>
  </xsl:template>

  <!-- pictures are either contents or attribute values of elements -->
  <xsl:template match="list">
   Images are element contents, apply a template to all image elements:
   <xsl:apply-templates select="image"/>
   Images are attribute values of an element, we do it differently:
   <xsl:apply-templates select="image3"/>
  </xsl:template>

  <xsl:template match="image">
    <p> <img src="{.}"/> </p>
  </xsl:template>

  <xsl:template match="image3">
    <p> <img src="{@source}"/><br/><xsl:value-of select="."/> </p>
  </xsl:template>


  <xsl:template match="comment">
   <hr/> <i><xsl:apply-templates/></i>
  </xsl:template>

 </xsl:stylesheet>