XQuery tutorial - basics

The educational technology and digital learning wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Introduction

This is a beginners' tutorial for XQuery.

Prerequisites:

Recommended reading, before this:

XQuery is an XML language for querying XML that can handle the following XML document types (both single documents or collections):

  • files
  • XML databases
  • XML fragments in memory

XQuery can express queries across all these kinds of data, whether physically stored in XML or viewed as XML via middleware. In other words, you can use it to retrieve things from files, from XML representations made from SQL databases or from native XML databases like eXist. In addition, Xquery is quite a real programming language.

In more simple terms, the W3C XML Query Working Group advertized that “XQuery is a standardized language for combining documents, databases, Web pages and almost anything else. It is very widely implemented. It is powerful and easy to learn. XQuery is replacing proprietary middleware languages and Web Application development languages. XQuery is replacing complex Java or C++ programs with a few lines of code. XQuery is simpler to work with and easier to maintain than many other alternatives. Do more with less.”

XQuery:

  • relies on the Xpath (version 2.0) language
  • can generate new XML/HTML documents

XQuery does not define updating. You may use XQuery Update Facility, which is an XQuery extension. As of Jan 2010, this not a full recommendation, but it's implemented at least partially in some engines.

Standards:

http://www.w3.org/TR/xquery/ (XQuery 1.0: An XML Query Language, W3C Recommendation 23 January 2007query language, stable)
http://www.w3.org/TR/xquery-operators/ (XQuery 1.0 and XPath 2.0 Functions and Operators, W3C Recommendation 23 January 2007)
http://www.w3.org/TR/xqupdate/ (XQuery Update Facility 1.0, W3C Candidate Recommendation 09 June 2009)
XML Syntax for XQuery 1.0 (XQueryX) (barely eadable for humans, but practical for machine parsing of query statements).
Authors of this standard: W3C XML Query Group

Use contexts

In order to play with XQuery, you need an XQuery processor. These are available in several forms:

  • Included in some command line/library products like “saxon”. See the Shell script article for an examplantion on how to use it under Windows.
  • Often, one of these processors is included in your XML editor, i.e. you should find XQuery processing in one of the Menu items.
  • Some programming languages also include XQuery libraries. Otherwise they exist as external libraries for most programming languages.

In practical terms, XQuery files can be executed:

  • From the command line
  • Through an administration tool (e.g. a Web tool or a local XML editor)
  • Through some web applications

Basic XQuery

Let's recall that XQuery uses XPath elements. Below are some simple examples that demonstrate this principle. Notice that these simple queries will not produce well-formed XML, but lists of well-formed XML fragments.

You should understand the following XPath expressions, else maybe read the XPath tutorial - basics again.

Position in a tree
ex: returns all nodes <card-propvalue> under <c3mssoft-desc> ...
 //c3msbrick//c3mssoft-desc//card-propvalue
Comparison
ex: returns all nodes with this id == epbl
 //c3msbrick//c3mssoft[@id="epbl"]
Functions
ex: title () only returns the content of a node
 //c3msbrick/title/text()

Let's now examine a few simple XQuery examples that will use a for - return construct.

Find all nodes with path //topic/title

The following expression returns all fragments <topic><title>xxxx </title></topic> from anywhere in an XML document.

 for $t in //topic/title
    return $t

The next example shows how to find all nodes of //topic/title in a file called "catalog.xml". doc is a so-called XQuery 1.0 and XPath 2.0 function.

 for $t in fn:doc("catalog.xml")//topic/title
    return $t

The next example does the same, but retrieves the file via HTTP:

 for $t in fn:doc("http://tecfa.unige.ch/proj/seed/catalog/net/xml/catalog-eng.xml")//topic/title
    return $t

Result for all examples

 <title>TECFA Seed Catalog</title>
 <title>Introduction</title>
 <title>Conceptual and technical framework</title>
 <title>The socio-constructivist  .... etc.

Wrapping a node list - creating well-formed XML

<result>
  { for $t in 
    fn:doc("http://tecfa.unige.ch/guides/xml/examples/shakespeare.1.10.xml/hamlet.xml")//ACT//SCENE/TITLE
   return $t }
</result>

Result would look like this:

<?xml version="1.0" encoding="UTF-8"?>
<result>
  <TITLE>SCENE I.  Elsinore. A platform before the castle.</TITLE>
  <TITLE>SCENE II.  A room of state in the castle.</TITLE>
....
<result>

FLWOR expressions

Let's now introduce the core construct of a typical XQeury expression and which is known under the name FLOWR, which stands for "For-Let-Where-Order-Return"

This construct is similar to select-from-where-..-order-by that is used in SQL.

The FLOWR elements:

for = iteration on a list of XML nodes
let = binding of a result to a local variable
where = selection condition
order = sorting
return = expression that will be returned

For :

for $variable in expression_search
RETURN $variable

"for" associates to $variable each XML fragment found expression, defined as XPath expression.

Example:

for $t in //topic/title return $t

let:

"let" assigns a value (node) to a variable

Example without let:

for $t in fn:doc("catalog09.xml")//c3msbrick//c3mssoft-desc//card-propvalue
return $t

Same example with let

for each node $t found, we bind $desc with a subnode.
for $t in fn:doc("catalog09.xml")//c3msbrick
  let $desc := $t//c3mssoft-desc//card-propvalue
return $desc

Counting, for each node $t found in the "for" loop we count the number of subnodes:

for $t in fn:doc("catalog09.xml")//c3msbrick
let $n := count($t//c3mssoft)
return <result> {$t/title/text()} owns {$n} bricks </result>

where

defines a selection condition

Same as above, but we only return results with a least 2 c3mssoft

for $t in fn:doc("catalog09.xml")//c3msbrick
  let $n := count($t//c3mssoft)
  where ($n > 1)
return <result> {$t/title/text()} owns {$n} bricks </result>

Order:

Can sort results

Example of an alphabetic sorting:

for $t in //topic/title order by $t return $t

More complex alphabetic sorting example:

for $t in fn:doc("catalog09.xml")//c3msbrick
let $n := count($t//c3mssoft)
where ($n > 1)
order by $n
return <result> {$t/title/text()} owns {$n} bricks </result>

Below is a slightly more complex return clause, we also include <titles> of <c3msbricks>:

for $t in fn:doc("catalog09.xml")//c3msbrick
let $brick_softs := $t//c3mssoft
let $n := count($brick_softs)
where ($n > 0)
order by $n
return <result> 
      For {$t/title/text()} we found {$n} softwares: 
         {$brick_softs//title} 
      </result>
return builds the expression to return
Warning: Each iteration must return a fragment, i.e. a single node (and not a collection!)

Good:

for $t in fn:doc("catalog09.xml")//c3msbrick
let $n := count($t//c3mssoft)
return <result> 
       {$t/title/text()} owns {$n} bricks 
       </result>

Bad:

for $t in fn:doc("catalog09.xml")//c3msbrick
let $n := count($t//c3mssoft)
return $t/title/text() owns $n bricks 

Using XQuery

Creating well-formed XML result fragments

XML result of an XQuery usually should lead to a single node (not a list). Let's first look at a Multi-fragment version (collection):

for $t in //c3msbrick/title
   return $t

This will return a list (a so-called collection)

1    <title class ="- topic/title " > TECFA Seed Catalog </ title >
2    <title class ="- topic/title " > Introduction </ title >
3    <title class ="- topic/title " > Conceptual and technical framework </ title >
4    <title class ="- topic/title " > The socio-constructivist approach </ title >
....

This sort of list is not well-formed XML, and can not be displayed in a browser for example.

... but this is not a problem if is dealt with by some program (e.g. a php script querying a DOM tree)

Now let's see the single fragment version. The following expression:

 <result> 
  { for $t in //topic/title/text()
  return <titre>{$t}</titre> }
 </result>

returns:

 <result  >
 <titre > TECFA Seed Catalog </ titre >
 <titre > Introduction </ titre >
 <titre > Conceptual and technical framework </ titre >
 <titre > The socio-constructivist approach </ titre > 
 ....
 </result>
We replaced "title" tags by "titre" tags (just to make this a bit more complicated)
All expressions within { ...} are evaluated as XQuery.
but all <tags> are copied as is ...

Multiple for loops

You can first have a look at the XML source file from which we shall retrieve data.

http://www.dbis.informatik.uni-goettingen.de/Mondial/mondial.xml
<result>
 { for $country in fn:doc ("http://www.dbis.informatik.uni-goettingen.de/Mondial/mondial.xml")/mondial/country,
   $ethnicgroup in $country/ethnicgroups
return
 <item>
     <country> {$country/name/text()} </country>
     <ethnicgroup>  {$ethnicgroup/@percentage}  {$ethnicgroup/text()}</ethnicgroup>
 </item>
}
</result>

The result is maybe not what you like (each country can appear more than once)

<result>
<item><country>Albania</country><ethnicgroup percentage="3">Greeks</ethnicgroup></item>
<item><country>Albania</country><ethnicgroup percentage="95">Albanian</ethnicgroup></item>
<item><country>Greece</country><ethnicgroup percentage="98">Greek</ethnicgroup></item>

So here is another example that creates items per country.

<result>
{ for $country in fn:doc ("http://www.dbis.informatik.uni-goettingen.de/Mondial/mondial.xml")/mondial/country
  return
   <item>
     <country> {$country/name/text()} </country>
    { for $ethnicgroup in $country/ethnicgroups
      return
        <ethnicgroup>  {$ethnicgroup/@percentage}  {$ethnicgroup/text()}</ethnicgroup>
     } 
   </item>
}
</result>

Result looks like:

<result>
<item>
  <country>Albania</country>
  <ethnicgroup percentage="3">Greeks</ethnicgroup>
  <ethnicgroup percentage="95">Albanian</ethnicgroup>
</item>
<item>
  <country>Greece</country>
  <ethnicgroup percentage="98">Greek</ethnicgroup></item>
<item>
  <country>Macedonia</country>
  <ethnicgroup percentage="22">Albanian</ethnicgroup>
  <ethnicgroup percentage="2">Serb</ethnicgroup>
  <ethnicgroup percentage="65">Macedonian</ethnicgroup>
  <ethnicgroup percentage="4">Turkish</ethnicgroup>
  <ethnicgroup percentage="3">Gypsy</ethnicgroup>
</item>

Here is another example file to query (made by the author some years ago). You may look at it before you study the xquery example code.

http://tecfa.unige.ch/proj/seed/catalog/net/xml/catalog-eng.xml

The following example shows how to generate some text that includes a count variable:

 <result>
 <title>List of C3MSBricks and associated Software</title>
 { for $t in fn:doc("http://tecfa.unige.ch/proj/seed/catalog/net/xml/catalog-eng.xml")//c3msbrick
   let $brick_softs := $t//c3mssoft
   let $n := count($brick_softs)
   where ($n > 0)
   order by $n descending
   return 
     <brick> Pour {$t/title/text()} on a les {$n} modules suivants:
        { for $soft in $brick_softs
          return <soft> {$soft//title/text()}</soft>
        }
     </brick>
 }
 </result>

Some explanations / recalls:

  • Each FLWOR expression must be within { ... }
  • The "return" clause of the outer loop includes a loop that will deal with
  • $brick_softs contains a collection of $softs from which we extract titles
  • We also sort the results

Result:

<result>
<title>List of C3MSBricks and associated Software</title>
<brick> Pour User statistics on a les 6 modules suivants:
  <soft>pnProdAct</soft>
  <soft>commArt</soft>
  <soft>pncUserPoints</soft>
  <soft>pncSimpleStats </soft>
  <soft>Statistics module</soft>
  <soft>NS-User_Points</soft>
 </brick>
<brick> Pour Gallery on a les 5 modules suivants:
 <soft>PhotoShare</soft>
 <soft>Photoshare combined with PageSetter</soft>
 <soft>My_eGallery</soft>
 <soft>Coppermine </soft>
 <soft>Gallery </soft>
</brick>
.....
</result>

XHTML output

Let's pull in XML files, extract some informations from it and render the result in HTML.

xquery version "1.0";

let $source_doc := fn:doc("http://tecfa.unige.ch/guides/xml/examples/shakespeare.1.10.xml/hamlet.xml")
return
<html>
 <head> <title>Questionnaire Items</title> </head>
 <body>
  Acts of Hamlet
  <ol>
  { for $t in $source_doc//ACT//SCENE/TITLE
   return 
   <li> {$t/text()} </li>
   }
   </ol> 
 </body>
</html>

This would produce something like:

<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <title>Questionnaire Items</title>
   </head>
   <body>
      Acts of Hamlet
      <ol>
         <li>SCENE I.  Elsinore. A platform before the castle.</li>
         <li>SCENE II.  A room of state in the castle.</li>
         <li>SCENE III.  A room in Polonius' house.</li>
         <li>SCENE IV.  The platform.</li>
         <li>SCENE V.  Another part of the platform.</li>
         <li>SCENE I.  A room in POLONIUS' house.</li>
         <li>SCENE II.  A room in the castle.</li>
         <li>SCENE I.  A room in the castle.</li>
         .....
      </ol>
   </body>
</html>


The next example is somewhat inspired by XQuery/Displaying data in HTML Tables of the nice XQuery wikibook.

It's the equivalent of the example discusses in the PHP - MySQL - XML tutorial - basics tutorial. Instead of using XSLT to format a typical database result table, we use XQuery.

XML must have the following structure: <result><row>....</row> <row>....</row> </result>. Inside the row element, you may have any number of XML elements.

xquery version "1.0";

let $xml_source_doc :=
    fn:doc('http://tecfa.unige.ch/guides/xml/examples/mysql-php-xml/testfile.xml')
 
return
<html>
  <head> <title>Questionnaire Items</title> </head>
  <body>

  This example shows how to render a table-like XML structure as HTML table. 

    <table>
    <thead>
      <tr>
      { let $a_row := $xml_source_doc/result/row[1]
        for $element in $a_row/*
        return
           <th>{ fn:node-name($element) }</th>
      }	   	
      </tr>
    </thead>
    <tbody>
     {
       for $row at $count in $xml_source_doc/result/row
       return
       <tr> 
       { if ($count mod 2) then (attribute {'bgcolor'} {'Lavender'}) else () }
       { for $cell in $row/*
        return
           <td>{ $cell }</td>
       }
       </tr>
       }
      </tbody>
     </table>
   </body>
</html>


A complete example

This example has been made by Daniel Dean, exchange student at Webster Geneva, spring 1 2008.

Consider the following XML:

<?xml version="1.0"?>
<catalog>
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications 
      with XML.</description>
   </book>
   <book id="bk102">
      <author>Ralls, Kim</author>
      <title>Midnight Rain</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-12-16</publish_date>
      <description>A former architect battles corporate zombies, 
      an evil sorceress, and her own childhood to become queen 
      of the world.</description>
   </book>
   <book id="bk103">
      <author>Corets, Eva</author>
      <title>Maeve Ascendant</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-11-17</publish_date>
      <description>After the collapse of a nanotechnology 
      society in England, the young survivors lay the 
      foundation for a new society.</description>
   </book>
   <book id="bk104">
      <author>Corets, Eva</author>
      <title>Oberon's Legacy</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2001-03-10</publish_date>
      <description>In post-apocalypse England, the mysterious 
      agent known only as Oberon helps to create a new life 
      for the inhabitants of London. Sequel to Maeve 
      Ascendant.</description>
   </book>
   <book id="bk105">
      <author>Corets, Eva</author>
      <title>The Sundered Grail</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2001-09-10</publish_date>
      <description>The two daughters of Maeve, half-sisters, 
      battle one another for control of England. Sequel to 
      Oberon's Legacy.</description>
   </book>
   <book id="bk106">
      <author>Randall, Cynthia</author>
      <title>Lover Birds</title>
      <genre>Romance</genre>
      <price>4.95</price>
      <publish_date>2000-09-02</publish_date>
      <description>When Carla meets Paul at an ornithology 
      conference, tempers fly as feathers get ruffled.</description>
   </book>
   <book id="bk107">
      <author>Thurman, Paula</author>
      <title>Splish Splash</title>
      <genre>Romance</genre>
      <price>4.95</price>
      <publish_date>2000-11-02</publish_date>
      <description>A deep sea diver finds true love twenty 
      thousand leagues beneath the sea.</description>
   </book>
   <book id="bk108">
      <author>Knorr, Stefan</author>
      <title>Creepy Crawlies</title>
      <genre>Horror</genre>
      <price>4.95</price>
      <publish_date>2000-12-06</publish_date>
      <description>An anthology of horror stories about roaches,
      centipedes, scorpions  and other insects.</description>
   </book>
   <book id="bk109">
      <author>Kress, Peter</author>
      <title>Paradox Lost</title>
      <genre>Science Fiction</genre>
      <price>6.95</price>
      <publish_date>2000-11-02</publish_date>
      <description>After an inadvertant trip through a Heisenberg
      Uncertainty Device, James Salway discovers the problems 
      of being quantum.</description>
   </book>
   <book id="bk110">
      <author>O'Brien, Tim</author>
      <title>Microsoft .NET: The Programming Bible</title>
      <genre>Computer</genre>
      <price>36.95</price>
      <publish_date>2000-12-09</publish_date>
      <description>Microsoft's .NET initiative is explored in 
      detail in this deep programmer's reference.</description>
   </book>
   <book id="bk111">
      <author>O'Brien, Tim</author>
      <title>MSXML3: A Comprehensive Guide</title>
      <genre>Computer</genre>
      <price>36.95</price>
      <publish_date>2000-12-01</publish_date>
      <description>The Microsoft MSXML3 parser is covered in 
      detail, with attention to XML DOM interfaces, XSLT processing, 
      SAX and more.</description>
   </book>
   <book id="bk112">
      <author>Galos, Mike</author>
      <title>Visual Studio 7: A Comprehensive Guide</title>
      <genre>Computer</genre>
      <price>49.95</price>
      <publish_date>2001-04-16</publish_date>
      <description>Microsoft Visual Studio 7 is explored in depth,
      looking at how Visual Basic, Visual C++, C#, and ASP+ are 
      integrated into a comprehensive development 
      environment.</description>
   </book>
</catalog>

Below are three different XQuery code snippets. You can test these with the Firefox XQuery USE ME (XqUSEme) add-on (install it first if necessary).

  1. Open books.xml in the Firefox browser
  2. Select Tools > Perfom XQuery
  3. Under the XQuery Tab, ensure that "Use this document for query?" is checked
  4. Copy and paste 1 of the 3 XQuery commands [see below] in the XQuery textarea
  5. On the right, select "Open in a new tab" AND change "output file ext."
  6. Select "Perform XQuery" and view the results

Xquery 1:

xquery version "1.0";
<html>
       <head>
  <title>Danny Dean - XQuery 1</title>
 </head>
 <body>
  <h1>Book Titles (A-Z)</h1>
   <div>
    <ul>
    {
     for $b in doc()//catalog/book
     order by $b/title ascending
     return
      <li>{$b/title/text()}</li>
    }
    </ul>
   </div>
 </body>
</html>

Xquery 2:

xquery version "1.0";
<html>
 <head>
  <title>Danny Dean - XQuery 2</title>
 </head>
 <body>
  <h1>Books by Genre</h1>
  <div>
  {
   for $genre in fn:distinct-values(doc()//catalog/book/genre)
   return
    <div>
     <h3>{$genre}</h3>
     <ul>
      {
       for $b in doc()//catalog/book
       where $b/genre = $genre
       return
        <li>{$b/title/text()}</li>
      }
     </ul>
    </div>
     }
  </div>
 </body>
</html>

Xquery 3:

xquery version "1.0";
<html>
 <head>
  <title>Danny Dean - XQuery 3</title>
 </head>
 <body>
   <h1>Most and Least Expensive Books</h1>
   { 
     let $books := doc()//catalog/book			
     let $max := $books[price = max($books/price)]
     let $min := $books[price = min($books/price)]
     return
     <div>
      <div>Most Expensive: {$max[1]/title/text()} - ${$max[1]/price/text()}</div>
      <div>Least Expensive: {$min[1]/title/text()} - ${$min[1]/price/text()}</div>
    </div>
   }
 </body>
</html>

Declarations

This section just shows some more advanced features. XQuery is a full programming language (stuff for another tutorials....).

Namespaces

If an XML file has namespaces, they must be declared. If there is a single, then it's easiest to declare it as default namespace, e.g.

 xquery version "1.0";
 declare default element namespace 'http://www.w3.org/1999/xhtml';

Here is an example that deals with XML that includes RDF and Dublin Core metadata

 xquery version "1.0";
 declare namespace rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
 declare namespace dc="http://purl.org/dc/elements/1.1/";

If there are more than one, e.g. an XHTML with some XML inside, then you'd have to declare the namespace plus use a prefix within the queries (example taken from the XqUSeme Firefox extension help)

 declare namespace html = 'http://www.w3.org/1999/xhtml';
 <textarea rows='20' cols='50'> {
 for $par in doc()//html:p 
 where starts-with($par, 'This') 
 return <p>{$par/string()}</p> } </textarea>
Proprietary extensions
  • Most vendors implement various extensions, e.g. declarations for HTML serialization of output.
Variables (untyped and typed)
 declare variable $x := 7.5;
 declare variable $x as xs:integer := 7;
Functions

“XQuery allows users to declare functions of their own. A function declaration specifies the name of the function, the names and datatypes of the parameters, and the datatype of the result. [..] A function declaration specifies whether a function is user-defined or external. For a user-defined function, the function declaration includes an expression called the function body that defines how the result of the function is computed from its parameters.” XQuery 1.0: An XML Query Language (specification, , retrieved 21:13, 11 February 2010 (UTC)).

The following example is also taken from the specification:

declare function local:summary($emps as element(employee)*) 
   as element(dept)*
{
   for $d in fn:distinct-values($emps/deptno)
   let $e := $emps[deptno = $d]
   return
      <dept>
         <deptno>{$d}</deptno>
         <headcount> {fn:count($e)} </headcount>
         <payroll> {fn:sum($e/salary)} </payroll>
      </dept>
};


local:summary(fn:doc("acme_corp.xml")//employee[location = "Denver"])

The function call is in the last line.

Links

Sample XML files on-line

(that you may use for playing with XQuery)

  • Shakespeare's play have been translated to XML by Jon Bozak. The original is at Oasis (Zip), but they also can be found on the Internet. E.g. search for hamlet.xml. Café con Leche has a copy

Software and standards

See the XQuery article.

According to the W3C, there are over 40 different software packages that support XML Query in some way. However, this does not mean that non-programmers' XQuery tools can be easily found currently (Jan 2009).

An easy way to play with XQuery is to install a browser extension. We tried out:

  • XQuery USE ME (XqUSEme) which is based on the (serious) Saxon B engine. The only problem I had is that it will try to validate and if the DTD is missing then there is trouble.

XQuery Tutorials

See also the XQuery article

XQuery Web sites