XML Schema tutorial - Basics: Difference between revisions

The educational technology and digital learning wiki
Jump to navigation Jump to search
m (Created page with "<pageby nominor="false" comments="false"/> {{web technology tutorial|beginner}} == Introduction == This is a beginners tutorial for XML Schema (of called '''XSD''' in refer...")
 
 
(106 intermediate revisions by the same user not shown)
Line 1: Line 1:
<pageby nominor="false" comments="false"/>
<!-- <pageby nominor="false" comments="false"/> -->
{{web technology tutorial|beginner}}
{{web technology tutorial|beginner}}
{{Incomplete}}


== Introduction ==
== Introduction ==


This is a beginners tutorial for [[XML Schema]] (of called '''XSD''' in reference to the file name extension *.xsd) made from slides
This is a beginners tutorial for [[XML Schema]] (often called '''XSD''' in reference to the file name extension *.xsd)


;Objectives
;Objectives
Line 10: Line 11:
* Understand the purpose of XSD
* Understand the purpose of XSD
* Be able to cope with XSD editing
* Be able to cope with XSD editing
* Translat DTDs to XSD with a conversion tool
* Translate DTDs to XSD with a conversion tool
* Modify data types of a given XSD
* Modify data types of a given XSD
* Write very simple XSD grammars
* Write very simple XSD grammars
* Use XML files with XSD namespaces together with XSLT stylesheets


; Prerequisites
; Prerequisites
Line 18: Line 20:
* Editing [[XML]] (being able to use a simple DTD). Catch up with the [[Editing XML tutorial]]
* Editing [[XML]] (being able to use a simple DTD). Catch up with the [[Editing XML tutorial]]
* Be somewhat familiar with DTD's (see the [[DTD tutorial]])
* Be somewhat familiar with DTD's (see the [[DTD tutorial]])
* XML namespaces (some, have a look at the [[XML namespace]] article)
* XML namespaces (some, have a look at the [[XML namespace]] article. At least you should know why the XSD prefix could be "xs" or "xsd" or "banana"....)
* HTML and CSS (some)
* HTML and CSS (some)
* Warning: XSD is a rather complex schema definition language. For one problem there always exist several good solutions.


; Next steps
; Moving on
* ...
* [[Relax NG]], the better alternative
 
* [[XSLT Tutorial - Basics]], learn how to transform your contents with XSLT
; Warning
 
XSD is a rather complex schema definition language. For one problem there always exist several good solutions.


These slides have been prepared with the help of
These slides have been prepared with the help of


* The W3C XML Schema primer: '''http://www.w3.org/TR/xmlschema-0/'''
* The W3C XML Schema primer: http://www.w3.org/TR/xmlschema-0/  
 
* Roger Costello s extensive XML Schema tutorial: http://www.xfront.com/
* Roger Costello s extensive XML Schema tutorial:''' http://www.xfront.com/'''
 
'''Contents'''
 
1.Introduction5
 
'''1.1Kinds of XML grammars5'''
 
'''1.2Feature comparison between grammar-based schemas6'''
 
'''1.3Resources7'''
 
2.XSD bare bones8
 
'''2.1The structure and namespace of an XSD file8'''
 
A.Solution 1: Give a namespace to XSD code 9
 
Example 2-1:XSD definition for a simple recipe 9
 
B.Solution 2: Give a namespace to target code 10
 
Example 2-2:XSD definition for a simple recipe 10
 
'''2.2Validation11'''
 
A.Association of XSD with XML, Solution 1 11


B.Association of XSD with XML, Solution 2 12
=== Types of XML grammars ===


Example 2-3:XML for a simple recipe with an associated XSD (file recipe.xml) 12
We may distinguish between several kinds of XML grammars


Exemple 2-4:IMS Content Packaging 1.1.4 and IMS/LOM Metadata 14
[[image:xml-schema-2.png|thumb|750px|none|Kinds of XML grammars]]


'''2.3Element definitions15'''
* A grammar-based schema specifies '''what elements''' may be used in an XML document, the '''order''' of the elements, the number of '''occurrences''' of each element, and finally the '''content and datatype''' of each element and attribute.
 
'''2.4Data types17'''
 
'''2.5Simple user-defined types19'''
 
Exemple 2-5:Exemple "list": 19
 
Exemple 2-6:restricted list of words to choose from 19
 
Exemple 2-7:Restrictions on numbers 20
 
'''2.6Organization of elements21'''
 
A.References vs. direct insertion (recall) 21
 
B.Sequences 22
 
Example 2-8:A list of ordered child elements 22
 
Example 2-9:A list with one more recipe child elements 22
 
Example 2-10:A list of ordered child elements 23
 
Example 2-11:A list with an optional email element - repeatable 23
 
C.Choice 24
 
Example 2-12:Optional repeatable child elements 24
 
Example 2-13:Either - or child elements 24
 
D.Mixed contents (tags and text) 25
 
E.Empty elements 25
 
'''2.7Attributes26'''
 
Example 2-14:Attribute groups 27
 
'''2.8Value constraints29'''
 
Example 2-15:Restrict values for an age element 29
 
3.From DTDs to XSDs30
 
'''3.1Encoding elements30'''
 
'''3.2Attribute definitions32'''
 
4.Next steps34
 
'''4.1Reading34'''
 
'''4.2Next modules34'''
 
5.Homework35
 
'''5.1Task35'''
 
'''1.Introduction'''
 
'''1.1Kinds of XML grammars'''
 
 
* A grammar-based schema specifies:
 
* what elements may be used in an XML document, the order of the elements, the number of occurrences of each element, etc.
 
* the content and datatype of each element and attribute.
 
* An assertion-based schema:
 
* makes assertions about the relationships that must hold between the elements and attributes in an XML instance document.
 
'''1.2Feature comparison between grammar-based schemas'''


* An assertion-based schema makes assertions about the relationships that must hold between the elements and attributes in an XML instance document.


Comparison between grammar-based schemas


{| class="prettytable"
{| class="prettytable"
Line 210: Line 110:
* XML Schemas were created to define more precise grammars than with DTDs, in particular one can define Data Types and more sophisticated element structures
* XML Schemas were created to define more precise grammars than with DTDs, in particular one can define Data Types and more sophisticated element structures


* DTD supports 10 datatypes; XML Schemas supports 44+ datatypes
* DTD supports 10 datatypes, mostly for attributes. XML Schema supports 44 datatypes and in addition, you can define your own.


* Relax NG was a reaction by people who didn t like this new format. It is about as powerful as XSD but not as complicated
* Relax NG was a reaction by people who didn't like this new format. It is about as powerful as XSD. but not as complicated.


'''1.3Resources'''
=== Resources ===


* XML Schema (also called XSD or simply Schema) is difficult
* XML Schema (also called XSD or simply Schema) is difficult
Line 224: Line 124:
'''W3C websites:'''
'''W3C websites:'''


'''''url: http://www.w3.org/XML/Schema'' (W3C Overview Page)'''
url: http://www.w3.org/XML/Schema (W3C Overview Page)


'''''url: ''http://www.w3.org/TR/xmlschema-0/ The W3C XML Schema primer'''
url: ''http://www.w3.org/TR/xmlschema-0/ The W3C XML Schema primer''


'''Specifications:'''
'''Specifications:'''


'''''url: http://www.w3.org/TR/xmlschema-1/'' XML Schema Part 1: Structures Second Edition 2004'''
url: http://www.w3.org/TR/xmlschema-1/ XML Schema Part 1: Structures Second Edition 2004


'''''url: http://www.w3.org/TR/xmlschema-2/'' XML Schema Part 2: Datatypes Second Edition 2004'''
url: http://www.w3.org/TR/xmlschema-2/ XML Schema Part 2: Datatypes Second Edition 2004


'''Tools:'''
'''Tools:'''


* Exchanger XML Editor can handle XML Schema
Exchanger XML Editor can handle XML Schema


* Support for XSD editing
* Support for XSD editing
* Validation of XSD file
* Validation of XSD file
* Validation of XML against XSD
* Validation of XML against XSD
* DTD/XSD/Relax NG translation
* DTD/XSD/Relax NG translation


'''2.XSD bare bones'''
== XSD bare bones ==


'''2.1The structure and namespace of an XSD file'''
=== The structure and namespace of an XSD file ===


* As '''any''' XML file, an XSD file must start with an XML declaration
* As '''any''' XML file, an XSD file must start with an XML declaration
Line 260: Line 157:
* Complex XSD files refer to more than one "Schema" namespace (see later)
* Complex XSD files refer to more than one "Schema" namespace (see later)


[[image:xml-schema-3.png|thumb|750px|none|Structure of an XSD file]]


'''Namespaces and prefixes'''
=== Namespaces and namespace prefixes ===
 
* You can '''either''' define a prefix for the XSD elements '''or''' one for your own XML elements
 
* See solution 1 and 2 below


* You then can decide whether your XML elements are namespaced
Since XSD is XML, one must be able to dinguish XSD elements from the language you are defining.
* You '''either''' can define a prefix for the XSD elements '''or''' one for your own XML elements. See solution 1 and 2 below
* You then can decide whether your XML elements are namespaced or not


'''A.Solution 1: Give a namespace to XSD code'''
==== Solution 1: Give a namespace prefix to the XSD code ====


* We define the '''xs:''' prefix for the XSD namespace
* We define the '''xs:''' prefix for the XSD namespace


* Doesn t matter what prefix we use (usually '''xs:''' but often '''xsd:''')
* Doesn't matter what prefix we use (usually '''xs:''' but often '''xsd:''')


* '''elementFormDefault="qualified"''' means that your target XML files will not have namespaces
* '''elementFormDefault="qualified"''' means that your target XML files will not have namespaces


'''Example 2-1: XSD definition for a simple recipe'''
'''Example: XSD definition for a simple recipe''' (ignore the details, and just look at the namespace declaration and prefix)
 
<source lang="XML">
  <?xml version="1.0" encoding="UTF-8"?>
  <?xml version="1.0" encoding="UTF-8"?>
  <!-- Simple recipe Schema -->
  <!-- Simple recipe Schema -->
  <'''xs:schema''' '''xmlns:xs="http://www.w3.org/2001/XMLSchema"'''
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
             '''elementFormDefault="qualified'''">
             elementFormDefault="qualified">
   <'''xs''':element name="list">
   <xs:element name="list">
     <xs:complexType>
     <xs:complexType>
       <xs:sequence>
       <xs:sequence>
         <xs:element maxOccurs="unbounded" ref="'''recipe'''"/>
         <xs:element maxOccurs="unbounded" ref="recipe"/>
       </xs:sequence>
       </xs:sequence>
     </xs:complexType>
     </xs:complexType>
   </xs:element>
   </xs:element>
  .....
  </xs:schema>
</source>
==== Solution 2: Give a namespace to target code and then prefix it ====


'''B. Solution 2: Give a namespace to target code'''
The following solution is less often used.


* We use a prefixed namespace for '''our''' XML elements
* We use a prefixed namespace for '''our''' XML elements


* Declare the XMLSchema namespace as default namespace, i.e. XSD elements will not be prefixed as in the next example
* The XML Schema namespace becomes default namespace, i.e. XSD elements will not be prefixed as shown in the next example.


'''Example 2-2: XSD definition for a simple recipe'''
'''Example: XSD definition for a simple recipe'''


<source lang="xml">
  <schema
  <schema
  '''xmlns='http://www.w3.org/2000/10/XMLSchema''''
xmlns='http://www.w3.org/2000/10/XMLSchema'
  '''targetNamespace='http://yourdomain.org/namespace/''''
targetNamespace='http://yourdomain.org/namespace/'
  '''xmlns:t='http://yourdomain.org/namespace/''''>
xmlns:t='http://yourdomain.org/namespace/'
 


   <element name='list'>
   <element name='list'>
   <complexType>
   <complexType>
     <sequence>
     <sequence>
     <element ref=''''t:recipe'''' maxOccurs='unbounded'/>
     <element ref='t:recipe' maxOccurs='unbounded'/>
     </sequence>
     </sequence>
   </complexType>
   </complexType>
   </element>
   </element>
</source>


'''2.2Validation'''
=== Association of an XSD with an XML file - validation ===


* An XML document described by a XSD is called an '''instance document'''.
An XML document described by a XSD is called an '''instance document'''. As with DTDs, you do not need to create an association in the XML file in order to validate an XML file, you could "manually" validate an XML against an XSD and most XML editors will allow you to do so. For example, in XML Exchanger, simply click on the validate icon, then select the XSD file when asked....


* As with DTDs one can validate an XML against an XSD and most XML editors will allow you to do so.
However, we will show two solutions for "linking" an XSD to an XML file. However, '''be aware''' that any XSLT stylesheet will need to be adapted.


* In XML Exchanger, simple click the validate icon, then select the XSD file when asked....
==== Association of XSD with XML, Solution 1 ====


'''A.Association of XSD with XML, Solution 1'''
* You must declare the '''XMLSchema-instance namespace'''. It's a little extra XML language that allows to link XSDs to XML files.
 
* You must declare the xsi:'''XMLSchema-instance namespace'''


* The '''xsi:noNamespaceSchemaLocation''' attribute defines the URL of your XSD
* The '''xsi:noNamespaceSchemaLocation''' attribute defines the URL of your XSD


* Warning: Make sure you get spelling and case right !!!
* Warning: Make sure you get its ''spelling'' and ''case'' right !!!


'''XML file (http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe-no-ns.xml )'''
Example:
* XML: [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe-no-ns.xml recipe-no-ns.xml]


  <?xml version="1.0" encoding="ISO-8859-1" ?>
<source lang="xml">
  <?xml version="1.0" ?>
  <list
  <list
   '''xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"'''
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   '''xsi:noNamespaceSchemaLocation="recipe-no-ns.xsd">'''
   xsi:noNamespaceSchemaLocation="recipe-no-ns.xsd">
   <recipe> ....
   <recipe> ....
  </list>
  </list>
</source>


'''XSD file (http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe-no-ns.xsd)'''
XSD file: [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe-no-ns.xsd recipe-no-ns.xsd]


  <?xml version="1.0" encoding="UTF-8"?>
<source lang="xml">
  <?xml version="1.0" ?>
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
             elementFormDefault="qualified">
             elementFormDefault="qualified">
   <xs:element name="list">
   <xs:element name="list">
</source>


'''B. Association of XSD with XML, Solution 2'''
==== Association of XSD with XML, Solution 2 ====


* This solution is more popular since many XML standards require a namespace
This solution is more popular for various reasons (e.g. most XML languages require a namespace declaration anyhow).


1.Both XML and XSD files must contain a '''namespace declaration for your domain'''
1. Both the XML and the XSD file must contain a '''namespace declaration for your domain'''


The XML file must contain in addition:
2. The XML file must contain in addition:
* a ''namespace declaration'' for '''XMLSchema-instance'''
* a '''xsi:schemaLocation''' ''attribute'' that tells where to find the XSDs. This attribute can have as many "namespace-URL" pairs as you like


2.a declaration for the '''XMLSchema-instance namespace'''
'''Example: XML for a simple recipe with an associated XSD'''
* [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe.xml recipe.xml]


3.a '''xsi:schemaLocation attribute''' that tells for your namespaces where to find the XSDs
<source lang="xml">
  <?xml version="1.0"?>
  <list
    xmlns="http://myrecipes.org/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://myrecipes.org/ recipe.xsd"
  >
    <recipe>
      <meta> .....</meta>
      ......
    </recipe>
  </list>
</source>


* This attribute can have as many "namespace-URL" pairs as you like
If you wish to reuse this code fragment for your own XML: You must make two changes in the code above, i.e. define
* A namespace for your own tags, e.g.
: ''xmlns="http://your_domain/something/"''
* Tell for a given namespace, where to find the XSD file, e.g.
: ''xsi:schemaLocation="http://yourdomain/something/ some-schema.xsd"''


'''Example 2-3: XML for a simple recipe with an associated XSD (file recipe.xml)'''
'''Example XSD file: '''  


'''XML file (http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe.xml)'''
* [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe.xsd recipe.xsd]


<?xml version="1.0" encoding="ISO-8859-1" ?>
<source lang="xml">
'''<list'''
  <?xml version="1.0"?>
  '''xmlns="http://myrecipes.org/"'''
  '''xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"'''
  '''xsi:schemaLocation="http://myrecipes.org/ recipe.xsd" >'''
  <recipe>
    <meta> .....</meta>
    ......
  </recipe>
'''</list>'''
 
In practical terms: You must provide something for the pink and red above
 
'''XSD file (http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe.xsd)'''
 
  <?xml version="1.0" encoding="UTF-8"?>
  <!-- Simple recipe Schema -->
  <!-- Simple recipe Schema -->
  <xs:schema '''xmlns:xs="http://www.w3.org/2001/XMLSchema'''"
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
             '''targetNamespace="http://myrecipes.org/"'''
             targetNamespace="http://myrecipes.org/"  
             '''xmlns="http://myrecipes.org/"'''
             xmlns="http://myrecipes.org/"  
             elementFormDefault="qualified">
             elementFormDefault="qualified">
     ....
     ....
  </xs:schema>
  </xs:schema>
 
</source>


* This XSD defines a default namespace (no prefixes) for your tags
* This XSD defines a default namespace (no prefixes) for your tags


* You should substitute '''http://myrecipes.org/ '''by an URL of your own, preferably an URL over which you have control, e.g. a blog or a home page.
* Again, in your XML, you should substitute '''http://myrecipes.org/''' by an URL of your own, preferably an URL over which you have control, e.g. a blog or a home page.


'''Exemple 2-4: IMS Content Packaging 1.1.4 and IMS/LOM Metadata'''
'''Example: IMS Content Packaging 1.1.4 and IMS/LOM Metadata'''


'''This XML file will use two vocabularies'''
This XML file uses two XML vocabularies: ''imscp'' and ''imsmd''


<source lang="xml">
  <manifest  
  <manifest  
   '''xmlns'''="'''http://www.imsglobal.org/xsd/imscp_v1p1'''"
   xmlns="http://www.imsglobal.org/xsd/imscp_v1p1"
   '''xmlns:imsmd'''="'''http://www.imsglobal.org/xsd/imsmd_v1p2'''"  
   xmlns:imsmd="http://www.imsglobal.org/xsd/imsmd_v1p2"  
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
   identifier="MANIFEST-1"
   identifier="MANIFEST-1"
   xsi:schemaLocation=
   xsi:schemaLocation=
     '''"http://www.imsglobal.org/xsd/imscp_v1p imscp_v1p1.xsd'''
     "http://www.imsglobal.org/xsd/imscp_v1p imscp_v1p1.xsd  
     '''http://www.imsglobal.org/xsd/imsmd_v1p2 imsmd_v1p2p2.xsd"'''>
     http://www.imsglobal.org/xsd/imsmd_v1p2 imsmd_v1p2p2.xsd">
   <metadata>
   <metadata>
     <'''imsmd''':lom> ...... </'''imsmd''':lom>
     <imsmd:lom> ...... </imsmd:lom>
   </metadata>
   </metadata>
   <organizations default="learning_sequence_1">
   <organizations default="learning_sequence_1">
  .....
  .....
</source>


* imscp_v1p1 is the default namespace (no prefix)
* ''imscp_v1p1'' is the default namespace (no prefix)
 
* ''imsmd_v1p1'' is the namespace for metadata.
* imsmd_v1p1 is the namespace for metadata.


'''Extract of ims_v1p1.xsd'''
'''Extract of ims_v1p1.xsd'''


<source lang="xml">
  <xsd:schema  
  <xsd:schema  
     '''xmlns''' = "'''http://www.imsglobal.org/xsd/imscp_v1p1'''"
     xmlns = "http://www.imsglobal.org/xsd/imscp_v1p1"
     targetNamespace = "'''http://www.imsglobal.org/xsd/imscp_v1p1'''"
     targetNamespace = "http://www.imsglobal.org/xsd/imscp_v1p1"
     xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance"
     xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance"
     xmlns:xsd = "http://www.w3.org/2001/XMLSchema"
     xmlns:xsd = "http://www.w3.org/2001/XMLSchema"
     version = "IMS CP 1.1.4"    elementFormDefault = "qualified">
     version = "IMS CP 1.1.4"    elementFormDefault = "qualified">
</source>
== Defining elements, attributes and structure ==
=== Element definitions ===


'''2.3Element definitions'''
Recall that XML structure is mostly about defining and nesting elements, so we firstly need to learn how to define elements.


* Recall that XML structure is about nested elements
Elements are defined with xs:element,


'''''<xs:element>'''''
'''<nowiki><xs:element></nowiki>'''


* Elements are defined with xs:element,
Example of a simple element without children and attributes:


Example of a simple element without childre and attributes:
<xs:element name="author" type="xs:string"/>


  <'''xs:element''' name="author" type="xs:string"/>
Its DTD equivalent would be:
  <!ELEMENT author (#PCDATA)>


'''Definition of children'''
Element children can be defined in two ways:
# The Russian puppet model: use a '<nowiki/>'''complexType'''' child element
# The Salami model: use a '<nowiki/>'''type'<nowiki/>''' attribute that refers to a data type you defined as '<nowiki/>'''complexType''''.


* Element children can be defined in two ways:(1) with a complexType child element or (2) with a type attribute
Let's examine both ways:


'''''<xs:complexType> (1)'''''
'''The Russan puppet model: <nowiki><xs:complexType></nowiki> (1)'''


* '''''complexType '''''as a child element of '''''xs:element:''''' means "we got children or attributes to declare"
''complexType'' is a child element of ''element'' and it will define the possible "data structures" for the element. In the example below we define five child elements for "recipe" - i.e. recipe_name, ingredients and directions. We also could define attributes, grand children that way.


  <'''xs:element''' name="recipe">
<source lang="xml">
     '''<xs:complexType>'''
  <xs:element name="recipe">
     <xs:complexType>
       <xs:sequence>
       <xs:sequence>
         <xs:element ref="meta"/>
         <xs:element ref="meta"/>
Line 452: Line 373:
         <xs:element ref="directions"/>
         <xs:element ref="directions"/>
       </xs:sequence>
       </xs:sequence>
     '''</xs:complexType> '''
     </xs:complexType>  
   </xs:element>
   </xs:element>
</source>
<code>ref=....</code> will point to the definition of the sub-element (see further below for an example)


'''''<xs:complexType> (2)'''''
The Russian puppet model is recommended for very simple DTDs. The Salami model below is more modular and therefore a better solution, most of the time.


* Alternatively, one can declare a complex type by itself and then "use it" in an element declaration.
'''The Salami model: <nowiki><xs:complexType></nowiki> (2)'''


'''''url: http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe2.xsd'' '''
You can declare a complex type by itself and then "use it" in an element declaration.


* Referring to a type:
Example XSD: [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe2.xsd recipe2.xsd]


<'''xs:element''' name="recipe" type="'''recipe_contents'''" />
* Defining an element that refers to a complex type for its child elements:


<source lang="xml">
<xs:element name="recipe" type="recipe_contents" />
</source>


* Defining the type:
* Defining the complex type:


  <'''xs:complexType''' name="'''recipe_contents'''">
<source lang="xml">
  <xs:complexType name="recipe_contents">
       <xs:sequence>
       <xs:sequence>
         <xs:element ref="meta"/>
         <xs:element ref="meta"/>
Line 477: Line 405:
         <xs:element ref="directions"/>
         <xs:element ref="directions"/>
       </xs:sequence>
       </xs:sequence>
  <'''/xs:complexType'''>
  </xs:complexType>
</source>Now before we further explain how to define element contents, let us have a look at data types.


'''2.4Data types'''
=== Data types ===


Simple data types allow to define what kind of data elements and attributes can contain
Simple data types allow to define what kind of data elements and attributes can contain


Examples:
Examples:


{| class="prettytable"
{| class="prettytable"
| <center>'''Simple Type'''</center>
| <center>Simple Type</center>
| <center>'''Examples (delimited by commas)'''</center>
| <center>Examples (delimited by commas)</center>
| <center>'''Explanation'''</center>
| <center>Explanation</center>


|-
|-
Line 585: Line 512:
| anyURI
| anyURI
| http://www.example.com/
| http://www.example.com/
|  
| Any sort of URI


|-
|-
Line 593: Line 520:


|}
|}
In addition one can define list types, union types and complex types
'''2.5Simple user-defined types'''
'''Exemple 2-5: Exemple "list":'''
XSD:
<xsd:element name="'''listOfMyInt'''" type="'''listOfMyIntType'''"/>
<xsd:simpleType name="'''listOfMyIntType'''">
    <xsd:'''list''' itemType="xsd:'''integer'''"/>
</xsd:simpleType>
XML:
<l'''istOfMyInt'''>20003 15037 95977 95945</listOfMyInt>
'''Exemple 2-6: restricted list of words to choose from'''


XSD:
In addition to the built-in simple data types, one can define list types, union types and complex types. We already have shown an example with complex types above. Complex types include structural information, i.e. use of child elements.


<xsd:element name="'''theory'''" type="'''list_theories'''"/>
We shall introduce how to define simple types - e.g. lists of terms or ranges of numerical values that the user must choose from - in section [[#Value restrictions|Value restrictions]]


 
=== Organization of elements ===
<xsd:simpleType name="'''list_theories'''">
    <xsd:'''restriction''' base="'''xsd:string'''">
        <xsd:enumeration value="constructivism"/>
        <xsd:enumeration value="behaviorism"/>
        <xsd:enumeration value="cognitivism"/>
    </xsd:restriction>
</xsd:simpleType>
 
XML:
 
<'''theory'''>constructivism<'''/theory'''>
 
'''Exemple 2-7: Restrictions on numbers'''
 
* This time (to change a bit) we define the type as child.
 
XSD:
 
<xs:element name="'''age'''">
 
 
  <xs:simpleType>
  <xs:'''restriction base="xs:integer"'''>
    <xs:'''minInclusive''' value="0"/>
    <xs:'''maxInclusive''' value="120"/>
  </xs:restriction>
  </xs:simpleType>
 
 
</xs:element>
 
 
XML:
 
<'''age'''>100</age>
 
 
'''2.6Organization of elements'''


* XSD allows for quite sophisticated occurrence constraints, i.e. how child elements can be used within an element. Here we only cover a few basic design patterns
* XSD allows for quite sophisticated occurrence constraints, i.e. how child elements can be used within an element. Here we only cover a few basic design patterns
* Both child elements and attributes are defines as '''complexType'''s, i.e. as possible element and attribute combinations that can be inserted within an element.


'''A.References vs. direct insertion (recall)'''
==== Salami vs. russian puppet style ====


* It is best to define all elements in a flat list and then refer to these when you define how child elements are to be inserted
As already mentioned, it is usually best to define all elements in a flat list and then refer to these when you define how child elements are to be inserted


'''Defining elements within elements (not so good)'''
'''Defining elements within elements (not so good)'''


<source lang="xml">
  <xs:element name="meta">
  <xs:element name="meta">
     <xs:complexType>
     <xs:complexType>
Line 675: Line 546:
     </xs:complexType>
     </xs:complexType>
  </xs:element>
  </xs:element>
</source>


'''Defining child elements with a reference (generally a better solution)'''
'''Defining child elements with a reference'''


* See next Example 2-8: A list of ordered child elements [22]
This is generally a better solution since you then can reuse a complexTye. This replaces the functionality of parametric entities in a DTD.


<'''xs:sequence'''>
The XML ''meta'' element has an ''author'' child element. ''ref="author"'' refers to a definition for ''author'' made elsewhere, e.g. below or before.
         <xs:element '''ref="author"'''/>
<source lang="xml">
<xs:element name="meta">
  <xs:complexType>
    <xs:sequence>
         <xs:element ref="author"/>
         .....
         .....
  </'''xs:sequence'''>
    </xs:sequence>
  </xs:complexType>
  </xs:element>


'''B.Sequences'''
.....
<xs:element name="author" type="xs:string"/>
.... 
</source>
 
==== Sequences ====


* Number of times a child element can occur is defined with minOccurs and maxOccurs attributes.
* Number of times a child element can occur is defined with minOccurs and maxOccurs attributes.


'''Example 2-8: A list of ordered child elements'''
'''Example: A list of three ordered child elements (salami style definition)'''


   <'''xs:element''' name="meta">
The element meta has three child elements (author, data, version) that must be used in that order.
 
<source lang="xml">
   <xs:element name="meta">
     <xs:complexType>
     <xs:complexType>
       <'''xs:sequence'''>
       <xs:sequence>
         <xs:element '''ref="author"'''/>
         <xs:element ref="author"/>
         <xs:element '''ref="date"'''/>
         <xs:element ref="date"/>
         <xs:element '''ref="version"'''/>
         <xs:element ref="version"/>
       </xs:sequence>
       </xs:sequence>
     </xs:complexType>
     </xs:complexType>


  <xs:element name="version" type="xs:string"/>
  <xs:element name="date" type="xs:string"/>
  <xs:element name="author" type="xs:string"/>
</source>


  <'''xs:element''' '''name="version"''' type="xs:string"/>
'''Example: A list with one more recipe child elements'''
  <xs:element '''name="date"''' type="xs:string"/>
  <xs:element '''name="author"''' type="xs:string"/>
 
'''Example 2-9: A list with one more recipe child elements'''


<source lang="xml">
   <xs:element name="list">
   <xs:element name="list">
     <xs:complexType>
     <xs:complexType>
       <'''xs:sequence'''>
       <xs:sequence>
         <xs:element '''maxOccurs="unbounded'''" '''ref="recipe'''"/>
         <xs:element maxOccurs="unbounded" ref="recipe"/>
       </xs:sequence>
       </xs:sequence>
     </xs:complexType>
     </xs:complexType>
   </xs:element>
   </xs:element>
</source>


'''Example 2-10: A list of ordered child elements'''
The ''list'' element can include between 1 and N ''recipe'' elements
 
  <xs:element name="meta">
    <xs:complexType>
      <'''xs:sequence'''>
        <xs:element '''ref="author"'''/>
        <xs:element '''ref="date"'''/>
        <xs:element '''ref="version"'''/>
      </xs:sequence>
    </xs:complexType>


'''Example 2-11: A list with an optional email element - repeatable'''
'''Example: A list with an optional email element - repeatable'''


<source lang="xml">
   <xs:element name="person">
   <xs:element name="person">
     <xs:complexType>
     <xs:complexType>
       <xs:sequence>
       <xs:sequence>
         <xs:element ref="name"/>
         <xs:element ref="name"/>
         <xs:element '''minOccurs="0" maxOccurs="unbounded"''' ref="email"/>
         <xs:element minOccurs="0" maxOccurs="unbounded" ref="email"/>
         <xs:element ref="link"/>
         <xs:element ref="link"/>
       </xs:sequence>
       </xs:sequence>
       <xs:attributeGroup ref="attlist.person"/>
       <xs:attributeGroup ref="attlist.person"/>
     </xs:complexType>
     </xs:complexType>
   </xs:element>
   </xs:element>
</source>


'''C. Choice'''
The ''person'' element
* must include a ''name'' element
* can include 0, 1 or many ''email'' elements
* must include a ''link'' element
* also defines attributes (see below)


'''Example 2-12: Optional repeatable child elements'''
==== Choice ====


'''Example: Optional repeatable child elements'''
XSD:
<source lang="xml">
   <xs:element name="INFOS">
   <xs:element name="INFOS">
     <xs:complexType>
     <xs:complexType>
       <'''xs:choice minOccurs="0" maxOccurs="unbounded"'''>
       <xs:choice minOccurs="0" maxOccurs="unbounded">
         <xs:element ref="date"/>
         <xs:element ref="date"/>
         <xs:element ref="author"/>
         <xs:element ref="author"/>
Line 752: Line 644:
     </xs:complexType>
     </xs:complexType>
   </xs:element>
   </xs:element>
</source>


'''Example 2-13: Either - or child elements'''


XML:
<source lang="xml">
<INFOS> <date>...</date> <author>...</author> <a>...</a> <date>...</date>
.... <date>...</date>
</INFOS>
</source>
'''Example: Either - or child elements'''
XSD:
<source lang="xml">
   <xs:element name="ATTEMPT">
   <xs:element name="ATTEMPT">
     <xs:complexType>
     <xs:complexType>
       <'''xs:choice>'''
       <xs:choice>
         <xs:element ref="action"/>
         <xs:element ref="action"/>
         <xs:element ref="EPISODE"/>
         <xs:element ref="EPISODE"/>
Line 763: Line 667:
     </xs:complexType>
     </xs:complexType>
   </xs:element>
   </xs:element>
</source>


'''D. Mixed contents (tags and text)'''
XML:
<source lang="xml">
<ATTEMPT> <action>He killed the DTD </action> </ATTEMPT>
</source>
 
<source lang="xml">
<ATTEMPT> <EPISODE> ...... </EPISODE> </ATTEMPT>
</source>
 
==== Mixed contents (tags and text) ====
 
XSD:


The <code>complexType</code> is defined as <code>mixed="true"</code> as shown below.
<source lang="xml">
   <xs:element name="para">
   <xs:element name="para">
     <xs:complexType '''mixed="true"'''>
     <xs:complexType mixed="true">
       <xs:sequence>
       <xs:sequence>
         <xs:element '''minOccurs="0" maxOccurs="unbounded'''" ref="'''strong'''"/>
         <xs:element minOccurs="0" maxOccurs="unbounded" ref="strong"/>
       </xs:sequence>
       </xs:sequence>
     </xs:complexType>
     </xs:complexType>
   </xs:element>
   </xs:element>
   <xs:element name="'''strong'''" type="xs:string"/>
   <xs:element name="strong" type="xs:string"/>
XML
</source>
<para> XML is <'''strong'''>so<'''/strong'''> cool ! </para>


XML:
<source lang="xml">
<para> XML is <strong>so</strong> cool ! </para>
</source>


'''E.Empty elements'''
==== Empty elements ====


* Simply define an element and do not define any child elements
* Simply define an element and do not define any child elements


  <xs:element '''name="author"''' type="xs:string"/>
  <xs:element name="author" type="xs:string"/>


* Of course this also applies to complex elements:
Of course this also applies to complex elements:


* See Example 2-14: Attribute groups [27]
=== Attributes ===


'''2.7Attributes'''
To declare attributes, you must define the element it belongs to as <code>complexTypes</code>, since simple elements cannot have attributes.


* To declare attributes, define complexTypes.
We will not cover all possibilities here, but just demonstrate with examples


* The '''use''' parameter: can be either optional, prohibited or required
==== Russian puppet style ====


* default is "optional"
A typical attribute definition inside an element definition looks like this:
 
<source lang="xml">
* We will not cover all possibilities here, but just demonstrate with examples
  <xs:element name="Name">
 
   <xs:complexType>
  <xs:element name="'''Name'''">
     <xs:attribute name="lang" type="xs:string" use="required"/>
   <xs:'''complexType'''>
   </xs:complexType>
     <xs:attribute name="'''lang'''" type="xs:string" '''use'''="required"/>
   </xs:'''complexType'''>
  </xs:element>
  </xs:element>
</source>


The use parameter: can be either ''optional'', ''prohibited'' or ''required''. The default is "optional"


The above code is actually a short hand notation for:
The above code is actually a short hand notation for a longer expression (not shown here).


  <xs:element name="'''Name'''">
<source lang="xml">
   <xs:'''complexType'''>
  <xs:element name="Name">
   <xs:'''simpleContent'''>
   <xs:complexType>
   <xs:simpleContent>
     <xs:extension base="xs:string">
     <xs:extension base="xs:string">
       <xs:attribute name="'''lang'''" type="xs:string" '''use'''="required"/>
       <xs:attribute name="lang" type="xs:string" use="required"/>
     </xs:extension
     </xs:extension
   </xs:'''simpleContent'''>
   </xs:simpleContent>
   </xs:'''complexType'''>
   </xs:complexType>
  </xs:element>
  </xs:element>
</source>


XML example
'''XML example'''


  <'''Name''' '''lang'''="English"/>
  <Name lang="English"/>


'''Attribute groups'''
'''Attribute groups'''


* More complex attributes are better declared with attribute groups
==== Salami style ====


* Attribute groups are reusable, i.e. the equivalent of DTD s parameter entities.
More complex attributes are better declared with attribute groups, since attribute groups are reusable, i.e. the equivalent of DTD s parameter entities.


'''Example 2-14: Attribute groups'''
'''Example: Defining attributes with attribute groups'''


'''''url: http://tecfa.unige.ch/guides/xml/examples/xsd-examples/family.xsd'' '''
url: [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/family.xsd family.xsd]


<source lang="xml">
  <xs:element name="person">
  <xs:element name="person">
     <xs:'''complexType'''>
     <xs:complexType>
       <xs:'''attributeGroup''' ref="'''attlist.person'''"/>
       <xs:attributeGroup ref="attlist.person"/>
     </xs:complexType>
     </xs:complexType>
  </xs:element>
  </xs:element>
</source>


The element definition above refers to a named attribute group (defined below)
The element definition above refers to a named attribute group defined below
 


  <xs:'''attributeGroup''' name="'''attlist.person'''">
<source lang="xml">
     <xs:attribute name="'''name'''" use="required"/>
  <xs:attributeGroup name="attlist.person">
     <xs:attribute name="name" use="required"/>




     <xs:attribute name="'''gender'''">
     <xs:attribute name="gender">
       <xs:simpleType>
       <xs:simpleType>
         <xs:'''restriction base="xs:token"'''>
         <xs:restriction base="xs:token">
           <xs:enumeration value="male"/>
           <xs:enumeration value="male"/>
           <xs:enumeration value="female"/>
           <xs:enumeration value="female"/>
Line 853: Line 778:
     </xs:attribute>
     </xs:attribute>


 
     <xs:attribute name="type" default="mother">
<!-- cont. on next slide ... -->
 
 
     <xs:attribute name="'''type'''" default="mother">
       <xs:simpleType>
       <xs:simpleType>
         <xs:'''restriction base="xs:token"'''>
         <xs:restriction base="xs:token">
           <xs:enumeration value="mother"/>
           <xs:enumeration value="mother"/>
           <xs:enumeration value="father"/>
           <xs:enumeration value="father"/>
Line 868: Line 789:
     </xs:attribute>
     </xs:attribute>


    <xs:attribute name="id" use="required" type="xs:ID"/>
</xs:attributeGroup>
</source>


    <xs:attribute name="'''id'''" use="required" type="xs:ID"/>
Valid XML fragment:
</xs:'''attributeGroup'''>
 
 
'''Valid XML fragment:'''


'''''url: http://tecfa.unige.ch/guides/xml/examples/xsd-examples/family.xm''l '''
url: [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/family.xml family.xml]


<source lang="xml">
  <family>
  <family>
   <person '''name'''="Joe Miller" '''gender'''="male" '''type'''="father" '''id'''="I123456789"/>
   <person name="Joe Miller" gender="male" type="father" id="I123456789"/>
   <person name="Josette Miller" type="girl" id="I123456987"/>
   <person name="Josette Miller" type="girl" id="I123456987"/>
  </family>
  </family>
</source>
== Value restrictions ==
In a loose sense, by ''value restriction'' we refer to the fact that XSD allows to define various kinds of data types (e.g. lists) or restrictions on data types such as numbers and strings.
=== Simple user-defined types (simpleType) ===
Simple Types allow to define lists of words and selections for example using the the <code>type</code> attribute


'''2.8Value constraints'''
'''Example: A list of numbers'''


* On can put restraints on values that a user can enter in several ways
XSD:


'''Example 2-15: Restrict values for an age element'''
<source lang="xml">
<xsd:element name="listOfMyInt" type="listOfMyIntType"/>


<xsd:simpleType name="listOfMyIntType">
    <xsd:list itemType="xsd:integer"/>
</xsd:simpleType>
</source>
XML:
<source lang="xml">
<listOfMyInt>20003 15037 95977 95945</listOfMyInt>
</source>
'''Example: Restricted lists of words to choose from (in two variants)'''
The user must choose between a list of possible contents. The example below defines restrictions on element contents: three alternatives for a ''theory'' element and five alternatives for a ''Country'' element.
XSD:
<source lang="xml">
<!-- (1) A modular solution -->
<xsd:element name="theory" type="list_theories"/>
<xsd:simpleType name="list_theories">
    <xsd:restriction base="xsd:string">
        <xsd:enumeration value="constructivism"/>
        <xsd:enumeration value="behaviorism"/>
        <xsd:enumeration value="cognitivism"/>
    </xsd:restriction>
</xsd:simpleType>
<!-- (2) A russian puppet solution -->
<xsd:element name="Country">
          <xsd:simpleType>
            <xs:restriction base="xsd:string">
              <xsd:enumeration value="FR" />
              <xsd:enumeration value="DE" />
              <xsd:enumeration value="ES" />
              <xsd:enumeration value="UK" />
              <xsd:enumeration value="CH" />
            </xsd:restriction>
          </xsd:simpleType>
        </xsd:element>
</source>
Valid XML example:
<source lang="xml">
<theory>constructivism</theory>
<country>CH</country>
</source>
'''Example: Restrictions of a single number'''
XSD (using russian puppet style):
<source lang="xml">
  <xs:element name="age">
  <xs:element name="age">


 
  <xs:simpleType>
<xs:simpleType>
   <xs:restriction base="xs:integer">
   <xs:'''restriction base="xs:integer"'''>
     <xs:minInclusive value="0"/>
     <xs:minInclusive value="0"/>
     <xs:maxInclusive value="120"/>
     <xs:maxInclusive value="120"/>
   </xs:restriction>
   </xs:restriction>
  </xs:simpleType>
  </xs:simpleType>
 
</xs:element>
</source>
 
XML:
 
<source lang="xml">
<age>100</age>
</source>
 
=== Value constraints with xs:restriction ===
 
One can put restraints on element or attribute values in many several ways. We suggest finding solutions to typical problems by googling (including regexp web sites).
 
Example we already introduced above: Restrict values for an age element
 
<source lang="xml">
<xs:element name="age">
  <xs:simpleType>
    <xs:restriction base="xs:integer">
        <xs:minInclusive value="0"/>
      <xs:maxInclusive value="120"/>
    </xs:restriction>
  </xs:simpleType>
</xs:element>
</source>
 
We also could have required a 1-3 digit number, being optimistic about future life expectancy...
<source lang="xml">
<xs:element name="age">
  <xs:simpleType>
    <xs:restriction base="xs:integer">
      <xs:totalDigits value="3"/>
    </xs:restriction>
  </xs:simpleType>
</xs:element>
</source>
 
The next example shows how to require that a string includes at least 200 characters and no more than 1000.
<source lang="xml">
  <xs:element name="p">
  <xs:simpleType>
      <xs:restriction base="xs:string"> 
        <xs:minLength value="200"/> 
        <xs:maxLength value="1000"/> 
      </xs:restriction>
    </xs:simpleType>
  </xs:element>
</source>
 
A quite powerful method is to use [[regular expression]]s with the ''pattern'' element:
 
The following regexp would require a four letter word starting with G
<source lang="XML">
  G[a-z}{3}
</source>
The following XML fragment defines a legal email address (hopefully)
<source lang="xml">
<xs:element name="Email">
  <xsd:restriction base="xsd:string">
    <!-- too complex
      <xs:pattern value="\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b"/>
      -->
      <xs:pattern value="(\w[-._\w]*\w@\w[-._\w]*\w\.\w{2,3})"/>
    </xsd:restriction>
</xs:element>
 
</source>
 
The following example found in the[http://www.informit.com/articles/article.aspx?p=31285 XML Schema regular expressions] tutorial by [http://www.informit.com/authors/bio/b41f4770-b972-4444-b946-2ca43586f026 Cliff Binstock] specifies that a part number consists of an uppercase character followed by 1 or more decimal digits:
<source lang="xml">
 
<xs:element name="Part" type="partNumber">
 
<xsd:simpleType name="partNumber">
  <xsd:restriction base="xsd:token">
    <xsd:pattern value="[A-Z]\d+"/>
  </xsd:restriction>
</xsd:simpleType>
</source>
 
'''Picking from a list of words'''
 
User must select at least one word from the following list. He/she can use them in any order and also can repeat. But, please notice, that this kind of expression should rather be defined with a regular expression. Also, the <code>,</code> requires a blank space before.


<source lang="XML">
<xs:element name="allowed_words">
    <xs:simpleType>
        <xs:restriction>
            <xs:simpleType>
                <xs:list>
                    <xs:simpleType>
                        <xs:restriction base="xs:token">
                            <xs:enumeration value="I"/>
                            <xs:enumeration value="you"/>
                            <xs:enumeration value="am"/>
                            <xs:enumeration value="are"/>
                            <xs:enumeration value="here"/>
                            <xs:enumeration value=","/>
                        </xs:restriction>
                    </xs:simpleType>
                </xs:list>
            </xs:simpleType>
            <xs:minLength value="1"/>
        </xs:restriction>
    </xs:simpleType>
</xs:element>
</source>
Below two good examples:
<source lang="XML">
  <allowed_words>here I am</allowed_words>
  <allowed_words>here I am , you are</allowed_words>
</source>


</xs:element>
'''Restriction on attributes'''


'''3.From DTDs to XSDs'''
The exact same logic applies to attributes. The following example defines an price element with an attribute defining the currency. Note, that we could have included the attribute definition within the definition of the "price element" (russian puppet style).


* Below we present a few typical translation patterns
<source lang="XML">
  <xs:element name="price">
    <xs:complexType mixed="true">
      <xs:attributeGroup ref="attlist.price"/>
    </xs:complexType>
  </xs:element>


* Most decent XML editors have a built-in translator that will do most of the work
  <xs:attributeGroup name="attlist.price">
    <xs:attribute name="currency" default="CHF">
      <xs:simpleType>
        <xs:restriction base="xs:token">
          <xs:enumeration value="CHF"/>
          <xs:enumeration value="Euros"/>
          <xs:enumeration value="Dollars"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
  </xs:attributeGroup>
</source>


* however, generated XSD code is not necessarily the most pretty ...
== Some Design patterns ==


* e.g. in Exchanger XML Editor: Menu Schema -> Convert Schema
(needs to be expanded over the years .... - [[User:Daniel K. Schneider|Daniel K. Schneider]] 18:02, 9 December 2010 (CET))


'''3.1Encoding elements'''
In the meantime see also:
* [http://www.oracle.com/technetwork/java/design-patterns-142138.html Introducing Design Patterns in XML Schemas]
* [http://www.xfront.com/GlobalVersusLocal.html Global versus Local] Recommended reading !


Examples taken from '''http://www.w3.org/2000/04/schema_hack/'''
=== Mixed contents with typed elements inside ===


<source lang="XML">
<?xml version="1.0" encoding="UTF-8"?>


<!-- A mixed type -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://mymix.org/"
      targetNamespace="http://mymix.org/" elementFormDefault="qualified">
     
<xs:element name="list">
<xs:complexType>
<xs:sequence>
<xs:element ref="TextAndNumbers" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="TextAndNumbers" type="TextNumberMix"/>
<xs:complexType name="TextNumberMix">
<xs:complexContent mixed="true">
<xs:restriction base="xs:anyType">
<xs:sequence>
<xs:element name="number1" type="xs:integer"/>
<xs:element name="number2" type="xs:integer"/>
<xs:element name="number3" type="xs:integer"/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
</xs:schema>
</source>
XML file
<source lang="XML">
<?xml version="1.0"?>
<list xmlns="http://mymix.org/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://mymix.org/ mixed-text-with-numbers.xsd">
<TextAndNumbers>
I am <number1>44</number1> years old and I like <number2>4</number2> times the number <number3>11</number3>
    </TextAndNumbers>
<TextAndNumbers>
He is <number1>10</number1> meters tall.
And he weights <number2>1000</number2> kilos.
You can earn <number3>10</number3> cents if you figure out who he is.
    </TextAndNumbers>
</list>
</source>
== Converting DTDs to XSDs ==
Below we shall present a few typical translation patterns
Most decent XML editors have a built-in translator that will do most of the work. However, generated XSD code is not necessarily the most pretty ...
* e.g. in Exchanger XML Editor: Use Menu Schema -> Convert Schema. The result is fairly good. Make sure to validate the DTD, before you translate !
Below we present a table including XSD definitions for typical DTD structural elements. In the examples we use a namespace prefix for the XML and none for the Schema. Therefore an *.xsd file would look like this:
<source lang="XML">
<schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.w3.org/2001/XMLSchema file:/usr/local/xngr/types/XML%20Schema/Validation/XMLSchema.xsd"
    xmlns:t="http://testing.org/"
    targetNamespace="http://testing.org/" >
<element name="ROOT">
  <complexType>
    <sequence>
    <element ref="t:A"/>
    <element ref="t:B"/>
    </sequence>
  </complexType>
</element>
<element name="A" type="string"/>
<element name="B" type="string"/>
</schema>
</source>
The DTD to XSD examples were originally taken from http://www.w3.org/2000/04/schema_hack/


{| class="prettytable"
{| class="prettytable"
| <center>'''DTD'''</center>
| <center>DTD</center>
| <center>'''XML Schema'''</center>
| <center>XML Schema</center>


|-
|-
| '''<!ELEMENT ROOT'''
|<source lang="xml"><!ELEMENT ROOT (A,B) ></source>
  '''(A,B) >'''
|<source lang="xml">
| '''<element name="ROOT">
<element name="ROOT">
   <complexType content="elementOnly">
   <complexType>
  <element ref="t:A">
    <sequence>
  <element ref="t:B">
    <element ref="t:A"/>
    <element ref="t:B"/>
    </sequence>
   </complexType>
   </complexType>
  <element>'''
  </element>
 
</source>
|-
|-
| '''<!ELEMENT ROOT'''
|<source lang="xml">
  '''(A|B) >'''
<!ELEMENT ROOT (A|B) ></source>
| '''<element name="ROOT">
|<source lang="xml">
   <complexType content="elementOnly">
<element name="ROOT">
   <complexType>
   <choice>
   <choice>
     <element ref="t:A">
     <element ref="t:A"/>
     <element ref="t:B">
     <element ref="t:B"/>
   </choice>
   </choice>
   </complexType>
   </complexType>
  <element>'''
  </element>
 
</source>
|-
|-
| '''<!ELEMENT ROOT'''
|<source lang="xml"><!ELEMENT ROOT (A|(B,C)) ></source>
  '''(A|(B,C)) >'''
|<source lang="xml">
| '''<element name="ROOT">
<element name="ROOT">
   <complexType content="elementOnly">
   <complexType>
   <choice>
   <choice>
     <element ref="t:A">
     <element ref="t:A"/>
     <sequence>
     <sequence>
     <element ref="t:B">
     <element ref="t:B"/>
     <element ref="t:C">
     <element ref="t:C"/>
     </sequence>
     </sequence>
   </choice>
   </choice>
   </complexType>
   </complexType>
  <element>'''
  </element>
 
</source>
|-
|-
| '''<!ELEMENT ROOT'''
|<source lang="xml"><!ELEMENT ROOT (A?,B+,C*) ></source>
  '''(A?,B+,C*) >'''
|<source lang="xml">
| '''<element name="ROOT">
<element name="ROOT">
   <complexType content="elementOnly">
   <complexType>
  <element ref="t:A" minOccurs="0">
    <sequence>
  <element ref="t:B" maxOccurs="unbounded">
    <element ref="t:A" minOccurs="0"/>
  <element ref="t:C" minOccurs="0" maxOccurs="unbounded">
    <element ref="t:B" maxOccurs="unbounded"/>
   </complexType>
    <element ref="t:C" minOccurs="0" maxOccurs="unbounded"/>
  <element>'''
   </sequence>
 
</complexType>
  <element>
</source>
|}
|}
'''3.2Attribute definitions'''


'''Attribute definitions'''


{| class="prettytable"
{| class="prettytable"
| <center>'''DTD'''</center>
| <center>DTD</center>
| <center>'''XML Schema'''</center>
| <center>XML Schema</center>
|-
|<source lang="xml">
<!ATTLIST ROOT a CDATA #REQUIRED></source>


|-
|<source lang="xml">
|  '''<!ATTLIST ROOT'''
<element name="ROOT">
  '''a CDATA #REQUIRED>'''
|  '''<element name="ROOT">
   <complexType content="elementOnly">
   <complexType content="elementOnly">
   <attribute name="a" type="string" use="required"/>
   <attribute name="a" type="string" use="required"/>
   </complexType>
   </complexType>
  </element>'''
  </element></source>
|-
|<source lang="xml">
<!ATTLIST ROOT a CDATA #IMPLIED></source>


|-
|<source lang="xml">
|  '''<!ATTLIST ROOT'''
<element name="ROOT">
  '''a CDATA #IMPLIED>'''
|  '''<element name="ROOT">
   <complexType content="elementOnly">
   <complexType content="elementOnly">
   <attribute name="a" type="string" use="optional"/>
   <attribute name="a" type="string" use="optional"/>
   </complexType>
   </complexType>
  </element>'''
  </element></source>
 
|-
|-
| '''<!ATTLIST ROOT'''
|<source lang="xml"><!ATTLIST ROOT a (x|y|z)#REQUIRED;></source>
  '''a (x|y|z)#REQUIRED;>'''
|<source lang="xml">
| '''<element name="ROOT">
<element name="ROOT">
   <complexType content="elementOnly">
   <complexType content="elementOnly">
   <attribute name="a">
   <attribute name="a">
Line 1,009: Line 1,208:
   </attribute>
   </attribute>
   </complexType>
   </complexType>
  </element>'''
  </element></source>
|-
|<source lang="xml">
<!ATTLIST ROOT a CDATA #FIXED "x"></source>


|-
|<source lang="xml">
|  '''<!ATTLIST ROOT'''
<element name="ROOT">
  '''a CDATA #FIXED "x">'''
|  '''<element name="ROOT">
   <complexType content="elementOnly">
   <complexType content="elementOnly">
   <attribute name="a" type="string"  
   <attribute name="a" type="string"  
     use="fixed" value="x"/>
     use="fixed" value="x"/>
   </complexType>'''
   </complexType>
  '''</element>'''
  </element></source>
 
|}


Reminder: as we explained above, either the XSD or the target language must use a namespace prefix for the elements names (not the attributes). E.g. The first rule above could also have been written like this:
{| class="prettytable"
| <center>DTD</center>
| <center>XML Schema</center>
|-
|<source lang="xml"><!ELEMENT ROOT (A,B) ></source>
|<source lang="xml">
<xs:element name="ROOT">
  <xs:complexType>
    <xs:sequence>
    <xs:element ref="A"/>
    <xs:element ref="B"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>
</source>
|}
|}
== XSD, XSLT and CSS association ==
Now it will get hairier. We will discuss:
* How to associate both a DTD and an XSD with an XML file (i.e. you will have to change the DTD)
* How to do an XSLT transform on an XML file that includes namespaces. See also [[XSLT for compound documents tutorial]]
* How to use a CSS for an XML file that includes namespaces.
=== Associating both DTD and XSD with an XML file ===
If you associate an XSD with an XML file, the DTD will break because of the namespace declarations that are needed for associating an XSD. If you plan to keep the DTD, you will have to add these additional attributes to the DTD. If you don't manage, simply remove the DTD declaration from the XML file :)
'''XML - cd-list.xml:'''
<source lang="XML">
<?xml version="1.0"?>
<cd-list xmlns="http://edutechwiki.unige.ch/en/XML/"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://edutechwiki.unige.ch/en/XML/ cd-list.xsd"
>
......
</source>
'''XSD - cd-list.xsd:'''
<source lang="XML">
<!DOCTYPE cd-list SYSTEM "cd-list.dtd">
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
  targetNamespace="http://edutechwiki.unige.ch/en/XML/"
          xmlns="http://edutechwiki.unige.ch/en/XML/"
  elementFormDefault="qualified">
</source>
'''Modified cd-list.dtd part to include attribute definition used for declaring the XSD in the XML''':
<source lang="XML">
<?xml version="1.0"?>
<!ELEMENT cd-list (title,description?, cd*)>
<!ATTLIST cd-list xmlns CDATA #IMPLIED >
<!ATTLIST cd-list xmlns:xsi CDATA #IMPLIED >
<!ATTLIST cd-list xsi:schemaLocation CDATA #IMPLIED >
</source>
=== Using an XSLT with an XML file that includes namespaces ===
Since you now have namespaces declared in your XML file, XSLT transforms will be broken. The XSLT processor does require prefixes for the elements of the XML input it it has a namespace declared. Unless I am very wrong (don't think so), this can't be helped .... Read more in [[XSLT for compound documents tutorial]] and maybe [[XML namespace]].
If adding namespaces in your XSLT to target XML elements sounds too complicated, then consider removing the XSD declarations from your XML file. You can validate the file by associating "manually" an XSD in your editor.
Example files, where the XSLT is adapted to namespaced XML files:
* [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/cd-list-xslt.xml cd-list-xslt.xml]
* [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/cd-list.xsl cd-list.xsl]
XML
<source lang="XML">
<?xml version="1.0"?>
<?xml-stylesheet href="cd-list.xsl" type="text/xsl"?>
<cd-list xmlns="http://edutechwiki.unige.ch/en/XML/"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://edutechwiki.unige.ch/en/XML/ cd-list.xsd"
>
.......
</source>
XLST (start of file)
<source lang="XML">
<xsl:stylesheet
xmlns:xsl = "http://www.w3.org/1999/XSL/Transform"
xmlns:my = "http://edutechwiki.unige.ch/en/XML/"
xmlns = "http://www.w3.org/1999/xhtml"
version="1.0">
<xsl:output method="xml"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN"
indent="yes"/>
        <!-- root of XML file -->
<xsl:template match="/">
  <xsl:apply-templates select="my:cd-list"/>
</xsl:template>
<!-- CD list contents -->
<xsl:template match="my:cd-list">
<html xmlns="http://www.w3.org/1999/xhtml" >
<head>
<title>
<xsl:value-of select="my:title"/>
</title>
</head>
<body bgcolor="#FFFFFF">
      <h1><xsl:value-of select="my:title"/></h1>
      <xsl:apply-templates select="my:cd"/>
        </body>
        </html>
</xsl:template>
    .........
</source>
Also note that we start from the root of the file, i.e. "/". Alternatively, the we could have used '''select="/my:cd-list"''' for the template that generates the HTML root, but not simply '''select="my:cd-list"'''
The same problem exists if you produce HTML5, HTML4, etc. e.g. examine:
* [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/cd-list-xslt-html5.xml cd-list-xslt-html5.xml] (this example does not work in Firefox 32 Linux, for an unknown reason ...)
* [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/cd-list-html5.xsl cd-list-html5.xsl]
Note: there is a dangling namespace declaration in the output which won't hurt, but should be removed some day ....
=== Use of a CSS ===
The CSS should work as is, i.e. doesn't need namespaces
Example:
* [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/cd-list-css.xml cd-list-css.xml]
* [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/cd-list.css cd-list.css]
== XSD Examples ==
Below are two simple examples. There are more of the kind in the [http://tecfa.unige.ch/guides/xml/Examples/xsd-examples xsd-examples directory].
If you need complex industrial-strength examples, consider looking at various data-centric standards, e.g.
'''Office Open XML''' (Microsoft Office products like Word, Powerpoint, etc.)
* [https://en.wikipedia.org/wiki/Office_Open_XML_file_formats Office Open XML file formats] (Wikipedia)
* [http://www.ecma-international.org/publications/standards/Ecma-376.htm Standard ECMA-376 Office Open XML File Formats]
If you want to see a Relax NG schema (for comparison purposes), consider looking at:
* [https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office Open document], the OASIS standard for Open Office.
'''X3D''' (a 3D graphics standard for the web)
* [http://www.web3d.org/documents/specifications/19776-1/V3.3/index.html Extensible 3D (X3D) encodings], XML Schema
'''SOAP''' (a standard for machine to machine communication)
* http://www.w3.org/2003/05/soap-envelope/
=== CD-List example ===
* [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/cd-list.xml cd-list.xml]
* [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/cd-list.xsd cd-list.xsd]
* [http://tecfa.unige.ch/guides/xml/examples/xsd-examples/cd-list.dtd cd-list.dtd] (FYI)
With respect to the DTD, the XSD just includes just a few data restrictions (I maybe will add some more some day ....). Also, please note that the the restriction on "genre" is not helpful, i.e. there is not enough choice. Try to fix this.
<source lang="XML">
<?xml version="1.0" encoding="UTF-8"?>
<!-- XSD for a simple CD list -->
<!-- Made by daniel k. schneider, TECFA, April 2013 -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
  targetNamespace="http://edutechwiki.unige.ch/en/XML/"
          xmlns="http://edutechwiki.unige.ch/en/XML/"
      elementFormDefault="qualified">
     
  <xs:element name="cd-list">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="title"/>
        <xs:element minOccurs="0" ref="description"/>
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="cd"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
 
  <xs:element name="cd">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="artist"/>
        <xs:element ref="title"/>
        <xs:element minOccurs="0" ref="genre"/>
        <xs:element minOccurs="0" ref="duration"/>
        <xs:element minOccurs="0" ref="rating"/>
        <xs:element minOccurs="0" ref="price"/>
        <xs:element minOccurs="0" ref="publisher"/>
        <xs:element minOccurs="0" ref="description"/>
        <xs:element minOccurs="0" ref="track-list"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
 
  <xs:element name="track">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="title"/>
        <xs:element minOccurs="0" ref="artist"/>
        <xs:element minOccurs="0" ref="genre"/>
        <xs:element minOccurs="0" ref="duration"/>
        <xs:element minOccurs="0" ref="composer"/>
      </xs:sequence>
    <xs:attribute name="no">
      <xs:simpleType>
        <xs:restriction base="xs:integer">
            <xs:maxInclusive value="99"/>
        </xs:restriction>
      </xs:simpleType>
  </xs:attribute>
    </xs:complexType>
  </xs:element>
 
  <xs:element name="artist" type="xs:string"/>
  <xs:element name="composer" type="xs:string"/>
 
  <xs:element name="description">
    <xs:complexType mixed="true">
      <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:element ref="heading"/>
        <xs:element ref="p"/>
      </xs:choice>
    </xs:complexType>
  </xs:element>
 
  <xs:element name="duration" type="xs:string"/>
 
  <xs:element name="genre">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="Jazz"/>
          <xs:enumeration value="Blues"/>
          <xs:enumeration value="Hard Bop"/>
          <xs:enumeration value="Be Bop"/>
          <xs:enumeration value="Latin Jazz"/>
          <xs:enumeration value="Pop"/>
        </xs:restriction>
      </xs:simpleType>
  </xs:element>
 
  <xs:element name="heading" type="xs:string"/>
 
  <xs:element name="p">
  <xs:simpleType>
      <xs:restriction base="xs:string"> 
        <xs:minLength value="200"/> 
        <xs:maxLength value="1000"/> 
      </xs:restriction>
    </xs:simpleType>
  </xs:element>
  <xs:element name="price">
    <xs:complexType mixed="true">
      <xs:attributeGroup ref="attlist.price"/>
    </xs:complexType>
  </xs:element>
 
  <xs:element name="publisher" type="xs:string"/>
  <xs:element name="rating" type="xs:string"/>
  <xs:element name="title" type="xs:string"/>
 
  <xs:element name="track-list">
    <xs:complexType>
      <xs:sequence>
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="track"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
 
  <xs:attributeGroup name="attlist.price">
    <xs:attribute name="currency" default="CHF">
      <xs:simpleType>
        <xs:restriction base="xs:token">
          <xs:enumeration value="CHF"/>
          <xs:enumeration value="Euros"/>
          <xs:enumeration value="Dollars"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
  </xs:attributeGroup>
 
</xs:schema>
</source>
=== Recipe example ===
* http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe.xml
* http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe.xsd
* http://tecfa.unige.ch/guides/xml/examples/xsd-examples/recipe.dtd (FYI)
<source lang="XML">
<?xml version="1.0" encoding="UTF-8"?>
<!-- Simple recipe Schema -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://myrecipes.org/"
      xmlns="http://myrecipes.org/"
      elementFormDefault="qualified">
  <xs:element name="list">
    <xs:complexType>
      <xs:sequence>
        <xs:element maxOccurs="unbounded" ref="recipe"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="recipe">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="meta"/>
        <xs:element minOccurs="0" ref="recipe_author"/>
        <xs:element ref="recipe_name"/>
        <xs:element ref="meal"/>
        <xs:element ref="ingredients"/>
        <xs:element ref="directions"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="meta">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="author"/>
        <xs:element ref="date"/>
        <xs:element ref="version"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="version" type="xs:string"/>
  <xs:element name="date" type="xs:string"/>
  <xs:element name="author" type="xs:string"/>
  <xs:element name="recipe_author" type="xs:string"/>
  <xs:element name="recipe_name" type="xs:string"/>
  <xs:element name="meal" type="xs:string"/>
  <xs:element name="ingredients">
    <xs:complexType>
      <xs:sequence>
        <xs:element maxOccurs="unbounded" ref="item"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="item" type="xs:string"/>
  <xs:element name="directions">
    <xs:complexType>
      <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:element ref="para"/>
        <xs:element ref="bullet"/>
      </xs:choice>
    </xs:complexType>
  </xs:element>
  <xs:element name="bullet">
    <xs:complexType mixed="true">
      <xs:sequence>
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="strong"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="para">
    <xs:complexType mixed="true">
      <xs:sequence>
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="strong"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="strong" type="xs:string"/>
</xs:schema>
</source>
== Links ==
* [http://www.w3.org/TR/xmlschema-0/ XML Schema Part 0: Primer Second Edition] (W3C)
* [http://www.xfront.com/BestPracticesHomepage.html XML Schemas: Best Practices] (last updated 2006, when retrieved 18:58, 9 December 2010 (CET))
** [http://www.xfront.com/GlobalVersusLocal.html Global versus Local] (A Collectively Developed Set of Schema Design Guidelines)
* [http://www.w3schools.com/schema/default.asp XML Schema Tutorial] (W3CSchools)
* [http://msdn.microsoft.com/en-us/library/ms256235.aspx XML Schemas (XSD) Reference] (Microsoft)

Latest revision as of 22:40, 30 April 2017

Introduction

This is a beginners tutorial for XML Schema (often called XSD in reference to the file name extension *.xsd)

Objectives
  • Understand the purpose of XSD
  • Be able to cope with XSD editing
  • Translate DTDs to XSD with a conversion tool
  • Modify data types of a given XSD
  • Write very simple XSD grammars
  • Use XML files with XSD namespaces together with XSLT stylesheets
Prerequisites
  • Editing XML (being able to use a simple DTD). Catch up with the Editing XML tutorial
  • Be somewhat familiar with DTD's (see the DTD tutorial)
  • XML namespaces (some, have a look at the XML namespace article. At least you should know why the XSD prefix could be "xs" or "xsd" or "banana"....)
  • HTML and CSS (some)
  • Warning: XSD is a rather complex schema definition language. For one problem there always exist several good solutions.
Moving on

These slides have been prepared with the help of

Types of XML grammars

We may distinguish between several kinds of XML grammars

Kinds of XML grammars
  • A grammar-based schema specifies what elements may be used in an XML document, the order of the elements, the number of occurrences of each element, and finally the content and datatype of each element and attribute.
  • An assertion-based schema makes assertions about the relationships that must hold between the elements and attributes in an XML instance document.

Comparison between grammar-based schemas

Features
DTD
XML Schema
Relax NG
Adoption wide spread Data-centric applications like web services R&D mostly
Complexity of structure Medium Powerful (e.g. sets, element occurrence constraints) Powerful
Data types Little (10, mostly attribute values) Powerful (44 + your own derived data types) Powerful (same as XSD)
Overall complexity low high medium
XML-based formalism no yes yes

(also a short notation)

Association with XML document DOCTYPE declaration Namespace declaration No standard solution
Browser support IE (not Firefox) no no
File suffix *.dtd *.xsd *.rng / *.rnc
Entities yes no (use xinclude instead) no
  • XML Schemas were created to define more precise grammars than with DTDs, in particular one can define Data Types and more sophisticated element structures
  • DTD supports 10 datatypes, mostly for attributes. XML Schema supports 44 datatypes and in addition, you can define your own.
  • Relax NG was a reaction by people who didn't like this new format. It is about as powerful as XSD. but not as complicated.

Resources

  • XML Schema (also called XSD or simply Schema) is difficult
  • A good way to learn XSD is to translate your own DTDs with a tool and then study the code
  • See also chapter 3. From DTDs to XSDs [30]

W3C websites:

url: http://www.w3.org/XML/Schema (W3C Overview Page)

url: http://www.w3.org/TR/xmlschema-0/ The W3C XML Schema primer

Specifications:

url: http://www.w3.org/TR/xmlschema-1/ XML Schema Part 1: Structures Second Edition 2004

url: http://www.w3.org/TR/xmlschema-2/ XML Schema Part 2: Datatypes Second Edition 2004

Tools:

Exchanger XML Editor can handle XML Schema

  • Support for XSD editing
  • Validation of XSD file
  • Validation of XML against XSD
  • DTD/XSD/Relax NG translation

XSD bare bones

The structure and namespace of an XSD file

  • As any XML file, an XSD file must start with an XML declaration
  • Root of an XSD is <schema> ... </schema>
  • Attributes of schema are used to declare certain things (see later)
  • XSD makes use of namespaces since we have to make a distinction between code that belongs to XSD and code that refers to the defined elements and attributes (same principle as in XSLT).
  • Complex XSD files refer to more than one "Schema" namespace (see later)
Structure of an XSD file

Namespaces and namespace prefixes

Since XSD is XML, one must be able to dinguish XSD elements from the language you are defining.

  • You either can define a prefix for the XSD elements or one for your own XML elements. See solution 1 and 2 below
  • You then can decide whether your XML elements are namespaced or not

Solution 1: Give a namespace prefix to the XSD code

  • We define the xs: prefix for the XSD namespace
  • Doesn't matter what prefix we use (usually xs: but often xsd:)
  • elementFormDefault="qualified" means that your target XML files will not have namespaces

Example: XSD definition for a simple recipe (ignore the details, and just look at the namespace declaration and prefix)

 <?xml version="1.0" encoding="UTF-8"?>
 <!-- Simple recipe Schema -->
 <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
            elementFormDefault="qualified">
   <xs:element name="list">
     <xs:complexType>
       <xs:sequence>
         <xs:element maxOccurs="unbounded" ref="recipe"/>
       </xs:sequence>
     </xs:complexType>
   </xs:element>
   .....
  </xs:schema>

Solution 2: Give a namespace to target code and then prefix it

The following solution is less often used.

  • We use a prefixed namespace for our XML elements
  • The XML Schema namespace becomes default namespace, i.e. XSD elements will not be prefixed as shown in the next example.

Example: XSD definition for a simple recipe

 <schema
	xmlns='http://www.w3.org/2000/10/XMLSchema'
	targetNamespace='http://yourdomain.org/namespace/'
	xmlns:t='http://yourdomain.org/namespace/'

  <element name='list'>
   <complexType>
    <sequence>
     <element ref='t:recipe' maxOccurs='unbounded'/>
    </sequence>
   </complexType>
  </element>

Association of an XSD with an XML file - validation

An XML document described by a XSD is called an instance document. As with DTDs, you do not need to create an association in the XML file in order to validate an XML file, you could "manually" validate an XML against an XSD and most XML editors will allow you to do so. For example, in XML Exchanger, simply click on the validate icon, then select the XSD file when asked....

However, we will show two solutions for "linking" an XSD to an XML file. However, be aware that any XSLT stylesheet will need to be adapted.

Association of XSD with XML, Solution 1

  • You must declare the XMLSchema-instance namespace. It's a little extra XML language that allows to link XSDs to XML files.
  • The xsi:noNamespaceSchemaLocation attribute defines the URL of your XSD
  • Warning: Make sure you get its spelling and case right !!!

Example:

 <?xml version="1.0" ?>
 <list
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:noNamespaceSchemaLocation="recipe-no-ns.xsd">
   <recipe> ....
 </list>

XSD file: recipe-no-ns.xsd

 <?xml version="1.0" ?>
 <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
            elementFormDefault="qualified">
   <xs:element name="list">

Association of XSD with XML, Solution 2

This solution is more popular for various reasons (e.g. most XML languages require a namespace declaration anyhow).

1. Both the XML and the XSD file must contain a namespace declaration for your domain

2. The XML file must contain in addition:

  • a namespace declaration for XMLSchema-instance
  • a xsi:schemaLocation attribute that tells where to find the XSDs. This attribute can have as many "namespace-URL" pairs as you like

Example: XML for a simple recipe with an associated XSD

  <?xml version="1.0"?>
  <list
    xmlns="http://myrecipes.org/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://myrecipes.org/ recipe.xsd" 
   >
    <recipe>
      <meta> .....</meta>
      ......
    </recipe>
  </list>

If you wish to reuse this code fragment for your own XML: You must make two changes in the code above, i.e. define

  • A namespace for your own tags, e.g.
xmlns="http://your_domain/something/"
  • Tell for a given namespace, where to find the XSD file, e.g.
xsi:schemaLocation="http://yourdomain/something/ some-schema.xsd"

Example XSD file:

 <?xml version="1.0"?>
 <!-- Simple recipe Schema -->
 <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
            targetNamespace="http://myrecipes.org/" 
            xmlns="http://myrecipes.org/" 
            elementFormDefault="qualified">
    ....
 </xs:schema>
  • This XSD defines a default namespace (no prefixes) for your tags
  • Again, in your XML, you should substitute http://myrecipes.org/ by an URL of your own, preferably an URL over which you have control, e.g. a blog or a home page.

Example: IMS Content Packaging 1.1.4 and IMS/LOM Metadata

This XML file uses two XML vocabularies: imscp and imsmd

 <manifest 
   xmlns="http://www.imsglobal.org/xsd/imscp_v1p1"
   xmlns:imsmd="http://www.imsglobal.org/xsd/imsmd_v1p2" 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
   identifier="MANIFEST-1"
   xsi:schemaLocation=
    "http://www.imsglobal.org/xsd/imscp_v1p imscp_v1p1.xsd 
     http://www.imsglobal.org/xsd/imsmd_v1p2 imsmd_v1p2p2.xsd">
  <metadata>
     <imsmd:lom> ...... </imsmd:lom>
  </metadata>
  <organizations default="learning_sequence_1">
 .....
  • imscp_v1p1 is the default namespace (no prefix)
  • imsmd_v1p1 is the namespace for metadata.

Extract of ims_v1p1.xsd

 <xsd:schema 
     xmlns = "http://www.imsglobal.org/xsd/imscp_v1p1"
     targetNamespace = "http://www.imsglobal.org/xsd/imscp_v1p1"
     xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance"
     xmlns:xsd = "http://www.w3.org/2001/XMLSchema"
     version = "IMS CP 1.1.4"     elementFormDefault = "qualified">

Defining elements, attributes and structure

Element definitions

Recall that XML structure is mostly about defining and nesting elements, so we firstly need to learn how to define elements.

Elements are defined with xs:element,

<xs:element>

Example of a simple element without children and attributes:

<xs:element name="author" type="xs:string"/>

Its DTD equivalent would be:

<!ELEMENT author (#PCDATA)>

Element children can be defined in two ways:

  1. The Russian puppet model: use a 'complexType' child element
  2. The Salami model: use a 'type' attribute that refers to a data type you defined as 'complexType'.

Let's examine both ways:

The Russan puppet model: <xs:complexType> (1)

complexType is a child element of element and it will define the possible "data structures" for the element. In the example below we define five child elements for "recipe" - i.e. recipe_name, ingredients and directions. We also could define attributes, grand children that way.

 <xs:element name="recipe">
     <xs:complexType>
       <xs:sequence>
         <xs:element ref="meta"/>
         <xs:element minOccurs="0" ref="recipe_author"/>
         <xs:element ref="recipe_name"/>
         <xs:element ref="ingredients"/>
         <xs:element ref="directions"/>
       </xs:sequence>
     </xs:complexType> 
   </xs:element>

ref=.... will point to the definition of the sub-element (see further below for an example)

The Russian puppet model is recommended for very simple DTDs. The Salami model below is more modular and therefore a better solution, most of the time.

The Salami model: <xs:complexType> (2)

You can declare a complex type by itself and then "use it" in an element declaration.

Example XSD: recipe2.xsd

  • Defining an element that refers to a complex type for its child elements:
 <xs:element name="recipe" type="recipe_contents" />
  • Defining the complex type:
 <xs:complexType name="recipe_contents">
       <xs:sequence>
         <xs:element ref="meta"/>
         <xs:element minOccurs="0" ref="recipe_author"/>
         <xs:element ref="recipe_name"/>
         <xs:element ref="meal"/>
         <xs:element ref="ingredients"/>
         <xs:element ref="directions"/>
       </xs:sequence>
 </xs:complexType>

Now before we further explain how to define element contents, let us have a look at data types.

Data types

Simple data types allow to define what kind of data elements and attributes can contain

Examples:

Simple Type
Examples (delimited by commas)
Explanation
string Confirm this is electric A text string
base64Binary GpM7 Base86 encoded binary data
hexBinary 0FB7 HEX encoded binary data
integer ...-1, 0, 1, ...
positiveInteger 1, 2, ...
negativeInteger ... -2, -1
nonNegativeInteger 0, 1, 2, ...
long -9223372036854775808, ... -1, 0, 1, ... 9223372036854775807
decimal -1.23, 0, 123.4, 1000.00
float -INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN
boolean true, false, 1, 0
duration P1Y2M3DT10H30M12.3S 1 year, 2 months, 3 days, 10 hours, 30 minutes, and 12.3 seconds
dataTime 1999-05-31T13:20:00.000-05:00 May 31st 1999 at 1.20pm Eastern Standard Time
date 1999-05-31
time 13:20:00.000, 13:20:00.000-05:00
gYear 1999
Name shipTo XML 1.0 Name type
QName po:USAddress XML Namespace QName
anyURI http://www.example.com/ Any sort of URI
language en-GB, en-US, fr valid values for xml:lang as defined in XML 1.0

In addition to the built-in simple data types, one can define list types, union types and complex types. We already have shown an example with complex types above. Complex types include structural information, i.e. use of child elements.

We shall introduce how to define simple types - e.g. lists of terms or ranges of numerical values that the user must choose from - in section Value restrictions

Organization of elements

  • XSD allows for quite sophisticated occurrence constraints, i.e. how child elements can be used within an element. Here we only cover a few basic design patterns
  • Both child elements and attributes are defines as complexTypes, i.e. as possible element and attribute combinations that can be inserted within an element.

Salami vs. russian puppet style

As already mentioned, it is usually best to define all elements in a flat list and then refer to these when you define how child elements are to be inserted

Defining elements within elements (not so good)

 <xs:element name="meta">
     <xs:complexType>
       <xs:sequence>
         <xs:element name="author" type="xs:string"/>
         <xs:element name="version" type="xs:string"/>
         <xs:element name="date" type="xs:string"/>
       </xs:sequence>
     </xs:complexType>
 </xs:element>

Defining child elements with a reference

This is generally a better solution since you then can reuse a complexTye. This replaces the functionality of parametric entities in a DTD.

The XML meta element has an author child element. ref="author" refers to a definition for author made elsewhere, e.g. below or before.

 <xs:element name="meta">
   <xs:complexType>
     <xs:sequence>
         <xs:element ref="author"/>
         .....
     </xs:sequence>
   </xs:complexType>
 </xs:element>

 .....
 <xs:element name="author" type="xs:string"/>
 ....

Sequences

  • Number of times a child element can occur is defined with minOccurs and maxOccurs attributes.

Example: A list of three ordered child elements (salami style definition)

The element meta has three child elements (author, data, version) that must be used in that order.

   <xs:element name="meta">
     <xs:complexType>
       <xs:sequence>
         <xs:element ref="author"/>
         <xs:element ref="date"/>
         <xs:element ref="version"/>
       </xs:sequence>
     </xs:complexType>

   <xs:element name="version" type="xs:string"/>
   <xs:element name="date" type="xs:string"/>
   <xs:element name="author" type="xs:string"/>

Example: A list with one more recipe child elements

   <xs:element name="list">
     <xs:complexType>
       <xs:sequence>
         <xs:element maxOccurs="unbounded" ref="recipe"/>
       </xs:sequence>
     </xs:complexType>
   </xs:element>

The list element can include between 1 and N recipe elements

Example: A list with an optional email element - repeatable

  <xs:element name="person">
     <xs:complexType>

       <xs:sequence>
         <xs:element ref="name"/>
         <xs:element minOccurs="0" maxOccurs="unbounded" ref="email"/>
         <xs:element ref="link"/>
       </xs:sequence>

       <xs:attributeGroup ref="attlist.person"/>

     </xs:complexType>
   </xs:element>

The person element

  • must include a name element
  • can include 0, 1 or many email elements
  • must include a link element
  • also defines attributes (see below)

Choice

Example: Optional repeatable child elements

XSD:

   <xs:element name="INFOS">
     <xs:complexType>
       <xs:choice minOccurs="0" maxOccurs="unbounded">
         <xs:element ref="date"/>
         <xs:element ref="author"/>
         <xs:element ref="a"/>
       </xs:choice>
     </xs:complexType>
   </xs:element>


XML:

<INFOS> <date>...</date> <author>...</author> <a>...</a> <date>...</date> 
 .... <date>...</date>
</INFOS>


Example: Either - or child elements

XSD:

  <xs:element name="ATTEMPT">
     <xs:complexType>
       <xs:choice>
         <xs:element ref="action"/>
         <xs:element ref="EPISODE"/>
       </xs:choice>
     </xs:complexType>
   </xs:element>

XML:

 <ATTEMPT> <action>He killed the DTD </action> </ATTEMPT>
 <ATTEMPT> <EPISODE> ...... </EPISODE> </ATTEMPT>

Mixed contents (tags and text)

XSD:

The complexType is defined as mixed="true" as shown below.

   <xs:element name="para">
     <xs:complexType mixed="true">
       <xs:sequence>
         <xs:element minOccurs="0" maxOccurs="unbounded" ref="strong"/>
       </xs:sequence>
     </xs:complexType>
   </xs:element>
   <xs:element name="strong" type="xs:string"/>

XML:

 <para> XML is <strong>so</strong> cool ! </para>

Empty elements

  • Simply define an element and do not define any child elements
<xs:element name="author" type="xs:string"/>

Of course this also applies to complex elements:

Attributes

To declare attributes, you must define the element it belongs to as complexTypes, since simple elements cannot have attributes.

We will not cover all possibilities here, but just demonstrate with examples

Russian puppet style

A typical attribute definition inside an element definition looks like this:

 <xs:element name="Name">
  <xs:complexType>
    <xs:attribute name="lang" type="xs:string" use="required"/>
  </xs:complexType>
 </xs:element>

The use parameter: can be either optional, prohibited or required. The default is "optional"

The above code is actually a short hand notation for a longer expression (not shown here).

 <xs:element name="Name">
  <xs:complexType>
   <xs:simpleContent>
    <xs:extension base="xs:string">
      <xs:attribute name="lang" type="xs:string" use="required"/>
    </xs:extension
   </xs:simpleContent>
  </xs:complexType>
 </xs:element>

XML example

<Name lang="English"/>

Attribute groups

Salami style

More complex attributes are better declared with attribute groups, since attribute groups are reusable, i.e. the equivalent of DTD s parameter entities.

Example: Defining attributes with attribute groups

url: family.xsd

 <xs:element name="person">
     <xs:complexType>
       <xs:attributeGroup ref="attlist.person"/>
     </xs:complexType>
 </xs:element>

The element definition above refers to a named attribute group defined below

 <xs:attributeGroup name="attlist.person">
     <xs:attribute name="name" use="required"/>


     <xs:attribute name="gender">
       <xs:simpleType>
         <xs:restriction base="xs:token">
           <xs:enumeration value="male"/>
           <xs:enumeration value="female"/>
         </xs:restriction>
       </xs:simpleType>
     </xs:attribute>

     <xs:attribute name="type" default="mother">
       <xs:simpleType>
         <xs:restriction base="xs:token">
           <xs:enumeration value="mother"/>
           <xs:enumeration value="father"/>
           <xs:enumeration value="boy"/>
           <xs:enumeration value="girl"/>
         </xs:restriction>
       </xs:simpleType>
     </xs:attribute>

     <xs:attribute name="id" use="required" type="xs:ID"/>
 </xs:attributeGroup>

Valid XML fragment:

url: family.xml

 <family>
   <person name="Joe Miller" gender="male" type="father" id="I123456789"/>
   <person name="Josette Miller" type="girl" id="I123456987"/>
 </family>

Value restrictions

In a loose sense, by value restriction we refer to the fact that XSD allows to define various kinds of data types (e.g. lists) or restrictions on data types such as numbers and strings.

Simple user-defined types (simpleType)

Simple Types allow to define lists of words and selections for example using the the type attribute

Example: A list of numbers

XSD:

 <xsd:element name="listOfMyInt" type="listOfMyIntType"/>

 <xsd:simpleType name="listOfMyIntType">
     <xsd:list itemType="xsd:integer"/>
 </xsd:simpleType>

XML:

 <listOfMyInt>20003 15037 95977 95945</listOfMyInt>

Example: Restricted lists of words to choose from (in two variants)

The user must choose between a list of possible contents. The example below defines restrictions on element contents: three alternatives for a theory element and five alternatives for a Country element.

XSD:

<!-- (1) A modular solution -->
 <xsd:element name="theory" type="list_theories"/>

 <xsd:simpleType name="list_theories">
     <xsd:restriction base="xsd:string">
         <xsd:enumeration value="constructivism"/>
         <xsd:enumeration value="behaviorism"/>
         <xsd:enumeration value="cognitivism"/>
     </xsd:restriction>
 </xsd:simpleType>

<!-- (2) A russian puppet solution -->

 <xsd:element name="Country">
          <xsd:simpleType>
            <xs:restriction base="xsd:string">
              <xsd:enumeration value="FR" />
              <xsd:enumeration value="DE" />
              <xsd:enumeration value="ES" />
              <xsd:enumeration value="UK" />
              <xsd:enumeration value="CH" />
            </xsd:restriction>
          </xsd:simpleType>
        </xsd:element>

Valid XML example:

 <theory>constructivism</theory>
 <country>CH</country>

Example: Restrictions of a single number

XSD (using russian puppet style):

 <xs:element name="age">

  <xs:simpleType>
   <xs:restriction base="xs:integer">
     <xs:minInclusive value="0"/>
     <xs:maxInclusive value="120"/>
   </xs:restriction>
  </xs:simpleType>

 </xs:element>

XML:

 <age>100</age>

Value constraints with xs:restriction

One can put restraints on element or attribute values in many several ways. We suggest finding solutions to typical problems by googling (including regexp web sites).

Example we already introduced above: Restrict values for an age element

<xs:element name="age">
  <xs:simpleType>
     <xs:restriction base="xs:integer">
        <xs:minInclusive value="0"/>
      <xs:maxInclusive value="120"/>
     </xs:restriction>
  </xs:simpleType>
</xs:element>

We also could have required a 1-3 digit number, being optimistic about future life expectancy...

<xs:element name="age">
  <xs:simpleType>
    <xs:restriction base="xs:integer">
      <xs:totalDigits value="3"/>
    </xs:restriction>
  </xs:simpleType>
</xs:element>

The next example shows how to require that a string includes at least 200 characters and no more than 1000.

  <xs:element name="p">
  	<xs:simpleType>
      <xs:restriction base="xs:string">  
        <xs:minLength value="200"/>  
        <xs:maxLength value="1000"/>  
      </xs:restriction> 
    </xs:simpleType>
  </xs:element>

A quite powerful method is to use regular expressions with the pattern element:

The following regexp would require a four letter word starting with G

 G[a-z}{3}

The following XML fragment defines a legal email address (hopefully)

<xs:element name="Email">
   <xsd:restriction base="xsd:string">
     <!-- too complex
      <xs:pattern value="\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b"/>
      -->
      <xs:pattern value="(\w[-._\w]*\w@\w[-._\w]*\w\.\w{2,3})"/>
    </xsd:restriction>
</xs:element>

The following example found in theXML Schema regular expressions tutorial by Cliff Binstock specifies that a part number consists of an uppercase character followed by 1 or more decimal digits:

<xs:element name="Part" type="partNumber">

<xsd:simpleType name="partNumber">
  <xsd:restriction base="xsd:token">
    <xsd:pattern value="[A-Z]\d+"/>
  </xsd:restriction>
</xsd:simpleType>

Picking from a list of words

User must select at least one word from the following list. He/she can use them in any order and also can repeat. But, please notice, that this kind of expression should rather be defined with a regular expression. Also, the , requires a blank space before.

<xs:element name="allowed_words">
    <xs:simpleType>
        <xs:restriction>
            <xs:simpleType>
                <xs:list>
                    <xs:simpleType>
                        <xs:restriction base="xs:token">
                            <xs:enumeration value="I"/>
                            <xs:enumeration value="you"/>
                            <xs:enumeration value="am"/>
                            <xs:enumeration value="are"/>
                            <xs:enumeration value="here"/>
                            <xs:enumeration value=","/>
                        </xs:restriction>
                    </xs:simpleType>
                </xs:list>
            </xs:simpleType>
            <xs:minLength value="1"/>
        </xs:restriction>
    </xs:simpleType>
</xs:element>

Below two good examples:

   <allowed_words>here I am</allowed_words>
   <allowed_words>here I am , you are</allowed_words>

Restriction on attributes

The exact same logic applies to attributes. The following example defines an price element with an attribute defining the currency. Note, that we could have included the attribute definition within the definition of the "price element" (russian puppet style).

   <xs:element name="price">
    <xs:complexType mixed="true">
      <xs:attributeGroup ref="attlist.price"/>
    </xs:complexType>
  </xs:element>

  <xs:attributeGroup name="attlist.price">
    <xs:attribute name="currency" default="CHF">
      <xs:simpleType>
        <xs:restriction base="xs:token">
          <xs:enumeration value="CHF"/>
          <xs:enumeration value="Euros"/>
          <xs:enumeration value="Dollars"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
  </xs:attributeGroup>

Some Design patterns

(needs to be expanded over the years .... - Daniel K. Schneider 18:02, 9 December 2010 (CET))

In the meantime see also:

Mixed contents with typed elements inside

<?xml version="1.0" encoding="UTF-8"?>

<!-- A mixed type -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://mymix.org/" 
	       targetNamespace="http://mymix.org/" elementFormDefault="qualified">
	       
	<xs:element name="list">
		<xs:complexType>
			<xs:sequence>
				<xs:element ref="TextAndNumbers" maxOccurs="unbounded"/>
			</xs:sequence>
		</xs:complexType>		
	</xs:element>
	
	<xs:element name="TextAndNumbers" type="TextNumberMix"/>
	
	<xs:complexType name="TextNumberMix">
		<xs:complexContent mixed="true">
			<xs:restriction base="xs:anyType">
				<xs:sequence>
					<xs:element name="number1" type="xs:integer"/>
					<xs:element name="number2" type="xs:integer"/>
					<xs:element name="number3" type="xs:integer"/>
				</xs:sequence>
			</xs:restriction>
		</xs:complexContent>
	</xs:complexType>
	
</xs:schema>

XML file

<?xml version="1.0"?>

<list xmlns="http://mymix.org/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	 xsi:schemaLocation="http://mymix.org/ mixed-text-with-numbers.xsd">
	 
	<TextAndNumbers>
	I am <number1>44</number1> years old and I like <number2>4</number2> times the number <number3>11</number3>
    </TextAndNumbers>
	<TextAndNumbers>
	He is <number1>10</number1> meters tall. 
And he weights <number2>1000</number2> kilos. 
You can earn <number3>10</number3> cents if you figure out who he is.
    </TextAndNumbers>
</list>

Converting DTDs to XSDs

Below we shall present a few typical translation patterns

Most decent XML editors have a built-in translator that will do most of the work. However, generated XSD code is not necessarily the most pretty ...

  • e.g. in Exchanger XML Editor: Use Menu Schema -> Convert Schema. The result is fairly good. Make sure to validate the DTD, before you translate !

Below we present a table including XSD definitions for typical DTD structural elements. In the examples we use a namespace prefix for the XML and none for the Schema. Therefore an *.xsd file would look like this:

<schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
	    xsi:schemaLocation="http://www.w3.org/2001/XMLSchema file:/usr/local/xngr/types/XML%20Schema/Validation/XMLSchema.xsd"
	    xmlns:t="http://testing.org/" 
	    targetNamespace="http://testing.org/" >

 <element name="ROOT">
  <complexType>
    <sequence>
     <element ref="t:A"/>
     <element ref="t:B"/>
    </sequence>
  </complexType>
 </element>

 <element name="A" type="string"/>
 <element name="B" type="string"/>
</schema>

The DTD to XSD examples were originally taken from http://www.w3.org/2000/04/schema_hack/

DTD
XML Schema
<!ELEMENT ROOT (A,B) >
<element name="ROOT">
  <complexType>
    <sequence>
     <element ref="t:A"/>
     <element ref="t:B"/>
    </sequence>
  </complexType>
 </element>
<!ELEMENT ROOT (A|B) >
<element name="ROOT">
  <complexType>
   <choice>
    <element ref="t:A"/>
    <element ref="t:B"/>
   </choice>
  </complexType>
 </element>
<!ELEMENT ROOT (A|(B,C)) >
<element name="ROOT">
  <complexType>
   <choice>
    <element ref="t:A"/>
    <sequence>
     <element ref="t:B"/>
     <element ref="t:C"/>
    </sequence>
   </choice>
  </complexType>
 </element>
<!ELEMENT ROOT (A?,B+,C*) >
<element name="ROOT">
  <complexType>
    <sequence>
     <element ref="t:A" minOccurs="0"/>
     <element ref="t:B" maxOccurs="unbounded"/>
     <element ref="t:C" minOccurs="0" maxOccurs="unbounded"/>
  </sequence> 
 </complexType>
 <element>

Attribute definitions

DTD
XML Schema
<!ATTLIST ROOT a CDATA #REQUIRED>
<element name="ROOT">
  <complexType content="elementOnly">
   <attribute name="a" type="string" use="required"/>
  </complexType>
 </element>
<!ATTLIST ROOT a CDATA #IMPLIED>
<element name="ROOT">
  <complexType content="elementOnly">
   <attribute name="a" type="string" use="optional"/>
  </complexType>
 </element>
<!ATTLIST ROOT a (x|y|z)#REQUIRED;>
<element name="ROOT">
  <complexType content="elementOnly">
   <attribute name="a">
    <simpleType base="string">
     <enumeration value="x"/>
     <enumeration value="y"/>
     <enumeration value="z"/>
    </simpleType>
   </attribute>
  </complexType>
 </element>
<!ATTLIST ROOT a CDATA #FIXED "x">
<element name="ROOT">
  <complexType content="elementOnly">
   <attribute name="a" type="string" 
    use="fixed" value="x"/>
  </complexType>
 </element>

Reminder: as we explained above, either the XSD or the target language must use a namespace prefix for the elements names (not the attributes). E.g. The first rule above could also have been written like this:

DTD
XML Schema
<!ELEMENT ROOT (A,B) >
<xs:element name="ROOT">
  <xs:complexType>
    <xs:sequence>
     <xs:element ref="A"/>
     <xs:element ref="B"/>
    </xs:sequence>
  </xs:complexType>
 </xs:element>

XSD, XSLT and CSS association

Now it will get hairier. We will discuss:

  • How to associate both a DTD and an XSD with an XML file (i.e. you will have to change the DTD)
  • How to do an XSLT transform on an XML file that includes namespaces. See also XSLT for compound documents tutorial
  • How to use a CSS for an XML file that includes namespaces.

Associating both DTD and XSD with an XML file

If you associate an XSD with an XML file, the DTD will break because of the namespace declarations that are needed for associating an XSD. If you plan to keep the DTD, you will have to add these additional attributes to the DTD. If you don't manage, simply remove the DTD declaration from the XML file :)

XML - cd-list.xml:

<?xml version="1.0"?>
<cd-list xmlns="http://edutechwiki.unige.ch/en/XML/"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://edutechwiki.unige.ch/en/XML/ cd-list.xsd" 
	>
 ......

XSD - cd-list.xsd:

<!DOCTYPE cd-list SYSTEM "cd-list.dtd">
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
	   targetNamespace="http://edutechwiki.unige.ch/en/XML/" 
           xmlns="http://edutechwiki.unige.ch/en/XML/" 
	   elementFormDefault="qualified">

Modified cd-list.dtd part to include attribute definition used for declaring the XSD in the XML:

<?xml version="1.0"?>
<!ELEMENT cd-list (title,description?, cd*)>
<!ATTLIST cd-list xmlns CDATA #IMPLIED >
<!ATTLIST cd-list xmlns:xsi CDATA #IMPLIED >
<!ATTLIST cd-list xsi:schemaLocation CDATA #IMPLIED >

Using an XSLT with an XML file that includes namespaces

Since you now have namespaces declared in your XML file, XSLT transforms will be broken. The XSLT processor does require prefixes for the elements of the XML input it it has a namespace declared. Unless I am very wrong (don't think so), this can't be helped .... Read more in XSLT for compound documents tutorial and maybe XML namespace.

If adding namespaces in your XSLT to target XML elements sounds too complicated, then consider removing the XSD declarations from your XML file. You can validate the file by associating "manually" an XSD in your editor.

Example files, where the XSLT is adapted to namespaced XML files:

XML

<?xml version="1.0"?>
<?xml-stylesheet href="cd-list.xsl" type="text/xsl"?>
		
<cd-list xmlns="http://edutechwiki.unige.ch/en/XML/"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://edutechwiki.unige.ch/en/XML/ cd-list.xsd" 
	>

.......

XLST (start of file)

<xsl:stylesheet 
	xmlns:xsl = "http://www.w3.org/1999/XSL/Transform" 
	xmlns:my = "http://edutechwiki.unige.ch/en/XML/" 
	xmlns = "http://www.w3.org/1999/xhtml"
	version="1.0">
	
	<xsl:output method="xml" 
		doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" 
		doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" 
		indent="yes"/>

        <!-- root of XML file -->
	<xsl:template match="/">
		  <xsl:apply-templates select="my:cd-list"/>
	</xsl:template>
	
	<!-- CD list contents -->
	<xsl:template match="my:cd-list">
		<html xmlns="http://www.w3.org/1999/xhtml" >
			<head>
				<title>
					<xsl:value-of select="my:title"/>
				</title>
			</head>
			<body bgcolor="#FFFFFF">
		      <h1><xsl:value-of select="my:title"/></h1>
		      <xsl:apply-templates select="my:cd"/>
	         </body>
        </html>
	</xsl:template>

     .........

Also note that we start from the root of the file, i.e. "/". Alternatively, the we could have used select="/my:cd-list" for the template that generates the HTML root, but not simply select="my:cd-list"

The same problem exists if you produce HTML5, HTML4, etc. e.g. examine:

Note: there is a dangling namespace declaration in the output which won't hurt, but should be removed some day ....

Use of a CSS

The CSS should work as is, i.e. doesn't need namespaces

Example:

XSD Examples

Below are two simple examples. There are more of the kind in the xsd-examples directory.

If you need complex industrial-strength examples, consider looking at various data-centric standards, e.g.

Office Open XML (Microsoft Office products like Word, Powerpoint, etc.)

If you want to see a Relax NG schema (for comparison purposes), consider looking at:

X3D (a 3D graphics standard for the web)

SOAP (a standard for machine to machine communication)

CD-List example

With respect to the DTD, the XSD just includes just a few data restrictions (I maybe will add some more some day ....). Also, please note that the the restriction on "genre" is not helpful, i.e. there is not enough choice. Try to fix this.

<?xml version="1.0" encoding="UTF-8"?>
<!-- XSD for a simple CD list -->
<!-- Made by daniel k. schneider, TECFA, April 2013 -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
		   targetNamespace="http://edutechwiki.unige.ch/en/XML/" 
           xmlns="http://edutechwiki.unige.ch/en/XML/" 
	       elementFormDefault="qualified">
	       
  <xs:element name="cd-list">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="title"/>
        <xs:element minOccurs="0" ref="description"/>
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="cd"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  
  <xs:element name="cd">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="artist"/>
        <xs:element ref="title"/>
        <xs:element minOccurs="0" ref="genre"/>
        <xs:element minOccurs="0" ref="duration"/>
        <xs:element minOccurs="0" ref="rating"/>
        <xs:element minOccurs="0" ref="price"/>
        <xs:element minOccurs="0" ref="publisher"/>
        <xs:element minOccurs="0" ref="description"/>
        <xs:element minOccurs="0" ref="track-list"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  
  <xs:element name="track">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="title"/>
        <xs:element minOccurs="0" ref="artist"/>
        <xs:element minOccurs="0" ref="genre"/>
        <xs:element minOccurs="0" ref="duration"/>
        <xs:element minOccurs="0" ref="composer"/>
      </xs:sequence>
     <xs:attribute name="no">
       <xs:simpleType>
         <xs:restriction base="xs:integer">
            <xs:maxInclusive value="99"/>
         </xs:restriction>
       </xs:simpleType>
  	 </xs:attribute>
    </xs:complexType>
  </xs:element>
  
  <xs:element name="artist" type="xs:string"/>
  <xs:element name="composer" type="xs:string"/>
  
  <xs:element name="description">
    <xs:complexType mixed="true">
      <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:element ref="heading"/>
        <xs:element ref="p"/>
      </xs:choice>
    </xs:complexType>
  </xs:element>
  
  <xs:element name="duration" type="xs:string"/>
  
  <xs:element name="genre">
       <xs:simpleType>
         <xs:restriction base="xs:string">
           <xs:enumeration value="Jazz"/>
           <xs:enumeration value="Blues"/>
           <xs:enumeration value="Hard Bop"/>
           <xs:enumeration value="Be Bop"/>
           <xs:enumeration value="Latin Jazz"/>
           <xs:enumeration value="Pop"/>
         </xs:restriction>
       </xs:simpleType>
  </xs:element>
   
  <xs:element name="heading" type="xs:string"/>
  
  <xs:element name="p">
  	<xs:simpleType>
      <xs:restriction base="xs:string">  
        <xs:minLength value="200"/>  
        <xs:maxLength value="1000"/>  
      </xs:restriction> 
    </xs:simpleType>
  </xs:element> 	

   <xs:element name="price">
    <xs:complexType mixed="true">
      <xs:attributeGroup ref="attlist.price"/>
    </xs:complexType>
  </xs:element>
  
  <xs:element name="publisher" type="xs:string"/>
  <xs:element name="rating" type="xs:string"/>
  <xs:element name="title" type="xs:string"/>
  
  <xs:element name="track-list">
    <xs:complexType>
      <xs:sequence>
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="track"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  
  <xs:attributeGroup name="attlist.price">
    <xs:attribute name="currency" default="CHF">
      <xs:simpleType>
        <xs:restriction base="xs:token">
          <xs:enumeration value="CHF"/>
          <xs:enumeration value="Euros"/>
          <xs:enumeration value="Dollars"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
  </xs:attributeGroup>
  
</xs:schema>

Recipe example

<?xml version="1.0" encoding="UTF-8"?>
<!-- Simple recipe Schema -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
	       targetNamespace="http://myrecipes.org/" 
	       xmlns="http://myrecipes.org/" 
	       elementFormDefault="qualified">
  <xs:element name="list">
    <xs:complexType>
      <xs:sequence>
        <xs:element maxOccurs="unbounded" ref="recipe"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="recipe">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="meta"/>
        <xs:element minOccurs="0" ref="recipe_author"/>
        <xs:element ref="recipe_name"/>
        <xs:element ref="meal"/>
        <xs:element ref="ingredients"/>
        <xs:element ref="directions"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="meta">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="author"/>
        <xs:element ref="date"/>
        <xs:element ref="version"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="version" type="xs:string"/>
  <xs:element name="date" type="xs:string"/>
  <xs:element name="author" type="xs:string"/>
  <xs:element name="recipe_author" type="xs:string"/>
  <xs:element name="recipe_name" type="xs:string"/>
  <xs:element name="meal" type="xs:string"/>
  <xs:element name="ingredients">
    <xs:complexType>
      <xs:sequence>
        <xs:element maxOccurs="unbounded" ref="item"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="item" type="xs:string"/>
  <xs:element name="directions">
    <xs:complexType>
      <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:element ref="para"/>
        <xs:element ref="bullet"/>
      </xs:choice>
    </xs:complexType>
  </xs:element>
  <xs:element name="bullet">
    <xs:complexType mixed="true">
      <xs:sequence>
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="strong"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="para">
    <xs:complexType mixed="true">
      <xs:sequence>
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="strong"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="strong" type="xs:string"/>
</xs:schema>

Links