The educational technology and digital learning wiki
Jump to navigation Jump to search

Disclaimer: As of May 2019, this article badly needs updating. Last somewhat important changes were made in 2008.


DITA is an XML document standard (vocabulary) for authoring modular text.

  • “The Darwin Information Typing Architecture (DITA) is an XML-based, end-to-end architecture for authoring, producing, and delivering technical information. This architecture consists of a set of design principles for creating "information-typed" modules at a topic level and for using that content in delivery modes such as online help and product support portals on the Web.” (Introduction to the Darwin Information Typing Architecture, retrieved 12:27, 3 November 2006 (MET)).

DITA was originally developed at IBM by Don R. Day, Michael Priestley and others. It now is a OASIS standard. Its general architecture may be quite interesting for education, because it (1) accommodates for topic-oriented organization and reuse (as opposed to long documents), (2) allows specialization and (3) therefore supports semantic markup (as opposed to DocBook which is typographic basically).

DITA summary

DITA is a topics-based information architecture. "Darwin information typing architecture" can be summarized as:

  1. Darwin: DITA utilizes principles of inheritance for specialization
  2. Information Typing: DITA was originally designed for technical information based on an information architecture of Concept, Task and Reference
  3. Architecture: DITA is a model for extension both of design and of processes

Topics can be physically or logically embedded. The general architecture of a topic is:

  • Title
  • Prolog (author, metadata, small description, etc.)
  • Body (sections that are structured according to each topic type)
  • Tail (embedded topics)

Here is picture (from Don Day PPT reproduced without permission) showing the specialization principle by putting side-by side the generic topic and the "task" topic:

DITA generic topic and task topic

New topics can be defined as autonomous nodes (or better) as nodes that inherit properties from a parent.

DITA in education

DSchneider believes that DITA could play a role in education.

Here are some quickly made up use case examples:

Pedagogical knowledge management and document production

Educators and associated communities of practice are sometimes engaged in knowledge management tasks and thus constitute a knowledge-building community. In addition, there exist many organizational structures financed to provide educators and other experts with information through web sites. Here are a few examples of what one can find there:

All these kinds of informations could profit from being "semantically" structured. Let's have a look a few things that people in education do and where DITA could fit in:

Indexing and Searching

People like to search by "kinds of information". Thus, digital libraries (e.g. learning object repositories) have been invented. However, they have a major flaw: Typical end-users hate editing meta-data and practice shows that solutions based on folksonomies are widely preferred (see social software). However, once the information space grows (like on delicious or connotea) search results include too much useless information.

A similar solution is unstructuredness (e.g. like a Wiki) and then rely on full text search plus use of categories. Wikis allow to enter data very quickly, but have the disadvantage that one can't easily produce text on demands (it's not easy to make a wiki book). Also, full text search has its limits once the wiki starts growing. In addition, search engines don't produce text, but they spit back page names plus the search context. E.g. you can't say something like "let's have a list of all the references on pages that belong to the category "instructional design modeling".

The other solution is to write semantically structured information (DITA) and plug it into a database. In theory, one could implement DITA-based information repositories with with SQL. However, our experience shows that building SQL tables for each kind of information is very time consuming and not very flexible. In other words, DITA-based solutions are are not realistic in the context of "bricolage informatics" that is prevailing in education. Since DITA really can address some of the needs for flexible information retrieval architectures, we need two things: An XML database like eXist (i.e. something into which you could just plug DITA fragments without any extra work) and a DITA-ready portal like the eXist portal. Only the portal must be programmed in PHP (Java is not accessible to "bricolage" programmers).

Flexible document production

Lets image that being engaged in teacher training. For a given course we wish to hand out some training materials that is appropriate. Learners engaged in projects may want to read and print all information related to some topic, e.g. how to design inquiry-based learning. DITA would allow to create print or web documents on the fly, on a per needed bases. Again, we need a portal with an easy API for this.

Authoring of lesson plans and scenarios

In some cases, it is desired that information entered be complete according to some standards. A typical example would be lesson plans. In order to share these plans it would be nice to this through an online application. A flexible TTW DITA-based editor coupled to an on-line XML database could address this issue. So, the portal should support TTW editing. (However DSchneider admits that XML editing is not easy and user must receive some initial training).

A similar use case are pedagogical scenario editors that are more ambitious. These are applications that allow a course designer or teacher to describe in detail somewhat more sophisticated lesson plans. A good example is the DialogPlus Toolkit. It's underlying templates could have been DITA-based. Of course, someone then has to build the editing tool around it. And one may also wonder in this context if DITA could be an included vocabulary in some educational modelling language, much in the way things happen in the RDF world.

A test case

DSchneider made some DITA extensions to have a writing tool to author the TECFA SEED catalog, an inventory of various learning activities and tools that can support these.

The DITA + TECFA extensions DTD included the following new topics:

  • card: generic node inherited by the others
  • c3msbrick: topic to describe conceptually plugins/modules for C3MS portals
  • c3mssoft: topic to describe software for plugins/modules
  • learnactivity: topic to describe learning activities (i.e. small scenarios)
  • learnact: topic to describe elementary learning activities

This project is now dead, but from an information management point of view, DSchneider thinks that the document produced was interesting. It contained both very structured information and "loose" text and it was heavily cross-indexed. Because we don't have the resources to put it into a database, I then decided to use this wiki to launch a larger project in the same spirit. Now if we had a DITA wiki, thing could really change.

So, here is my last proposition, make the DITA portalware Wiki-like. I.e. allow administrators to define DITA templates (specialized topics) for editing from which the users can choose, given the subject they want to enter. This also may be of interest to the Wikimedia community. These people now literally use hundreds of absolutely ugly templates to add boxes with structured contents, to define references and citations, etc. Compared to some of these (e.g. have a look at the citation templates) editing XML appears to be rather easy.

e-Learning objects

It may be an interesting idea to investigate how DITA could enhance or replace simple content-based standards like IMS Content Packaging / SCORM based learning objects. Hunt and Bernard (2005) demonstrated that DITA XML can be extended to develop reusable learning content.

DSchneider's opinion: Such an enterprise may be interesting since DITA content has the obvious advantage of being modular and semantically structured. In contrast, current e-learning basically means to assemble unstructured HTML pages and other formats into learning sequences (called "organizations") and that are delivered with so called content packages. IMS/SCORM is now a firmly established standard in low-level training both in industry and higher education. DITA could play more than one role:

  • Nothing prevents someone to add directly DITA fragments into a content package. DSchneider is not sure if an LMS could handle this (have to test this).
  • DITA learning materials could be saved as HTML (a series of files) and then added to the learning sequence (that's easy, but means extra work for the author).
  • DITA-based learning materials could be translated as a whole to a Simple Content Pack (that's in essence what Hunt and Bernard proposed).

Now actors who engage in more advanced training don't care much about content-driven standards, since their designs are always activity-based (see the next use case). So who may be interested by Hunt and Bernard's proposal ? People who care about authoring and assembly of modular well-structured contents into either large documents or customized on-line e-learning sequences. As opposed to SCORM/IMS Content Packaging (which is a simple structured assembly of non-structured contents), DITA could provide an overall information architecture for learning content and export assembled documents to various formats (including IMS content packs) and as such it may be of interest to real distance universities that still work rather with high quality print tutorials and limit screen work to more interactive work and communication.

Educational modeling languages

The purpose of more advanced modeling languages is to outline pedagogical scenarios in terms of learner activities, to exchange learning units, and to define executable scenarios. DITA certainly has been built as a text-centric vocabulary, but there is no reason why extensions couldn't describe machine-readable contents of various sorts and would "compile" not just into screen-readable text and quizzing but more differentiated on-line activities.

Again, there exist standards like IMS Learning Design (LD) or IMS Simple Sequencing, but as it stands today, each of these languages only cover parts of our needs. Therefore, discussion about educational modelling languages is in no ways closed. Current simple SCORM-based e-learning "standards" don't fit the needs of school and advanced industry training and there is need for a lot of experimentation. In other words, there are two open roads: Either reimplement some SCORM/IMS standards (including exporting to these formats) or do something better.

An other idea is to use DITA to sketch out scenarios as text documents and then translate (e.g. via XSLT like Hunt and Bernard suggest for IMS/SCORM content packs) to a vocabulary like IMS Learning Design for which there ought to be engines within the nearer future. In a similar spirit, DITA could act as a container for executable activity descriptions (e.g. why not adopt LD scenarios within DITA ?). Could there be foreign "data islands" in DITA ?

DITA to support learner activities

DITA could be either a cognitive tool for writing activities or (b) a component for a such a tool.

(a) DITA extensions could be built to help students with writing strongly structured texts. A typical example is what DSchneider and Paraskevi Synteta did in their C3MS project-based learning model. Students had to use a special purpose project tool named ePBL, which stands for « Project-Based e-learning and had to define research plans with a specially made XML grammar.

(b) Some vocabularies may need a special authoring tool. A nice example is Benetos (2006) Computer-supported argumentative writer based on her ArgEssML schema defined as Relax NG grammar. So some DITA vocabularies, in the same way as IMS standards, would need need associated player software.


What could we gain by writing vocabularies as DITA extensions ?

DITA has some advantages over "home-made" schemas
  • There is no initial need to write style sheets (contents of extensions displays).
  • One can reuse existing DITA topics (modules).
  • The modular architecture engages to think documents in terms of dynamically created objects.

DSchneider thinks of two main avenues for development:

  1. Structured portions of text could be integrated with others kinds of text, both very loose "title + body" formatting that is supporting by the generic basic DITA topic or strongly typed ones. This would lead to a fine "tutorial/manual production" framework and is more document centric than current screen-centric SCORM/IMS standards. No one likes to read longer texts on the screen, in particular students don't. DSchneider doesn't either (despite having a 1200x3500 "workspace").
    In this spirit, DITA also could be the basis of a new generation Wiki. E.g. there is no reason why a web site as this one could not be implemented with a DITA/XML database/TTW combo. We'd get the huge advantage of getting structured text (instead of those awful macros that proliferate on Wikipedia) and a really cool way to produce printed books. Currently, generating wiki books is not easy and requires a lot of manual work to achieve high-quality output. Give us a DITA-based wiki framework please.
  2. Now for some dreaming: The shortly sketched use cases above could be at the heart of a document-centric learning and teaching environment, e.g. see instructional design models like writing-to-learn and knowledge-building community model The DITA e-learning community should maybe take into account that many modern instructional design model engage learners in writing activities ... and DITA is about writing. Learner's writing activities need scaffolding and a schema can help with that.

Furthermore, teachers need tools to engage students in collective and collaborative activities. E.g. a teacher (whether in classroom or a distance) ought to be able to say: "Now let's look at the the list of <arguments> in your productions which I extracted to page XYZ. "

On the negative side: DITA ain't easy and XML database server scripting even less :(


The DITA toolkit

The DITA Open Toolkit (DITA OT) is an implementation of the OASIS DITA Technical Committee's specification for Darwin Information Typing Architecture (DITA) DTDs and Schemas. The Toolkit transforms DITA content (maps and topics) into deliverable formats.

XML Vendors did develop other tools or build on top of this kit. Most free XML editors include now a version of this toolkit (but often an older version).


Most modern XML editors (including the free ones) include the DITA OT schemas and associated XSLT style sheets. In any case, you can download the DITA OT and use it from your favorite (good) XML editor.

Of particular interest are:

  • Adobe FrameMaker (commercial XML-capable word processor). Consider this, if you plan to write big text. There is a free EDD stylesheet you can find. Some high-end XML editors like XML Spy, Oxygen, XMetal, Arbortext etc. also make using DITA fairly easy. Finally, as of 2019, you probably can find several online editing tools, none of which we tested.

See List of DITA Optimized Editors (as of May 2019, last updated in 2015).

DITA-based CMS frameworks

  • In the past there existed some commercial products, e.g. ixiasoft, Astoria, Bluestream, XyEnterprise (none tested).
  • More recently (2017), DITA is integrated in larger products, e.g. Adobe Experience manager (a kind of CMS), or Framemaker (its documentation-centered word processor)

Specifications and reference

Tutorials and Introductions

Links of links



Kravogel, Christian and Boris Horner (2005). DITA - Getting Started, XTech 2005: XML, the Web and beyond, HTML

Benetos, Kalliopi (2006), Computer-Supported Argumentative Writer An authoring tool with built-in scaffolding and self-regulation for novice writers of argumentative texts, Master thesis, TECFA, University of Geneva. PDF

Day, Don; Erik Hennum, John Hunt, Michael Priestley, David Schell, and Nancy Harrison (2004), An XML Architecture for Technical Documentation: The Darwin Information Typing Architecture, WinWriters, Inc., HTML retrieved 17:02, 4 November 2006 (MET).

Day, Don; Michael Priestley and David Schell (2001), Introduction to the Darwin Information Typing Architecture, IBM developerWorks article. HTML

Hunt, John P. and Robert Bernard (2005), An XML-based information architecture for learning content, Part 1: A DITA content pilot, IBM developerWorks article, HTML

Hunt, John P. and Robert Bernard (2005), An XML-based information architecture for learning content, Part 1: A DITA specialization design, IBM developerWorks article, HTML

Priestley, Michael (2001), Specialization in the Darwin Information Typing Architecture, IBM developerWorks article HTML (This is mandatory reading if you wish to extend DITA)

Priestley, M., Hargis, G., and Carpenter, S. (2001) DITA: An XML-based Technical Documentation Authoring and Publishing Architecture. Technical Communication, Volume 48, No.3, p.352--367.

Schneider Daniel & Paraskevi Synteta (2005). Conception and implementation of rich pedagogical scenarios through collaborative portal sites, in Senteni,A. Taurisson,A. Innovative Learning & Knowledge Communities / les communautés virtuelles: apprendre, innover et travailler ensemble", ICOOL 2003 & Colloque de Guéret 2003 selected papers, a University of Mauritius publication, under the auspices of the UNESCO, ISBN-99903-73-19-1. PDF Preprint

Schneider, Daniel with Paraskevi Synteta, Catherine Frété, Fabien Girardin, Stéphane Morand (2003) Conception and implementation of rich pedagogical scenarios through collaborative portal sites: clear focus and fuzzy edges. ICOOL International Conference on Open and Online Learning, December 7-13, 2003, University of Mauritius. PDF.

Schneider, Daniel. (2005) "Gestaltung kollektiver und kooperativer Lernumgebungen" in Euler & Seufert (eds.), E-Learning in Hochschulen und Bildungszentren. Gestaltungshinweise für pädagogische Innovationen, München: Oldenbourg. Preprint in PDF