Open Packaging Conventions and Office Open XML

The educational technology and digital learning wiki
Jump to navigation Jump to search

Draft

Open Packaging Conventions and Office Open XML are two document standards sponsored by Microsoft.

According to Wikipedia (retrieved 12:08, 30 October 2010 (CEST)) “Office Open XML (also informally known as OOXML or OpenXML) is a zipped, XML-based file format developed by Microsoft[2] for representing spreadsheets, charts, presentations and word processing documents. The Office Open XML specification has been standardised by Ecma. A later edition was standardized by ISO and IEC as an International Standard (ISO/IEC 29500)”,

“The Open Packaging Conventions (OPC) is a container-file technology initially created by Microsoft to store a combination of XML and non-XML files that together form a single entity such as an Open XML Paper Specification (OpenXPS) document. OPC-based file formats combine the advantages of leaving the independent file entities embedded in the document intact and resulting in much smaller files compared to normal use of XML. The OPC is specified in Part 2 of the Office Open XML standards ISO/IEC 29500:2008 and ECMA-376.” (Wikipedia, retrieved 12:08, 30 October 2010 (CEST))

Example

Let's a look at very short Word 2007 document. It only contains a level 1 heading with two words and a font change. The file is called XML_rocks.docx

Screenshot of a two words docx (word 2007) document

Now, in order to open a docx file, rename it to *.zip, e.g. XML_rocks.zip and then open it with an archiving program. You will see the following files.

   1312   [Content_Types].xml
    590   _rels/.rels

   1000   docProps/app.xml
    633   docProps/core.xml

   1424   word/document.xml
   1746   word/settings.xml
   1031   word/fontTable.xml
    286   word/webSettings.xml
  15907   word/styles.xml

    817   word/_rels/document.xml.rels

   6992   word/theme/theme1.xml

document.xml

document.xml contains your content. Unlike text-centric XML formats like DocBook text and style is all mixed together. In that respect, OOXML, follows the same principle as the XSL/FO standard.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:document xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml"><w:body><w:p w:rsidR="00717609" w:rsidRPr="00D325BD" w:rsidRDefault="00126A98" w:rsidP="00D325BD"><w:pPr><w:pStyle w:val="Heading1"/><w:rPr><w:rFonts w:asciiTheme="minorHAnsi" w:hAnsiTheme="minorHAnsi" w:cstheme="minorHAnsi"/><w:sz w:val="96"/><w:szCs w:val="96"/></w:rPr></w:pPr><w:r w:rsidRPr="00D325BD"><w:rPr><w:rFonts w:asciiTheme="minorHAnsi" w:hAnsiTheme="minorHAnsi" w:cstheme="minorHAnsi"/><w:sz w:val="96"/><w:szCs w:val="96"/></w:rPr><w:t>XML rocks</w:t></w:r></w:p><w:sectPr w:rsidR="00717609" w:rsidRPr="00D325BD" w:rsidSect="007A4DB6"><w:pgSz w:w="11906" w:h="16838"/><w:pgMar w:top="1417" w:right="1417" w:bottom="1417" w:left="1417" w:header="708" w:footer="708" w:gutter="0"/><w:cols w:space="708"/><w:docGrid w:linePitch="360"/></w:sectPr></w:body></w:document>

Links