HTML and XHTML elements and attributes: Difference between revisions
mNo edit summary |
m (→Hyperlinks) |
||
(57 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
{{ | {{incomplete}} | ||
{{web technology tutorial|Introductory}} | {{web technology tutorial|Introductory}} | ||
<pageby nominor="false" comments="false"/> | <!-- <pageby nominor="false" comments="false"/> --> | ||
== Introduction == | == Introduction == | ||
Line 9: | Line 9: | ||
* Learn basic HTML and XHTML markup | * Learn basic HTML and XHTML markup | ||
; Prerequisites | ; Prerequisites | ||
* None | * None, but you also may read the [[Internet tutorial]] | ||
; Concurrent | |||
* [[HTML and XHTML validation and repair]] | |||
; Moving on | ; Moving on | ||
* See the [[web technology tutorials]] | * [[CSS]] | ||
* [[DHTML]] tutorial | |||
* [[HTML forms tutorial]] | |||
* See also the list of [[web technology tutorials]] | |||
; Level and target population | ; Level and target population | ||
* Beginners | * Beginners | ||
; Remarks | ; Remarks | ||
* For the moment, this article is intended to be a "handout" for "lab" teaching. In other words, a teacher + hands-on activities are needed. | * For the moment, this article is intended to be a "handout" for "lab" teaching. In other words, a teacher + hands-on activities are needed. In addition, we don't explain how to use a specific editing tool. | ||
* To do: Add some more tags and attributes, some additional explantions for each tag, an HTML forms tutorial, etc. | |||
</div> | </div> | ||
See also: | |||
* [[HTML]], [[XHTML]] and [[HTML5]] for some background information | |||
* [[HTML links]] for a page with pointers (e.g. to other HTML tutorials) | |||
* [[Using SVG with HTML5 tutorial]] (moving on) | |||
=== SGML and XML markup === | === SGML and XML markup === | ||
Line 23: | Line 34: | ||
Definitions of formal languages are called "document type definitions", "schemas" or "grammars". Read the [[DTD tutorial]] if you wish to know details. For the moment, you just need to understand that these grammars are sets of rules that define: | Definitions of formal languages are called "document type definitions", "schemas" or "grammars". Read the [[DTD tutorial]] if you wish to know details. For the moment, you just need to understand that these grammars are sets of rules that define: | ||
* a set of '' elements'' (tags) and their '' attributes'' that identify various | {{quotationbox| | ||
* a set of '' elements'' (tags) and their '' attributes'' that identify various structural elements of an HTML page; | |||
* '' how'' these elements can be '' embedded'' ; | * '' how'' these elements can be '' embedded'' ; | ||
* different sorts of entities (reusable fragments, special characters). | * different sorts of entities (reusable fragments, special characters). | ||
}} | |||
SGML and XML languages, e.g. HTML and XHTML have three kinds of components: | SGML and XML languages, e.g. HTML and XHTML have three kinds of components: | ||
{{quotationbox| | |||
* elements | * elements | ||
* attributes | * attributes | ||
* character and entity references (not explained here) | * character and entity references (not explained here) | ||
}} | |||
These elements use a special syntax that we shall introduce now with an explanation taken from [http://en.wikipedia.org/wiki/HTML Wikipedia] | These elements use a special syntax that we shall introduce now with an explanation taken from [http://en.wikipedia.org/wiki/HTML Wikipedia] | ||
=== An introduction to the (X)HTML markup formalism === | The most recent [[HTML5]] standard (not yet formalized) is based on another formalism, i.e. Web IDL, an interface definition language, that will not be covered here. | ||
=== An introduction to the (X)HTML markup formalism according to Wikipedia === | |||
'''HTML elements''' | |||
HTML elements are the basic components for HTML markup. Elements have two basic properties: attributes and content. Each element's attribute and each element's content has certain restrictions that must be followed for an HTML document to be considered valid. An element usually has a start tag (e.g. <code><element-name></code>) and an end tag (e.g. <code></element-name></code>). The element's attributes are contained in the start tag and content is located between the tags (e.g. <code><element-name attribute="value">Content</element-name></code>). Some elements, such as <code><nowiki><br></nowiki></code>, do not have any content and must not have a closing tag. Listed below are several types of markup elements used in HTML. | HTML elements are the basic components for HTML markup. Elements have two basic properties: attributes and content. Each element's attribute and each element's content has certain restrictions that must be followed for an HTML document to be considered valid. An element usually has a start tag (e.g. <code><element-name></code>) and an end tag (e.g. <code></element-name></code>). The element's attributes are contained in the start tag and content is located between the tags (e.g. <code><element-name attribute="value">Content</element-name></code>). Some elements, such as <code><nowiki><br></nowiki></code>, do not have any content and must not have a closing tag. Listed below are several types of markup elements used in HTML. | ||
'''Structural''' markup describes the purpose of text. For example, <code><nowiki><h2>Golf</h2></nowiki></code> establishes "Golf" as a second-level, which would be rendered in a browser in a manner similar to the " | '''Structural''' markup describes the purpose of text. For example, <code><nowiki><h2>Golf</h2></nowiki></code> establishes "Golf" as a second-level heading, which would be rendered in a browser in a manner similar to the "Introduction" title at the start of this section. Structural markup does not denote any specific rendering, but most Web browsers have standardized default styles for element formatting. Text may be further styled with [[CSS|Cascading Style Sheets]] (CSS). | ||
'''Presentational''' markup describes the appearance of the text, regardless of its function. For example <code><nowiki><b>boldface</b></nowiki></code> indicates that visual output devices should render "boldface" in bold text | '''Presentational''' markup describes the appearance of the text, regardless of its function. For example <code><nowiki><b>boldface</b></nowiki></code> indicates that visual output devices should render "boldface" in '''bold text'''. In the case of both <code><nowiki><b>bold</b></nowiki></code> and <code><nowiki><i>italic</i></nowiki></code>, there are elements which usually have an equivalent visual rendering but are more semantic in nature, namely <code><nowiki><strong>strong emphasis</strong></nowiki></code> and <code><nowiki><em>emphasis</em></nowiki></code> respectively. Most presentational markup elements have become deprecated under the HTML 4.0 specification, in favor of [[CSS]] based style design. | ||
'''Hypertext''' markup links parts of the document to other documents. HTML up through version [[XHTML]] 1.1 requires the use of an anchor element to create a hyperlink in the flow of text: <code><nowiki><a> | '''Hypertext''' markup links parts of the document to other documents. HTML up through version [[XHTML]] 1.1 requires the use of an anchor element to create a hyperlink in the flow of text: <code><nowiki><a> ... </a></nowiki></code>. However, the <code>href</code> attribute must also be set to a valid [[URL]] so for example the HTML markup, <code><nowiki><a href="http://en.wikipedia.org/">Wikipedia</a></nowiki></code>, will render the word "<span class="plainlinks">[http://en.wikipedia.org/ Wikipedia]</span>" as a [[hypertext|hyperlink]]. To link on an image, the anchor tag use the following syntax: <code><a href="url"><img src="image.gif" alt="alternative text" width="50" height="50"></a></code> | ||
'''HTML attributes''' | |||
Most of the attributes of an element are name-value pairs, separated by "=" | Let's now look at attributes. Most of the attributes of an element are name-value pairs, separated by "=". Attributes are written within the start tag of an element, after the element's name. The value may be enclosed in single or double quotes, although values consisting of certain characters can be left unquoted in HTML (but not XHTML). | ||
Most elements can take any of several common attributes: | Most elements can take any of several common attributes: | ||
Line 60: | Line 79: | ||
<abbr id="anId" class="aClass" style="color:blue;" title="Hypertext Markup Language">HTML</abbr></source> | <abbr id="anId" class="aClass" style="color:blue;" title="Hypertext Markup Language">HTML</abbr></source> | ||
This example displays as <span id="anId" class="aClass" style="color:blue;" title="Hypertext Markup Language">HTML</span> | This example displays as | ||
:<span id="anId" class="aClass" style="color:blue;" title="Hypertext Markup Language">HTML</span> | |||
In most browsers, pointing the cursor at the abbreviation should display the title text "Hypertext Markup Language." | |||
Most elements also take the language-related attributes <code>lang</code> and <code>dir</code>. | Most elements also take the language-related attributes <code>lang</code> and <code>dir</code>. | ||
Line 82: | Line 103: | ||
=== Content vs. Style === | === Content vs. Style === | ||
In keeping with the principle of [http://en.wikipedia.org/wiki/Separation_of_Concerns Separation of Concerns], the function of HTML is primarily to add structural and semantic information to the raw text of a document. In other words: '''HTML just defines what kind of elements you have inside a document'''. | In keeping with the principle of [http://en.wikipedia.org/wiki/Separation_of_Concerns Separation of Concerns], the function of HTML is primarily to add structural and semantic information to the raw text of a document. In other words: '''HTML just defines what kind of text (or typographic) elements you have inside a document''', e.g. a title, a paragraph, a quotation, a list, a list item, a table, a table row, a table cell, etc. | ||
Web browsers have a built-in method to render each of these elements, e.g. titles appear bigger and list elements are indented and prefixed with bullets. | |||
In older variants of HTML, designers would change appeareance with html tags that are now deprecated in HTML 4.x and XHTML 4.x strict. | In older variants of HTML, designers would change appeareance with html tags that are now deprecated in HTML 4.x and XHTML 4.x strict. Nowadays, to style a text, [[CSS]], the Cascading Stylesheet language is used ! The advantage of this strategy is that you may use the same style for ''lots'' of pages, e.g. this wiki uses the same stylesheets for all its page. This also means, that you can easly change the look of all your page that use the same styelsheet(s). Finally, you may associate several stylesheets to a page and that offers an extra set functionalities. For example, you can load a large "official" stylesheet that will cover most of your needs and then you can fine-tune styling by adding your own on top. You also may create different stylesheets for different media, in particular: one for normal viewers, one for visually impaired viewers and one for printing that filters out elements like navigation menus that you won't need on pager. | ||
== Structure of HTML pages and its variants == | |||
== Structure of | |||
Markup of an HTML page is divided into two big parts: the head contains information that the user will not see inside the browser window and the body contains the contents to be displayed. We can express this with a simple formula: | Markup of an HTML page is divided into two big parts: the head contains information that the user will not see inside the browser window and the body contains the contents to be displayed. We can express this with a simple formula: | ||
Line 96: | Line 115: | ||
A most simple HTML document that would just display | A most simple HTML document that would just display | ||
:: | :: Hello EdutechWiki reader! | ||
would look like this: | would look like this: | ||
<source lang="html4strict"> | <source lang="html4strict"> | ||
Line 108: | Line 127: | ||
</html> | </html> | ||
</source> | </source> | ||
You may have a look at the real page by clicking [http://tecfa.unige.ch/guides/htmlman/hello.html here]. However, this HTML code is not totally complete. In addition, the first line(s) of an (X)HTML page should contain a declaration that precisely defines what HTML dialect is being used. For instance, there is a difference between HTML, XHTML and HTML5 pages. | |||
A simple '''HTML 4''' page may look like this: | |||
[[image:dyn-html-1.png|frame|none|Important elements of an HTML page]] | |||
* File: [http://tecfa.unige.ch/guides/js/ex/coap/week-1-2/preq-html-page.html preq-html-page.html] | |||
An '''XHTML Page''' looks slightly different: | |||
[[image:dyn-html-2.png|frame|none|Anatomy of an XHTML page]] | |||
*File: [http://tecfa.unige.ch/guides/js/ex/coap/week-1-2/preq-xhtml-page.html preq-xhtml-page.html] | |||
There exist different variants for HTML. As of 2013, the major version in use are HTML5, HTML 4.x, and (fake) XHTML 1.x. Let's now look at a larger gallery of complete examples before we start introducing (X)HTML elements and attributes. | |||
=== HTML and XHTML code examples === | === HTML and XHTML code examples === | ||
Line 128: | Line 156: | ||
</source> | </source> | ||
HTML tags may use any kind of case, e.g. ''HEAD'', ''Head'', ''head'', ''heaD'' would be correct. To insure XHTML | HTML 4.x tags may use any kind of case, e.g. ''HEAD'', ''Head'', ''head'', ''heaD'' would be correct. To insure XHTML compatibility we suggest to adopt the following strategy: | ||
* use only lower case as in the example below that is formally identical to the one above | * use only lower case as in the example below that is formally identical to the one above | ||
* always close tags (more on that later ...) | * always close tags (more on that later ...) | ||
Line 145: | Line 173: | ||
; XHTML 1.0 strict example | ; XHTML 1.0 strict example | ||
<source lang="xml"> | <source lang="xml"> | ||
<?xml version="1.0" encoding="UTF-8"?> | <?xml version="1.0" encoding="UTF-8"?> | ||
Line 161: | Line 188: | ||
</source> | </source> | ||
As you can see HTML and | ; HTML 5 example | ||
* In HTML, some tags e.g. the ''p'' and ''li'' tags can be left "open", i.e. it is not necessary to add a closing tag | |||
* Attributes in HTML do not always need a value. | <source lang="html5"> | ||
<!doctype html> | |||
<html> | |||
<head> | |||
<title>My first HTML 5 document</title> | |||
<meta charset="UTF-8"> | |||
</head> | |||
<body> | |||
<p>Hello world!</p> | |||
</body> | |||
</html> | |||
</source> | |||
As you can see HTML, XHTML and HTML 5 look very similar. The major difference between HTML and XHTML are the following: | |||
* In HTML 4, some tags e.g. the ''p'' and ''li'' tags can be left "open", i.e. it is not necessary to add a closing tag | |||
* Attributes in HTML 4.x do not always need a value. | |||
* In XHTML, the ''html'' tag needs a namespace declaration (but ''not'' in HTML). | * In XHTML, the ''html'' tag needs a namespace declaration (but ''not'' in HTML). | ||
* HTML 5 extends the tag set of HTML 4 / XHTML 1.x and tags should be in lower case and closed (as in XHTML), but the latter is not a requirement. | |||
This may be confusing for a beginner. So to make things simple: | This may be confusing for a beginner. So, to make things simple: | ||
* Always start with one of the | * Always start with one of the three templates above (your [[web authoring system]] may do this automatically for you) | ||
* Always close all tags, even when you just write "old" HTML code | * Always close all tags, even when you just write "old" HTML or "new" HTML5 code | ||
=== HTML and XHTML structure and document type information === | === HTML and XHTML structure and document type information (DTD) === | ||
Let's now have a look at the lines before the <nowiki>html</nowiki> tag. | Let's now have a look at the lines before the <nowiki>html</nowiki> tag. | ||
Line 178: | Line 221: | ||
The rationale for including this information is that display will be better when the browser knows what kind of (X)HTML you intended to use. | The rationale for including this information is that display will be better when the browser knows what kind of (X)HTML you intended to use. | ||
There exist three major HTML document types: | There exist three major '''HTML''' document types: | ||
;HTML 4.01 Strict | ;HTML 4.01 Strict | ||
Line 184: | Line 227: | ||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" | ||
"http://www.w3.org/TR/html4/strict.dtd"> | "http://www.w3.org/TR/html4/strict.dtd"> | ||
<html> | |||
</html> | |||
</source> | </source> | ||
Line 190: | Line 235: | ||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" | ||
"http://www.w3.org/TR/html4/loose.dtd"> | "http://www.w3.org/TR/html4/loose.dtd"> | ||
<html> | |||
</html> | |||
</source> | </source> | ||
Line 196: | Line 243: | ||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" | ||
"http://www.w3.org/TR/html4/frameset.dtd"> | "http://www.w3.org/TR/html4/frameset.dtd"> | ||
<html> | |||
..... | |||
</html> | |||
</source> | </source> | ||
There exist four major XHTML document types: | There exist four major '''XHTML''' document types: | ||
;XHTML 4.01 Strict | ;XHTML 4.01 Strict | ||
Line 205: | Line 255: | ||
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" | PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" | ||
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> | "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> | ||
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> | |||
.... | |||
</html> | |||
</source> | </source> | ||
Line 212: | Line 265: | ||
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" | PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" | ||
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> | "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> | ||
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> | |||
.... | |||
</html> | |||
</source> | </source> | ||
Line 219: | Line 275: | ||
PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" | PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" | ||
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"> | "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"> | ||
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> | |||
.... | |||
</html> | |||
</source> | </source> | ||
Line 225: | Line 284: | ||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN" | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN" | ||
"http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd"> | "http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd"> | ||
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> | |||
.... | |||
</html> | |||
</source> | |||
There exist 2 variants of '''HTML5'''. Normal HTML5 and a so-called XML serialization | |||
; HTML5 | |||
<source lang="xml"> | |||
<!doctype html> | |||
<html> | |||
.... | |||
</html> | |||
</source> | </source> | ||
; XHTML5 | |||
* No specific declaration is needed, but the page must be served as ''application/xhtml+xml'' or ''application/xml'' | |||
Note regarding XHTML and XML: | Note regarding XHTML and XML: | ||
Line 236: | Line 311: | ||
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> | "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> | ||
</source> | </source> | ||
* In | * In addition, do not forget to declare the XHTML namespace attribute in the html tag | ||
=== How do browsers handle different HTML versions ? === | |||
As a web designer you must cope with several issues. Here are some of these issues that you should look at once you'll get past learning basic HTML: | |||
* Take a decision whether you want to write code that can be displayed with all browsers. Usually you should. | |||
* Decide what to do if browsers can't handle certain tags. The best strategy is use dynamic server or client code that will adapt the content to the target's abilities. This particularly applies to [[HTML5]] which has been pushed quite aggressively since summer 2010. Each browser implements a different subset of this proposal. E.g. use [http://www.modernizr.com/ Modernizr]. | |||
* Be aware of browser modes, i.e. that fact that each web browser has a mode that can cope with bad legacy HTML (particularly important for IE). Read [http://hsivonen.iki.fi/doctype/ Activating Browser Modes with Doctype]. This article by Henri Sivonen describes the following issue: {{quotation|In order to deal both with content written according to Web standards and with content written according to legacy practices that were prevalent in the late 1990s, contemporary Web browsers implement various engine modes. This document explains what those mode are and how they are triggered.}} | |||
* Internet Explorer (version 8 and lower) cannot handle XHTML, i.e. it can't display xhtml when it is served as [http://en.wikipedia.org/wiki/Internet_media_type mime-type]. IE will take xhtml pages if served as "html", i.e. it will translate XHTML to HTML and ignore all other languages that you may have included in your XHTML page (like [[SVG]] or [[MathML]]). | |||
== The head element == | == The head element == | ||
{{xmlelement|<head | {{xmlelement|<head>... </head>}} | ||
In the head element we may defines various important information about an HTML page. Its contents are not displayed. | In the head element we may defines various important information about an HTML page. Its contents are not displayed. | ||
The most important sub- | The most important sub-elements of ''head'' are the following: | ||
{{xmlelement|<link / | {{xmlelement|<link />}} | ||
Specifies links to other documents, such as previous and next links, or alternate versions. A common use is to link to external stylesheets, using the form: | Specifies links to other documents, such as previous and next links, or alternate versions. A common use is to link to external stylesheets, using the form: | ||
<source lang="xml"> | <source lang="xml"> | ||
Line 251: | Line 335: | ||
</source> | </source> | ||
{{xmlelement|<meta / | {{xmlelement|<meta />}} | ||
Can be used to specify additional metadata about a document, such as its author, publication date, expiration date, page description, keywords, or other information not provided through the other header elements and attributes. For instance, the content attribute defines both the mime type and the | Can be used to specify additional metadata about a document, such as its author, publication date, expiration date, page description, keywords, or other information not provided through the other header elements and attributes. For instance, the content attribute defines both the mime type and the character set. There is a difference between HTML4/XHTML1 and [[HTML5]] ! | ||
HTML and XHTML 1 (served as HTML) | |||
<source lang="xml"> | <source lang="xml"> | ||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> | ||
</source> | </source> | ||
HTML5: | |||
<source lang="xml"> | |||
<meta charset="UTF-8"> | |||
</source> | |||
{{xmlelement|<title>…</title>}} | {{xmlelement|<title>…</title>}} | ||
Define a document title. Required in every HTML and XHTML document. User agents (e.g. browsers) may use the title in different ways or use it as default page name when saving the page on the local file system. Typically, the title is displayed on the window decoration. For obvious reasons, only text is allowed within a title element. | Define a document title. Required in every HTML and XHTML document. User agents (e.g. browsers) may use the title in different ways or use it as default page name when saving the page on the local file system. Typically, the title is displayed on the window decoration. For obvious reasons, only text is allowed within a title element. | ||
A minimal head fragment would look like this: | |||
<source lang="xml"> | |||
<head> | |||
<title>Hello Page</title> | |||
</head> | |||
</source> | |||
Now let's be a bit scary, the head of this page (as of sept. 1 2009) | |||
starts like this: | |||
<source lang="xml" enclose="div"> | |||
<head> | |||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> | |||
<meta http-equiv="Content-Style-Type" content="text/css" /> | |||
<meta name="generator" content="MediaWiki 1.15.1" /> | |||
<meta name="keywords" content="HTML and XHTML elements and attributes,CSS,DTD tutorial,HTML,HTML links,Hypertext,JavaScript,Portalware,SGML,Through the web editor,URL" /> | |||
<link rel="stylesheet" type="text/css" href="/mediawiki/extensions/CategoryTree/CategoryTree.css?5" /> | |||
<link rel="alternate" type="application/x-wiki" title="Edit" href="/mediawiki/index.php?title=HTML_and_XHTML_elements_and_attributes&action=edit" /> | |||
<link rel="edit" title="Edit" href="/mediawiki/index.php?title=HTML_and_XHTML_elements_and_attributes&action=edit" /> | |||
<link rel="shortcut icon" href="/favicon.ico" /> | |||
</source> | |||
It then goes on for '''dozens of lines'''. If you wish to see all of them: In your browser select ''View->Source''. Many modern web pages though use reasonably short headers. E.g. the google search page used this (sept. 2009): | |||
<source lang="xml"><head> | |||
<meta http-equiv="content-type" content="text/html;charset=UTF-8"> | |||
<title>Google</title> | |||
<script> [... some scripting code inside ...]</script> | |||
</head></source> | |||
== Structuring the document body == | == Structuring the document body == | ||
Inside the body tag, a variety high level elements may used in any order | Inside the body tag, a variety of high level elements may used in any order. The following pseudo-formal rule includes the most important ones: | ||
:''' ''body'' = ( address | blockquote | div | dl | h1 | h2 | h3 | ol | p | pre | table | ul ) * ''' | |||
Each of these structural elements may include markup | Read the rule like this: Inside the body tag you may include any of these tags as much as you like and in any order. | ||
Each of these structural elements may include text without markup, additional structural elements or inline elements (see below). Some elements however, like the ''ol'', ''ul'' and ''table'' tags, only allow a restricted set of sub-elements inside. | |||
Knowing exactly what kind elements (tags) you are allowed to insert within an element is not obvious. Many websites include detailed descriptions of tags, e.g. have a look at the ones we list in the [[HTML_links#Manuals_.26_Short_References|HTML links]] page. Be aware that most websites don't clearly tell you what HTML version they are referring to, but then if you make a mistake, a so-called validator can tell you so. Also, professional web designers in training may want to acquire a good book. | |||
The last resort (at least for computer-savy people) is always the specifications which you can find at W3C, e.g. the [http://www.w3.org/TR/html/ XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition) Recommendation] that requires you to also read the [http://www.w3.org/TR/1999/REC-html401-19991224/ HTML 4.01 Specification]. Very difficult for beginners, but you still may have a look a both of these documents. | |||
=== Headings (titles) === | === Headings (titles) === | ||
Line 272: | Line 396: | ||
{{xmlelement|<h1>…</h1> <h2>…</h2> <h3>…</h3> <h4>…</h4> <h5>…</h5> <h6>…</h6>}} | {{xmlelement|<h1>…</h1> <h2>…</h2> <h3>…</h3> <h4>…</h4> <h5>…</h5> <h6>…</h6>}} | ||
Section headings at different levels. <h1> delimits the highest-level heading, <h2> the next level down (sub-section), <h3> for a level below that, and so on to <h6>. Most visual browsers show headings as large bold text by default, though this can be overridden with CSS. Heading elements are not intended merely for creating large or bold text — they describe the document’s structure and organization. Some programs use them to generate outlines and tables of contents. | Section headings at different levels. <h1> delimits the highest-level heading, <h2> the next level down (sub-section), <h3> for a level below that, and so on to <h6>. Most visual browsers show headings as large bold text by default, though this can be overridden with CSS. Heading elements are not intended merely for creating large or bold text — they describe the document’s structure and organization. Some programs use them to generate outlines and tables of contents. | ||
Heading elements may include most inline elements. | |||
=== Paragraphs === | === Paragraphs === | ||
Paragraph elements may include most inline elements plus most of the block elements, e.g. lists and tables. | |||
{{xmlelement|<p>...</p>}} | {{xmlelement|<p>...</p>}} | ||
Line 309: | Line 437: | ||
== Inline elements: Adding markup to block elements == | == Inline elements: Adding markup to block elements == | ||
=== Hyperlinks === | |||
:''inline contents'' = ( ''Any text'' | a | abbr | acronym | br | cite | code | em | img | kbd | q | samp | span | strong | table | script | object) * | |||
{{Xmlelement|<a>…</a>}} | {{Xmlelement|<a>…</a>}} | ||
An anchor can be either the origin or the target (destination) end of a hyperlink. | An anchor can be either the origin or the target (destination) end of a hyperlink. | ||
With the attribute href, the anchor becomes a hyperlink to either another part of the document or another resource (e.g. a webpage) using an external URL. | With the attribute href, the anchor becomes a hyperlink to either another part of the document or another resource (e.g. a webpage) using an external URL or a relative URL to another resource on the same website. | ||
The following code | The following code | ||
<source lang="html4strict"> | <source lang="html4strict"> | ||
<a href="http:// | <a href="http://edutechwiki.unige.ch/en/web_technology_tutorials">Web technology tutorials</a> | ||
</source> | </source> | ||
will show like this | will show like this | ||
: [[ | : [[Web technology tutorials]] | ||
The attribute title may be set to give brief information about the link: | The attribute title may be set to give brief information about the link: | ||
<source lang="html4strict"> | <source lang="html4strict"> | ||
<a href="URL" title="additional information">link text</a> | <a href="URL" title="additional information">link text</a> | ||
</source> | </source> | ||
=== Phrase elements === | |||
{{Xmlelement|<em>…</em>}} | |||
Emphasis (conventionally displayed in italics) | |||
{{Xmlelement|<strong>…</strong>}} | |||
strong emphasis (conventionally displayed bold). | |||
=== Presentation elements === | |||
Elements like "b", "it" etc. should be avoided as much as possible. The "font+" tag is even deprecated. Use [[CSS tutorial|CSS]] instead ! | |||
{{Xmlelement|<sub>…</sub> and <sup>…</sup>}} | |||
Mark subscript or superscript text. (Equivalent CSS: {vertical-align: sub} or {vertical-align: super}.) | |||
{{Xmlelement|<br />}} | |||
An element that inserts a line break. This element should '''not''' be used to separate paragraphs. Paragraphs are simply wrapped within ''p'' tags. | |||
== Tables == | |||
{{Xmlelement|<table>…</table>}} | |||
Identifies a table. A most simple table has the following structure: | |||
<source lang="html4strict"> | |||
<table border="1"> | |||
<tr><th>Food</th><th>Price</th></tr> | |||
<tr><td>Bread</td><td>$2.99</td></tr> | |||
<tr><td>Milk</td><td>$1.40</td></tr> | |||
</table> | |||
</source> | |||
It would look like this: | |||
<table border="1"> | |||
<tr><th>Food</th><th>Price</th></tr> | |||
<tr><td>Bread</td><td>$2.99</td></tr> | |||
<tr><td>Milk</td><td>$1.40</td></tr> | |||
</table> | |||
{{Xmlelement|<tr>…</tr>}} | |||
Contains a row of cells in a table. | |||
{{Xmlelement|<th>…</th>}} | |||
A table header cell; contents are conventionally displayed bold and centered. An aural user agent may use a louder voice for these items. | |||
{{Xmlelement|<td>…</td>}} | |||
A table data cell. | |||
{{Xmlelement|<colgroup>…</colgroup>}} | |||
Specifies a column group in a table. | |||
{{Xmlelement|<col />}} | |||
Specifies attributes for an entire column in a table. | |||
{{Xmlelement|<caption>…</caption>}} | |||
Specifies a caption for a table. | |||
{{Xmlelement|<thead>…</thead>}} | |||
Specifies the header part of a table. This section may be repeated by the user agent if the table is split across pages (in printing or other paged media). | |||
{{Xmlelement|<tbody>…</tbody>}} | |||
Specifies the main part of a table. | |||
{{Xmlelement|<tfoot>…</tfoot>}} | |||
Specifies the footer part of a table. Like <thead>, this section may be repeated by the user agent if the table is split across pages (in printing or other paged media). | |||
== Forms == | |||
To do. In the meantime, see: [http://en.wikipedia.org/wiki/HTML_form HTML form] on wikipedia. | |||
== Scripts, pictures and other external objects == | |||
{{Xmlelement|<script>...</script>}} | |||
Places a script in the document. Also usable in the head and in block contexts. Typically, the script in the head element would load all libraries, i.e. function definitions, whereas scripts in the body would just define a few lines of function calls. See [[JavaScript]]. | |||
{{Xmlelement|<applet>…</applet>}} | |||
Embeds a Java applet in the page. Deprecated in favor of <object>, as it could only be used with Java applets, and had accessibility limitations. | |||
{{Xmlelement|<img />}} | |||
Used to insert an image in the document. The src attribute specifies the image URL. The required alt attribute provides alternative text in case the image cannot be displayed. (Though alt is intended as alternative text, Microsoft Internet Explorer renders it as a tooltip if no title is given.) Below is an example that would retrieve a picture from the same server as edutech wiki. We also define a width and a height, a strategy that can greatly speed up rendering of a page. | |||
<source lang="xml"> | |||
<img alt="Creative commons batch" | |||
src="/mediawiki/images/Cc-network.png" border="0" width="88" height="31"> | |||
</source> | |||
It would show like this: [[image:cc-network.png]] | |||
The following example shows how one would include a picture from a Wikimedia website (result not shown): | |||
<source lang="xml"> | |||
<img alt="Creative commons batch" src="http://upload.wikimedia.org/wikipedia/common/4/4d/Crystal_Clear_mimetype_html.png"> | |||
</source> | |||
{{Xmlelement|<map>…</map>}} | |||
Specifies a client-side image map, not explained here. | |||
{{Xmlelement|<object>…</object>}} | |||
Includes an object in the page of the type specified by the type attribute. This may be in any MIME-type the user agent understands, such as an embedded HTML page, a file to be handled by a plug-in such as Flash, a Java applet, a sound file, etc. | |||
Note: The "embed" tag calls a plug-in handler for the type specified by the type attribute. Used for embedding Flash files, sound files, etc. This is a proprietary Netscape extension to HTML; <object> is the W3C standard method. | |||
{{Xmlelement|<iframe>…</object>}} | |||
Includes a resource (HTML page, picture, etc.). HTML5 removed a number of attributes from the HTML4.1 element definition, use CSS instead. However it added <code>sandbox</code> allowing some extra restrictions on the content of iframe. Prior to that, iframes were security risks. | |||
== Elements for extensions and styling purposes == | == Elements for extensions and styling purposes == | ||
{{Xmlelement | {{Xmlelement|<div>…</div>}} | ||
A block-level logical division. A generic element with no semantic meaning used to distinguish a document section, usually for purposes such as presentation or behaviour controlled by stylesheets or DOM calls. | A block-level logical division. A generic element with no semantic meaning used to distinguish a document section, usually for purposes such as presentation or behaviour controlled by stylesheets or DOM calls. | ||
Code below is valid | |||
<source lang="xml"> | |||
<div style="color:green;font-size:larger;background-color:lightgrey"><h3>Draft section</h3> | |||
<p> .... </p> | |||
</div> | |||
</source> | |||
but should be rather replaced by: | |||
<source lang="xml"> | |||
<div class="draft"><h3>Draft section</h3> | |||
<p> .... </p> | |||
</div> | |||
</source> | |||
plus a [[CSS]] definition of the "draft" class we just invented, e.g. like this | |||
.draft {color:green;font-size:larger;background-color:lightgrey} | |||
{{Xmlelement|<span>…</span>}} | {{Xmlelement|<span>…</span>}} | ||
An inline logical division. A generic element with no semantic meaning used to distinguish a document section, usually for purposes such as presentation or behaviour controlled by stylesheets or DOM calls. | An inline logical division. A generic element with no semantic meaning used to distinguish a document section, usually for purposes such as presentation or behaviour controlled by stylesheets or DOM calls. | ||
== | Here is a little example that shows how to use ''span'' with some inline [[CSS tutorial|CSS]] styling. | ||
<source lang="XML"> | |||
<p> <span style="font-weight:bold;color:green;">This article <i>or</i> section is a stub</span>. | |||
A stub is an entry that did not yet receive substantial attention .....</p> | |||
</source> | |||
In a browser, it would look like this: | |||
:<p> <span style="font-weight:bold;color:green;">This article <i>or</i> section is a stub</span>. A stub is an entry that did not yet receive substantial attention .....</p> | |||
== Use of HTML fragments in online environments == | |||
Many web 2.0 environments and [[portalware]] let users hand code "HTML boxes". In this case you can enter any "body" elements you are allow to. Typically elements like ''p'', ''ul'', etc. are allowed, elements like ''script'' or ''object'' may or may not be allowed for security purposes. 'h1' etc. elements may not be allowed for styling purposes. | |||
Often, such environments include by default a [[through the web editor]]. But most of these tools do have an "HTML" button that will let you hand code HTML. | |||
You will have to find out on a case per case basis... | |||
== Acknowledgment and copyright == | |||
{{copyrightalso|[http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution/Share-Alike License]. Parts of this article are based on various Wikipedia articles, in particular: [http://en.wikipedia.org/wiki/HTML HTML], [http://en.wikipedia.org/wiki/XHTML XHTML], and [http://en.wikipedia.org/wiki/HTML_element HTML element]}} | {{copyrightalso|[http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution/Share-Alike License]. Parts of this article are based on various Wikipedia articles, in particular: [http://en.wikipedia.org/wiki/HTML HTML], [http://en.wikipedia.org/wiki/XHTML XHTML], and most importantly [http://en.wikipedia.org/wiki/HTML_element HTML element] for element descriptions. If you reuse lots of contents from this page, you must both acknowledge these Wikipedia articles and EduTechwiki and reproduce this license.}} | ||
[[Category: Web authoring]] | [[Category: Web authoring]] |
Latest revision as of 19:38, 29 September 2018
Introduction
- Learning goals
- Learn basic HTML and XHTML markup
- Prerequisites
- None, but you also may read the Internet tutorial
- Concurrent
- Moving on
- CSS
- DHTML tutorial
- HTML forms tutorial
- See also the list of web technology tutorials
- Level and target population
- Beginners
- Remarks
- For the moment, this article is intended to be a "handout" for "lab" teaching. In other words, a teacher + hands-on activities are needed. In addition, we don't explain how to use a specific editing tool.
- To do: Add some more tags and attributes, some additional explantions for each tag, an HTML forms tutorial, etc.
See also:
- HTML, XHTML and HTML5 for some background information
- HTML links for a page with pointers (e.g. to other HTML tutorials)
- Using SVG with HTML5 tutorial (moving on)
SGML and XML markup
SGML and XML are the formalisms with which formal languages like HTML (in SGML) and XHTML (in XML) are defined. SGML at some time was replaced by XML which is simpler in structure, but more powerful in terms of tools that have been built around it.
Definitions of formal languages are called "document type definitions", "schemas" or "grammars". Read the DTD tutorial if you wish to know details. For the moment, you just need to understand that these grammars are sets of rules that define:
- a set of elements (tags) and their attributes that identify various structural elements of an HTML page;
- how these elements can be embedded ;
- different sorts of entities (reusable fragments, special characters).
SGML and XML languages, e.g. HTML and XHTML have three kinds of components:
- elements
- attributes
- character and entity references (not explained here)
These elements use a special syntax that we shall introduce now with an explanation taken from Wikipedia
The most recent HTML5 standard (not yet formalized) is based on another formalism, i.e. Web IDL, an interface definition language, that will not be covered here.
An introduction to the (X)HTML markup formalism according to Wikipedia
HTML elements
HTML elements are the basic components for HTML markup. Elements have two basic properties: attributes and content. Each element's attribute and each element's content has certain restrictions that must be followed for an HTML document to be considered valid. An element usually has a start tag (e.g. <element-name>
) and an end tag (e.g. </element-name>
). The element's attributes are contained in the start tag and content is located between the tags (e.g. <element-name attribute="value">Content</element-name>
). Some elements, such as <br>
, do not have any content and must not have a closing tag. Listed below are several types of markup elements used in HTML.
Structural markup describes the purpose of text. For example, <h2>Golf</h2>
establishes "Golf" as a second-level heading, which would be rendered in a browser in a manner similar to the "Introduction" title at the start of this section. Structural markup does not denote any specific rendering, but most Web browsers have standardized default styles for element formatting. Text may be further styled with Cascading Style Sheets (CSS).
Presentational markup describes the appearance of the text, regardless of its function. For example <b>boldface</b>
indicates that visual output devices should render "boldface" in bold text. In the case of both <b>bold</b>
and <i>italic</i>
, there are elements which usually have an equivalent visual rendering but are more semantic in nature, namely <strong>strong emphasis</strong>
and <em>emphasis</em>
respectively. Most presentational markup elements have become deprecated under the HTML 4.0 specification, in favor of CSS based style design.
Hypertext markup links parts of the document to other documents. HTML up through version XHTML 1.1 requires the use of an anchor element to create a hyperlink in the flow of text: <a> ... </a>
. However, the href
attribute must also be set to a valid URL so for example the HTML markup, <a href="http://en.wikipedia.org/">Wikipedia</a>
, will render the word "Wikipedia" as a hyperlink. To link on an image, the anchor tag use the following syntax: <a href="url"><img src="image.gif" alt="alternative text" width="50" height="50"></a>
HTML attributes
Let's now look at attributes. Most of the attributes of an element are name-value pairs, separated by "=". Attributes are written within the start tag of an element, after the element's name. The value may be enclosed in single or double quotes, although values consisting of certain characters can be left unquoted in HTML (but not XHTML).
Most elements can take any of several common attributes:
- The
id
attribute provides a document-wide unique identifier for an element. This can be used by stylesheets to provide presentational properties, by browsers to focus attention on the specific element, or by scripts to alter the contents or presentation of an element. Appended to the URL of the page, it provides a globally-unique identifier for an element; typically a sub-section of the page. For example, the ID "Attributes" inhttp://en.wikipedia.org/wiki/HTML#Attributes
- The
class
attribute provides a way of classifying similar elements. This can be used for presentation purposes for example. An HTML document might use the designationclass="notation"
to indicate that all elements with this class value are subordinate to the main text of the document. Such elements might be gathered together and presented as footnotes on a page instead of appearing in the place where they occur in the HTML source. - An author may use the
style
non-attributal codes presentational properties to a particular element. It is considered better practice to use an element’sid
orclass
attributes to select the element with a stylesheet, though sometimes this can be too cumbersome for a simple ad hoc application of styled properties. - The
title
attribute is used to attach subtextual explanation to an element. In most browsers this attribute is displayed as what is often referred to as a tooltip.
The abbreviation element, abbr
, can be used to demonstrate these various attributes:
<abbr id="anId" class="aClass" style="color:blue;" title="Hypertext Markup Language">HTML</abbr>
This example displays as
- HTML
In most browsers, pointing the cursor at the abbreviation should display the title text "Hypertext Markup Language."
Most elements also take the language-related attributes lang
and dir
.
Summary of (X)HTML markup principles
HTML elements may be either containers or empty.
Container elements are constructed with:
- a start tag (
<tag>
) marking the beginning of an element, which may incorporate:- any number of attributes (including their values);
- some amount of content (text, other elements);
- an end tag, in which the element name is prepended with a forward slash:
</tag>
. (note: In some forms of HTML the end tag is optional for some container elements.)
Empty elements consist of only a single tag, with any attributes. (The tag may have a slash appended: <tag />
– in XHTML this is required.)
Attributes define desired behavior or indicate additional element properties. In XHTML each attribute must have a quoted value, e.g. class = "important"
.
Element (and attribute) names may be written in either upper or lower case in HTML, but must be in lower case in XHTML.
Content vs. Style
In keeping with the principle of Separation of Concerns, the function of HTML is primarily to add structural and semantic information to the raw text of a document. In other words: HTML just defines what kind of text (or typographic) elements you have inside a document, e.g. a title, a paragraph, a quotation, a list, a list item, a table, a table row, a table cell, etc.
Web browsers have a built-in method to render each of these elements, e.g. titles appear bigger and list elements are indented and prefixed with bullets.
In older variants of HTML, designers would change appeareance with html tags that are now deprecated in HTML 4.x and XHTML 4.x strict. Nowadays, to style a text, CSS, the Cascading Stylesheet language is used ! The advantage of this strategy is that you may use the same style for lots of pages, e.g. this wiki uses the same stylesheets for all its page. This also means, that you can easly change the look of all your page that use the same styelsheet(s). Finally, you may associate several stylesheets to a page and that offers an extra set functionalities. For example, you can load a large "official" stylesheet that will cover most of your needs and then you can fine-tune styling by adding your own on top. You also may create different stylesheets for different media, in particular: one for normal viewers, one for visually impaired viewers and one for printing that filters out elements like navigation menus that you won't need on pager.
Structure of HTML pages and its variants
Markup of an HTML page is divided into two big parts: the head contains information that the user will not see inside the browser window and the body contains the contents to be displayed. We can express this with a simple formula:
html = head + body
A most simple HTML document that would just display
- Hello EdutechWiki reader!
would look like this:
<html>
<head>
<title>Hello Page</title>
</head>
<body>
<p>Hello EdutechWiki reader!</p>
</body>
</html>
You may have a look at the real page by clicking here. However, this HTML code is not totally complete. In addition, the first line(s) of an (X)HTML page should contain a declaration that precisely defines what HTML dialect is being used. For instance, there is a difference between HTML, XHTML and HTML5 pages.
A simple HTML 4 page may look like this:
- File: preq-html-page.html
An XHTML Page looks slightly different:
- File: preq-xhtml-page.html
There exist different variants for HTML. As of 2013, the major version in use are HTML5, HTML 4.x, and (fake) XHTML 1.x. Let's now look at a larger gallery of complete examples before we start introducing (X)HTML elements and attributes.
HTML and XHTML code examples
- HTML 4.01 strict example
Source: http://www.w3.org/TR/html4/struct/global.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
<TITLE>My first HTML document</TITLE>
</HEAD>
<BODY>
<P>Hello world!
</BODY>
</HTML>
HTML 4.x tags may use any kind of case, e.g. HEAD, Head, head, heaD would be correct. To insure XHTML compatibility we suggest to adopt the following strategy:
- use only lower case as in the example below that is formally identical to the one above
- always close tags (more on that later ...)
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>My first HTML document</title>
</head>
<body>
<p>Hello world!</p>
</body>
</html>
- XHTML 1.0 strict example
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>My first XHTML document</title>
</head>
<body>
<p>Hello world!</p>
</body>
</html>
- HTML 5 example
<!doctype html>
<html>
<head>
<title>My first HTML 5 document</title>
<meta charset="UTF-8">
</head>
<body>
<p>Hello world!</p>
</body>
</html>
As you can see HTML, XHTML and HTML 5 look very similar. The major difference between HTML and XHTML are the following:
- In HTML 4, some tags e.g. the p and li tags can be left "open", i.e. it is not necessary to add a closing tag
- Attributes in HTML 4.x do not always need a value.
- In XHTML, the html tag needs a namespace declaration (but not in HTML).
- HTML 5 extends the tag set of HTML 4 / XHTML 1.x and tags should be in lower case and closed (as in XHTML), but the latter is not a requirement.
This may be confusing for a beginner. So, to make things simple:
- Always start with one of the three templates above (your web authoring system may do this automatically for you)
- Always close all tags, even when you just write "old" HTML or "new" HTML5 code
HTML and XHTML structure and document type information (DTD)
Let's now have a look at the lines before the html tag.
Correct HTML files should include the following document type declaration information starting on line 1. Before we add more explanation we suggest that you either use HTML 4.01 Transitional or XHTML 4.01 Transitional for pages meant for reading on a computer and XHTML Basic for (modern) cellphones and PDAs.
The rationale for including this information is that display will be better when the browser knows what kind of (X)HTML you intended to use.
There exist three major HTML document types:
- HTML 4.01 Strict
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
</html>
- HTML 4.01 Transitional
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
</html>
- HTML 4.01 Frameset
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
"http://www.w3.org/TR/html4/frameset.dtd">
<html>
.....
</html>
There exist four major XHTML document types:
- XHTML 4.01 Strict
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
....
</html>
- XHTML 4.01 Transitional
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
....
</html>
- XHTML 4.01 Frameset
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
....
</html>
- XHTML Basic
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN"
"http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
....
</html>
There exist 2 variants of HTML5. Normal HTML5 and a so-called XML serialization
- HTML5
<!doctype html>
<html>
....
</html>
- XHTML5
- No specific declaration is needed, but the page must be served as application/xhtml+xml or application/xml
Note regarding XHTML and XML:
- If you intend to serve XHTML as XML (e.g. in order to include other XML languages within your document) we suggest adding an XML declaration at the very beginning of the file.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
- In addition, do not forget to declare the XHTML namespace attribute in the html tag
How do browsers handle different HTML versions ?
As a web designer you must cope with several issues. Here are some of these issues that you should look at once you'll get past learning basic HTML:
- Take a decision whether you want to write code that can be displayed with all browsers. Usually you should.
- Decide what to do if browsers can't handle certain tags. The best strategy is use dynamic server or client code that will adapt the content to the target's abilities. This particularly applies to HTML5 which has been pushed quite aggressively since summer 2010. Each browser implements a different subset of this proposal. E.g. use Modernizr.
- Be aware of browser modes, i.e. that fact that each web browser has a mode that can cope with bad legacy HTML (particularly important for IE). Read Activating Browser Modes with Doctype. This article by Henri Sivonen describes the following issue: “In order to deal both with content written according to Web standards and with content written according to legacy practices that were prevalent in the late 1990s, contemporary Web browsers implement various engine modes. This document explains what those mode are and how they are triggered.”
- Internet Explorer (version 8 and lower) cannot handle XHTML, i.e. it can't display xhtml when it is served as mime-type. IE will take xhtml pages if served as "html", i.e. it will translate XHTML to HTML and ignore all other languages that you may have included in your XHTML page (like SVG or MathML).
The head element
<head>... </head>
In the head element we may defines various important information about an HTML page. Its contents are not displayed.
The most important sub-elements of head are the following:
<link />
Specifies links to other documents, such as previous and next links, or alternate versions. A common use is to link to external stylesheets, using the form:
<link rel="stylesheet" type="text/css" href="url" title="description_of_style">
<meta />
Can be used to specify additional metadata about a document, such as its author, publication date, expiration date, page description, keywords, or other information not provided through the other header elements and attributes. For instance, the content attribute defines both the mime type and the character set. There is a difference between HTML4/XHTML1 and HTML5 !
HTML and XHTML 1 (served as HTML)
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
HTML5:
<meta charset="UTF-8">
<title>…</title>
Define a document title. Required in every HTML and XHTML document. User agents (e.g. browsers) may use the title in different ways or use it as default page name when saving the page on the local file system. Typically, the title is displayed on the window decoration. For obvious reasons, only text is allowed within a title element.
A minimal head fragment would look like this:
<head>
<title>Hello Page</title>
</head>
Now let's be a bit scary, the head of this page (as of sept. 1 2009) starts like this:
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="MediaWiki 1.15.1" />
<meta name="keywords" content="HTML and XHTML elements and attributes,CSS,DTD tutorial,HTML,HTML links,Hypertext,JavaScript,Portalware,SGML,Through the web editor,URL" />
<link rel="stylesheet" type="text/css" href="/mediawiki/extensions/CategoryTree/CategoryTree.css?5" />
<link rel="alternate" type="application/x-wiki" title="Edit" href="/mediawiki/index.php?title=HTML_and_XHTML_elements_and_attributes&action=edit" />
<link rel="edit" title="Edit" href="/mediawiki/index.php?title=HTML_and_XHTML_elements_and_attributes&action=edit" />
<link rel="shortcut icon" href="/favicon.ico" />
It then goes on for dozens of lines. If you wish to see all of them: In your browser select View->Source. Many modern web pages though use reasonably short headers. E.g. the google search page used this (sept. 2009):
<head>
<meta http-equiv="content-type" content="text/html;charset=UTF-8">
<title>Google</title>
<script> [... some scripting code inside ...]</script>
</head>
Structuring the document body
Inside the body tag, a variety of high level elements may used in any order. The following pseudo-formal rule includes the most important ones:
- body = ( address | blockquote | div | dl | h1 | h2 | h3 | ol | p | pre | table | ul ) *
Read the rule like this: Inside the body tag you may include any of these tags as much as you like and in any order.
Each of these structural elements may include text without markup, additional structural elements or inline elements (see below). Some elements however, like the ol, ul and table tags, only allow a restricted set of sub-elements inside.
Knowing exactly what kind elements (tags) you are allowed to insert within an element is not obvious. Many websites include detailed descriptions of tags, e.g. have a look at the ones we list in the HTML links page. Be aware that most websites don't clearly tell you what HTML version they are referring to, but then if you make a mistake, a so-called validator can tell you so. Also, professional web designers in training may want to acquire a good book.
The last resort (at least for computer-savy people) is always the specifications which you can find at W3C, e.g. the XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition) Recommendation that requires you to also read the HTML 4.01 Specification. Very difficult for beginners, but you still may have a look a both of these documents.
Headings (titles)
<h1>…</h1> <h2>…</h2> <h3>…</h3> <h4>…</h4> <h5>…</h5> <h6>…</h6>
Section headings at different levels. <h1> delimits the highest-level heading, <h2> the next level down (sub-section), <h3> for a level below that, and so on to <h6>. Most visual browsers show headings as large bold text by default, though this can be overridden with CSS. Heading elements are not intended merely for creating large or bold text — they describe the document’s structure and organization. Some programs use them to generate outlines and tables of contents.
Heading elements may include most inline elements.
Paragraphs
Paragraph elements may include most inline elements plus most of the block elements, e.g. lists and tables.
<p>...</p>
Creates a paragraph, perhaps the most common block level element. The closing tag is optional for HTML and (of course) required for XHTML.
<blockquote>…</blockquote>
A block-level quotation, for when the quotation includes block level elements, e.g. paragraphs. The cite attribute may give the source, and must be a fully qualified Uniform Resource Identifier (URL). The default presentation of block quotations in visual browsers is usually to indent them from both margins. This has led to the element being unnecessarily used just to indent paragraphs, regardless of semantics.
<pre>…</pre>
Pre-formatted text. Text within this element is typically displayed in a non-proportional font exactly as it is laid out in the file (see ASCII art). Whereas browsers ignore whitespace for other HTML elements, in pre, whitespace should be rendered as authored. This element can contain any inline element except: image (img), object (object), big font size (big), small font size (small), superscript (sup), and subscript (sub).
<hr />
A horizontal rule. Presentational rules can also be drawn with stylesheets.
Lists
<dl>…</dl>
A definition list consisting of definition terms paired with definitions, see the next two items
<dt>…</dt>
A definition term in a definition list.
<dd>…</dd>
The definition of a term, in a definition list.
<ol>…</ol>
An ordered (enumerated) list. The type attribute can be used to specify the kind of ordering, but stylesheets give more control: {list-style-type: foo}. The default is Arabic numbering. Within an ol element, you may use nothing but <li>...</li> elements !
<ul>…</ul>
An unordered (bulleted) list. Stylesheets can be used to specify the list marker: {list-style-type: foo}. The default marker is a disc.
<li>…</li>
A list item in ordered (ol) or unordered (ul) lists.
Inline elements: Adding markup to block elements
Hyperlinks
- inline contents = ( Any text | a | abbr | acronym | br | cite | code | em | img | kbd | q | samp | span | strong | table | script | object) *
<a>…</a>
An anchor can be either the origin or the target (destination) end of a hyperlink.
With the attribute href, the anchor becomes a hyperlink to either another part of the document or another resource (e.g. a webpage) using an external URL or a relative URL to another resource on the same website. The following code
<a href="http://edutechwiki.unige.ch/en/web_technology_tutorials">Web technology tutorials</a>
will show like this
The attribute title may be set to give brief information about the link:
<a href="URL" title="additional information">link text</a>
Phrase elements
<em>…</em>
Emphasis (conventionally displayed in italics)
<strong>…</strong>
strong emphasis (conventionally displayed bold).
Presentation elements
Elements like "b", "it" etc. should be avoided as much as possible. The "font+" tag is even deprecated. Use CSS instead !
<sub>…</sub> and <sup>…</sup>
Mark subscript or superscript text. (Equivalent CSS: {vertical-align: sub} or {vertical-align: super}.)
<br />
An element that inserts a line break. This element should not be used to separate paragraphs. Paragraphs are simply wrapped within p tags.
Tables
<table>…</table>
Identifies a table. A most simple table has the following structure:
<table border="1">
<tr><th>Food</th><th>Price</th></tr>
<tr><td>Bread</td><td>$2.99</td></tr>
<tr><td>Milk</td><td>$1.40</td></tr>
</table>
It would look like this:
Food | Price |
---|---|
Bread | $2.99 |
Milk | $1.40 |
<tr>…</tr>
Contains a row of cells in a table.
<th>…</th>
A table header cell; contents are conventionally displayed bold and centered. An aural user agent may use a louder voice for these items.
<td>…</td>
A table data cell.
<colgroup>…</colgroup>
Specifies a column group in a table.
<col />
Specifies attributes for an entire column in a table.
<caption>…</caption>
Specifies a caption for a table.
<thead>…</thead>
Specifies the header part of a table. This section may be repeated by the user agent if the table is split across pages (in printing or other paged media).
<tbody>…</tbody>
Specifies the main part of a table.
<tfoot>…</tfoot>
Specifies the footer part of a table. Like <thead>, this section may be repeated by the user agent if the table is split across pages (in printing or other paged media).
Forms
To do. In the meantime, see: HTML form on wikipedia.
Scripts, pictures and other external objects
<script>...</script>
Places a script in the document. Also usable in the head and in block contexts. Typically, the script in the head element would load all libraries, i.e. function definitions, whereas scripts in the body would just define a few lines of function calls. See JavaScript.
<applet>…</applet>
Embeds a Java applet in the page. Deprecated in favor of <object>, as it could only be used with Java applets, and had accessibility limitations.
<img />
Used to insert an image in the document. The src attribute specifies the image URL. The required alt attribute provides alternative text in case the image cannot be displayed. (Though alt is intended as alternative text, Microsoft Internet Explorer renders it as a tooltip if no title is given.) Below is an example that would retrieve a picture from the same server as edutech wiki. We also define a width and a height, a strategy that can greatly speed up rendering of a page.
<img alt="Creative commons batch"
src="/mediawiki/images/Cc-network.png" border="0" width="88" height="31">
The following example shows how one would include a picture from a Wikimedia website (result not shown):
<img alt="Creative commons batch" src="http://upload.wikimedia.org/wikipedia/common/4/4d/Crystal_Clear_mimetype_html.png">
<map>…</map>
Specifies a client-side image map, not explained here.
<object>…</object>
Includes an object in the page of the type specified by the type attribute. This may be in any MIME-type the user agent understands, such as an embedded HTML page, a file to be handled by a plug-in such as Flash, a Java applet, a sound file, etc.
Note: The "embed" tag calls a plug-in handler for the type specified by the type attribute. Used for embedding Flash files, sound files, etc. This is a proprietary Netscape extension to HTML; <object> is the W3C standard method.
<iframe>…</object>
Includes a resource (HTML page, picture, etc.). HTML5 removed a number of attributes from the HTML4.1 element definition, use CSS instead. However it added sandbox
allowing some extra restrictions on the content of iframe. Prior to that, iframes were security risks.
Elements for extensions and styling purposes
<div>…</div>
A block-level logical division. A generic element with no semantic meaning used to distinguish a document section, usually for purposes such as presentation or behaviour controlled by stylesheets or DOM calls. Code below is valid
<div style="color:green;font-size:larger;background-color:lightgrey"><h3>Draft section</h3>
<p> .... </p>
</div>
but should be rather replaced by:
<div class="draft"><h3>Draft section</h3>
<p> .... </p>
</div>
plus a CSS definition of the "draft" class we just invented, e.g. like this
.draft {color:green;font-size:larger;background-color:lightgrey}
<span>…</span>
An inline logical division. A generic element with no semantic meaning used to distinguish a document section, usually for purposes such as presentation or behaviour controlled by stylesheets or DOM calls.
Here is a little example that shows how to use span with some inline CSS styling.
<p> <span style="font-weight:bold;color:green;">This article <i>or</i> section is a stub</span>.
A stub is an entry that did not yet receive substantial attention .....</p>
In a browser, it would look like this:
This article or section is a stub. A stub is an entry that did not yet receive substantial attention .....
Use of HTML fragments in online environments
Many web 2.0 environments and portalware let users hand code "HTML boxes". In this case you can enter any "body" elements you are allow to. Typically elements like p, ul, etc. are allowed, elements like script or object may or may not be allowed for security purposes. 'h1' etc. elements may not be allowed for styling purposes.
Often, such environments include by default a through the web editor. But most of these tools do have an "HTML" button that will let you hand code HTML.
You will have to find out on a case per case basis...