WebVTT: Difference between revisions

The educational technology and digital learning wiki
Jump to navigation Jump to search
 
(27 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{stub}}
<!-- <pageby nominor="false" comments="false"/> -->
 
{{incomplete}}
{{web technology tutorial|Intermediate}}
== Introduction ==
== Introduction ==


{{quotation|The WebVTT format (Web Video Text Tracks) is a format intended for marking up external text track resources.
{{quotation|The WebVTT format (Web Video Text Tracks) is a format intended for marking up external text track resources. The main use for WebVTT files is captioning video content.}} ([http://dev.w3.org/html5/webvtt/ WebVTT], retr. April 23 2012). A WebVTT includes a list of so-called cues that can be used by a player to overlay information on the video (e.g. texts or graphics) and for navigation.
The main use for WebVTT files is captioning video content.}} ([http://dev.w3.org/html5/webvtt/ WebVTT], retr. April 23 2012).


WebVTT can be used in the '''track''' element of [[HTML5 audio and video|HTML5 video]]. As of April 2012, browsers may not implement this, but probably some JavaScript-based HTML5 players do.
WebVTT can be used in the '''track''' element of [[HTML5 audio and video|HTML5 video]]. One can associate different track files with a single video, e.g. multiple language files or multiple formats or various types. As of April 2012, browsers either do not yet implement this or only partially (like Chrome), but some JavaScript-based HTML5 players do implement various subsets of either WebVTT or competing standards such as the simple SRT format.


See also:  
See also:  
* [[Timed Text]] (TT or TTML)
* [[Timed Text]] (TT or TTML) is an older competing XML-based format that we actually prefer.
* [[HTML5 audio and video]]
* [[HTML5 audio and video]]
'''Do not trust details'''. Need to test this first - [[User:Daniel K. Schneider|Daniel K. Schneider]] 19:25, 23 April 2012 (CEST)


== The format ==
== The format ==


The WebVTT format is fairly complex plain text format. It surprisingly doesn't use [[XML]].  
The WebVTT format is a fairly complex plain text format. It surprisingly doesn't use [[XML]]. However, its data part (the cue contents) can include some XML-like markup.  


'''File structure'''
=== File structure ===


A WebVTT file has the following structure:
A WebVTT file has the following structure:
* First line must be the word "WEBVTT"
* First line must be the word "WEBVTT"
* Second line must be empty
* Second line must be empty
* Entries, separated by at least a blank line whose first character is a whitespace
* Any number of entries, i.e. so-called '''cues''', separated by at least a blank line


=== Simple entries ===
In addition:
* You must use UTF-8 [[character encoding]]
* Files must be served as text/vtt, e.g. add the following instruction to an Apache server http.conf or .htacess file
AddType text/vtt .vtt
* Line returns can use Windows or Unix like characters, e.g. \r, \n or \r\n. This basically means that if you see a new line in a [[text editor]], then you are doing fine.
* The &amp;, &lt; and &gt; must be replaced by &amp;amp;, &amp;lt; and &amp;gt;


Simple entries are simple to create.
'''Simple example'''
* A header, e.g. a number
* A time using the hh:mm:ss.mmm format, optionally followed by layout clues
* The text to be displayed. This text can include some HTML and other markup
* A line with a blank space


Example of a simple entry:
This works with Chrome 18, but you will have to enable track using the ''chrome://flags'' URI. Search for ''track'', the click to activate.
Track file:
<source lang="text">
WEBVTT
 
Introduction
00:00:01.000 --> 00:01:10.000
Wikipedia is a great adventure. It may have
its shortcomings, but it is the largest collective
knowledge construction endevour
 
Disclaimer
00:01:10.000 --> 00:02:10.000
This is just a track demo using VTT
</source>
 
As you can see, there is a "WEBVTT" header and cues are separated by blank lines. We will explain the cue syntax below.
 
HTML5 portion (read the [[HTML5 audio and video]] article for details):
<source lang="XML">
<video id="movie1" controls preload="metadata">
<source src="videos/state-of-wikipedia-480x272.mp4"/>
<source src="videos/state-of-wikipedia-480x272.ogv"/>
<source src="videos/state-of-wikipedia-480x272.webm"/>
<track kind="subtitles" label="EN subtitles" src="subtitles_en.vtt" srclang="en" default/> 
<track kind="subtitles" label="Soustitre en FR" src="subtitles_fr.vtt" srclang="fr"/>
        Your browser doesn't support HTML5. Maybe you should upgrade.
</video>
</source>
Live example files:
* http://tecfa.unige.ch/guides/html/html5-video/html5-video-track.html
* http://tecfa.unige.ch/guides/html/html5-video/subtitles_en.vtt
 
=== Simple cues ===
 
Cues roughly have the following structure:
 
; (1) A header, e.g. a number
: A header cannot include the following strings <code>"-->"</code>, <code>\r</code>, <code>\n</code>, and <code>\r\n</code>
 
; (2) A line for '''timing''' plus optional layout instructions, i.e. so-called cue settings
: ''start_time --> end_time'' [cue settings]
: using the ''hh:mm:ss.mmm'' or the shorter ''mm:ss.mmm'' format. Note the colons ":" and the dot "."
: optionally followed by layout clues.
: Examples of timing lines
00.00.01.000 --> 00.01.01.017
00.01.000 --> 01.01.017
00.00.01.000 --> 00.01.01.017 T:50%
 
; (3) One or more '''lines of text'''
: The text to be displayed. This text can include some HTML-like markup
: ''Voice'': &lt;v voice name&gt; allows to define a voice (name of the person who speaks)
: ''CSS class'': &lt;c.classname&gt;Some text ...&lt;/c&gt; This allows for flexible styling. CSS is defined in the HTML
: ''Bold'': &lt;b&gt;Some text ...&lt;/b&gt;
: ''Italic'': &lt;i&gt;Some text ...&lt;/i&gt;
: ''Underline'': &lt;u&gt;Some text ...&lt;/u&gt;
: ''Ruby annotations'': &lt;ruby&gt;base text&lt;rt&gt;annotation&lt;/rt&gt;&lt;/ruby&gt;
 
; (4) A '''separator line'''
 
=== Example simple ===
 
The following simple example demonstrates the use of header, timing line and a single text line.
<source lang="text">
<source lang="text">
WEBVTT
WEBVTT
Line 38: Line 103:
00:00:01.000 --> 00:00:20.000
00:00:01.000 --> 00:00:20.000
<v DKS>This is a <b>very short</b> video that is not about WEBVTT  
<v DKS>This is a <b>very short</b> video that is not about WEBVTT  
2
00:00:21.000 --> 00:00:40.000
You see a typical example
of a robot moving forward
</source>
=== More cue settings ===
On the '''timing''' line one can define position, alignment and size.
'''Line position'''
: line:10 ... means 10 lines down
: line:50% .... means 50% down
'''Horizontal position'''
: position:0% ... means left
: position:80% ... means far to the right
'''Horizontal alignment'''
: align:start ... means left-aligned
: align:middle ... centered
: align:end ... right aligned
'''Text width'''
: size:50% ... means taking up 50% of the video width
'''Vertical text'''
: vertical:rl ... grows to the left
: vertical:lr ... grows to the right
Example
<source lang="dos">
WEBVTT
1
00:00.000 --> 01:00.000 vertical:rl
Pile of characters
   
   
2
00:01.000 --> 00:02.000 S:50%
Bon soir !
</source>
</source>


'''Karake style entries''' (source: [http://html5doctor.com/video-subtitling-and-webvtt/ HTML5 doctor])
=== Karaoke style cues ===
 
You also can use karaoke style cues, i.e. (if we understood right), text that scrolls up.
 
source: [http://html5doctor.com/video-subtitling-and-webvtt/ HTML5 doctor])
<source lang="text">
<source lang="text">
WEBVTT
WEBVTT
Line 52: Line 163:
</source>
</source>


=== Multiple line entries ===
=== Multiple line cues ===


You can use HTML <code>p</code> tags.
You can use HTML <code>p</code> tags.
Line 68: Line 179:
</source>
</source>


=== JSON ===
=== JSON cues ===
 
''If we understood right'', anything can go into the content lines of a cue, provided that some interpreter can read it. The following shows a JSON data structure that could be interpreted by some HTML5 player.
 
<source lang="text">
<source lang="text">
WEBVTT
WEBVTT
Line 81: Line 195:
}
}
</source>
</source>
=== Adding CSS ===
When you succeeded to add your subtitles, you might wish to personalise them a little bit with CSS.
Here it is the solution and the way to do it. See the VTT file below. Read carrefully. You'll see the code <c ...></c>
<source lang="text">
WEBVTT
Introduction
00:00:01.000 --> 00:01:10.000
<c.vIntro>Wikipedia is a great adventure. It may have
its shortcomings, but it is the largest collective
knowledge construction endevour</c>
Disclaimer
00:01:10.000 --> 00:02:10.000
This is just a track demo using VTT
</source>
Now, in your css file (that belong to your webpage of course), just add this :
<source lang="css">
.vIntro {
        color: coral;
        text-transform: uppercase;
        font-family: "Helvetica Neue";
        font-weight: lighter;
        font-size: 18px;
        text-decoration: underline;       
}
</source>
Now you are able to use CSS for your VTT files.


== Software ==
== Software ==


=== Browser support ===
=== Browser support ===
As of April 2012:
* Chrome 18 supports a subset of WebVTT, but you will have to enable it.
: In the Chrome browser, enter the configuration URI:
chrome://flags
: Activate track support (search for "track" in the page) and click.
* ''Recent'' Internet Explorer 10 (beta) supports both WebVTT and Timed Text.
** Read [http://msdn.microsoft.com/en-us/library/ie/hh673566%28v=vs.85%29.aspx Video: timed text tracks] (sometimes, these guys are ahead ...)
If your browser doesn't support VTT, some HTML5 video players do provide support. See [[HTML5 audio and video]] for some more information about these. There exist probably over a dozen good players. Below we show a live example using such a player.


=== Validators ===
=== Validators ===


* [http://quuz.org/webvtt/ WebVTT validator] (copy/paste)
* [http://quuz.org/webvtt/ WebVTT validator] (copy/paste)
=== Editing tools ===
* [http://ie.microsoft.com/testdrive/Graphics/CaptionMaker/ HTML5 Video Caption Maker] Microsoft, should work in any modern browser.
** Read [http://msdn.microsoft.com/en-us/library/ie/jj152136%28v=vs.85%29.aspx Create WebVTT or TTML files with Caption Maker]


== Examples ==
== Examples ==
Line 102: Line 266:
<v Roger Bingham>We're actually at the Lucern Hotel, just down the street
<v Roger Bingham>We're actually at the Lucern Hotel, just down the street
</source>
</source>
'''Simple live example using the [http://leanbackplayer.com/ LeanBack Player]'''
<source lang="XML">
<div class="leanback-player-video">
  <video controls="controls" preload="metadata" poster="wikipedialogo.png" width="480" height="272">
  <source src="videos/state-of-wikipedia-480x272.mp4" type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"' />
  <source src="videos/state-of-wikipedia-480x272.webm" type='video/webm; codecs="vp8, vorbis"' />
  <source src="videos/state-of-wikipedia-480x272.ogv" type='video/ogg; codecs="theora, vorbis"' />
  <track enabled="true" kind="subtitles" label="EN"
          src="subtitles_en.srt" srclang="en" type="text/x-srt"/>
  <track enabled="true" kind="subtitles" label="EN VTT"
          src="subtitles_en.vtt" srclang="en" type="text/vtt"/>
  <track enabled="true" kind="subtitles" label="FR"
          src="subtitles_fr.vtt" srclang="fr" type="text/vtt"/>
  <object class="leanback-player-flash-fallback" width="640" height="360"
          type="application/x-shockwave-flash"
          data="http://releases.flowplayer.org/swf/flowplayer.swf">
      <param name="movie" value="http://releases.flowplayer.org/swf/flowplayer.swf" />
      <param name="allowFullScreen" value="true" />
      <param name="wmode" value="opaque" />
      <param name="bgcolor" value="#000000" />
      <param name="flashVars"
            value="config={'playlist':['wikipedialogo.png',
{'url':'videos/state-of-wikipedia-480x272.mp4','autoPlay':false,'autobuffering':false}]}" />
  </object>
  <div class="leanback-player-html-fallback" style="width: 640px; height: 360px;">
    <img src="wikipedialogo.png" width="640" height="360" alt="Poster Image"
          title="No HTML5-Video playback capabilities found. Please download the video(s) below." />
    <div>
<strong>Download Video:</strong>
  <a href="videos/state-of-wikipedia-480x272.mp4">.mp4</a>
  <a href="videos/state-of-wikipedia-480x272.webm">.webm</a>
  <a href="videos/state-of-wikipedia-480x272.ogv">.ogv</a>
</div>
</div>
</div>
</source>
Live example file:
* http://tecfa.unige.ch/guides/html/html5-video/html5-video-track-leanbackplayer.html
Track files:
* http://tecfa.unige.ch/guides/html/html5-video/subtitles_en.vtt
* http://tecfa.unige.ch/guides/html/html5-video/subtitles_en.srt
* http://tecfa.unige.ch/guides/html/html5-video/subtitles_fr.vtt
As you can see this player relies on extra markup:
* A wrapping '''div''' of class="leanback-player-video"
* Two extra (illegal but harmless) track attributes: enabled="true" and type="....". However we did not test what happens if these are omitted. Maybe the ''enabled'' is not needed since this can be configured in the JavaScript and the ''type'' only may be needed for local testing.


== Links ==
== Links ==


; Specification
; Specification
* [http://dev.w3.org/html5/webvtt/ WebVTT], Living Standard — Last Updated 19 April 2012, retrieved April 2012.
* [http://dev.w3.org/html5/webvtt/ WebVTT], Draft Community Group Report, 18 October 2016 (retrieved oct 2016)


; Organizations
; Organizations
Line 123: Line 337:
* [http://wiki.whatwg.org/wiki/Timed_track_formats Timed track formats] (WhatWG wiki)
* [http://wiki.whatwg.org/wiki/Timed_track_formats Timed track formats] (WhatWG wiki)
* http://wiki.whatwg.org/wiki/Timed_tracks Timed tracks (WhatWG wiki)
* http://wiki.whatwg.org/wiki/Timed_tracks Timed tracks (WhatWG wiki)
; Other Tools
* [http://www.universalsubtitles.org Amara]: an easy way to caption and translate any video in this large and powerful platform who is called UniversalSubtitles.org. [http://vimeo.com/39734142 More info in this video].


[[Category: Multimedia]]
[[Category: Multimedia]]
[[Category: web standards]]
[[Category: web standards]]
[[Category: digital video]]
[[Category: digital video]]
[[fr:WebVTT]]

Latest revision as of 08:25, 28 March 2018

Introduction

“The WebVTT format (Web Video Text Tracks) is a format intended for marking up external text track resources. The main use for WebVTT files is captioning video content.” (WebVTT, retr. April 23 2012). A WebVTT includes a list of so-called cues that can be used by a player to overlay information on the video (e.g. texts or graphics) and for navigation.

WebVTT can be used in the track element of HTML5 video. One can associate different track files with a single video, e.g. multiple language files or multiple formats or various types. As of April 2012, browsers either do not yet implement this or only partially (like Chrome), but some JavaScript-based HTML5 players do implement various subsets of either WebVTT or competing standards such as the simple SRT format.

See also:

Do not trust details. Need to test this first - Daniel K. Schneider 19:25, 23 April 2012 (CEST)

The format

The WebVTT format is a fairly complex plain text format. It surprisingly doesn't use XML. However, its data part (the cue contents) can include some XML-like markup.

File structure

A WebVTT file has the following structure:

  • First line must be the word "WEBVTT"
  • Second line must be empty
  • Any number of entries, i.e. so-called cues, separated by at least a blank line

In addition:

  • You must use UTF-8 character encoding
  • Files must be served as text/vtt, e.g. add the following instruction to an Apache server http.conf or .htacess file
AddType text/vtt .vtt
  • Line returns can use Windows or Unix like characters, e.g. \r, \n or \r\n. This basically means that if you see a new line in a text editor, then you are doing fine.
  • The &, < and > must be replaced by &amp;, &lt; and &gt;

Simple example

This works with Chrome 18, but you will have to enable track using the chrome://flags URI. Search for track, the click to activate. Track file:

WEBVTT

Introduction
00:00:01.000 --> 00:01:10.000
Wikipedia is a great adventure. It may have
its shortcomings, but it is the largest collective
knowledge construction endevour

Disclaimer
00:01:10.000 --> 00:02:10.000
This is just a track demo using VTT

As you can see, there is a "WEBVTT" header and cues are separated by blank lines. We will explain the cue syntax below.

HTML5 portion (read the HTML5 audio and video article for details):

<video id="movie1" controls preload="metadata">
 <source src="videos/state-of-wikipedia-480x272.mp4"/>
 <source src="videos/state-of-wikipedia-480x272.ogv"/>
 <source src="videos/state-of-wikipedia-480x272.webm"/>
 <track kind="subtitles" label="EN subtitles" src="subtitles_en.vtt" srclang="en" default/>   
 <track kind="subtitles" label="Soustitre en FR" src="subtitles_fr.vtt" srclang="fr"/>
        Your browser doesn't support HTML5. Maybe you should upgrade.
</video>

Live example files:

Simple cues

Cues roughly have the following structure:

(1) A header, e.g. a number
A header cannot include the following strings "-->", \r, \n, and \r\n
(2) A line for timing plus optional layout instructions, i.e. so-called cue settings
start_time --> end_time [cue settings]
using the hh:mm:ss.mmm or the shorter mm:ss.mmm format. Note the colons ":" and the dot "."
optionally followed by layout clues.
Examples of timing lines
00.00.01.000 --> 00.01.01.017
00.01.000 --> 01.01.017
00.00.01.000 --> 00.01.01.017 T:50%
(3) One or more lines of text
The text to be displayed. This text can include some HTML-like markup
Voice: <v voice name> allows to define a voice (name of the person who speaks)
CSS class: <c.classname>Some text ...</c> This allows for flexible styling. CSS is defined in the HTML
Bold: <b>Some text ...</b>
Italic: <i>Some text ...</i>
Underline: <u>Some text ...</u>
Ruby annotations: <ruby>base text<rt>annotation</rt></ruby>
(4) A separator line

Example simple

The following simple example demonstrates the use of header, timing line and a single text line.

WEBVTT

1
00:00:01.000 --> 00:00:20.000
<v DKS>This is a <b>very short</b> video that is not about WEBVTT 

2
00:00:21.000 --> 00:00:40.000 
You see a typical example
of a robot moving forward

More cue settings

On the timing line one can define position, alignment and size.

Line position

line:10 ... means 10 lines down
line:50% .... means 50% down

Horizontal position

position:0% ... means left
position:80% ... means far to the right

Horizontal alignment

align:start ... means left-aligned
align:middle ... centered
align:end ... right aligned

Text width

size:50% ... means taking up 50% of the video width

Vertical text

vertical:rl ... grows to the left
vertical:lr ... grows to the right

Example

WEBVTT

1
00:00.000 --> 01:00.000 vertical:rl
Pile of characters
 
2
00:01.000 --> 00:02.000 S:50%
Bon soir !

Karaoke style cues

You also can use karaoke style cues, i.e. (if we understood right), text that scrolls up.

source: HTML5 doctor)

WEBVTT

1
00:00:01.000 --> 00:00:10.000
Never gonna give you up <00:00:01.000> 
Never gonna let you down <00:00:05.000> 
Never gonna run around and desert you

Multiple line cues

You can use HTML p tags.

WEBVTT

Introduction
00:00:10.000 --> 00:01:10.000
<p>Wikipedia is a great adventure</p>
<p>It may have shortcomings, but it remains the largest collective knowledge construction endevour</p>
 
Disclaimer
00:01:10.000 --> 00:02:10.000
<p>This is just a track demo using VTT</p>

JSON cues

If we understood right, anything can go into the content lines of a cue, provided that some interpreter can read it. The following shows a JSON data structure that could be interpreted by some HTML5 player.

WEBVTT

Wikipedia
00:01:15.200 --> 00:02:18.800
{
"title": "State of Wikipedia",
"description": "Jimmy Wales talking ...",
"src": "http://upload.wikimedia.org/wikipedia/en/thumb/8/80/Wikipedia-logo-v2.svg/120px-Wikipedia-logo-v2.svg.png",
"href": "http://en.wikipedia.org/wiki/Wikipedia"
}

Adding CSS

When you succeeded to add your subtitles, you might wish to personalise them a little bit with CSS.

Here it is the solution and the way to do it. See the VTT file below. Read carrefully. You'll see the code <c ...></c>

WEBVTT

Introduction
00:00:01.000 --> 00:01:10.000
<c.vIntro>Wikipedia is a great adventure. It may have
its shortcomings, but it is the largest collective
knowledge construction endevour</c>

Disclaimer
00:01:10.000 --> 00:02:10.000
This is just a track demo using VTT

Now, in your css file (that belong to your webpage of course), just add this :

.vIntro {
        color: coral;
        text-transform: uppercase;
        font-family: "Helvetica Neue";
        font-weight: lighter;
        font-size: 18px;
        text-decoration: underline;        
}

Now you are able to use CSS for your VTT files.

Software

Browser support

As of April 2012:

  • Chrome 18 supports a subset of WebVTT, but you will have to enable it.
In the Chrome browser, enter the configuration URI:
chrome://flags
Activate track support (search for "track" in the page) and click.
  • Recent Internet Explorer 10 (beta) supports both WebVTT and Timed Text.

If your browser doesn't support VTT, some HTML5 video players do provide support. See HTML5 audio and video for some more information about these. There exist probably over a dozen good players. Below we show a live example using such a player.

Validators


Editing tools

Examples

Simple example 1 (taken from the draft specification)

WEBVTT

00:11.000 --> 00:13.000
<v Roger Bingham>We are in New York City
 
00:13.000 --> 00:16.000
<v Roger Bingham>We're actually at the Lucern Hotel, just down the street

Simple live example using the LeanBack Player

<div class="leanback-player-video">
  <video controls="controls" preload="metadata" poster="wikipedialogo.png" width="480" height="272">
   <source src="videos/state-of-wikipedia-480x272.mp4" type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"' />
   <source src="videos/state-of-wikipedia-480x272.webm" type='video/webm; codecs="vp8, vorbis"' />
   <source src="videos/state-of-wikipedia-480x272.ogv" type='video/ogg; codecs="theora, vorbis"' />
	
   <track enabled="true" kind="subtitles" label="EN"
          src="subtitles_en.srt" srclang="en" type="text/x-srt"/>
   <track enabled="true" kind="subtitles" label="EN VTT"
          src="subtitles_en.vtt" srclang="en" type="text/vtt"/>
   <track enabled="true" kind="subtitles" label="FR" 
          src="subtitles_fr.vtt" srclang="fr" type="text/vtt"/>
   <object class="leanback-player-flash-fallback" width="640" height="360"
           type="application/x-shockwave-flash"
           data="http://releases.flowplayer.org/swf/flowplayer.swf">
      <param name="movie" value="http://releases.flowplayer.org/swf/flowplayer.swf" />
      <param name="allowFullScreen" value="true" />
      <param name="wmode" value="opaque" />
      <param name="bgcolor" value="#000000" />
      <param name="flashVars" 
             value="config={'playlist':['wikipedialogo.png', 
		{'url':'videos/state-of-wikipedia-480x272.mp4','autoPlay':false,'autobuffering':false}]}" />
   </object>
   <div class="leanback-player-html-fallback" style="width: 640px; height: 360px;">
     <img src="wikipedialogo.png" width="640" height="360" alt="Poster Image" 
          title="No HTML5-Video playback capabilities found. Please download the video(s) below." />
    <div>
<strong>Download Video:</strong>
   <a href="videos/state-of-wikipedia-480x272.mp4">.mp4</a>
   <a href="videos/state-of-wikipedia-480x272.webm">.webm</a>
   <a href="videos/state-of-wikipedia-480x272.ogv">.ogv</a>
 </div>
</div>
</div>

Live example file:

Track files:

As you can see this player relies on extra markup:

  • A wrapping div of class="leanback-player-video"
  • Two extra (illegal but harmless) track attributes: enabled="true" and type="....". However we did not test what happens if these are omitted. Maybe the enabled is not needed since this can be configured in the JavaScript and the type only may be needed for local testing.

Links

Specification
  • WebVTT, Draft Community Group Report, 18 October 2016 (retrieved oct 2016)
Organizations
Introductions
Discussion
General timed track
Other Tools
  • Amara: an easy way to caption and translate any video in this large and powerful platform who is called UniversalSubtitles.org. More info in this video.