Tuesday, April 30, 2013

YouTube: more sophisticated captions

As my regular viewers probably know, I always add closed captions to my videos. Not only do I have an audience split between English-speakers and German-speakers, but I also have a fair sprinkling of hearing-impaired viewers. It’s a lot of work, but it’s worth it.

Hitherto, YouTube has only officially supported two formats: SubRip and SubViewer. Both are good, both are easy to do by hand (they’re simple text files), both work, but both were only supported to the extent of their official specifications: no formatting of any kind.

When I uploaded my latest video, I noticed that YouTube had quietly introduced support for a raft of additional formats.

This is excellent news, particularly for professional broadcasters, who can now use broadcast-quality standards like EIA-608 for NTSC systems and EBU-STL for PAL. This gives them, assuming YouTube has implemented full support, a lot of flexibility regarding formatting, colours and so on.

For those of us stuck somewhere in the middle, for whom broadcast standard captions are a quagmire of technical jiggery-pokery, YouTube has provided at least partial support for some simpler formats that allow a slightly greater degree of flexibility.

One of those is a relatively new standard called WebVTT, which is designed primarily to allow browsers to implement captions and subtitles for HTML5 playback. Browsers don’t yet support this, but will do (we hope) in the future; since YouTube will eventually — even if it takes a few more years — move over to HTML5 video playback, support for WebVTT would seem the logical thing to do.

WebVTT is particularly attractive to me, because it is basically SubRip plus a few extra features; and I’ve been using SubRip ever since I started captioning my videos.

There are a few things YouTube’s implementation of WebVTT won’t do. Many features, notably colour, would normally be implemented by using stylesheet rules, but for the most basic reasons of security, YouTube can’t let you manipulate the site’s stylesheets. But other features implemented in the caption file itself also aren’t supported: position, alignment and size. (However, including the code for these features doesn’t throw up an error.)

What does work for WebVTT is italics, bold and underlining. Not much, but better than nothing, and it does enable you to add a little more expression, or differentiate between two speakers. YouTube also allows you to insert comments (which are not displayed).

I uploaded a test caption file to a video I had on my test account. You’ll see the formatting early on in the video (later in the video I experimented with the other features, which didn’t work). The button to enable captions is at the bottom, near the right, labelled either “CC” or with an icon representing subtitles at the bottom of a TV screen. Go here to watch the video.

If you are already familar with SubRip, as I was, the changes are minimal:
  • The file begins with the string WEBVTT followed by a blank line.
  • In the timecodes, replace commas with decimal points.
  • Comments are between subtitles, by typing NOTE (in capitals), followed by your comment; a blank line indicates the end of the comment.
  • Italic, bold and underlined text is indicated with HTML-style <i>, <b> and <u> tags.
For example, here is a subtitle in the original SubRip format:

00:00:08,600 --> 00:00:11,520
The microphone is a Rode Videomic,

And here it is converted to WebVTT, with the words “Rode Videomic” in italics:

00:00:08.600 --> 00:00:11.520
The microphone is a <i>Rode Videomic</i>,

And that, ladies and gentlemen, is pretty much it.


  1. If you already have a SubRip and are trying to figure out how to replace all of the timecode commas with decimal points, a simple regular expression find and replace should do the trick. Use Notepad++ or Geany as your text editor. Do a regular expression Find an Replace.

    Find this: ([0-9]),([0-9])
    Replace with this: \1.\2

    1. Thank you; that's a very useful tip. You'd need to be a bit careful with that if your captions themselves include large numbers; "1,234" would be changed to "1.234". But that shouldn't usually be a problem with most captions.

    2. Just in case you can also do this "regular expression find and replace":

      Find this: ([0-9])([0-9]):([0-9])([0-9]):([0-9])([0-9]),
      Replace with this: \1\2:\3\4:\5\6.

  2. Informative, thanks. Messed around a bit with caption positioning using a broadcast standard format, and it was not messy at all when you found a subtitle editor that had support for such a format. Made an instruction video about it to contribute


  3. And the link to it... :-)