As video streaming becomes ubiquitous, subtitles need to adapt to web platforms. WebVTT (Web Video Text Tracks) builds on SRT's simplicity while adding features specifically designed for web delivery.
A Brief History
When HTML5 video emerged, it became clear that subtitles needed to evolve. Browser vendors and streaming platforms required a format that could handle modern web needs: precise styling, multiple languages, and accessibility features. WebVTT was born from these requirements, becoming the W3C standard for web subtitles.
From SRT to WebVTT
If you're familiar with SRT files, WebVTT will feel natural. Let's look at the same subtitle in both formats:
1
00:00:01,000 --> 00:00:04,000
WEBVTT
00:00:01.000 --> 00:00:04.000
The similarities are clear, but WebVTT introduces some key differences:
- Required "WEBVTT" header
- Numbers before timestamps no longer required (but suggested)
- Periods instead of commas in timestamps (
01:00.000
instead of01:00,000
)
WebVTT also introduces web-specific styling options using limited CSS (Cascading Style Sheets) syntax, along with support for regions and positioning. We'll get into this in the next section.
CSS-Style Formatting
WebVTT's power comes from its CSS-like styling system. Using STYLE blocks, you can define how different elements appear:
WEBVTT
STYLE
::cue {
color: white;
background-color: rgba(0, 0, 0, 0.7);
font-family: Arial, sans-serif;
}
::cue(b) {
color: yellow;
font-weight: bold;
}
::cue(.important) {
color: red;
font-weight: bold;
}
::cue(v[voice="narrator"]) {
color: cyan;
font-style: italic;
Styling Elements
Different selectors target specific elements:
WEBVTT
STYLE
::cue(b) {
color: yellow;
}
::cue(i) {
font-style: italic;
color: cyan;
}
00:00:01.000 --> 00:00:04.000
Class-Based Styling
You can define custom classes for different types of text:
WEBVTT
STYLE
::cue(.important) {
color: red;
font-weight: bold;
}
::cue(.whisper) {
color: gray;
font-style: italic;
}
00:00:01.000 --> 00:00:05.000
<c.important>Critical announcement!</c>
00:00:06.000 --> 00:00:10.000
Voice-Based Styling
Speakers can have distinct styles:
WEBVTT
STYLE
::cue(v[voice="narrator"]) {
color: yellow;
font-family: "Times New Roman", serif;
}
::cue(v[voice="character"]) {
color: cyan;
font-family: Arial, sans-serif;
}
00:00:01.000 --> 00:00:04.000
<v narrator>The story begins...</v>
00:00:04.000 --> 00:00:08.000
Language-Specific Styling
Different languages can have distinct appearances:
WEBVTT
STYLE
::cue(:lang(en)) {
color: white;
font-family: Arial, sans-serif;
}
::cue(:lang(ja)) {
color: yellow;
font-family: "Noto Sans JP", sans-serif;
}
00:00:01.000 --> 00:00:04.000
<lang en>Welcome to the tutorial</lang>
00:00:04.000 --> 00:00:08.000
Styling Limitations
While WebVTT's styling system is powerful, it has some important restrictions:
- Cannot load external resources
- Limited to text-related CSS properties
- Styling applies to entire cue boxes
- No animation or transition effects
Anatomy of a WebVTT File
Now that we understand WebVTT's styling capabilities, let's look at how a complete file comes together:
WEBVTT
Kind: captions
Language: en
STYLE
::cue {
color: white;
background-color: rgba(0, 0, 0, 0.7);
}
NOTE
This is a comment - it won't be displayed
1
00:00:01.000 --> 00:00:04.000
In today's video, we'll explore
the latest web technologies.
2
00:00:04.500 --> 00:00:08.000 align:end line:90%
Subscribe for more tutorials!
3
00:00:08.100 --> 00:00:12.000
Each file contains:
- The WEBVTT header (required)
- Optional metadata (Kind, Language)
- STYLE blocks for formatting
- Cue blocks with timing and text
- Optional positioning attributes
Positioning and Layout
Beyond styling, WebVTT offers precise control over subtitle positioning. Unlike traditional formats, WebVTT uses a web-native positioning system:
00:00:04.000 --> 00:00:08.000 align:end position:90%
Right-aligned subtitle
00:00:08.000 --> 00:00:12.000 line:10%
Subtitle near the top
00:00:12.000 --> 00:00:16.000 size:40%
Common positioning properties:
align
: Start, center, or end alignmentline
: Vertical position (percentage or line number)position
: Horizontal position (percentage)size
: Width of the text box
Voice and Speaker Support
For content with multiple speakers, WebVTT provides clear identification through voice tags, which can be styled as we saw earlier:
STYLE
::cue(v[voice="host"]) {
color: yellow;
}
::cue(v[voice="guest"]) {
color: cyan;
}
00:00:01.000 --> 00:00:04.000
<v host>Welcome to the show!
00:00:04.000 --> 00:00:08.000
This feature is particularly valuable for interviews, panel discussions, and educational materials. It also helps with accessibility requirements by making speaker changes clear to screen readers.
Working with WebVTT Files
While WebVTT offers powerful styling and positioning features, keeping subtitles simple often works best. Follow these guidelines for reliable results:
Improving Readability
The same principles that work for SRT apply to WebVTT:
- Two lines maximum per subtitle
- Around 40 characters per line
- 20-25 characters per second
- Natural line breaks
Technical Recommendations
For robust WebVTT files:
- Always use UTF-8 encoding
- Test positioning on different screen sizes
- Verify speaker labels work in your player
- Keep styling consistent throughout
Platform Support
WebVTT enjoys strong support across modern platforms, but capabilities vary.
Most players reliably support:
- Basic subtitle display
- Simple positioning
- Speaker identification
- Standard timing
However, test carefully when using:
- Complex positioning
- Custom styling
- Regions
- Advanced features
This is due to the web-based nature of WebVTT, which is not always well-supported outside of web browsers, since it requires layout and styling support traditionally only implemented in web browsers.
Common Use Cases
Video streaming platforms have embraced WebVTT for its reliability and web-native features. The format particularly shines in online learning, where clear speaker identification and precise timing help viewers follow along.
Accessibility is another key strength. Screen readers handle WebVTT well, and the format's support for semantic markup helps create more inclusive content. The combination of CSS-like styling and semantic structure makes it possible to create subtitles that are both visually appealing and accessible.
Tools and Validation
While any text editor can handle WebVTT files, specialized tools make creation and testing easier:
Professional subtitle editors include:
- Aegisub: Supports WebVTT export
- Subtitle Edit: Strong WebVTT support
- Caption Maker: Web-focused editor
Common Mistakes
Here are some typical WebVTT-specific issues to watch out for:
Incorrect STYLE block placement
This example demonstrates how a STYLE block may be placed incorrectly. These blocks must always come before any cues (shown text) in the subtitle.
1
00:00:01.000 --> 00:00:04.000
First subtitle
STYLE
::cue {
color: red;
Invalid CSS syntax
This example demonstrates a common mistake when writing CSS syntax - a missing semicolon. For more information on the specific syntax of CSS, W3Schools provides many great articles on the topic.
WEBVTT
STYLE
::cue {
color: red
font-weight: bold;
Mixing class and voice tags incorrectly
This example demonstrates invalid use of XML-like tags for class and voice (speaker labeling).
- Using
v.important
is invalid and should bec.important
(v
for voice vsc
for cue).
WEBVTT
STYLE
::cue(.important) {
color: red;
}
00:00:01.000 --> 00:00:04.000
<v.important>Wrong syntax</v>
00:00:01.000 --> 00:00:04.000
Invalid positioning values
This example demonstrates an invalid positioning value as well as an invalid alignment value.
position
is set to101%
, which is invalid because percentages must be between 0 and 100.align
is set tomiddle
, when it should becenter
.
00:00:01.000 --> 00:00:04.000 position:101%
First subtitle
00:00:04.000 --> 00:00:08.000 align:middle
What's Next?
Now that you understand WebVTT's capabilities, from its CSS-like styling system to positioning controls, you'll want to explore the tools that can create and edit these files efficiently. In our next article, we'll look at subtitle editors that support modern formats like WebVTT.
Time to put your web subtitles to work!