VTT (Web Video Text Tracks) is the standard subtitle format for HTML5 video. It provides advanced features such as styling, positioning, and metadata support.

Format

The file must begin with the WEBVTT header.
Subtitle blocks, known as cues, are separated by blank lines.
Each cue consists of:
1. Cue identifier (Optional): A string used to identify the cue (e.g., intro).
2. Timecode: The start and end times, separated by -->. Standard formats are mm:ss.mss or hh:mm:ss.mss. LingoHub supports both the period . and the comma , as decimal separators for milliseconds.
3. Cue settings: Optional settings (e.g., align:start size:50%) can follow the timecode.
4. Payload: The subtitle text content.
LingoHub parses NOTE blocks as comments.
LingoHub preserves STYLE blocks during import and export.
The timecode line, including any settings, is used as the segment key (e.g., 00:00:00.000 --> 00:00:02.500 align:start).

Example

Additional example files are available on GitHub.

WEBVTT

NOTE
This is a comment block in a VTT file.

intro_slide
00:00:00.000 --> 00:00:10.700
Welcome to the presentation.

main_content
00:00:10.700 --> 00:00:47.600
This is the main payload of the subtitle.

Used by

HTML5 <track> element.
Modern web video players such as Video.js and Plyr.
Major streaming platforms.

Format

Example

Used by

References