65ms Time to First Frame: How We Built the Fastest HLS Delivery
The video below starts playing in ~65ms — roughly 2 video frames from instant — on any network condition. This post explains how we got there.
The Problem: HLS Has Too Many Round Trips
Standard HLS playback requires a cascade of sequential network requests before the first frame can render. Each one blocks the next:
- Master playlist — browser fetches
master.m3u8 - Variant playlist — player reads master, fetches
1080p.m3u8 - Init segment — player reads variant, fetches
1080p_init.mp4 - First media segment — player fetches
1080p_0000.m4s
That's 4 sequential round trips before the decoder has anything to work with. On a connection with 50ms RTT, that's 200ms of pure network latency before a single byte of video data reaches the decoder — and that's before download time, JS initialization, or decode time.
Here's what baseline HLS looks like across network conditions:
| Network | Baseline TTFF |
|---|---|
| Real (fast connection) | 859ms |
| Fast 4G (12 Mbps, 50ms) | 1,352ms |
| Slow 4G (4 Mbps, 100ms) | 2,622ms |
| 3G (1.5 Mbps, 300ms) | 6,123ms |
6 seconds on 3G. Let's fix each bottleneck one at a time.
Optimization 1: Turbo Mode — Inline Playlists as Data URIs
The first two round trips (variant playlist + init segment) can be eliminated
entirely. We base64-encode every variant playlist and init segment directly
into the master playlist as data: URIs:
#EXTM3U
#EXT-X-VERSION:7
#EXT-X-INDEPENDENT-SEGMENTS
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",...,
URI="data:application/vnd.apple.mpegurl;base64,I0VYVE0zVQ..."
#EXT-X-STREAM-INF:BANDWIDTH=3820000,RESOLUTION=2048x1080,AUDIO="audio"
data:application/vnd.apple.mpegurl;base64,I0VYVE0zVQ... When hls.js parses this master, it already has the full variant playlist and init segment — no additional fetches needed. Two round trips gone.
The init segment gets inlined inside the variant playlist's
#EXT-X-MAP tag:
#EXT-X-MAP:URI="data:video/mp4;base64,AAAAHGZ0eXBpc28..." Server-side code that builds this:
function inlineInitSegment(playlist, dir, initFilename) {
const initB64 = readFileSync(join(dir, initFilename))
.toString("base64");
const dataUri = `data:video/mp4;base64,$${initB64}`;
return playlist.replace(
/URI="[^"]*"/g,
`URI="$${dataUri}"`
);
} | Network | Before | After turbo | Change |
|---|---|---|---|
| Real | 859ms | 403ms | -53% |
| 3G | 6,123ms | 6,006ms | -2% |
Big improvement on fast connections (2 fewer RTTs), marginal on slow ones (download time dominates).
Optimization 2: Inline the Master in HTML
The master playlist itself is still a network fetch. We base64-encode it into
the HTML page as a JavaScript variable. hls.js loads it from a
blob: URL instead of fetching over the network:
<script>
const MASTER_B64 = "I0VYVE0zVQ..."; // turbo master, inlined at build time
const blob = new Blob([atob(MASTER_B64)], {
type: "application/vnd.apple.mpegurl"
});
hls.loadSource(URL.createObjectURL(blob));
</script> Now the player has the full manifest (master + all variants + all init segments) the instant JavaScript executes. Zero network fetches for manifest data.
| Network | Before | After inline | Change |
|---|---|---|---|
| Real | 403ms | 254ms | -37% |
| 3G | 6,006ms | 5,635ms | -6% |
Optimization 3: Preload the First Segment
The only network fetch left is the first media segment. We can overlap its
download with JavaScript initialization using <link rel="preload">:
<head>
<link rel="preload" href="/hls.light.min.js" as="script" />
<link rel="preload" href="/first0.25s/1080p_0000.m4s"
as="fetch" crossorigin />
<link rel="preload" href="/first0.25s/audio_0000.m4s"
as="fetch" crossorigin />
</head> The browser starts downloading the first segment during HTML parsing, before JavaScript even begins executing. By the time hls.js initializes and requests the segment, it's already in the browser cache.
| Network | Before | After preload | Change |
|---|---|---|---|
| Real | 254ms | 80ms | -69% |
| 3G | 5,635ms | 4,772ms | -15% |
80ms on fast connections. But 3G is still slow — the 2-second first segment (994KB) takes too long to download even with preload.
Optimization 4: 0.25-Second First Segment
The default 2-second first segment is 994KB at 1080p/3.5Mbps. On 3G, that takes ~5 seconds to download. A 0.25-second segment is only 104KB — it downloads 10x faster.
# Encode with short first segment (0.25s, GOP=6)
ffmpeg -i input.mp4 -t 0.25 -an \
-vf "scale=-2:1080" \
-c:v libx264 -b:v 3500k -maxrate 3500k -bufsize 7000k \
-preset veryslow -g 6 -keyint_min 6 -sc_threshold 0 \
-f hls -hls_time 0.25 -hls_segment_type fmp4 \
-hls_fmp4_init_filename 1080p_init.mp4 \
output/1080p.m3u8 GOP=6 (6 frames at 24fps = 0.25s) has negligible compression penalty compared to the standard GOP=12. The rest of the video uses normal 2-second segments.
| First Segment | Size | Real | Fast 4G | Slow 4G | 3G |
|---|---|---|---|---|---|
| 2.0s | 994KB | 66ms | 603ms | 1,790ms | 4,768ms |
| 1.0s | 379KB | 58ms | 183ms | 549ms | 1,416ms |
| 0.5s | 203KB | 67ms | 68ms | 155ms | 400ms |
| 0.25s | 104KB | 65ms | 68ms | 76ms | 73ms |
| 0.125s | 50KB | 69ms | 65ms | 73ms | 65ms |
At 0.25s, the segment arrives before hls.js is even ready to request it. Going shorter to 0.125s doesn't help — the bottleneck is now JavaScript initialization, not the network.
Optimization 5: hls.js Light Build, Self-Hosted
Two more wins by changing how we load the player library:
- hls.js light build (345KB vs 541KB full) — strips subtitle, alt audio, DRM, and CMCD support. 35% smaller = faster parse.
- Self-hosted from same origin — eliminates cross-origin DNS lookup + TLS handshake that a CDN like jsdelivr would require. The browser reuses the existing connection.
Optimization 6: Player Configuration
Two hls.js config options prevent the player from wasting time on startup:
const hls = new Hls({
startLevel: -1,
maxBufferLength: 5,
maxMaxBufferLength: 10,
startFragPrefetch: true,
// Skip bandwidth estimation — trust a high default
testBandwidth: false,
abrEwmaDefaultEstimate: 10_000_000,
progressive: true,
});
hls.on(Hls.Events.MANIFEST_PARSED, () => {
// Force 1080p start — don't let ABR pick 240p first
const idx = hls.levels.findIndex(l => l.height === 1080);
if (idx !== -1) hls.startLevel = idx;
video.play().catch(() => {});
}); testBandwidth: false skips the initial bandwidth probe.
abrEwmaDefaultEstimate: 10_000_000 tells the ABR algorithm to
assume 10 Mbps, which prevents the cautious 240p start.
The Full Stack
Putting it all together — the cumulative effect of each optimization:
| Optimization | RTTs | Real | 3G |
|---|---|---|---|
| Baseline HLS | 4 | 859ms | 6,123ms |
| + Turbo mode | 2 | 403ms | 6,006ms |
| + Inline master in HTML | 1 | 254ms | 5,635ms |
| + Preload first segment | ~0 | 80ms | 4,772ms |
| + 0.25s first segment | ~0 | 65ms | 73ms |
| + hls.js light + self-host | ~0 | 65ms | 67ms |
859ms to 65ms on fast connections. 6,123ms to 67ms on 3G. 92% faster on every network condition.
What We Tested That Didn't Help
We also ran an exhaustive optimization grid testing 7 additional strategies beyond the current best. All measurements: Playwright Chromium headless, 3 runs, cache disabled, median reported.
| Strategy | No throttle | Fast 4G | Slow 4G | 3G |
|---|---|---|---|---|
| Current best (control) | 65ms | 67ms | 67ms | 127ms |
| fetchpriority="high" | 72ms | 70ms | 60ms | 126ms |
| Poster frame (WebP) | 77ms | 61ms | 69ms | 143ms |
| Skip audio preload | 63ms | 66ms | 72ms | 127ms |
| Video-only bootstrap | 63ms | 71ms | 69ms | 131ms |
| 103 Early Hints | 71ms | 68ms | 75ms | 127ms |
Everything is within noise of the control. At ~65ms, we've hit the physical floor — the remaining time is HTML parse + hls.js init + MSE setup + video decode + compositor. No network optimization can reduce those.
What This Means in Practice
65ms is 1.5 frames at 24fps. The video appears to start instantly. On 3G — a connection most video platforms consider unwatchable — the experience is identical to a fast fiber connection.
The key insight: eliminate round trips, not bandwidth. Standard HLS's 4-request waterfall is the bottleneck, not the pipe size. Once all manifest data is inlined and the first segment is small enough to preload during JS init, network speed becomes irrelevant.
All measurements and code are in the open source repo.