Unicode-Range & Subset Loading: Implementation Blueprint

This guide is part of the Font Loading & Delivery Strategies blueprint, and it solves a specific waste problem: most sites ship far more glyphs than any single page renders. A full Latin-plus-Cyrillic-plus-Greek family can be 200–400KB per weight, yet an English marketing page needs only the basic Latin block — a few hundred codepoints. Shipping the rest is pure transfer cost on the critical path, and on a text-LCP page that cost lands directly on your Largest Contentful Paint.

The unicode-range descriptor is the browser-native fix. You split a family into per-script subset files, give each its own @font-face with a unicode-range descriptor, and the browser downloads only the subset files whose ranges intersect the characters actually on the page. An English page fetches latin.woff2 and never touches cyrillic.woff2; a Russian page does the reverse; a mixed page fetches both. Targeted delivery routinely cuts the initial font payload by 60–85%, which shaves 200–400ms off LCP on constrained networks and removes the layout-shift risk of a bloated download arriving mid-render.

Start your diagnosis in Chrome DevTools. Open the Coverage panel (Cmd/Ctrl-Shift-P → "Show Coverage"), reload, and inspect your font files — or, more directly, open the Network panel filtered by Font and compare the transferred WOFF2 size against the number of distinct glyphs your page renders. If you are downloading a multi-hundred-KB file to paint a screen of Latin text, the file carries glyphs the page will never use, and subsetting plus unicode-range is the fix.

Baseline Configuration

The minimum correct setup is two or more @font-face blocks that share one font-family name but point at different subset files, each tagged with the unicode-range it covers. The shared family name is what makes this transparent to your CSS: you keep writing font-family: 'Inter' everywhere, and the browser resolves each character to whichever subset's range contains it.

Baseline unicode-range split (Latin + Cyrillic)

@font-face {
  font-family: 'Inter';
  src: url('/fonts/inter-latin.woff2') format('woff2');
  font-weight: 400;
  font-display: swap;
  unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+2000-206F, U+2074, U+20AC, U+2122, U+FFFD;
}

@font-face {
  font-family: 'Inter';
  src: url('/fonts/inter-cyrillic.woff2') format('woff2');
  font-weight: 400;
  font-display: swap;
  unicode-range: U+0400-045F, U+0490-0491, U+04B0-04B1, U+2116;
}

Two rules govern correctness here. The ranges must not overlap — a codepoint listed in two @font-face blocks makes the browser unable to pick one cleanly and can trigger duplicate downloads. And every block needs an explicit font-display; without it each subset defaults to font-display: auto (browser-defined, usually block-like), so set swap or optional consistently to control the fallback window. The unicode-range evaluation happens before font-display timing, so a range that matches no on-page character means the file is never even requested. Pair this with the matching font-display value and the subsets that the page does use will swap in predictably.

unicode-range routing page characters to subset files Page text is split by codepoint range; Latin characters resolve to the Latin subset file and Cyrillic characters to the Cyrillic subset file, so the browser downloads only matching files. Page text "Hello Привет" unicode-range: U+0000-00FF Latin match unicode-range: U+0400-045F Cyrillic match latin.woff2 cyrillic.woff2
Each codepoint range maps to its own subset file, so the browser fetches only the files the page actually uses.

Step-by-Step Implementation Workflow

Each step ends with a verification check so you confirm the subset is correct before it reaches production.

Implementation Steps:

  1. Measure the baseline payload first. Open DevTools → Network → filter Font, reload, and record the transferred WOFF2 size and request count. Note the number — a single 280KB Inter[wght].woff2 that paints a screen of English text is the waste you are about to eliminate. Verify: you have a written "before" figure (e.g. 1 request, 287KB transferred) to compare against after splitting.
  2. Audit the glyphs your content actually uses. Run glyphhanger https://example.com --spider --spider-limit=20 to crawl rendered pages and emit the exact codepoints in use, or assemble the unicode-range strings published for your family. Verify: you have a concrete U+ list (or a --whitelist file) covering every script your content renders, with no guessing — glyphhanger prints both a U+ string and a glyph count you can sanity-check against your content.
  3. Install the toolchain. pip install fonttools brotli zopfli gives you pyftsubset plus the WOFF2/Brotli codecs. Verify: pyftsubset --help runs and python -c "import brotli" exits cleanly.
  4. Generate the Latin subset. Run pyftsubset against the source TTF/OTF, restricting to the Latin range and keeping only the OpenType features you use. Verify: ls -la subset-latin.woff2 shows the file under budget (< 50KB for a typical Latin subset — Inter's basic-Latin subset lands near 15–20KB), and pyftsubset --unicodes-file=... --output-file=... --verbose reports the retained glyph count, which should match your audit.
  5. Generate the remaining script subsets. Repeat for Latin-Extended, Cyrillic, Greek, Vietnamese, or whatever your audience needs, one file per range. Verify: each file is small and self-contained, the ranges across all commands are mutually exclusive, and the summed size of all subsets is still far below the monolithic original (you are not paying for the savings in duplicated overhead).
  6. Wire up the @font-face blocks. Add one block per subset, sharing the font-family name, each with its unicode-range and font-display. Verify: load an English page in DevTools → Network → Font — exactly one subset row (latin.woff2) should download; load a page with Cyrillic text and confirm cyrillic.woff2 joins it (two rows) while the irrelevant Greek and Vietnamese subsets stay absent. The row count per page is your proof the range gating works.
  7. Preload the critical subset only. Add a single <link rel="preload" as="font" crossorigin> for the subset your above-the-fold text needs, using resource hints to pull it forward. Verify: the critical subset starts downloading during HTML parse (its request appears at the top of the waterfall, not after CSSOM), and no non-critical subset is preloaded — a wasted preload shows a console warning that the resource was preloaded but not used within a few seconds.
  8. Content-hash and cache. Deploy each subset with a hashed filename and Cache-Control: public, max-age=31536000, immutable. Verify: response headers show immutable, and a font-content change produces a new filename so caches never serve stale glyphs — see browser font caching mechanics for the header set and the second-load (disk cache) check.
  9. Automate it in the build. Move steps 2–5 into your pipeline so subsets regenerate whenever content or the font changes. Verify: a clean build reproduces byte-identical subsets — see automating font subsetting in CI for the full pyftsubset and glyphhanger pipeline.

Worked Example: Splitting One Family Into Four Subsets

To make the numbers concrete, here is a full split of a single weight of Inter (the variable file flattened to 400 for clarity) into four script subsets. Each command below is run against the same source Inter-Regular.ttf; only the --unicodes range changes. The resulting sizes are representative WOFF2 figures for this family — your exact bytes vary by font, but the shape of the result holds: four small files whose sum is a fraction of the monolithic original.

Four pyftsubset commands, one per script

# Latin (basic) — U+0000–00FF plus common punctuation/symbols  → ~16 KB
pyftsubset Inter-Regular.ttf --output-file=inter-latin.woff2 \
  --unicodes="U+0000-00FF,U+0131,U+0152-0153,U+2000-206F,U+2074,U+20AC,U+2122,U+FFFD" \
  --flavor=woff2 --layout-features=kern,liga --no-hinting

# Latin-Extended — accented & extended Latin  → ~9 KB
pyftsubset Inter-Regular.ttf --output-file=inter-latin-ext.woff2 \
  --unicodes="U+0100-024F,U+0259,U+1E00-1EFF,U+2020,U+20A0-20AB,U+20AD-20CF,U+2113,U+2C60-2C7F,U+A720-A7FF" \
  --flavor=woff2 --layout-features=kern,liga --no-hinting

# Cyrillic — U+0400–045F core plus extensions  → ~12 KB
pyftsubset Inter-Regular.ttf --output-file=inter-cyrillic.woff2 \
  --unicodes="U+0400-045F,U+0490-0491,U+04B0-04B1,U+2116" \
  --flavor=woff2 --layout-features=kern,liga --no-hinting

# Greek — U+0370–03FF  → ~7 KB
pyftsubset Inter-Regular.ttf --output-file=inter-greek.woff2 \
  --unicodes="U+0370-03FF" \
  --flavor=woff2 --layout-features=kern,liga --no-hinting

A monolithic Inter Regular covering all four scripts is roughly 80–90KB as WOFF2. Split this way, the four subsets sum to about 44KB if a page needed every script at once — but no page does. An English page downloads only inter-latin.woff2 (~16KB), an 82% reduction against the monolith; an English page that also shows a few accented loanwords pulls Latin + Latin-Extended (~25KB). A Russian page fetches Latin + Cyrillic (~28KB). The savings come not from the summed size but from the fact that the browser fetches only the rows the page touches.

The matching @font-face stack for all four subsets

@font-face {
  font-family: 'Inter'; src: url('/fonts/inter-latin.woff2') format('woff2');
  font-weight: 400; font-display: swap;
  unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+2000-206F, U+2074, U+20AC, U+2122, U+FFFD;
}
@font-face {
  font-family: 'Inter'; src: url('/fonts/inter-latin-ext.woff2') format('woff2');
  font-weight: 400; font-display: swap;
  unicode-range: U+0100-024F, U+0259, U+1E00-1EFF, U+2020, U+20A0-20AB, U+20AD-20CF, U+2113, U+2C60-2C7F, U+A720-A7FF;
}
@font-face {
  font-family: 'Inter'; src: url('/fonts/inter-cyrillic.woff2') format('woff2');
  font-weight: 400; font-display: swap;
  unicode-range: U+0400-045F, U+0490-0491, U+04B0-04B1, U+2116;
}
@font-face {
  font-family: 'Inter'; src: url('/fonts/inter-greek.woff2') format('woff2');
  font-weight: 400; font-display: swap;
  unicode-range: U+0370-03FF;
}

To verify the split, load four representative pages and read the Font filter in the Network panel. An English page: one row, ~16KB. A page with a name like "Łódź": two rows (Latin + Latin-Extended), ~25KB. A Russian page: two rows (Latin + Cyrillic), ~28KB. A Greek page: two rows (Latin + Greek), ~23KB. If a page ever downloads a subset whose script does not appear in its text, a range is too broad or two ranges overlap.

How the Browser Decides Which Subset to Download

The mechanism is deliberate and worth understanding, because it explains every verification check above. When the browser builds a @font-face family from several blocks sharing one font-family, it does not download anything eagerly. It first lays out the page and determines the set of codepoints that actually need that family. Then, for each @font-face block, it intersects that codepoint set with the block's unicode-range. A block is fetched only if the intersection is non-empty — that is, only if at least one character on the page falls inside its declared range. This is the "deferred download" or range-gating behaviour, and it is what lets you declare a dozen subsets and still ship one file to a single-language page.

Two consequences follow. First, a block whose range matches no on-page character is never requested at all — not fetched-and-discarded, never requested. Second, the decision is per-block, so adding one stray character (a outside your Latin range, a single Cyrillic name in an English article) can pull in an entire extra subset file. This is why the audit step matters: a unicode-range that is narrower than your real content silently fails to cover stray glyphs (they fall back to the next font in the stack), while one that is broader than needed can drag in files you expected to stay dormant.

This per-range gating is also how Google Fonts has shipped fonts for years. Request a Google Fonts CSS URL and you get back not one @font-face but a long stack of them — frequently 6 to 20 blocks for the same family, each pointing at a tiny script-specific WOFF2 and each carrying its own unicode-range (latin, latin-ext, cyrillic, cyrillic-ext, greek, greek-ext, vietnamese, and so on). Your browser then applies exactly the gating described above and downloads only the one or two subsets your page needs. Self-hosting reproduces this pattern by hand or in your build; the multi-range stack in the worked example above is the Google Fonts pattern, just on files you control.

The combining-vs-splitting trade-off is the one real judgement call. Each subset is a potential extra HTTP request, with its own connection-reuse, header, and decode overhead — splitting too aggressively trades transfer savings for round-trip count and a colder cache. Combine when subsets are almost always used together (basic Latin and the punctuation/symbols block belong in one file, not two) or when a script is small and your audience nearly always renders it. Split when a script is large and only a minority of pages need it (Cyrillic on a primarily-English site, CJK anywhere). The practical sweet spot is 4–6 blocks grouped by script or by the languages your traffic actually serves; below that you under-target and ship dead glyphs, above that you fragment requests for diminishing returns.

Browser Compatibility & Fallback Matrix

Capability Chrome/Edge Firefox Safari Notes
unicode-range descriptor 36+ 44+ 10+ Supported across all evergreen engines; the browser fetches only matching subsets
WOFF2 subset format 36+ 39+ 10+ Subset and ship WOFF2 exclusively; ~30% smaller than WOFF
font-display per @font-face 60+ 58+ 11.1+ Applies independently to each subset block
Lazy subset fetch (range-gated) Yes Yes Yes A subset whose range matches no on-page glyph is never requested
IE11 (no unicode-range) n/a n/a n/a EOL June 2022; falls back to the first matching @font-face — serve one monolithic WOFF2 last if you must support it

The fallback story is simple in practice: every browser you ship to today supports unicode-range. The only engine that ignored it was IE11, which reached end-of-life in June 2022; if you still owe it support, append one un-ranged, monolithic WOFF2 @font-face at the end of the stack as a catch-all, accepting that IE11 then downloads the whole family.

Configuration & Code Examples

Unicode-range subset declaration for Latin & Cyrillic

@font-face {
  font-family: 'Inter';
  src: url('/fonts/inter-latin.woff2') format('woff2');
  unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC, U+02C6, U+02DA, U+02DC, U+2000-206F, U+2074, U+20AC, U+2122, U+2191, U+2193, U+2212, U+2215, U+FEFF, U+FFFD;
  font-display: swap;
}

@font-face {
  font-family: 'Inter';
  src: url('/fonts/inter-cyrillic.woff2') format('woff2');
  unicode-range: U+0400-045F, U+0490-0491, U+04B0-04B1, U+2116;
  font-display: swap;
}

CLI subsetting command using pyftsubset

pyftsubset font.ttf \
  --output-file=subset-latin.woff2 \
  --unicodes="U+0000-00FF,U+0131,U+0152-0153,U+02BB-02BC,U+02C6,U+02DA,U+02DC,U+2000-206F,U+2074,U+20AC,U+2122,U+2191,U+2193,U+2212,U+2215,U+FEFF,U+FFFD" \
  --flavor=woff2 \
  --layout-features=kern,liga \
  --no-hinting

Auditing real glyph usage with glyphhanger

# Crawl the live site, emit the exact codepoints in use, and subset in one pass.
glyphhanger https://example.com --spider --spider-limit=20 \
  --subset=fonts/Inter.ttf \
  --formats=woff2 \
  --css

Resource hint integration for the critical subset

<link rel="preload" href="/fonts/inter-latin.woff2" as="font" type="font/woff2" crossorigin>
<link rel="preconnect" href="https://static.example.com" crossorigin>

The glyphhanger example is the bridge to automation: it derives the codepoint list from your actual rendered content rather than a hand-maintained range, which is exactly what you want running on every build so subsets never drift from usage.

Common Pitfalls

  • Overlapping unicode-range declarations. When the same codepoint appears in two blocks, the browser cannot resolve a single subset and may fetch both files for characters in the overlap. Keep ranges strictly disjoint.
  • Forgetting font-display on subsets. A subset block without it defaults to font-display: auto, reintroducing the FOIT (invisible text) you subsetted to avoid. Set swap or optional on every block.
  • Missing crossorigin on the subset preload. Font fetches are anonymous-CORS, so a preload without crossorigin is cached separately from the real fetch and the file downloads twice — and on a CDN origin a missing CORS header makes the fetch fail outright.
  • Wrong WOFF2 MIME type at the origin. Serving subsets without Content-Type: font/woff2 makes some setups reject the type-tagged preload. Configure the MIME type explicitly on the server and edge.
  • Caching subsets without content hashes. A long immutable lifetime on a non-hashed filename means a font update never reaches returning visitors — they keep the stale subset. Always hash the filename.
  • Over-fragmenting into too many subsets. Splitting one family into a dozen tiny files multiplies request overhead and complicates cache warming. Group by script/language and keep it to roughly 4–6 blocks.

Frequently Asked Questions

How does unicode-range interact with variable font axes?

unicode-range operates on which glyphs ship, and variable axes operate on how those glyphs are interpolated — they are independent. A subset file still carries the full wght/opsz/slnt axis data, just for fewer glyphs, so you can subset a variable font and continue to drive its axes with font-variation-settings. The browser downloads only the subset whose range matches the page content, axes and all.

What is the maximum recommended number of unicode-range blocks per font-family?

Keep it to about 4–6 discrete blocks. Each block is a potential separate HTTP request, so excessive fragmentation trades download savings for request overhead and makes cache warming harder. Group by script (Latin, Latin-Extended, Cyrillic, Greek) or by the languages your traffic actually uses, guided by your audience's geographic distribution.

Does unicode-range work with font-display: optional?

Yes, and they compose cleanly. The browser evaluates the unicode-range match first to decide which subset to request, then applies optional timing to that subset's load. If the matching subset has not arrived within the ~100ms optional window, the browser keeps the fallback for that page view — so you get range-targeted downloads and zero layout shift from a late swap.

Should I subset before or after preloading?

Subset first, always. A preload only accelerates discovery — it does nothing about file size. Preloading a 200KB unsubsetted family just pulls a large download earlier and can starve your LCP resource. Cut the family to a sub-50KB subset, then preload that subset so the file you pull forward is both early and small.

Related