Optimizing Discovery
Signet is more than just the rollup. We’ve created tools and documentation that help developers work with Signet, but they’re only valuable when developers can discover them. Many companies have failed because nobody knew they existed.
I think about discovery in terms of pushing and pulling information:
- Businesses push information through marketing, social media, and outreach
- Users pull information by visiting our documentation directly
- Robots pull our content with crawlers, then push it through search engines and LLMs
The Research
We need to understand how humans and robots discover content.
What search engines want
Search engines want fast sites. Google’s crawl budget documentation is explicit:
“If the site responds quickly for a while, the limit goes up… If the site slows down or responds with server errors, the limit goes down and Googlebot crawls less.”
Bing determines crawl frequency based on content freshness. Yandex’s 2023 source code leak revealed technical performance factors like server errors and page speed directly impact ranking.
Google also only considers the mobile version of a site for rankings and uses Core Web Vitals to break ties between similar competitors.
What LLMs want
Unlike search engines, LLM crawlers don’t execute JavaScript. Structured data has been found to reduce hallucinations and DOM semantics significantly improve LLM extraction accuracy.
That means that semantic HTML, Schema.org markup, and proper element hierarchy help LLMs understand and cite content.
The overlap
Robots want plain text documents linked together in a predictable, consistent structure, delivered as fast as possible. Humans want the same, plus visuals, interactivity, and tone.
| For Humans | For Robots | For Both |
|---|---|---|
| Visuals | Crawl directives | Speed |
| Interactivity | Canonical URLs | Mobile responsive |
| Navigation UI | Structured metadata | Semantic markup |
| Tone & voice | Meta descriptions | Fresh content |
That gives us some clear priorities:
- Speed - Faster pages get crawled more, rank better, convert better
- Semantic structure - Proper HTML and structured data improve both search ranking and LLM accuracy
- No JavaScript dependency - LLM crawlers can’t execute JS, so core content must be in HTML
How fast is fast enough?
While faster is always better, there’s a critical breakpoint called IW10.
TCP connections can’t send unlimited data immediately. To prevent network congestion servers use slow start, beginning with a small congestion window that grows with each acknowledged packet. RFC 6928 sets the initial window at 10 segments of 1460 bytes: 14,600 bytes maximum before the server must wait for acknowledgment.
That means that if we keep our total initial bundle under 14.6KB then pages would load in one network round-trip. Each additional 14.6KB requires another round-trip.
TCP Initial Window (14.6KB limit)
First round-trip capacity:
┌──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┐
│ 1460 │ 1460 │ 1460 │ 1460 │ 1460 │ 1460 │ 1460 │ 1460 │ 1460 │ 1460 │ bytes
└──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┘
= 14,600 bytes (14.6KB)
Exceed this: Server waits for ACK before sending moreImpact of Multiple Round-Trips
13.5KB page:
░░░░░▐████████▌ 67ms ✓
├──── 67ms ────┤
30KB page:
░░░░░░▐████████▌░░░░░░▐████████▌░░░░░░▐████████▌ 201ms ✓
├──── 67ms ────┤├──── 67ms ────┤├──── 67ms ────┤
RTT 1 RTT 2 RTT 3
50KB page:
░░░░░░▐████████▌░░░░░░▐████████▌░░░░░░▐████████▌░░░░░░▐████████▌ 268ms ✓
├──── 67ms ────┤├──── 67ms ────┤├──── 67ms ────┤├──── 67ms ────┤
RTT 1 RTT 2 RTT 3 RTT 4
Each RTT is about 67ms. On mobile networks, 100ms+ RTT delays are common.The basics
I often think about Dan Abramov’s “You Might Not Need Redux”. He discusses how frameworks (even his own) are a tradeoff that introduce constraints. “If you trade something off, make sure you get something in return.”
There are many really good reasons to use a framework, but for our purposes, we didn’t feel the return was there. React costs ~46KB gzipped and would be a non-starter for IW10.
With that in mind, we can tackle the basic requirements:
- SSR
- Semantic HTML
- Structured data
- Appropriate
<meta>tags robots.txtwith XML sitemap- WCAG AA accessibility
Implementing these basics landed us at about 28KB bundle size per page. Two round-trips.
Finding optimizations
Improve minification and compression (28KB → 21KB)
There were some easy tradeoffs to make:
- Strip Schema.org JSON-LD to essential fields
- Inline only critical CSS and JS, defer everything else
- Improve build-time minification
- Switch Cloudfront’s compression for build-time pre-compression
Further optimizations came with stronger tradeoffs. For example, we could defer CSS and JS loading entirely, but that harms our core web vitals by introducing layout shifts and creates a FOUC.
Service Worker streaming injection (21KB → 13.5KB)
Browsers only need styles present when rendering and not necessarily in the HTML payload itself.
Service Workers (not to be confused with Web Workers) can intercept network requests and use TransformStream to modify responses in transit rather than waiting for them to fully arrive. That means we can inject CSS and JS during HTML streaming:
1// Intercept HTML and inject CSS during transfer
2async function handleNavigation(request) {
3 const cache = await caches.open("pages-v1")
4 const response = await cache.match(request)
5 if (!response) return fetch(request)
6
7 return new Response(
8 response.body
9 .pipeThrough(new TextDecoderStream())
10 .pipeThrough(
11 new TransformStream({
12 transform(chunk, controller) {
13 if (chunk.includes("</head>")) {
14 chunk = chunk.replace(
15 "</head>",
16 `<style>${componentCSS}</style></head>`
17 )
18 }
19 controller.enqueue(chunk)
20 },
21 })
22 )
23 .pipeThrough(new TextEncoderStream()),
24 { headers: response.headers }
25 )
26}This change let us aggressively prune our inlined code. The HTML itself was only 10.5KB. The critical CSS and JS we previously inlined are injected during transfer by the Service Worker so that the browser has a fully-styled document by the time it needs it.
This solution wasn’t a panacea. First-time visitors still experience a slight FOUC (typically ~50ms) while the Service Worker installs in the background. After installation, though, all subsequent visits are instant and fully styled.
Instant navigation
We can also improve browsing speed for humans by aggressively caching and prefetching content:
bfcache provides instant restoration when users hit back/forward buttons. It preserves the full page state: JavaScript execution context, scroll position, form inputs.
Speculation Rules API enables declarative prefetching:
1<script type="speculationrules"> 2 { 3 "prefetch": [ 4 { 5 "source": "document", 6 "where": { "href_matches": "/docs/*" }, 7 "eagerness": "moderate" 8 } 9 ] 10 } 11</script>Custom hover prefetching prefetches and caches pages when users hover over links. If they move the mouse away before a timeout window the prefetch cancels, avoiding wasteful prefetches.
Combined with Service Worker injection and standard background prefetching, navigation typically resolves from memory in ~5-10ms.
The complete system flow
Complete Navigation and Caching Flow
┌─────────────────────────────────────────────────────────────────┐
│ CACHE POPULATION (How pages get into cache) │
├─────────────────────────────────────────────────────────────────┤
│ • Speculation Rules: /docs/\* prefetch │
│ • Custom prefetch: All links after hover │
│ • Background prefetching for high-priority pages │
└─────────────────────────────────────────────────────────────────┘
Navigation Event
│
├──┬─► Back/Forward Button
│ └─► bfcache: Instant full page restore ✓
│
└──┬─► Regular Click
│
├─► Tier 0: Memory Map (JS) ───────────────► ~0ms
│ 25 pages, session-only, 60% hit rate
│ └─► HIT: DOM swap ✓ │ MISS ↓
│
├─► Tier 1: Service Worker Cache ──────────► ~10ms
│ 200+ pages, persistent, 90% hit rate
│ └─► HIT: CSS Streaming Injection
│ ┌────────────────────────────────────────┐
│ │ 1. Fetch HTML (10.5KB, no CSS or JS) │
│ │ 2. Pipe through TextDecoderStream │
│ │ 3. TransformStream: │
│ │ • Find header tag │
│ │ • Inject CSS and JS │
│ │ 4. Pipe through TextEncoderStream │
│ │ 5. Browser receives fully styled doc │
│ └────────────────────────────────────────┘
│ Return (10ms, 0ms FOUC) ✓ │ MISS ↓
│
├─► Tier 2: Browser HTTP Cache ──────────────► ~10ms
│ ~100MB, Cache-Control headers, 95% hit rate
│ └─► HIT ✓ │ MISS ↓
│
├─► Tier 3: CloudFront Edge ─────────────────► ~12ms
│ Unlimited storage, 24hr TTL, 99.9% hit rate
│ └─► HIT: Brotli level 5 compressed ✓ │ MISS ↓
│
└─► Tier 4: S3 Origin ───────────────────────► ~118ms
Permanent storage, 0.1% hit rate
Effective latency: First visit ~12ms | After 3 pages ~2ms averageLow cost experiments: llms.txt
No major LLM provider officially supports llms.txt yet, but there are signals that we may be headed toward adoption of the standard. Anthropic maintains comprehensive files for their documentation. The implementation cost is trivial, so it makes sense to include. We also added a “Copy for LLMs” button on documentation pages for direct markdown access.