Skip to main content

Updates

Optimizing Discovery

Tom //15 min read
case-study

How many network requests should it take to navigate to a page?

In the simplest case, navigation resolves in a single network transaction. The browser requests the HTML content for the URL and displays the response. There’s no further network activity or processing because everything necessary to render that page is embedded within the HTML itself:

html
<article>
  <h1>Optimizing Discovery</h1>
  <p>How many network requests should it take...</p>
</article>

In practice, a modern page will also display images, load some JavaScript, add some styling, and so on. To do this, the browser will make a bunch of additional requests. Some of these requests will be render-blocking, requiring the browser to defer display until they resolve. The rest are nice to have, but the browser can display the page while they load.

These additional network requests exchange slower load times for better styling and interactivity. Many tools and frameworks have been built that offer developers a multitude of tradeoffs to choose from. Most of this tooling aims to optimize the human experience of visiting a page.

However, most developers now discover content with the help of robots like search and LLMs. These robots present our content directly to the user in a setting largely outside of our control. We want to be certain that the information presented to our users by these robots is correct, relevant, and up-to-date.

Optimizing for humans now also requires optimizing for robots. This substantially changes the value proposition of our tools. Robots prefer traditionally structured websites and don’t care about styling or interactivity. They want content, and they want it as quickly as possible.

What robots want

The needs of search engines are well-documented and understood. For example, Google only indexes mobile versions and uses Core Web Vitals to determine page rankings. Their crawl budget documentation is explicit:

If the site responds quickly for a while, the limit goes up…If the site slows down or responds with server errors, the limit goes down and Googlebot crawls less. Google Search Central

Other search engines have similar requirements. Bing determines crawl frequency from content freshness. Yandex’s 2023 source code leak showed page speed directly impacts ranking. Broadly speaking, faster sites get crawled more often and rank higher.

The needs of LLMs are opaque. There are two pathways to consider: proactive crawling (for building internal knowledgebases) and reactive searching (for on-demand user requests).

Through observation we’ve learned that they don’t execute JavaScript. If our content requires JS to render, it’s invisible to LLMs. Research has found that structured data reduces hallucinations and semantic HTML improves extraction accuracy.

Tool-augmented LLM use, like Claude’s web_search, relies on external search APIs that use traditional search infrastructure. For example, ChatGPT uses Bing and Claude uses Brave. Others, like Perplexity, have developed their own tools and infrastructure.

Our content needs to be fast, semantic HTML that renders without JavaScript.

How fast is fast enough?

When optimizing for speed, there’s a point beyond which a human can’t perceive improvements. Robots have no such diminishing returns. Recall our traditional network transaction: one request, one response. So why not just inline all of our scripts, styles, and so on inside of our HTML? Because of how TCP works.

When a browser opens a new connection to a server, the server can’t immediately blast data as fast as the network can handle. TCP uses a mechanism called slow start to ramp up gradually, preventing network congestion. RFC 6928 defines the initial “congestion window” as 10 segments of 1460 bytes each, so about 14.6KB. If the server tries to send more than that in the first burst, it has to wait for an acknowledgment before continuing.

TCP Initial Window

First roundtrip capacity (10 segments):
+------+------+------+------+------+
| 1460 | 1460 | 1460 | 1460 | 1460 |
+------+------+------+------+------+
| 1460 | 1460 | 1460 | 1460 | 1460 | bytes
+------+------+------+------+------+
= 14,600 bytes (14.6KB)

Exceed this: Server waits for ACK before sending more

A “roundtrip” is one message to and from the server. Each roundtrip incurs some network overhead. Desktop connections typically see 50-70ms per round trip. Mobile networks are worse, often 100-300ms. A 14KB page loads in one round trip. A 50KB page needs four.

Impact of Multiple Roundtrips (200ms mobile latency)

14KB page (1 roundtrip):
 req   data
 ....####
+--------+
0      200ms

50KB page (4 roundtrips):
 req  data  req  data  req  data  req  data
 ....#### ....#### ....#### ....####
+--------+--------+--------+--------+
0      200     400     600     800ms

A page under 14.6KB arrives in a single burst: 200ms total. A 50KB page needs four round trips: 800ms of waiting before the browser can start parsing the full response.

If we keep every page under 14.6KB (compressed), content will arrive to the browser in a single round trip. From a networking perspective, this is about as fast as we can get and maximizes the time crawlers spend with our content, rather than waiting.

No Frameworks

Fitting within a single roundtrip is a very tight constraint.

React and React DOM alone weigh about 42KB gzipped, almost three times our budget. Many frameworks also need JavaScript to render content, which means LLM crawlers can’t read it. We could explore a framework that offers server-side rendering, but for static pages the complexity of SSR and hydration seemed like the wrong tradeoff.

Just like like with Signet, our hypothesis is that the complexity of a framework isn’t necessary. We opted for plain HTML, CSS, and vanilla JavaScript.

Getting semantic

Sites are commonly built with <div> elements, like this:

html
<div class="wrapper">
  <div class="sidebar">
    <div class="nav-item">Getting Started</div>
    <div class="nav-item">API Reference</div>
  </div>
  <div class="content">
    <div class="title">Getting Started</div>
    <div class="text">Welcome to our documentation...</div>
  </div>
</div>

This renders fine in a browser, but neither the browser nor crawlers know what any of it means. Semantic HTML uses elements that describe their purpose. A <nav> is navigation. An <article> is self-contained content. Research shows semantic HTML improves extraction accuracy for LLMs trying to understand page structure. Search engines also use these elements to understand content hierarchy.

We can transition to something that looks like:

html
<div class="wrapper">
  <nav>
    <a href="/getting-started">Getting Started</a>
    <a href="/api">API Reference</a>
  </nav>
  <article>
    <h1>Getting Started</h1>
    <p>Welcome to our documentation...</p>
  </article>
</div>

Screen readers are a third type of robot we haven’t described yet, but a very important one. An aria-label tells robots what the navigation contains:

html
<div class="wrapper">
  <nav aria-label="Documentation">
    <a href="/getting-started">Getting Started</a>
    <a href="/api">API Reference</a>
  </nav>
  <article>
    <h1>Getting Started</h1>
    <p>Welcome to our documentation...</p>
  </article>
</div>

Search engines don’t use ARIA attributes directly for indexing, but they consider ARIA attributes as part of their page ranking criteria. Missing ARIA attributes hurts your ranking.

Structured data

Semantic HTML tells crawlers about structure, but not what our content is. Schema.org defines a vocabulary that search engines understand. You embed it as JSON-LD in a script tag:

html
<script type="application/ld+json">
  {
    "@context": "https://schema.org",
    "@type": "TechArticle",
    "headline": "Getting Started",
    "description": "Learn how to set up and use Signet",
    "author": {
      "@type": "Organization",
      "name": "Signet"
    }
  }
</script>

This tells a crawler exactly what it’s looking at: a technical article with a specific headline, description, and author.

There are many schema types, but we only needed three to describe our content, navigation, and our organization.

Cutting the budget

After doing this work, we were feeling really optimistic. Unfortunately, our pages weighed 28KB on average. We needed to cut them in half.

Since we’re not using any frameworks, we also aren’t benefitting from the nice build systems many frameworks provide. We made some basic improvements to our build system:

  • Improve build-time minification with cssnano, postcss, and autoprefixer
  • Switch from AWS CloudFront’s automatic Brotli compression to build-time pre-compression

That got us from 28KB down to 24KB. Progress, but not enough.

Tiered CSS loading

We have virtually no JavaScript, so the next most obvious target was our CSS. Recall how the browser can load resources as either render-blocking or deferred. We can split our CSS into three tiers:

  • Critical (inlined): Layout, typography, colors. The minimum needed to render above-the-fold content without a flash of unstyled HTML.
  • Enhancement (high priority external CSS): Interactive components, hover states, transitions. Loaded as soon as possible, but not render-blocking.
  • Lazy (low priority external CSS): Search modal, syntax highlighting, hidden content. Only loaded when needed.
html
<!-- Critical: inlined in <head> -->
<style data-critical>
  /* layout, typography, colors */
</style>

<!-- Enhanced: external, high priority -->
<link rel="stylesheet" href="/css/enhanced.css" fetchpriority="high" />

<!-- Lazy: preload, swap on load -->
<link
  rel="preload"
  as="style"
  href="/css/lazy.css"
  onload="this.rel='stylesheet'"
  fetchpriority="low"
/>

The critical CSS varies by page type, so we generate three variants at build time and inline the right one on each page.

At this point, each page was about 11KB of HTML (with critical CSS inlined) plus a <link> tag pointing to 10KB of render-blocking enhanced CSS. Still over budget.

Streaming CSS injection

There are only a few options available to us:

  • Remove styles entirely
  • Defer loading of all enhanced CSS to no longer block render
  • Inline some of the enhanced styles (up to 14.6KB) and defer the rest

Removing styles was very unlikely to shave off the 6.5KB we needed. We could get under 14.6KB by deferring the enhanced CSS entirely, but that caused a large FOUC. Inlining more critical styles seemed promising, but in the end that made the HTML payload a few KB too large.

We needed another option.

I knew robots didn’t care about this styling; only humans needed the full set of stylesheets. Deferring the entire enhanced stylesheet keeps our pages under 14.6KB and makes the robots happy.

Thinking back to my basics: browsers only need styles present when rendering and not necessarily in the HTML payload itself. Is there a way to modify our HTML payload after the browser received it but before rendering began?

Yes, there is! Service Workers can intercept network responses and transform them before the browser’s render work begins.

We can send minimal HTML from the server and append the CSS locally via a service worker. The HTML stays small over the network but arrives fully-styled at the browser.

The simplest approach waits for the full HTML, then modifies it:

javascript
async function injectCSS(response) {
  const html = await response.text();
  const modified = html.replace(
    "</head>",
    `<style>${cachedCSS}</style></head>`
  );
  return new Response(modified);
}

This works, but the browser waits for the full download, then waits again while we process it. We’ve actually added latency.

TransformStream lets us process data as it flows through, byte by byte. This way, we can inject our stylesheet while the HTML downloads. The browser receives the fully styled document before its render work, removing the latency we added above:

javascript
function createInjectionStream(cssToInject) {
  let headClosed = false;

  return new TransformStream({
    transform(chunk, controller) {
      const text = new TextDecoder().decode(chunk);

      if (!headClosed && text.includes("</head>")) {
        headClosed = true;
        const modified = text.replace(
          "</head>",
          `<style>${cssToInject}</style></head>`
        );
        controller.enqueue(new TextEncoder().encode(modified));
      } else {
        controller.enqueue(chunk);
      }
    },
  });
}

But there’s a bug! HTML arrives in chunks, and </head> might be split across two of them: </he in one chunk, ad> in the next. We’d miss it entirely.

We need to buffer until we find the closing tag:

javascript
function createInjectionStream(cssToInject) {
  let headClosed = false;
  let buffer = "";

  return new TransformStream({
    transform(chunk, controller) {
      const text = new TextDecoder().decode(chunk);
      buffer += text;

      if (!headClosed && buffer.includes("</head>")) {
        headClosed = true;
        const modified = buffer.replace(
          "</head>",
          `<style>${cssToInject}</style></head>`
        );
        controller.enqueue(new TextEncoder().encode(modified));
        buffer = "";
      } else if (headClosed) {
        controller.enqueue(chunk);
      }
    },
  });
}

The Service Worker loads CSS into memory on activation, then pipes every HTML response through this transform:

javascript
async function injectCSS(response) {
  return new Response(
    response.body.pipeThrough(createInjectionStream(cachedCSS)),
    { status: response.status, headers: response.headers }
  );
}

One more gotcha: we’re changing the response size, so the original content-length header is now wrong. Browsers will truncate the response if we leave it:

javascript
async function injectCSS(response) {
  const headers = new Headers(response.headers);
  headers.delete("content-length");

  return new Response(
    response.body.pipeThrough(createInjectionStream(cachedCSS)),
    { status: response.status, headers }
  );
}

The tradeoff for this approach is complexity. Service Workers require careful lifecycle management, and debugging cache invalidation is NOT fun.

Caching for humans

So far we’ve focused primarily on first visits because that’s the only type of visit robots make. But what about returning humans?

Since we already have a Service Worker, we can use it to precache pages in the background.

The precaching happens in stages:

  1. When the Service Worker activates, it immediately caches the critical assets: CSS, JavaScript, and the manifest.
  2. On the first navigation, it caches context-aware pages based on where you landed. For example, if a visitor enters through the homepage, it precaches /updates/ and recent posts. If they start on a documentation page, it precaches siblings and the homepage.
  3. We also use requestIdleCallback to prefetch documentation pages in the background when the browser has nothing else to do.

We also prefetch on hover:

javascript
function handleMouseEnter(e) {
  const link = e.target.closest("a");
  if (!isValidLink(link)) return;

  // Wait 65ms before prefetching, user might move mouse away
  prefetchTimeouts.set(
    link.href,
    setTimeout(() => {
      prefetchPage(link.href);
    }, 65)
  );
}

function handleMouseLeave(e) {
  const timeout = prefetchTimeouts.get(e.target.closest("a")?.href);
  if (timeout) clearTimeout(timeout);
}

Users hover over things constantly without clicking, so we only prefetch if they linger (studies have found 65ms to be a good amount of time). If they move away, we cancel. No wasted bandwidth.

For back/forward navigation, we rely on bfcache, which snapshots pages for instant restoration, preserving JavaScript context, scroll position, and form inputs. It’s powerful but fragile. It breaks on unload listeners, open connections, and IndexedDB transactions, so using it requires careful design.

Here’s how the pieces fit together:

Complete Navigation and Caching Flow

PRECACHING (Service Worker)
  Tier 0: SW activation (core assets) ~10 files
  Tier 1: First fetch (context-aware) ~5 pages
  Tier 2: After 1500ms (remaining) ~200 pages

PREFETCHING (Client-side)
  Hover: Links after 65ms
  Background: Docs via requestIdleCallback

Navigation Event
|
+-> Back/Forward -> bfcache: Instant restore
|
+-> Regular Click
   |
   +-> SW Cache ------> ~5-10ms
   |   CSS injection via TransformStream
   |
   +-> HTTP Cache ----> ~10-15ms
   |
   +-> CDN Edge ------> ~12-20ms
   |   24hr TTL, Brotli
   |
   +-> Origin (S3) ---> ~100-150ms

Typical: First visit ~12ms | Cached ~5ms

With all of this in place, cached navigation resolves in 5-10ms.

Results

There are many stellar webapps in the crypto space. We were curious how many of our peers were making similar optimizations and ran some benchmarking tools using the lighthouse CLI.

We compared ourselves against some of the major rollups: zkSync, Unichain, Base, Starknet, Arbitrum, Scroll, Optimism, Linea, Spire.

Testing conditions:

  • Lighthouse on mobile with 4G throttling.
  • 3 runs each site
  • Cold loads
  • Use a “representative” page from their docs (this will never be truly apples-to-apples, but good enough for a ballpark)
Mobile vs Desktop LCP

Site          Mobile  Desktop
------------------------------
Signet        0.91s   0.25s
Scroll        2.59s   0.65s
Unichain      3.46s   0.70s
Linea         3.99s   0.93s
Base          4.00s   0.89s
Spire         5.50s   1.81s
Starknet      5.65s   2.11s
Arbitrum      6.24s   1.35s
zkSync        6.70s   2.10s
Optimism      9.70s   1.82s

Target: LCP <2.5s (Google's "Good")
Web Vitals (Mobile)

Site        LCP    FCP    TTFB   CLS    TBT
--------------------------------------------
Signet      0.91s  0.91s  13ms   0.000  0ms
Scroll      2.59s  2.01s  8ms    0.290  1ms
Unichain    3.46s  1.08s  9ms    0.000  1ms
Linea       3.99s  1.54s  8ms    0.010  520ms
Base        4.00s  1.99s  7ms    0.000  207ms
Spire       5.50s  1.75s  8ms    0.000  366ms
Starknet    5.65s  1.99s  8ms    0.080  267ms
Arbitrum    6.24s  2.03s  8ms    0.070  247ms
zkSync      6.70s  2.50s  12ms   0.000  212ms
Optimism    9.70s  1.99s  84ms   0.000  260ms

Targets: LCP<2.5s FCP<1.8s TTFB<800ms CLS<0.1 TBT<200ms
Bundle Analysis (Mobile)

Site      Total   HTML  CSS   JS     Img   Other  #
----------------------------------------------------
Signet    64KB    11KB  7KB   9KB    0B    37KB   7
Scroll    1.3MB   18KB  21KB  328KB  6KB   927KB  35
Unichain  2.4MB   46KB  11KB  320KB  0B    2MB    18
Arbitrum  2.5MB   7KB   31KB  956KB  2KB   1.5MB  24
zkSync    2.6MB   42KB  4KB   369KB  89KB  2.1MB  48
Spire     3.3MB   31KB  36KB  1.1MB  8KB   2.1MB  75
Base      3.4MB   59KB  58KB  1.3MB  27KB  2MB    88
Starknet  3.9MB   49KB  59KB  1.3MB  13KB  2.5MB  78
Optimism  4.2MB   68KB  58KB  1.2MB  15KB  2.9MB  76
Linea     11.8MB  8KB   34KB  3.6MB  5KB   8.2MB  30

Other = fonts, JSON, SVGs, manifests | # = file count
Accessibility (WCAG 2.1 Mobile)

Site      Score  Img  Contr  ARIA  Form  Name  Tot
--------------------------------------------------
Signet    100    -    -      -     -     -     -
zkSync    97     -    56     -     -     -     56
Unichain  100    -    -      -     -     -     -
Base      94     -    11     -     -     -     11
Starknet  100    -    -      -     -     -     -
Arbitrum  82     -    29     1     -     1     31
Scroll    75     1    29     -     -     10    40
Optimism  100    -    -      -     -     -     -
Linea     100    -    -      -     -     -     -
Spire     98     -    -      -     -     -     -

Img=alt | Contr=color | ARIA=attrs | Form=labels | Name=btns
Robots & Crawlers

Site      robots canon href  schema llms llms-f
-----------------------------------------------
Signet    Y      Y     Y     3      Y    Y
zkSync    Y      Y     Y     1      -    -
Unichain  Y      -     Y     0      Y    Y
Base      Y      Y     Y     0      Y    Y
Starknet  Y      Y     Y     0      Y    Y
Arbitrum  -      Y     Y     0      -    -
Scroll    -      Y     Y     0      -    -
Optimism  Y      Y     Y     0      Y    Y
Linea     -      Y     Y     2      -    -
Spire     Y      Y     Y     0      Y    Y

Y=present -=missing | schema=JSON-LD count
Architecture & Technology

Site      Framework          SSR  Semantic  JS-Req
--------------------------------------------------
Signet    None               Y    76%       No
zkSync    Nuxt               Y    16%       No
Unichain  Next.js            Y    40%       No
Base      Next.js+Mintlify   Y    20%       Yes
Starknet  Next.js+Mintlify   Y    10%       Yes
Arbitrum  Docusaurus         Y    57%       No
Scroll    Astro              Y    64%       No
Optimism  Next.js+Mintlify   Y    30%       Yes
Linea     Docusaurus         Y    43%       No
Spire     Next.js+Mintlify   Y    6%        Yes

Semantic = % semantic HTML | JS-Req = JS required

Everyone seems to use server-side rendering frameworks. Astro in particular had standout performance in these tests. For most teams, they’re probably the right choice.

Interestingly, four of the ten sites (Base, Starknet, Optimism, Spire) require JavaScript to fully render their content. LLM crawlers can’t execute JavaScript, so those four sites are invisible when they try to fetch documentation directly.

llms.txt doesn’t have official support from major LLM providers yet, but adoption is growing. Anthropic maintains llms.txt files for their own docs. Six of the ten sites we benchmarked have llms.txt files too.

Did our improvements work?

We’re very happy with where Signet landed. This work was probably overkill, but it was a fun experiment that is already bearing fruit – our pages are being crawled and visited by LLM bots almost 10 times more often and our average position in Google page rankings has improved from 26.9 to 2.3!

We made a bet that in a crowded, hyper-competitive market, being discoverable matters more than being beautiful.

ESC

Start typing to search documentation...

Navigate Select ⌘K Open