Gyring

Gyre (verb): to follow a site's link graph outward from a seed URL, capturing each page's content as you go. Named after the spiraling outward motion — you start at one point and wind through the connected web around it.

In the Gyrence pipeline, gyring is the multi-page traversal primitive, distinct from fetching a single URL or mapping a site's link surface.

When to gyre

Use a gyre when you need the content of multiple connected pages, in one call, without orchestrating queue + fetch yourself.

Reach for gyre when:

  • You have a seed URL and want the surrounding cluster of pages (e.g. a documentation root, a product category page, a press-release index).
  • You want each visited page returned with its title, description, and markdown — not just a list of URLs.
  • The site doesn't expose a clean sitemap, or its sitemap is too coarse for what you need.

Reach for a different primitive when:

You want…Use
One specific page's contentFetch
The full URL surface of a site, no contentMap
Pages matching a query across the webSearch
Structured JSON pulled from page contentExtract

How a gyre walks

A gyre is a breadth-first walk bounded by a page budget:

  1. The seed URL is fetched.
  2. Its outbound links are enqueued (filtered by sameDomain if set).
  3. The next URL in the queue is fetched; its links are enqueued.
  4. Steps 2–3 repeat until maxPages successful fetches have happened, or the queue is exhausted.

Each page goes through the same two-tier fetcher as the /fetch endpoint: plain HTTP first, escalating to the headless-browser worker when the page needs JS, returns a soft block, or rate-limits.

Budget and billing

maxPages caps successes, not attempts. A page that 404s or trips the block-page detector lands in errors[] and does not count against the budget.

Billing is per-page: 1 credit for HTTP-tier pages, 3 for browser-tier. The total cost is the sum across the response's pages[], with a minimum of 1.

Scope: sameDomain

By default a gyre stays on the seed's exact hostname. blog.example.com is not treated as same-domain as example.com — subdomains are out of scope. Set sameDomain: false to follow links anywhere on the public web (still subject to SSRF rules).

If your goal is "find every URL on this site, including subdomains," gyre is the wrong tool — use Map, which prefers sitemaps and is built for surface discovery rather than content capture.

Common patterns

  • Mirror a small docs site. Seed the docs root, set maxPages: 50, sameDomain: true. You get a markdown corpus you can feed straight to an LLM or your own RAG index.
  • Capture a press-release cluster. Seed a /newsroom or /press page; the gyre fans out to the linked releases.
  • Snapshot a product category. Seed a category index; capture the linked product detail pages in one call.

Limits

  • maxPages range: 1–50 per request.
  • 25-second hard deadline on the whole request.
  • No maxDepth parameter today — depth is bounded indirectly by maxPages and BFS order.
  • Visited URLs are tracked verbatim. ?ref=x and ?ref=y count as distinct pages.
Try it

Run a gyre from the console at /app/gyre — pick a seed, set a budget, watch pages stream in.