Gyre
Walk a site from a seed URL within a page budget, following outbound links breadth-first. Returns parsed markdown for every page successfully visited, plus per-URL errors. Same-domain by default. The link-traversal primitive in the Gyrence pipeline — Search finds, Fetch reads one, Gyre walks many.
Use this when
- You need a bounded slice of a site (1–50 pages) without standing up a full crawler.
- You're feeding RAG and want a seed page plus everything it links to, in one call.
- You want partial-success semantics — per-URL failures land in
errors[]and the walk continues. - You're prototyping coverage before committing to a scheduled crawl.
| Method | POST |
| Path | /api/v1/gyre |
| Auth | Bearer |
| Credits | 1 per HTTP page, 3 per browser page (min 1) |
Request
| Parameter | Type | Description |
|---|---|---|
urlrequired | string | Absolute http(s) seed URL. Private, loopback, and link-local hosts are rejected (SSRF). |
maxPages | numberdefault: 20 | Hard cap on pages visited. Range 1–50. The walk stops as soon as this many pages have been fetched (successes only). |
sameDomain | booleandefault: true | When true, only links whose hostname exactly matches the seed's hostname are followed. Subdomains are NOT considered same-domain. |
Example body
{
"url": "https://example.com",
"maxPages": 10,
"sameDomain": true
}Response
| Field | Type | Description |
|---|---|---|
startUrl | string | Echo of the seed URL. |
totalPages | number | Number of pages successfully fetched (length of `pages[]`). |
pages[] | object[] | One entry per successfully fetched page (see fields below). |
pages[].url | string | Absolute URL of the visited page. |
pages[].title | string | Contents of <title>. Empty when absent. |
pages[].description | string | Meta description or og:description. Empty when absent. |
pages[].markdown | string | Main-content markdown (same extraction pipeline as /fetch). |
pages[].statusCode | number | Origin HTTP status. |
pages[].via | "http" | "browser" | Tier that produced this page. Drives per-page credit cost. |
errors[] | object[] | Per-URL errors encountered during the walk. `{ url, error }`. The walk continues past individual failures. |
Example response
{
"ok": true,
"data": {
"startUrl": "https://example.com",
"totalPages": 3,
"pages": [
{
"url": "https://example.com",
"title": "Example",
"description": "Home page",
"markdown": "# Example\n\n...",
"statusCode": 200,
"via": "http"
}
],
"errors": []
}
}Example
curl -X POST https://www.gyrence.com/api/v1/gyre \
-H "Authorization: Bearer $GYRENCE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com","maxPages":10}'Errors
| Code | HTTP | Meaning |
|---|---|---|
bad_request | 400 | url missing/invalid, or maxPages out of range. |
unauthorized | 401 | Missing, malformed, or revoked Authorization header. |
credits_exhausted | 402 | Workspace balance below request cost. |
forbidden_url | 403 | SSRF guard rejected the seed (private, loopback, link-local host). |
not_found | 404 | The seed (and every queued URL) returned 404, or no pages were successfully fetched. |
timeout | 408 | Request exceeded the 25-second hard deadline. |
rate_limited | 429 | Per-workspace rate limit. |
upstream_error | 502 | First page returned 5xx. |
unavailable | 503 | Block-page detector tripped on the seed, or any other unmapped error. |
Credits
Each fetched page is billed individually: 1 credit per HTTP-tier page, 3 credits per browser-tier page. The total is the sum across pages[], with a minimum of 1 even if nothing was billable. Errored URLs are not charged. Inspect each pages[].via to attribute cost.
Coverage & known limits
sameDomainis exact-host.https://blog.example.comis not considered same-domain ashttps://example.com. Use Map when you need subdomain coverage.- No
maxDepthparameter. Depth is bounded indirectly bymaxPagesand breadth-first queue order. - Query-string variants count as distinct pages.
?utm_source=…URLs each consume budget. - Per-page block detection lands in
errors[](not the top-level envelope). The walk continues past blocked URLs.
Notes
- Walk semantics. Breadth-first from the seed. Each page's outbound
links[](as extracted by the underlying fetcher) is enqueued. Visited URLs are tracked verbatim. - Page budget vs queue.
maxPagescaps successful fetches, not URLs considered. The walk stops as soon as the budget is reached, even if the queue still has unvisited URLs. - Per-page fetcher. Each page goes through the same two-tier pipeline as
/fetch(HTTP → browser escalation, SEC.gov fast path, block-page detection). See the Fetch docs for tier mechanics. - Failure tolerance. Per-URL failures are pushed to
errors[]and the walk continues. The request only fails when nothing was fetched.
Try it
Run Gyre from the console at /app/gyre — pick a seed, set a budget, and watch pages stream in.
