Skip to content
Pro feature

Cache TTL: A configurable cache, served from the edge

Don't pay to re-render the same screenshot or PDF twice. Every cache HIT is free — across every Microlink output. Use ttl to set the freshness window and staleTtl to keep callers instant during the background refresh.

Same cache, every output:metadatascreenshotpdfhtmlmarkdowninsightsdata extraction
First request to a URL
MISS · billed once
Your code
any workflow
Microlink Pro
fetch + render
Origin
target site
Response
cache populated
Every request after, while ttl is valid
HIT · free
Your code
any workflow
CloudFlare edge
nearest of 240+ nodes
Cached response
tens of milliseconds

With staleTtl, even revalidations happen in the background.

Cache hits are free

One paid MISS warms the cache; every HIT until ttl expires is free — including expensive screenshot and PDF renders.

Two caches behind it: a unified cache (x-cache-status) holds the shared copy per URL; a CloudFlare edge cache (cf-cache-status) serves it from the nearest of 240+ edge nodes. Cold regions auto-populate from the unified layer.

ttl sets the cache window. staleTtl enables stale-while-revalidate — callers always hit cache while refreshes happen in the background. No Redis, no cron jobs.

Two parameters → every cache trade-off

Pay once. Serve millions of cache hits for free.

A request cache, an invalidation policy, and a background revalidator are the three pieces every team rebuilds on top of an external API. Pro folds them into the response layer — and cache hits never count against your plan quota, so the better your cache strategy, the less you pay per served request. ttl tunes lifetime, staleTtl covers cold starts, and the response headers expose enough observability to keep your cache hit rate honest.

01 · ttl
Tune cache lifetime per request
The ttl parameter sets the maximum time a response is considered valid before expiring — from 1 minute to 31 days. Pass it as a number in milliseconds (86400000) or as a humanized string ('1d', '90s', '1hour'). The aliases 'min' and 'max' snap to the boundaries.
min (1m)1h6h1d7d14dmax (31d)
Longer TTL = lower bill. Each paid MISS buys you free hits for the entire cache window. Short TTLs for dashboards and feeds, longer TTLs for marketing pages and docs, and 'max' for content that essentially never changes. The effective lifetime always echoes back as x-cache-ttl.
02 · staleTtl
Cold starts, eliminated
staleTtl opts into stale-while-revalidate: when a cached entry passes its stale window, the next request still gets served from cache instantly while a background refresh regenerates a fresh copy. Your callers never wait on the origin again.
staleTtl: 0staleTtl: '12h'staleTtl: '1d'staleTtl: false
The recommended production pattern is { ttl: '1d', staleTtl: 0 } — every caller is served from cache (free), and one background refresh per cache window keeps the entry current (the only billed request). The staleTtl value cannot exceed ttl.
03 · Headers
Observability, included
Every response carries the headers you need to track cache behavior and tune it over time. No probing, no guesswork —x-cache-status tells you whether the response was a HIT, MISS, or BYPASS; the rest tell you why.
x-cache-statusx-cache-ttlcf-cache-statusx-response-time
Combine x-cache-status with x-response-time in your APM and you get a real-time read on cache hit rate and p95 latency — enough to know when a TTL needs to grow, shrink, or pivot to staleTtl.
Code

Recommended: one paid request per day, the rest are free

Keep responses valid for 24 hours and serve every caller from cache while a background refresh keeps them current. The result: a single billed MISS per cache window, every other request a free HIT. Works the same on metadata, screenshots, PDFs, HTML, and markdown.

index.js
import mql from '@microlink/mql'

const { data } = await mql('https://example.com', {
  apiKey: process.env.MICROLINK_API_KEY,
  ttl: '1d',
  staleTtl: 0
})
Bypass the cache

Need a guaranteed fresh response?

Pass force: true to skip the cache layer entirely and force-regenerate a new copy. The response header x-cache-status will read BYPASS, and a fresh entry replaces the previous one. Use it for cache invalidation events — not on every request.

index.js
import mql from '@microlink/mql'

const { data } = await mql('https://example.com', {
  apiKey: process.env.MICROLINK_API_KEY,
  force: true
})
Verify

How to confirm cache behavior

x-cache-status is the source of truth — and the difference between a free request and a billed one. HIT means served from the unified cache (not counted toward your plan), MISS means a fresh build (billed), and BYPASS means the cache was skipped on purpose (billed). The accompanying cf-cache-status tells you whether CloudFlare's edge served it from a node close to the caller.

HTTP/2 200
x-pricing-plan: proPro plan active
x-cache-status: HITserved from cache · no quota used
x-cache-ttl: 86400000effective ttl in ms (= 1d)
cf-cache-status: HITserved from nearest edge node
x-response-time: 23mscache hit ⇒ tens of ms

Pay for misses. Hits are on us.

Pick the volume that matches your traffic. Cache TTL tuning, stale-while-revalidate, and the cache layer (unified cache + CloudFlare edge) are included on every Pro plan — and every cache hit served from those layers stays free, no matter how many requests it absorbs.

Does caching apply to screenshots and PDFs too?

Yes — the cache layer covers every Microlink output equally: metadata, HTML, markdown, screenshots, PDFs, insights, and data extraction. There is no separate cache for media.
A 4K full-page screenshot or a 50-page PDF behaves the same as a metadata response: the first request renders and caches the artifact (x-cache-status: MISS, billed once); every subsequent caller within the ttl window gets it back as a HIT for free, served from the nearest CloudFlare edge node. Because rendered outputs are the most expensive ones to produce, that is also where caching saves you the most.

Do cached responses count against my plan quota?

No. Any response served from cache (x-cache-status: HIT) does not count toward your plan quota — it is served from CloudFlare's edge in milliseconds and billed at zero. Only the first request that warms the cache (x-cache-status: MISS) and explicit cache bypasses (x-cache-status: BYPASS, e.g. when force: true is set) count as billed requests.
The longer your ttl, the more free hits each paid miss generates. See the edge-cdn announcement and the cache documentation for the full rationale.

What is the difference between ttl and staleTtl?

ttl sets how long a cached response is considered valid — between 1 minute and 31 days. staleTtl opts into stale-while-revalidate: when a cached entry crosses the staleTtl threshold, the next request still serves the cached copy instantly while a background refresh regenerates a fresh one.
The staleTtl value cannot exceed ttl. The recommended production setup is ttl set to your freshness budget, staleTtl: 0 — every request returns instantly while background refreshes keep the cache current.

What values can I pass to ttl?

A number in milliseconds (86400000) or a humanized string. Supported units: s, m, h, and d in singular, plural, and long-form variants — for example '90s', '1hour', '7days'.
Aliases: 'min' equals 1 minute and 'max' equals 31 days. The minimum is 1 minute and the maximum is 31 days; values outside that window get clamped.

How does stale-while-revalidate actually work?

When you set staleTtl: 0, every request hits a cached copy if one exists — and if that copy has aged past the stale threshold, Microlink schedules a background fetch to regenerate it. The current caller does not wait. Subsequent callers benefit from the freshly built copy.
This is the same pattern modern CDNs use (Cache-Control: stale-while-revalidate), implemented inside the API so you do not have to coordinate it on your end.

How do I bypass the cache for a fresh response?

Pass force: true. The cache layer is skipped, a new response is generated, and the response carries x-cache-status: BYPASS.
Use this for invalidation events — for example, you know the source content changed — not on every request, since you would lose the latency and quota benefits of caching. See the force reference.

How do I confirm cache behavior on a request?

Read the cache headers on the response. x-cache-status (HIT, MISS, or BYPASS) is the source of truth for the unified cache. cf-cache-status reports the CloudFlare edge layer separately.
x-cache-ttl confirms the effective ttl in milliseconds. x-response-time gives you a quick latency sanity check — cache hits typically come back in tens of milliseconds.

Do ttl and staleTtl work on free plans?

No. Both ttl and staleTtl are Pro features. Free-plan responses are still cached using the default 24-hour ttl, but the parameters themselves are honored only on Pro requests.
The unified cache + CloudFlare edge cache combination is shared by both tiers — Pro adds the ability to tune lifetime and opt into stale-while-revalidate.