Caching
How the SDK's default downtime-only fallback, TTL freshness window, stale-while-revalidate, and custom stores work.
The SDK ships with a cache enabled by default, but it is not a TTL cache out of the box. By default it's a downtime-only fallback: every pv.get() reads the live API first, so a freshly published version is picked up on the very next call. The cache only steps in when a live read fails.
You can opt into a freshness window (ttl) and stale-while-revalidate (staleWhileRevalidate) on top of that fallback when a prompt is read far more often than it changes.
The default: live-first, fallback-on-failure
promptvault({ key }) gives you ttl: 0. That means:
- Every read goes to the live API. The database is always authoritative, so a publish propagates on the next read with no delay.
- Each successful read is stored as the last-known-good value for that key.
- If a read fails with a transport error (network down, timeout, 5xx), the last stored value is served instead - regardless of age. A stale copy beats an outage.
There is no time-based expiry in this mode. The cache is purely an outage backstop.
A read is metered on every successful live call. The default cache does not reduce read volume - it only protects you from failures. Set
ttlif you also want to cut network calls.
Opting into a freshness window
Set ttl (milliseconds) to serve cached values without a network call inside the window:
import { promptvault } from "@promptv/sdk";
// Default - live read every call, fall back to the last value on failure.
promptvault({ key });
// Freshness window - within 10s, return cached with no API call.
promptvault({ key, cache: { ttl: 10_000 } });
// Cheaper - refresh at most every 5 minutes.
promptvault({ key, cache: { ttl: 300_000 } });
// Explicit downtime-only (same as the default).
promptvault({ key, cache: { ttl: 0 } });
// Disable caching completely - every call must reach the API, no fallback.
promptvault({ key, cache: false });
// Plug in your own store (Redis, disk, etc.).
promptvault({ key, cache: { store: myStore, ttl: 60_000 } });With ttl > 0, a dashboard publish takes up to ttl to propagate to a given process. With the default ttl: 0, it propagates on the next read.
Stale-while-revalidate
staleWhileRevalidate (milliseconds) layers on top of ttl. Once an entry is older than ttl but younger than ttl + staleWhileRevalidate, the stale value is returned immediately while a background refresh runs - so the caller never waits on the network. It has no effect unless ttl > 0.
// Fresh for 30s, then serve stale for up to another 60s while refreshing in the background.
promptvault({ key, cache: { ttl: 30_000, staleWhileRevalidate: 60_000 } });Options
| Setting | Type | Default | Notes |
|---|---|---|---|
cache | boolean | CacheOptions | true | true ⇒ memory store, ttl: 0 (downtime-only fallback). false ⇒ no cache, no fallback. |
cache.enabled | boolean | true | false ⇒ same as cache: false. |
cache.store | CacheStore | MemoryCacheStore | Custom store. |
cache.ttl | number (ms) | 0 | Freshness window. 0 ⇒ revalidate every read (downtime-only). |
cache.staleWhileRevalidate | number (ms) | 0 | Serve-stale window past ttl. Requires ttl > 0. |
Garbage durations (negative, NaN, non-finite) collapse to 0, so a bad config can only ever disable a window, never invert it.
Behaviour rules
For each read, the caching transport resolves as follows:
- Fresh hit (
age < ttl): the cached value is returned with no network call. - Stale-while-revalidate (
ttl ≤ age < ttl + staleWhileRevalidate): the cached value is returned immediately and a background refresh runs (coalesced). - Otherwise (no usable entry, or
ttl: 0): a live fetch runs. It resolves three ways:- Success → store the value (stamped with the current time) and return it.
- Transport error (network down, timeout, 5xx, anything surfacing as
PromptVaultError("transport_error"), or any non-PromptVaultErrorthrow) → the last stored value is returned if one exists; otherwise the error is re-thrown. This fallback ignores age. - Authoritative API error (
not_found, etc.) → the error always propagates. We never mask "this prompt was deleted" with a stale copy.
Concurrent reads of the same key - foreground or background - share a single in-flight request, so a burst or cold-start fan-out collapses to one network round trip.
The cache is keyed at the transport layer, and each call type falls back independently: .get() under prompt:<slug>, .list() under prompts, and .version() / .listVersions() under versions:<slug>.
Custom stores
Implement CacheStore to back the cache with anything - Redis, a file, a shared object:
import { promptvault, type CacheStore, type CacheEntry } from "@promptv/sdk";
const redisStore: CacheStore = {
async get(key) {
const raw = await redis.get(`pv:${key}`);
return raw ? (JSON.parse(raw) as CacheEntry) : undefined;
},
async set(key, entry) {
await redis.set(`pv:${key}`, JSON.stringify(entry));
},
async delete(key) {
await redis.del(`pv:${key}`);
},
};
const pv = promptvault({
key: process.env.PROMPTV_KEY!,
cache: { store: redisStore, ttl: 60_000 },
});delete is optional on the CacheStore interface; get and set may be sync or async.
Entries written by an external store and seeded without a
storedAttimestamp are treated as having no known age - they're only ever used as a downtime fallback, never as a freshttlhit.
Sharing a single MemoryCacheStore across multiple clients also works:
import { promptvault, MemoryCacheStore } from "@promptv/sdk";
const shared = new MemoryCacheStore();
const pvLive = promptvault({ key: process.env.PV_LIVE!, cache: { store: shared } });
const pvTest = promptvault({ key: process.env.PV_TEST!, cache: { store: shared } });Different keys serve different versions. Don't share a store between live and test clients unless you scope the keys, or you'll get cross-environment cache hits.
Cache shape
CacheEntry:
interface CacheEntry<T = unknown> {
value: T;
storedAt?: number; // epoch ms when written; omitted ⇒ unknown age (fallback-only)
}Entries are wrapped values - keep this in mind if you serialise to Redis or disk.
Disabling cache when you really mean it
cache: false (or cache: { enabled: false }) removes the caching transport entirely. There is no downtime fallback in this mode - every call hits the API and any failure throws. Use it for ad-hoc scripts and tests, not for production.