Caching

How the SDK's default downtime-only fallback, TTL freshness window, stale-while-revalidate, and custom stores work.

The SDK ships with a cache enabled by default, but it is not a TTL cache out of the box. By default it's a downtime-only fallback: every pv.get() reads the live API first, so a freshly published version is picked up on the very next call. The cache only steps in when a live read fails.

You can opt into a freshness window (ttl) and stale-while-revalidate (staleWhileRevalidate) on top of that fallback when a prompt is read far more often than it changes.

The default: live-first, fallback-on-failure

promptvault({ key }) gives you ttl: 0. That means:

Every read goes to the live API. The database is always authoritative, so a publish propagates on the next read with no delay.
Each successful read is stored as the last-known-good value for that key.
If a read fails with a transport error (network down, timeout, 5xx), the last stored value is served instead - regardless of age. A stale copy beats an outage.

There is no time-based expiry in this mode. The cache is purely an outage backstop.

A read is metered on every successful live call. The default cache does not reduce read volume - it only protects you from failures. Set ttl if you also want to cut network calls.

Opting into a freshness window

Set ttl (milliseconds) to serve cached values without a network call inside the window:

import { promptvault } from "@promptv/sdk";

// Default - live read every call, fall back to the last value on failure.
promptvault({ key });

// Freshness window - within 10s, return cached with no API call.
promptvault({ key, cache: { ttl: 10_000 } });

// Cheaper - refresh at most every 5 minutes.
promptvault({ key, cache: { ttl: 300_000 } });

// Explicit downtime-only (same as the default).
promptvault({ key, cache: { ttl: 0 } });

// Disable caching completely - every call must reach the API, no fallback.
promptvault({ key, cache: false });

// Plug in your own store (Redis, disk, etc.).
promptvault({ key, cache: { store: myStore, ttl: 60_000 } });

With ttl > 0, a dashboard publish takes up to ttl to propagate to a given process. With the default ttl: 0, it propagates on the next read.

Stale-while-revalidate

staleWhileRevalidate (milliseconds) layers on top of ttl. Once an entry is older than ttl but younger than ttl + staleWhileRevalidate, the stale value is returned immediately while a background refresh runs - so the caller never waits on the network. It has no effect unless ttl > 0.

// Fresh for 30s, then serve stale for up to another 60s while refreshing in the background.
promptvault({ key, cache: { ttl: 30_000, staleWhileRevalidate: 60_000 } });

Options

Setting	Type	Default	Notes
`cache`	`boolean \| CacheOptions`	`true`	`true` ⇒ memory store, `ttl: 0` (downtime-only fallback). `false` ⇒ no cache, no fallback.
`cache.enabled`	`boolean`	`true`	`false` ⇒ same as `cache: false`.
`cache.store`	`CacheStore`	`MemoryCacheStore`	Custom store.
`cache.ttl`	`number` (ms)	`0`	Freshness window. `0` ⇒ revalidate every read (downtime-only).
`cache.staleWhileRevalidate`	`number` (ms)	`0`	Serve-stale window past `ttl`. Requires `ttl > 0`.

Garbage durations (negative, NaN, non-finite) collapse to 0, so a bad config can only ever disable a window, never invert it.

Behaviour rules

For each read, the caching transport resolves as follows:

Fresh hit (age < ttl): the cached value is returned with no network call.
Stale-while-revalidate (ttl ≤ age < ttl + staleWhileRevalidate): the cached value is returned immediately and a background refresh runs (coalesced).
Otherwise (no usable entry, or ttl: 0): a live fetch runs. It resolves three ways:
- Success → store the value (stamped with the current time) and return it.
- Transport error (network down, timeout, 5xx, anything surfacing as PromptVaultError("transport_error"), or any non-PromptVaultError throw) → the last stored value is returned if one exists; otherwise the error is re-thrown. This fallback ignores age.
- Authoritative API error (not_found, etc.) → the error always propagates. We never mask "this prompt was deleted" with a stale copy.

Concurrent reads of the same key - foreground or background - share a single in-flight request, so a burst or cold-start fan-out collapses to one network round trip.

The cache is keyed at the transport layer, and each call type falls back independently: .get() under prompt:<slug>, .list() under prompts, and .version() / .listVersions() under versions:<slug>.

Custom stores

Implement CacheStore to back the cache with anything - Redis, a file, a shared object:

import { promptvault, type CacheStore, type CacheEntry } from "@promptv/sdk";

const redisStore: CacheStore = {
  async get(key) {
    const raw = await redis.get(`pv:${key}`);
    return raw ? (JSON.parse(raw) as CacheEntry) : undefined;
  },
  async set(key, entry) {
    await redis.set(`pv:${key}`, JSON.stringify(entry));
  },
  async delete(key) {
    await redis.del(`pv:${key}`);
  },
};

const pv = promptvault({
  key: process.env.PROMPTV_KEY!,
  cache: { store: redisStore, ttl: 60_000 },
});

delete is optional on the CacheStore interface; get and set may be sync or async.

Entries written by an external store and seeded without a storedAt timestamp are treated as having no known age - they're only ever used as a downtime fallback, never as a fresh ttl hit.

Sharing a single MemoryCacheStore across multiple clients also works:

import { promptvault, MemoryCacheStore } from "@promptv/sdk";

const shared = new MemoryCacheStore();
const pvLive = promptvault({ key: process.env.PV_LIVE!, cache: { store: shared } });
const pvTest = promptvault({ key: process.env.PV_TEST!, cache: { store: shared } });

Different keys serve different versions. Don't share a store between live and test clients unless you scope the keys, or you'll get cross-environment cache hits.

Cache shape

CacheEntry:

interface CacheEntry<T = unknown> {
  value: T;
  storedAt?: number; // epoch ms when written; omitted ⇒ unknown age (fallback-only)
}

Entries are wrapped values - keep this in mind if you serialise to Redis or disk.

Disabling cache when you really mean it

cache: false (or cache: { enabled: false }) removes the caching transport entirely. There is no downtime fallback in this mode - every call hits the API and any failure throws. Use it for ad-hoc scripts and tests, not for production.

Caching

On this page