Cache
A hardware or software component that stores data so future requests for that data can be served faster.
A cache is a temporary storage layer that holds copies of frequently accessed data so that future requests for that data can be served more quickly. Caches exist at many levels of a system: in-browser caches, CDN edge caches, application-level caches (such as Redis or Memcached), and CPU caches.
Caching reduces latency and backend load by avoiding repeated computation or data retrieval for the same request. Cache strategies include time-based expiration (TTL), cache invalidation on write, and stale-while-revalidate patterns. The choice of strategy depends on how frequently the underlying data changes and how tolerant the application is of stale responses.
In API gateway architectures, response caching is a common feature. The gateway can cache responses from backend services and serve them directly for subsequent identical requests. This is especially useful in serverless deployments where reducing cold starts and function invocations directly lowers cost and improves response times.
Last updated