# SLA (Service Level Agreement)

A Service Level Agreement (SLA) is a formal contract between a service provider and a customer that defines the expected level of service, measured against specific metrics. SLAs establish accountability by specifying what happens when the provider fails to meet the agreed-upon targets, typically in the form of service credits or other remedies.

## Key components of an SLA

Most SLAs include the following elements:

* **Availability (uptime)**: The percentage of time the service is guaranteed to be operational, commonly expressed as "nines" -- 99.9% (three nines) allows roughly 8.7 hours of downtime per year, while 99.99% (four nines) allows about 52 minutes.
* **Latency targets**: Maximum acceptable response times for API calls or other operations, often defined at specific percentiles (e.g., p99 latency under 200ms).
* **Throughput**: The volume of requests or transactions the service can handle within a given time period.
* **Support response times**: How quickly the provider will acknowledge and begin working on issues of different severity levels.
* **Remedies**: The compensation the customer receives if the provider misses the SLA targets, such as service credits applied to future invoices.

## SLAs, SLOs, and SLIs

These three related terms are often confused:

* **SLA (Service Level Agreement)**: The external contract with customers, including consequences for violations.
* **SLO (Service Level Objective)**: An internal target that the engineering team aims to meet. SLOs are typically stricter than SLAs to provide a safety margin.
* **SLI (Service Level Indicator)**: The actual measured metric (e.g., the real uptime percentage or observed p99 latency) used to evaluate whether the SLO and SLA are being met.

## SLAs and API gateways

API gateways play a role in both meeting and monitoring SLA commitments. Because the gateway is the entry point for all API traffic, its own availability and latency directly affect the end-to-end SLA. A gateway outage means the entire API is unreachable, regardless of whether the backends are healthy.

Choosing a gateway architecture that is itself highly available is important. Serverless API Gateway runs on Cloudflare Workers, which operates across Cloudflare's global network of data centers. Cloudflare's Workers platform publishes its own SLA, and because execution happens at the edge closest to the user, request latency is reduced compared to gateways running in a single region. This edge-based architecture removes single points of failure that can threaten SLA compliance.

Beyond availability, the gateway is a natural place to collect the SLI metrics that feed into SLA reporting -- request counts, error rates, and response times can all be measured at the gateway layer.

## Related documentation

* [Getting Started](/getting-started/introduction.md) - Deploy on Cloudflare's globally distributed edge
* [Servers Configuration](/configuration/servers.md) - Configure backend upstreams and understand routing reliability
* [Deployment with Wrangler](/deployment/wrangler.md) - Deploy the gateway for production workloads


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.serverlessapigateway.com/glossary/s/sla-service-level-agreement.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
