# Scalability

Scalability is the ability of a system to handle increasing amounts of work by adding resources. A scalable system can grow to accommodate higher traffic, more data, or additional users without a proportional decline in performance. Scalability is a fundamental requirement for production API systems.

There are two primary scaling approaches: vertical scaling (scaling up) involves adding more power to an existing machine (more CPU, RAM, or storage), while horizontal scaling (scaling out) involves adding more machines to distribute the workload. Horizontal scaling is generally preferred for web applications and APIs because it avoids the hardware limits of a single machine and provides better fault tolerance.

In serverless and API gateway contexts, scalability is largely handled by the platform. Serverless functions scale horizontally by default, spinning up new instances to handle concurrent requests. Edge-deployed API gateways scale across the provider's global network. This platform-managed scalability removes the need for teams to implement auto-scaling policies, manage server fleets, or predict capacity requirements -- the system automatically adjusts to actual demand.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.serverlessapigateway.com/glossary/s/scalability.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
