FastAPI Async APIs: What Actually Gets Faster?
A practical guide to async FastAPI routes, blocking calls, thread pools, and when async actually helps.
The Short Version
async def does not make a FastAPI route fast by itself. Async helps when your route waits on awaitable I/O: a database call, an HTTP request, a cache lookup, a stream, or a WebSocket message. If the route blocks the event loop with sync code, it can become slower under concurrency than a plain sync route.
The useful question is not "Should every FastAPI endpoint be async?" The useful question is:
When this request gets slow, does it give the event loop a chance to run something else?
That question is the difference between async syntax and async behavior.
The 50 ms Test
Here are four FastAPI endpoints that all appear to wait for the same amount of time:
import asyncio
import time
from fastapi import FastAPI
from fastapi.concurrency import run_in_threadpool
app = FastAPI()
@app.get("/async-sleep")
async def async_sleep():
await asyncio.sleep(0.05)
return {"ok": True}
@app.get("/blocking-in-async")
async def blocking_in_async():
time.sleep(0.05)
return {"ok": True}
@app.get("/sync-blocking")
def sync_blocking():
time.sleep(0.05)
return {"ok": True}
@app.get("/offloaded-blocking")
async def offloaded_blocking():
await run_in_threadpool(time.sleep, 0.05)
return {"ok": True}
On one local in-process simulation, twenty concurrent requests produced this:
| Endpoint | Concurrency | Delay | Wall time | What happened |
|---|---|---|---|---|
/async-sleep | 20 | 50 ms | 0.0551 s | Requests waited together. |
/blocking-in-async | 20 | 50 ms | 1.1094 s | Requests effectively waited one after another. |
/sync-blocking | 20 | 50 ms | 0.0714 s | FastAPI moved sync work to the thread pool. |
/offloaded-blocking | 20 | 50 ms | 0.0622 s | The blocking call was explicitly offloaded. |
Do not treat these as production benchmark numbers. They were run with HTTPX ASGITransport inside one process. Treat them as a behavior test.
The behavior is the lesson: await asyncio.sleep() gives control back to the event loop. time.sleep() blocks the thread. If that blocking call runs inside async def, the event loop cannot move on to other requests.
Async Is A Scheduling Model, Not A Speed Button
An async FastAPI route runs on the event loop. The event loop can handle many waiting tasks because a task can pause at await and let another task run.
That works well for I/O-heavy APIs:
- waiting on Postgres
- calling another HTTP API
- reading from Redis
- waiting for a message broker
- streaming tokens to a browser
- holding WebSocket connections open
It does not make CPU-heavy work faster. It also does not make blocking libraries non-blocking.
This is the mental model:
- A request enters your route.
- The route runs until it reaches an
await. - While that operation waits, the event loop can run another request.
- When the awaited operation is ready, the route continues.
If step 2 never reaches a real await point, the event loop is stuck.
async def vs def In FastAPI
FastAPI supports both route styles because both are useful.
Use async def when your route calls awaitable libraries:
import httpx
from fastapi import FastAPI
app = FastAPI()
@app.get("/profile/{user_id}")
async def profile(user_id: int):
async with httpx.AsyncClient(timeout=5.0) as client:
response = await client.get(f"https://api.example.com/users/{user_id}")
response.raise_for_status()
return response.json()
Use normal def when the route must call blocking libraries and you do not have a good async option:
import requests
from fastapi import FastAPI
app = FastAPI()
@app.get("/legacy-profile/{user_id}")
def legacy_profile(user_id: int):
response = requests.get(f"https://api.example.com/users/{user_id}", timeout=5)
response.raise_for_status()
return response.json()
That second example is not glamorous, but it is honest. FastAPI and Starlette run normal sync routes in a thread pool so the blocking call does not freeze the event loop.
The wrong version is this:
import requests
from fastapi import FastAPI
app = FastAPI()
@app.get("/bad-profile/{user_id}")
async def bad_profile(user_id: int):
response = requests.get(f"https://api.example.com/users/{user_id}", timeout=5)
response.raise_for_status()
return response.json()
This route is async in spelling only. The requests.get() call blocks the event loop until it finishes.
The Route Decision Matrix
Use this table during code review:
| Work inside the route | Better choice | Why |
|---|---|---|
| Async database call | async def | The DB wait can yield to the event loop. |
| Async HTTP call | async def | Outbound network wait can overlap with other requests. |
| Sync library with no async API | def, or explicit offload | Keep blocking work away from the event loop. |
| Short pure-Python logic | Either, keep it simple | There is no meaningful I/O wait to overlap. |
| CPU-heavy work | Worker process or queue | Async does not remove CPU cost. |
| Short non-critical side effect | BackgroundTasks can work | Useful after-response convenience. |
| Long or retryable job | External queue plus job table | Needs durability, status, and retries. |
| WebSocket or SSE stream | async def | Long-lived connections need cooperative I/O. |
Thread Pools Are Useful, But Bounded
Sync routes are not automatically bad. They are often the right choice when the dependency is blocking.
The trade-off is capacity. Starlette uses AnyIO's worker thread pool for sync routes and sync dependencies. In the local simulation, the default AnyIO thread limit was 40 tokens.
At 100 concurrent requests with a 50 ms blocking sleep:
| Endpoint | Concurrency | Wall time | Lesson |
|---|---|---|---|
/async-sleep | 100 | 0.0763 s | Awaitable waits stayed close to one wave. |
/sync-blocking | 100 | 0.1932 s | Thread-pool work completed in bounded waves. |
/offloaded-blocking | 100 | 0.1962 s | Explicit offload used the same kind of limited resource. |
Thread-pool offloading is a safety valve. It is not infinite concurrency.
Async Databases Need Their Own Discipline
Async database access is one of the best reasons to use async FastAPI, but it also creates easy mistakes.
Good defaults:
- Create the database engine once during app startup or lifespan.
- Create one async session per request.
- Do not store a global
AsyncSession. - Do not share one session across concurrent tasks.
- Tune the connection pool under load.
- Measure query time separately from route time.
A good shape looks like this:
from collections.abc import AsyncIterator
from typing import Annotated
from fastapi import Depends, FastAPI
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
app = FastAPI()
engine = create_async_engine(DB_URL, pool_pre_ping=True)
SessionLocal = async_sessionmaker(engine, expire_on_commit=False)
async def get_session() -> AsyncIterator[AsyncSession]:
async with SessionLocal() as session:
yield session
@app.get("/users/{user_id}")
async def read_user(
user_id: str,
session: Annotated[AsyncSession, Depends(get_session)],
):
statement = select(User).where(User.id == user_id)
return (await session.scalars(statement)).one_or_none()
Async does not bypass the database. If 500 requests hit a pool of 20 connections, 480 requests still need to wait somewhere. That waiting should be controlled and observable.
BackgroundTasks Is Not A Queue
FastAPI BackgroundTasks is useful for small after-response work:
- send a non-critical notification
- write a lightweight audit event
- schedule local cleanup
It is not a durable job system. It does not give you strong retry behavior, cross-process visibility, long-running status, or crash recovery.
For work like OCR, video processing, batch email, report generation, embeddings, or long LLM jobs, use a job architecture:
POST /jobscreates a job row and returnsjob_id.- A worker pulls work from a queue.
- The worker updates job status.
- The client polls, subscribes through SSE/WebSocket, or receives a webhook.
- Failed jobs have retry and dead-letter behavior.
The API process should stay responsive. The worker process should own the long work.
How To Test Async FastAPI Code
For normal route tests, FastAPI's TestClient is often enough.
When the test itself is async, or when it needs to await async database calls, use HTTPX with ASGITransport:
import pytest
from httpx import ASGITransport, AsyncClient
from app.main import app
@pytest.mark.anyio
async def test_root():
async with AsyncClient(
transport=ASGITransport(app=app),
base_url="http://test",
) as client:
response = await client.get("/")
assert response.status_code == 200
One caveat: in-process ASGI tests are not the same as real network tests. They are excellent for route behavior and validation. They are not enough for full deployment behavior, worker topology, graceful shutdown, or production latency.
The Production Checklist
Before calling a FastAPI service "async-ready", check this:
- Every slow operation inside
async defis actually awaitable. - No
requests, sync ORM call,time.sleep, or blocking SDK is hidden inside an async route. - HTTP clients are reused where appropriate and have timeouts.
- Database sessions are request-scoped, not global.
- Connection pool limits are known and measured.
- Sync routes and sync dependencies are understood as thread-pool work.
- CPU-heavy tasks are moved out of the request path.
- Long jobs use a queue and persisted status.
- Tests cover both validation errors and async I/O paths.
- Load tests report p95 and p99 latency, not only average latency.
The Practical Rule
Use async FastAPI when your service spends time waiting on things that can be awaited. Keep blocking work out of the event loop. Keep long work out of the API process. Measure the real bottleneck before changing the route style.
That is the difference between writing async Python and building an async API that behaves well under load.
References
- FastAPI async guide: https://fastapi.tiangolo.com/async/
- FastAPI async tests: https://fastapi.tiangolo.com/advanced/async-tests/
- FastAPI background tasks: https://fastapi.tiangolo.com/tutorial/background-tasks/
- FastAPI server workers: https://fastapi.tiangolo.com/deployment/server-workers/
- Python asyncio docs: https://docs.python.org/3/library/asyncio.html
- Python coroutines and tasks: https://docs.python.org/3/library/asyncio-task.html
- SQLAlchemy asyncio docs: https://docs.sqlalchemy.org/en/20/orm/extensions/asyncio.html
- Uvicorn deployment: https://www.uvicorn.org/deployment/
- Starlette release notes: https://www.starlette.io/release-notes/