Skip to content

Conversation

@savannahostrowski
Copy link
Member

I discovered that we didn't have a benchmark in here for FastAPI so I figured I'd add one in to start. This is a pretty basic example of canonical FastAPI request handling.

@albertedwardson
Copy link

Hi! Just passing by and wanted to put my two cents in. I’m a bit unsure about adding this to the benchmark suite though.

pyperformance already have an async HTTP benchmark with Tornado (bm_tornado_http) that covers Python async request handling. FastAPI, by contrast, pulls in extra layers: especially Pydantic, which does validation in Rust.

I worry it could add noise rather than useful data about Python performance, if there’s ever a regression in this benchmark, it might be hard to tell what caused it. Maybe this fits better as an external benchmark rather than part of the core suite?

Not trying to nitpick, just genuinely curious what’s the goal to measure here? For a real-world minimal FastAPI app, this example lacks database or business logic and only serves semi-static responses, so I’m not sure what Python-side behavior it’s meant to represent

@savannahostrowski
Copy link
Member Author

I think there are a couple things to consider here:

  • There's existing precedent for adding popular libraries/frameworks into the benchmark suite, see Django or, as you pointed out, Tornado. FastAPI is now the most popular Python web framework so it seems relevant to consider adding a benchmark in here.
  • To your point about FastAPI dependencies potentially causing extra noise, the same could be said for Django or Tornado which also have their own set of dependencies. If we see a regression, the first step is always to investigate whether it's CPython or dependencies...but IMO, that's true for any external framework benchmark.
  • I did consider adding more complexity to this benchmark but candidly, most benchmarks in here are very simple. The goal here is to track how Python changes affect FastAPI's core request handling and async patterns. If we wanted more involved benchmarking, I think I'd rather include other separate benchmarks to measure perf on other scenarios/features.

@kumaraditya303
Copy link
Contributor

Just curious, do you have benchmark numbers for this benchmark?

@savannahostrowski
Copy link
Member Author

I mean, running it locally:

Run 1: calibrate the number of loops: 1
- calibrate 1: 131 ms (loops: 1, raw: 131 ms)
- calibrate 2: 125 ms (loops: 1, raw: 125 ms)
- calibrate 3: 121 ms (loops: 1, raw: 121 ms)
- calibrate 4: 127 ms (loops: 1, raw: 127 ms)
Calibration: 1 warmup, 1 loop
Run 2: 1 warmup, 3 values, 1 loop
- warmup 1: 127 ms
- value 1: 124 ms
- value 2: 124 ms
- value 3: 124 ms
Run 3: 1 warmup, 3 values, 1 loop
- warmup 1: 134 ms
- value 1: 134 ms
- value 2: 124 ms (-6%)
- value 3: 136 ms
Run 4: 1 warmup, 3 values, 1 loop
- warmup 1: 128 ms (+5%)
- value 1: 124 ms
- value 2: 121 ms
- value 3: 121 ms
Run 5: 1 warmup, 3 values, 1 loop
- warmup 1: 127 ms
- value 1: 121 ms
- value 2: 125 ms
- value 3: 124 ms
Run 6: 1 warmup, 3 values, 1 loop
- warmup 1: 145 ms (+18%)
- value 1: 124 ms
- value 2: 122 ms
- value 3: 124 ms
Run 7: 1 warmup, 3 values, 1 loop
- warmup 1: 128 ms
- value 1: 123 ms
- value 2: 126 ms
- value 3: 123 ms
Run 8: 1 warmup, 3 values, 1 loop
- warmup 1: 128 ms
- value 1: 122 ms
- value 2: 125 ms
- value 3: 123 ms
Run 9: 1 warmup, 3 values, 1 loop
- warmup 1: 133 ms
- value 1: 143 ms
- value 2: 138 ms
- value 3: 138 ms
Run 10: 1 warmup, 3 values, 1 loop
- warmup 1: 128 ms
- value 1: 123 ms
- value 2: 123 ms
- value 3: 133 ms (+5%)
Run 11: 1 warmup, 3 values, 1 loop
- warmup 1: 129 ms (+6%)
- value 1: 120 ms
- value 2: 124 ms
- value 3: 122 ms
Run 12: 1 warmup, 3 values, 1 loop
- warmup 1: 127 ms
- value 1: 124 ms
- value 2: 129 ms
- value 3: 125 ms
Run 13: 1 warmup, 3 values, 1 loop
- warmup 1: 128 ms
- value 1: 124 ms
- value 2: 127 ms
- value 3: 125 ms
Run 14: 1 warmup, 3 values, 1 loop
- warmup 1: 139 ms (+10%)
- value 1: 127 ms
- value 2: 127 ms
- value 3: 124 ms
Run 15: 1 warmup, 3 values, 1 loop
- warmup 1: 130 ms
- value 1: 121 ms
- value 2: 123 ms
- value 3: 127 ms
Run 16: 1 warmup, 3 values, 1 loop
- warmup 1: 128 ms
- value 1: 123 ms
- value 2: 127 ms
- value 3: 124 ms
Run 17: 1 warmup, 3 values, 1 loop
- warmup 1: 130 ms
- value 1: 125 ms
- value 2: 125 ms
- value 3: 129 ms
Run 18: 1 warmup, 3 values, 1 loop
- warmup 1: 131 ms (+7%)
- value 1: 124 ms
- value 2: 122 ms
- value 3: 122 ms
Run 19: 1 warmup, 3 values, 1 loop
- warmup 1: 129 ms
- value 1: 123 ms
- value 2: 128 ms
- value 3: 124 ms
Run 20: 1 warmup, 3 values, 1 loop
- warmup 1: 130 ms
- value 1: 123 ms
- value 2: 124 ms
- value 3: 124 ms
Run 21: 1 warmup, 3 values, 1 loop
- warmup 1: 128 ms
- value 1: 124 ms (-7%)
- value 2: 127 ms
- value 3: 147 ms (+11%)
Run 22: 1 warmup, 3 values, 1 loop
- warmup 1: 129 ms
- value 1: 136 ms (+6%)
- value 2: 127 ms
- value 3: 124 ms

WARNING: the benchmark result may be unstable
* Not enough samples to get a stable result (95% certainly of less than 1% variation)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python3 -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

fastapi_http: Mean +- std dev: 126 ms +- 5 ms

Unfortunately, many of the I/O heavy benchmarks are unstable (e.g. Tornado is as well). Diego and I spent a bunch of time going back and forth trying to make this more stable.

@kumaraditya303
Copy link
Contributor

I meant a comparison between like 3.10 and 3.14.

@savannahostrowski
Copy link
Member Author

savannahostrowski commented Dec 16, 2025

Sure, looks like ~73% faster (with --processes 2 --values 3) 🎉

3.10: fastapi_http: Mean +- std dev: 215 ms +- 15 ms

3.14: fastapi_http: Mean +- std dev: 124 ms +- 2 ms

Update: with more processes this looks more like a ~57% speedup...so not quite as good but still a solid improvement.

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Dec 16, 2025

Nice! asyncio benchmarks show similar ~80% improvement between 3.10 and 3.14.

A significant part of speedup would be from python/cpython#107803 in 3.14, either way happy to see that FastAPI performance has improved similar to asyncio :)

@savannahostrowski
Copy link
Member Author

Just for transparency (with --processes 21 --values 3):

[fastapi_310] 212 ms +- 13 ms (baseline)
[fastapi_311] 142 ms +- 6 ms (1.49x faster)
[fastapi_312] 136 ms +- 8 ms (1.56x faster)
[fastapi_313] 131 ms +- 8 ms (1.62x faster)
[fastapi_314] 135 ms +- 20 ms (1.57x faster)

Copy link
Contributor

@diegorusso diegorusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for adding FastAPI to pyperformance.

@savannahostrowski
Copy link
Member Author

Thanks for all the feedback/reviews @diegorusso!

@savannahostrowski savannahostrowski merged commit ad43918 into python:main Dec 16, 2025
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants