Battle: Play, Node.js, Flask, uwsgi, gevent, gunicorn, Torando and MongoDB
A few benchmarks on popular web frameworks you would consider to build a REST API.
Also a good occasion to talk about an event-driven model versus a more classical threaded model
This first part is an explanation of general programming concepts and might be skipped if you are not interested or if you are already well aware of these notions.
#Introduction -should I thread?-
First, ask yourself these questions when creating your application.
Is my application CPU bound ? Will it mainly do complex calculations (maths, images processing, encoding, etc …) ?
Is my application IO bound ? Will it mainly call a database, read or write files to the disk, making network calls?
Let’s start with a computer with one CPU and one core and run your application.
If your application is CPU bound, then your CPU will be busy doing a useful job and there is not much you can do other than waiting for it to finish or scale with a more powerful CPU or another machine.
But if your application is IO bound, in a lot of cases (we will see when next), your application will be stuck as well even if it is not really doing anything else than waiting for your network or disk to provide some input.
##So how could my application server more requests?
Let’s assume you are in the first case -CPU bound-
The traditional approach would be to have a most efficient CPU, in a simplified model, a CPU running at a higher frequency, so finishing its job faster. Well there are some technology limitations in how high the frequency can go, so nowadays we have CPUs with multiple cores.
To go faster, one strategy is to send smaller amount of work to each of the core and then gather the results. In a web application, if one core is busy, we could still process other requests with other cores.
###Let’s move back to our web application and introduce the concept of processes and threads
When starting your application, the operating system will start a process to isolate it from other applications running at the same time in the OS. It will allocate a certain amount of memory to it and a port on which you can access the application. Now we need a way to benefit from the multiple cores that we have nowadays in our computers (even the cheapest regular desktop computer you can buy today).
The best way to achieve that is through threads. You can view a thread as a lightweight process. Threads are created within the application and share the memory and state allocated to the process -which is a nice feature as we will see later- They are cheaper to create/dispose than processes but are also a bit more dangerous than processes. One thread can make your whole application crashes (the process, therefore all the threads) whereas processes run in isolation. In addition as they share memory and state, you can access the same objects from different threads and run into bad situations like deadlocks. This generally results from bad coding, so it can be fixed. Creating processes instead of threads is more expensive in time and resource (memory usage). Moreover there is no memory sharing, so it is not well suited for sessions based application in which you will store session primarily in memory. Stateless applications as a simple REST API do not have this concern.
###What should be my ideal configuration?
If your CPU has N cores, then ideally you would have one process with N threads, each of the threads running on one core and always doing something useful, meaning not waiting for IO.
This ideal solution is not achievable in all programming languages. A few are mentioned here as they will be used next for the benchmarks. JVM languages (Java, Scala, Groovy, …) : Run 1 process with N threads Python (most common implementation is CPython: It is not possible to run more than one thread at the same time, so you need to create N processes. A lot of times threads are blocked/waiting on IO, in this case it is still a good idea to have multiple threads within one process (refer to threads advantages in previous section). Application server will usually run a pre-forked model with a parent process creating nbCpu child processes. So usually it is possible to share memory for the server libs (like Flask lib) Node.js: Will run one thread per process, so create N processes as well
###What is a simple API or application usually doing?
Connecting to a database or reading from disk!
At this point, I want to clarify some terms and concepts you will see in the rest of the article:
###Threaded model
Most of the applications you will find in the wild are based on this model. Remember that if your core is doing something useful (not waiting for IO), there is not much you can do about it, but this is not the most common case. A lot of time, your thread will be reading from a socket or from disk and it will be waiting for the input to be available. You cannot do anything else with your thread because it is blocked doing that. If you have another thread ready -let’s say from another request-, the CPU scheduler will switch to this thread and run it, actually doing something interesting. This is the classic model and formerly the most popular one because it is simple to understand and easy to code against; programming languages also exposed apis to read/write in a blocking way more than they did in a non blocking fashion. Yet this model has flaws when you want to handle thousands of concurrent requests.
Plus: Well implemented in most languages, easy to understand and code against - synchronous -
Minus: It does not scale well. Each thread has a stack trace for execution and this consumes memory (1MB per thread on a Linux JVM 64 bits), even worse if we are talking about processes instead of threads. In addition the CPU scheduler introduces an overhead when switching a lot between so many threads, meaning that a high fraction of the CPU availability will be used for the scheduler. Have to be careful about shared variables (mutex, synchronization, deadlocks, …)
###Event model
In an application mostly IO bound (does not apply if highly CPU bound), we just saw that we spent most of our time waiting/blocked. This model only runs a single thread but it is crucial that we do not run any blocking operations in this thread. The thread only executes useful code and listen to events. Instead of blocking when reading a socket, it will usually register a callback to be executed when the socket is ready. Ideally reading the socket is done through non blocking systems operations -epoll, kqueue, select- .
Plus: scale very well, low memory usage, no cpu scheduler switch
Minus: Harder to code against, usually lot of callbacks and nested code structure, often associated with asynchronous coding, harder to debug or trace execution, does not fit well in all use cases
###Synchronous code
The code will be executed sequentially as you see it. The thread waits for a statement to complete -even if blocking- before executing the next statement.
###Asynchronous code
The code will NOT be executed in the order you see. The thread will execute asynchronous code (as sending an async HTTP request) and move on before receiving the response). As you eventually want to do something with the response, this style is usually associated with callbacks. You register a callback to be executed when you receive a notification that you received the response. The other way of dealing with async -instead of callbacks- is to use Futures or Promises which telles you you will get the result or an error at some point in the future.
###Blocking/Non blocking
This reflects the status of your thread. You may achieve non blocking by different mechanisms as being notified (event) or by polling a status on a given interval. Note that even if your thread is blocked, you may have non blocking code block, if you run your blocked thread code in another thread and do not wait for the response in the current thread.
People usually mixes async/non blocking IO, which is fine, usually async is associated with callbacks being registered while you can do a non blocking IO by calling an non blocking socket which would just give you an error instead of wait if data is not there. Also note that you can have blocking code without blocking I/O. In a multithreaded environement, if you have two threads competing for the same shared resource and the code is synchronized, one thread will be blocked until the other one release the resource.
Note also that a very heavy CPU bound application can lead to blocking application. That’s why we hear that frameworks like Node.js are not suited for such heavy work. If you do not run multiple processes (with the cluster module for instance), you will run 1 process and 1 thread. If your thread is busy doing a long task, the server cannot process any other request. If you don’t mind that your heavy background job will take 40 sec instead of 30 sec, you can still run other threads just to serve requests. Imagine serving request which would take 5ms and almost no CPU but you cannot serve any of them because of the heavy job, that’s annoying. With multiple threads or processes, the CPU will switch from time to time to the other threads to serve requests.
#Benchmarks Purpose of the test is to get a single document from a database (MongoDB) and send it to the client.
A lot of systems move to a n-tiers model with one tier servicing as a content source API. Usually the new standard for this system is to connect to various other back-end systems and provide a HTTP api to the clients (used to be SOAP, now flavour is more REST). As a result, this system often depends mainly on I/O and does not do crazy calculations => event driven is an appropriate fit! Therefore, we will test a web application waiting on database calls.
Many clients will create requests at the same time. The goal is not to test and stress MongoDB but rather the web frameworks. As a consequence, the database will only contains one simple record.
We will test returning a small document (only one field) and a larger document (hundreds of fields).
##Configuration
Client | Server application | Server database |
---|---|---|
Intel i5-2500K | AMD-FX 8320 8 cores (VM of 4 cores) | AMD-FX 8320 8 cores (VM of 3 cores) |
8GB Ram | 4GB Ram | 2GB Ran |
128 Gb SSD | 128 Gb SSD | 128 Gb SSD |
Ubuntu 14.04 server | Ubuntu 14.04 server | Ubuntu 14.04 server |
I first tested JMeter which has nice graphs and metrics. Yet it is heavy and threaded. So creating more clients use more threads. It resulted that JMeter was a bottleneck (100% CPU use) so I moved to wrk which is a better fit for pure load tests: https://github.com/wg/wrk
JMeter is nice if you want to build tests scenarios.
###Frameworks tested:
- Flask, python (v2.7.6) : on uwsgi, PyMongo driver -threaded model-
- Flask, python : on uwsgi (2.0.4) with gevent(1.0.1), PyMongo driver -event model via patching-
- Flask, python : on Gunicorn (v18.0) with gevent, PyMongo driver -event model via patching-
- Tornado, python : Motor mongo driver -event model-
- Node.js (v0.10.25) , javascript V8: mongo-native driver, Event model
- Play! (v2.2.3, openJDK 1.7.0_55) scala: ReactiveMongo, Event model
###Non blocking libs
In an event driven framework like Node.js, all the libs you will usually find are asynchronous as it is the design of Node.js, so you do not have to think too much. When using Play! (Java or Scala) or Tornado (Reactive frameworks), you will also leverage Java and Python and therefore you can find many libraries which are blocking. Always try to find async non blocking libraries.
For Play! and Mongo when working with Scala, you could use the following:
- Java mongo driver
- Casbah (scala mongo driver)
- Salat (layer on top of Casbah)
- Reactive Mongo
On these 4 choices, only Reactive Mongo is a good choice as it the only one which is not blocking. Play! will only start (configurable) nbCpus threads to serve requests.
####So what to do if I do not find any non blocking libs?
Well your best solution is to execute blocking code in another thread pool, so you do not impact the pool serving requests. You usually want this thread pool to be larger (in the hundreds , and you will increase the number with the time your threads will block and the frequency/concurrency of use of these threads)
##Setting up the VMs
We use Oracle VirtualBox to create the VMs.
###On Mongo VM, install:
Just follow instructions here:
http://docs.mongodb.org/manual/tutorial/install-mongodb-on-linux/
Create a db, collection and document:
Increase the number of open descriptors:
###On Server VM: install: Install git: here
Execute server as root
#Tests
As an introduction, I want to emphasize again how careful you have to be if you choose the threaded model.
After installing the dependencies above for Flask and uwsgi, you could run:
This will serve 1 request at a time. If your database call is 10ms, then you will serve sequentially 1000/10=100 requests per second.
To see this behavior explicitly, you can test the /sleep endpoint. by sleeping 1 sec, we simulate a blocking database call which would take 1sec.
##Intro
###We run 1 client
Running 5s test @ http://192.168.0.12:5000/sleep
1 threads and 1 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.00s 0.00us 1.00s 100.00%
Req/Sec 0.00 0.00 0.00 100.00%
5 requests in 5.02s, 390.00B read
Socket errors: connect 0, read 4, write 0, timeout 0
Requests/sec: 1.00
Transfer/sec: 77.67B
###We run 2 clients but the throughput is still the same
Running 5s test @ http://192.168.0.12:5000/sleep
1 threads and 2 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.01s 0.00us 2.01s 100.00%
Req/Sec 0.00 0.00 0.00 100.00%
5 requests in 5.02s, 390.00B read
Socket errors: connect 0, read 4, write 0, timeout 1
Requests/sec: 1.00
Transfer/sec: 77.68B
###Now if we run 2 threads:
Running 5s test @ http://192.168.0.12:5000/sleep
1 threads and 2 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.00s 0.00us 1.00s 100.00%
Req/Sec 1.00 0.00 1.00 100.00%
9 requests in 5.03s, 702.00B read
Socket errors: connect 0, read 8, write 0, timeout 0
Requests/sec: 1.79
Transfer/sec: 139.59B
Not that running two processes will work too but the overhead is larger with processes than threads. You can run 1000 threads (you cannot run 1000 processes -cpu, memory-)
##Test Flask
Adjust ip in app.py to your mongo VM ip (I use bridge network adapter in VM, so that my VM has an IP visible on the network)
###Flask + uwsgi -threaded model, it is the default-
One drawback of using multiple processes is using more memory and also the multiplication of Mongo clients -one pool per process- and thus it is harder to configure properly the connection pool because you have to know the number of processes that you will run in production.
###Flask + uwsgi + gevent
Gevent add an event model (non blocking) and provides a synchronous api on top of it (pretty unusual and pretty cool), so you do not have to write callbacks.
It will patch all I/O operations so they won’t block anymore. So this means that we can run the process with only one thread.
Gevent in uwsgi has an option called asynccores. Actually it has nothing to do with actual cpu cores or threads as only one thread runs but it allows to specify the number of max concurrent requests allowed.
Indeed each additional concurrent request will result in a representation of the request context and state in memory and thus additional memory consumed; obviously you can specify a very high number and memory will only be consumed per the actual concurrent requests.
Uncomment the following line in app.py
And increase the number of connections in the pool to 500.
If you plan to support 1000 concurrent users and you would run 4 processes, you may lower the connection pool in each client to 250.
Here we specify a max of 5000 concurrent request (async cores). Note that we run a single process and a single thread !
We may also increase the number of processes.
###Flask + gunicorn + gevent
Gunicorn by default uses sync workers (worker means process). It is not possible to specify threads.
As a consequence for us, it is almost mandatory to run it with an async framework like gevent.
So leave the gevent patching uncommented in app.py
, go the app folder and and run:
Not that usually they advice to run python frameworks with 2*nbCpu+1 processes, so in our case 9, yet as we only do non blocking I/O, it does not bring additional perf. It would, if we would have some blocking calls.
##Test Node
Node, as python runs 1 thread in 1 process, so we use cluster module which will pre-fork nbCpu processes for us. Cluster module is defined in the code of the application. Same remarks for pool size as python apply.
##Test Tornado The idea behind Tornado is the same as with Node. It does not run under uwsgi, as uwsgi does not support async calls.
##Test Play
It will download required dependencies. Play starts one process and uses nbCpu threads to handle requests.
##Client
Install wrk: here git clone or download the zip.
##Results
I ensured to replay tests at least 3 times for variance and also ensured that I had enough threads*processes to cover for the nb of requests in the threaded model.
For play, I used 6000 connections in the Mongo pool configuration.
For python and node, I used 6000/4 processes = 1500 in the pool configuration
Be sure to check your ulimit -n settings
Max load is 5000 concurrent requests.
###First test is a with a very small json document (only 1 field)
Successful requests
Failed requests
Conclusion: Play! is the fastest, threaded model with uwsgi is the slowest and the one throwing the most errors.
###Second test is a with a big json document (src on github)
Successful requests
Failed requests
Conclusion: Looks like Node is really fast as de-serialization of BSON-> JSON via Javascript must be very efficient compared to the other languages.
Event-driven frameworks are fast. Gunicorn is a better choice than uwsgi.
##Full report:
###Small document
####200 clients/concurrent requests:
Play:
test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 57.07ms 73.09ms 1.18s 87.40%
Req/Sec 3.52k 636.42 5.50k 71.55%
104786 requests in 30.00s, 16.09MB read
Requests/sec: 3492.87
Transfer/sec: 549.17KB
Node:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 69.71ms 96.14ms 766.52ms 95.80%
Req/Sec 2.95k 191.85 3.41k 69.70%
87592 requests in 30.00s, 22.14MB read
Requests/sec: 2919.74
Transfer/sec: 755.60KB
Flask + UWSGI threaded:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 159.40ms 133.66ms 901.64ms 75.38%
Req/Sec 828.39 71.71 0.92k 72.73%
24712 requests in 30.00s, 3.72MB read
Socket errors: connect 0, read 24709, write 0, timeout 45
Requests/sec: 823.69
Transfer/sec: 127.09KB
Flask + UWSGI + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 91.16ms 21.97ms 316.48ms 91.30%
Req/Sec 1.54k 62.07 1.67k 75.93%
46240 requests in 30.00s, 6.97MB read
Socket errors: connect 0, read 46223, write 0, timeout 0
Requests/sec: 1541.34
Transfer/sec: 237.82KB
Flask + Gunicorn + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 73.31ms 59.08ms 537.16ms 80.88%
Req/Sec 2.69k 164.36 2.88k 86.25%
80353 requests in 30.00s, 18.54MB read
Requests/sec: 2678.43
Transfer/sec: 632.99KB
Tornado + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 68.19ms 28.91ms 167.48ms 73.29%
Req/Sec 2.91k 136.80 3.26k 64.49%
86753 requests in 30.00s, 22.67MB read
Requests/sec: 2891.78
Transfer/sec: 773.78KB
####1000 clients/concurrent requests:
Play:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 195.40ms 78.51ms 2.42s 92.69%
Req/Sec 5.15k 539.84 6.14k 62.50%
152946 requests in 30.00s, 23.48MB read
Requests/sec: 5098.15
Transfer/sec: 801.57KB
Node:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 352.00ms 215.41ms 1.96s 79.08%
Req/Sec 2.83k 172.33 3.24k 75.00%
84316 requests in 30.00s, 21.31MB read
Requests/sec: 2810.50
Transfer/sec: 727.33KB
Flask + UWSGI threaded:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 889.14ms 1.62s 24.14s 97.30%
Req/Sec 786.79 116.25 1.09k 69.70%
23529 requests in 30.00s, 3.55MB read
Socket errors: connect 0, read 23527, write 0, timeout 2693
Requests/sec: 784.29
Transfer/sec: 121.01KB
Flask + UWSGI + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 433.91ms 227.04ms 4.08s 93.26%
Req/Sec 1.55k 88.75 1.74k 59.46%
46356 requests in 30.00s, 6.98MB read
Socket errors: connect 0, read 46355, write 0, timeout 193
Requests/sec: 1545.13
Transfer/sec: 238.41KB
Flask + Gunicorn + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 391.82ms 352.81ms 8.90s 91.60%
Req/Sec 2.48k 272.07 3.11k 71.88%
74063 requests in 30.00s, 17.09MB read
Socket errors: connect 0, read 0, write 0, timeout 345
Requests/sec: 2468.73
Transfer/sec: 583.43KB
Tornado + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 377.50ms 211.26ms 1.79s 84.83%
Req/Sec 2.59k 490.37 3.77k 73.26%
77651 requests in 30.02s, 20.29MB read
Requests/sec: 2586.83
Transfer/sec: 692.18KB
####5000 clients/concurrent requests:
Play:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 5000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 784.53ms 1.00s 22.79s 96.99%
Req/Sec 6.01k 0.97k 7.69k 71.43%
177443 requests in 30.00s, 27.24MB read
Socket errors: connect 0, read 369, write 0, timeout 9831
Requests/sec: 5914.00
Transfer/sec: 0.91MB
Node:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 5000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.90s 1.15s 6.71s 90.60%
Req/Sec 2.43k 521.25 3.42k 68.97%
70955 requests in 30.00s, 17.93MB read
Socket errors: connect 0, read 0, write 0, timeout 12651
Requests/sec: 2365.01
Transfer/sec: 612.04KB
Flask + UWSGI threaded:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 5000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 7.26s 9.86s 26.35s 79.21%
Req/Sec 0.87k 158.51 1.21k 74.19%
25560 requests in 30.02s, 3.85MB read
Socket errors: connect 0, read 26076, write 0, timeout 43560
Requests/sec: 851.37
Transfer/sec: 131.36KB
Flask + UWSGI + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 5000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.43s 2.08s 26.14s 93.81%
Req/Sec 1.58k 92.50 1.78k 72.73%
46912 requests in 30.00s, 7.07MB read
Socket errors: connect 0, read 46903, write 0, timeout 29852
Requests/sec: 1563.52
Transfer/sec: 241.25KB
Flask + Gunicorn + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 5000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.20s 2.13s 27.03s 96.46%
Req/Sec 2.16k 242.02 2.56k 69.44%
64328 requests in 30.00s, 14.85MB read
Socket errors: connect 0, read 388, write 0, timeout 38478
Requests/sec: 2144.11
Transfer/sec: 506.71KB
Just to show you that more processes don't improve,
next are tests with 9 processes for gunicorn
W/o gevent
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 433.07ms 2.85s 28.31s 98.93%
Req/Sec 1.34k 533.32 2.00k 70.44%
40363 requests in 30.00s, 9.12MB read
Socket errors: connect 0, read 0, write 0, timeout 202
Requests/sec: 1345.40
Transfer/sec: 311.39KB
Here it is not too bad, because db call is on the same machine, so smaller than 1ms; it would be awful if db is slow or busy or network latency is high.
With gevent
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 76.28ms 98.70ms 1.88s 88.86%
Req/Sec 2.59k 92.71 2.77k 66.23%
77420 requests in 30.00s, 17.87MB read
Requests/sec: 2580.67
Transfer/sec: 609.88KB
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 400.74ms 284.32ms 5.32s 92.30%
Req/Sec 2.50k 92.26 2.66k 62.50%
74142 requests in 30.00s, 17.11MB read
Socket errors: connect 0, read 0, write 0, timeout 67
Requests/sec: 2471.36
Transfer/sec: 584.05KB
Tornado + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 5000 connections
` Thread Stats Avg Stdev Max +/- Stdev
Latency 2.33s 1.39s 19.26s 90.36%
Req/Sec 2.15k 278.71 2.46k 62.50%
61850 requests in 30.00s, 16.16MB read
Socket errors: connect 0, read 0, write 0, timeout 15260
Requests/sec: 2061.52
Transfer/sec: 551.62KB
###Big document
####200 clients/concurrent requests:
Play:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 254.29ms 262.76ms 2.09s 92.59%
Req/Sec 1.05k 364.50 1.58k 65.52%
31045 requests in 30.00s, 179.13MB read
Requests/sec: 1034.84
Transfer/sec: 5.97MB
Node:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 138.11ms 89.99ms 594.72ms 77.12%
Req/Sec 1.44k 79.86 1.62k 69.70%
42899 requests in 30.00s, 345.38MB read
Requests/sec: 1429.96
Transfer/sec: 11.51MB
Flask + UWSGI threaded:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 604.89ms 866.90ms 5.74s 88.15%
Req/Sec 277.74 89.55 440.00 71.43%
8207 requests in 30.00s, 24.25MB read
Socket errors: connect 0, read 8722, write 0, timeout 329
Non-2xx or 3xx responses: 4498
Requests/sec: 273.56
Transfer/sec: 827.60KB
Flask + UWSGI + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 223.19ms 744.86ms 26.40s 98.98%
Req/Sec 628.22 24.90 678.00 68.75%
18811 requests in 30.00s, 114.81MB read
Socket errors: connect 0, read 18810, write 0, timeout 447
Requests/sec: 627.02
Transfer/sec: 3.83MB
Flask + Gunicorn + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 316.08ms 112.43ms 807.73ms 69.39%
Req/Sec 637.83 25.89 698.00 75.61%
19108 requests in 30.00s, 118.16MB read
Requests/sec: 636.93
Transfer/sec: 3.94MB
Tornado + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 240.82ms 145.22ms 818.39ms 81.85%
Req/Sec 846.31 81.98 1.04k 63.79%
25263 requests in 30.00s, 156.99MB read
Requests/sec: 842.10
Transfer/sec: 5.23MB
####1000 clients/concurrent requests:
Play:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 857.45ms 589.91ms 15.82s 85.86%
Req/Sec 1.12k 170.53 1.50k 72.09%
33462 requests in 30.00s, 193.08MB read
Socket errors: connect 0, read 195, write 0, timeout 387
Requests/sec: 1115.38
Transfer/sec: 6.44MB
Node:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 740.09ms 424.49ms 3.88s 74.38%
Req/Sec 1.36k 74.50 1.54k 58.82%
40379 requests in 30.00s, 325.09MB read
Socket errors: connect 0, read 0, write 0, timeout 197
Requests/sec: 1345.88
Transfer/sec: 10.84MB
Flask + UWSGI threaded:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 3.25s 4.73s 13.16s 82.15%
Req/Sec 227.03 87.62 379.00 60.61%
6856 requests in 30.01s, 14.81MB read
Socket errors: connect 0, read 8355, write 0, timeout 9851
Non-2xx or 3xx responses: 4706
Requests/sec: 228.43
Transfer/sec: 505.14KB
Flask + UWSGI + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 462.35ms 1.44s 27.40s 97.03%
Req/Sec 634.82 34.25 720.00 67.65%
18968 requests in 30.00s, 115.77MB read
Socket errors: connect 0, read 18954, write 0, timeout 8766
Requests/sec: 632.26
Transfer/sec: 3.86MB
Gunicorn + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.48s 643.74ms 5.50s 72.59%
Req/Sec 647.46 35.99 718.00 62.86%
19351 requests in 30.00s, 119.66MB read
Socket errors: connect 0, read 0, write 0, timeout 860
Requests/sec: 645.02
Transfer/sec: 3.99MB
Tornado + gevent:
Running 30s test @ http://192.168.0.12:5000/mongo
1 threads and 1000 connections
` Thread Stats Avg Stdev Max +/- Stdev
Latency 1.13s 580.67ms 3.29s 65.88%
Req/Sec 0.87k 127.90 1.27k 79.31%
25725 requests in 30.01s, 159.86MB read
Socket errors: connect 0, read 0, write 0, timeout 191
Requests/sec: 857.27
Transfer/sec: 5.33MB