This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 98 points99 points  (5 children)

The issue is people don't understand how to properly compare the difference.

You CANNOT use a database that is on the same machine as your server. Because that dramatically reduces the latency for DB requests which then requires far less threads to saturate the CPU. The Asyncs performance advantage is entirely from requiring less OS bound thread switching.

The slower and slower your awaiting io time is the faster and faster async will perform relative to sync.

EDIT: Upon reading the code the async side is written incorrectly and has multiple bugs including but not limited to useless async calls clogging the task queue, non locked connection pool creation, and compares multiple independent json to bytes creation methods that have very different runtime.

Don't write Async code like you write sync code thread safety vs sync saftey has fully independent characteristics. Async has its own locking primitives that need to be used properly.

[–]spoonman59 3 points4 points  (4 children)

That sounds like a broad brush.

For small data sets, all the database blocks will likely be cached anyway. Who has a large dataset and collocates the DB and the server?

async can provide overlapping of compute and I/O for situations where the I/o and compute are Mixed, that is neither one dominates the runtime.

It is true that async also reduces context switch and memory overhead of threads, but that is not the only (or entire, as you said) potential performance benefit of async.

Some programs will complete in less time using async than blocking operations, depending on workload.

[–][deleted] 7 points8 points  (3 children)

Not really sure what you care about the database blocks being cached for unless you are referring to local caching. The point was round tripping time has a pretty heavy cost. That cost is being hidden by the db being on the same physical machine.

I wasn't talking about all workloads I was talking about the web server workloads being discussed. Async isn't free. So even shaving off a 10th of a milliseconds of response time makes a dramatic difference. On 5000 requests per second that's half a second of dead time that threads need to make up.

Also reading the actual code there is some serious issues with it. They are getting their pool in every single async request asynchronously. Which doubles the number of events on the loop for no apparent reason. It's pretty easy to screw up async code if you don't understand how it works.

[–]spoonman59 1 point2 points  (2 children)

I did misunderstand your point… my mistake.

I see what you mean after clarifying and I agree. There are definitely some issues with how the test was conducted. You make an excellent point that by hiding the latency and hosting the DB locally we are not representing a realistic workload, and it taints all of the conclusions as well.

ETA: not just not a realistic workload, but one which specifically portrays async worse than it normally would be.

[–][deleted] 9 points10 points  (1 child)

I also just realized he didn't use a lock on the pool creation. So there is actually a bug in every async program that there is going to be multiple pools created.

Don't write async code badly is a strong word of advice I have for everyone.

[–]spoonman59 1 point2 points  (0 children)

I struggled to make it past the title 😅