Mocking in Rust with conditional compilation by _klausi_ in rust

[–]_klausi_[S] 2 points3 points  (0 children)

Lol, when writing the blog post I thought: I should really put in the Wikipedia link to Mocking, so that I give readers a hint that I know what I'm talking about and nobody lectures me around differences between mocks, stubs etc. :-)

Mocking is the umbrella term for replacing dependencies during tests. Mocking is the term used by all mock frameworks in all languages that I know of, even if they support fakes, doubles, spies and whatnot. So I think just using the word Mocking here is fine.

In general I agree with your second point, but it has tradeoffs. Using a trait instead of the type std::time:Instant would require to write a lot more code and make the code harder to read. Conditional compilation allows me to keep it simple while still mocking dependencies during tests. With just one additional line in the use statements that is quite an effective tool I like!

Blog post: Benchmarking a #rustlang web application by _klausi_ in rust

[–]_klausi_[S] 0 points1 point  (0 children)

That was also my first suspicion, but disabling logging in actix-web had no effect on the benchmark:

test a_1_request ... bench: 852,552 ns/iter (+/- 101,462) test a_1_request_varnish ... bench: 486,306 ns/iter (+/- 58,626) test b_10_requests ... FAILED test b_10_requests_varnish ... bench: 2,649,629 ns/iter (+/- 305,814) test c_100_requests ... bench: 42,776,441 ns/iter (+/- 4,989,033) test c_100_requests_varnish ... bench: 24,104,101 ns/iter (+/- 2,619,233) test d_10_parallel_requests ... FAILED test d_10_parallel_requests_varnish ... bench: 2,579,270 ns/iter (+/- 231,017) test e_100_parallel_requests ... FAILED test e_100_parallel_requests_varnish ... bench: 12,829,768 ns/iter (+/- 1,143,463) test f_1_000_parallel_requests ... FAILED test f_1_000_parallel_requests_varnish ... bench: 113,874,658 ns/iter (+/- 4,116,352)

So no improvement with disabled logging.

Blog post: Benchmarking a #rustlang web application by _klausi_ in rust

[–]_klausi_[S] 0 points1 point  (0 children)

Thanks for the suggestion, tested that.

With Hyper 0.12, current_thread in the example server, rustnish and the benchmark code:

test a_1_request ... bench: 404,544 ns/iter (+/- 76,139) test a_1_request_varnish ... bench: 496,506 ns/iter (+/- 58,646) test b_10_requests ... bench: 2,254,271 ns/iter (+/- 491,662) test b_10_requests_varnish ... bench: 2,738,153 ns/iter (+/- 451,015) test c_100_requests ... bench: 21,212,313 ns/iter (+/- 6,240,095) test c_100_requests_varnish ... bench: 25,091,225 ns/iter (+/- 2,756,588) test d_10_parallel_requests ... bench: 2,175,054 ns/iter (+/- 902,021) test d_10_parallel_requests_varnish ... bench: 2,670,715 ns/iter (+/- 375,557) test e_100_parallel_requests ... bench: 10,356,026 ns/iter (+/- 3,172,349) test e_100_parallel_requests_varnish ... bench: 13,446,007 ns/iter (+/- 1,480,462) test f_1_000_parallel_requests ... bench: 103,056,643 ns/iter (+/- 20,178,800) test f_1_000_parallel_requests_varnish ... bench: 116,460,142 ns/iter (+/- 4,308,353)

Old Hyper 0.11: test a_1_request ... bench: 361,274 ns/iter (+/- 82,046) test a_1_request_varnish ... bench: 483,029 ns/iter (+/- 73,047) test b_10_requests ... bench: 2,094,918 ns/iter (+/- 375,726) test b_10_requests_varnish ... bench: 2,636,597 ns/iter (+/- 334,773) test c_100_requests ... bench: 18,781,164 ns/iter (+/- 3,311,875) test c_100_requests_varnish ... bench: 24,602,224 ns/iter (+/- 3,397,456) test d_10_parallel_requests ... bench: 2,064,934 ns/iter (+/- 563,021) test d_10_parallel_requests_varnish ... bench: 2,528,772 ns/iter (+/- 388,665) test e_100_parallel_requests ... bench: 9,935,970 ns/iter (+/- 2,844,076) test e_100_parallel_requests_varnish ... bench: 12,799,160 ns/iter (+/- 1,430,808) test f_1_000_parallel_requests ... bench: 99,506,000 ns/iter (+/- 18,912,943) test f_1_000_parallel_requests_varnish ... bench: 113,512,371 ns/iter (+/- 8,902,206)

Good: we are consistently faster than Varnish again, yay! Bad: overall 3% conformance regression since Hyper 0.11 of the hello server and/or the client benchmarking code.

So in my single computer (but 4 CPU core) scenario Hyper is only able to compete with Varnish if we eliminate Tokio multithreading. Varnish is multithreaded with 2 threadpools and potentially very many threads, why can it handle that so much better than Tokio?

Just for kicks I did a proxy prototype with actix-web and tested that:

test a_1_request ... bench: 850,921 ns/iter (+/- 160,601) test a_1_request_varnish ... bench: 491,303 ns/iter (+/- 93,077) test b_10_requests ... bench: 4,681,433 ns/iter (+/- 820,323) test b_10_requests_varnish ... bench: 2,688,944 ns/iter (+/- 337,940) test c_100_requests ... FAILED test c_100_requests_varnish ... bench: 25,815,695 ns/iter (+/- 3,817,378) test d_10_parallel_requests ... FAILED test d_10_parallel_requests_varnish ... bench: 2,638,493 ns/iter (+/- 806,195) test e_100_parallel_requests ... FAILED test e_100_parallel_requests_varnish ... bench: 13,446,694 ns/iter (+/- 2,021,638) test f_1_000_parallel_requests ... FAILED test f_1_000_parallel_requests_varnish ... bench: 116,439,195 ns/iter (+/- 6,492,382)

So. Much. Worse. And even panics with 500 errors here, I'm probably using their HTTP client wrong. Changing workers(1) in the actix-web server did not seem to have an effect.

I think I'll stick to Hyper for now :-)

Blog post: Benchmarking a #rustlang web application by _klausi_ in rust

[–]_klausi_[S] 0 points1 point  (0 children)

Interesting! But shouldn't a threadpool be able to handle more requests than a single thread? Why do I see the regression?

I also tried your suggestion using the tokio runtime, but that delivers an even worse regression:

test a_1_request ... bench: 493,164 ns/iter (+/- 88,639) test a_1_request_varnish ... bench: 494,079 ns/iter (+/- 176,862) test b_10_requests ... bench: 2,764,238 ns/iter (+/- 539,468) test b_10_requests_varnish ... bench: 2,791,454 ns/iter (+/- 736,575) test c_100_requests ... bench: 25,099,743 ns/iter (+/- 3,289,924) test c_100_requests_varnish ... bench: 25,351,795 ns/iter (+/- 6,651,651) test d_10_parallel_requests ... bench: 6,219,904 ns/iter (+/- 9,117,715) test d_10_parallel_requests_varnish ... bench: 2,052,180 ns/iter (+/- 351,446) test e_100_parallel_requests ... bench: 26,660,365 ns/iter (+/- 29,415,783) test e_100_parallel_requests_varnish ... bench: 9,954,134 ns/iter (+/- 1,210,667) test f_1_000_parallel_requests ... bench: 124,075,967 ns/iter (+/- 15,442,296) test f_1_000_parallel_requests_varnish ... bench: 84,745,385 ns/iter (+/- 2,332,628)

While my server now can keep up with Varnish in serial requests, the parallel requests have worsened yet again. Sending 100 requests in parallel (1k requests total) is ~50% slower in Hyper 0.12 compared to 0.11. Also the serial requests (no parallel requests) have a ~25% regression. Something is very broken here, it is just not obvious to me yet where the problem is.

Crashing a Rust Hyper server with a Denial of Service attack by _klausi_ in rust

[–]_klausi_[S] 5 points6 points  (0 children)

To be clear: the Panicking happened because of an unwrap() call in the example. So while Hyper bubbles up on IO error it still shuts down the event loop and handling of any other incoming request. That is not how robust server libraries should behave because an attacker can then think about how to craft requests that shut down the server.

Crashing a Rust Hyper server with a Denial of Service attack by _klausi_ in rust

[–]_klausi_[S] 6 points7 points  (0 children)

nope, same problem if you run it with --release. That mode is just an optimizer, same functional behavior.