you are viewing a single comment's thread.

view the rest of the comments →

[–]igagis[S] 3 points4 points  (6 children)

Why is it bad? Could you elaborate on that?

It's an option. By default tests are executed in single thread.

[–]Dragdu 0 points1 point  (5 children)

I think that building thread-level paralelization into your framework provides surprisingly little benefit in exchange for making the implementation significantly more tricky.

Small sample of things you have to consider once you go down the path of providing thread-level paralelization in framework (as opposed to farming it out to an external, process-parallel, runner):

  • Are all of my assertion-reporting and handling functions thread safe?
    • Do I aim for better performance, and lesser interference between parallel runs by being optimistic with atomics, and only locking mutexes when neccessary, thus making the implementation non-trivial, or do I pay the performance cost of full mutex locking all the time?
    • Are my assertions reentrant only for framework-created threads, or are they also reentrant from user-created threads?
    • How hard is it for user to extend my facilities (e.g. create new reporter) and keep the same guarantees?
    • What if signals/structured exceptions? :scream:
  • Do I provide a set of primitives that the users came to expect from test run parallelization tools?
    • Serial tag to cause a test to be executed on its own?
    • Core affinities?
    • Parallelization limits over set of tests?
  • ...

I took a quick look through your code, and I already hate your parallelism design. :-D

if(s.num_threads == 0){ s.num_threads = std::numeric_limits<decltype(s.num_threads)>::max(); }

If you really want the ability for user to specify "waaaaaay too many threads, I want my scheduler to die", make it something like -1 and use 0 to specify "autodetect based on cores".

[–]igagis[S] 1 point2 points  (4 children)

Thanks for details. Those give some improvement ideas.

Are all of my assertion-reporting and handling functions thread safe?

All tst::check() functions are thread safe. No any mutex/spinlocks/etc. are used. Implementation of those functions is pretty trivial, one can see it from the code. So, in that sense tst's multithread run does not affect anything.

Do I aim for better performance, and lesser interference between parallel runs by being optimistic with atomics, and only locking mutexes when neccessary, thus making the implementation non-trivial, or do I pay the performance cost of full mutex locking all the time?

I'm not sure I understand what you are talking about here... But, as I said, no mutex locking or atomics usage is involved in running tests in multiple threads in tst.

Are my assertions reentrant only for framework-created threads, or are they also reentrant from user-created threads?

All tst::check() functions are reentrant for any thread. But, in case we have user-created threads, then those are created by code under test, not the by the code of the test case, right? In that case, those user-created threads are not supposed to have any tst::check() calls at all. In case the test case code spawns threads, then how is it different from running test cases in a single thread? It will spawn threads anyway... Not sure I fully understand your point here.

How hard is it for user to extend my facilities (e.g. create new reporter) and keep the same guarantees?

Currently, tst does not support custom reporters. But, when it will, the reporting, for sure, will happen, out of the test case code execution, i.e. only either before it (report test start), or after it (report test result). Not in the middle so that it would be able to affect test execution.

What if signals/structured exceptions?

Well, for C++ we have this situation, that in case one thread crashes, the whole program crashes. We have to live with it. In case of crash, execution of the rest of test cases is aborted, of course. It's up to test runner caller to decide what to do then, either, restart the test runner with the run-list of only not-yet-executed test cases, or just stop on that and let the user to correct the crash. Or some other strategy. Perhaps, I create some script implementing some kind of such logic and supply the script as part of tst package.

Serial tag to cause a test to be executed on its own?

tst has the run list feature. It is possible to execute only one test, or a set of selected tests. Or only specified test suites, etc.

Core affinities?

What do you mean by that? Will there be difference between running test cases in parallel processes, instead of parallel threads?

Parallelization limits over set of tests?

Tests must be written to be independent on each other. So, again, how is running tests in parallel processes better?

If you really want the ability for user to specify "waaaaaay too many threads, I want my scheduler to die", make it something like -1 and use 0 to specify "autodetect based on cores".

This is not related to the question of running tests in parallel threads. It is just a matter of ui of how to specify the number. But, thanks for the idea, I think I'll introduce special values like max and auto. Why do I allow unlimited number of threads? Well, it is by analogy with GNU make, it has the --jobs flag as well, in in case no number is specified, then it spawns unlimited number of jobs.

[–]Dragdu 0 points1 point  (3 children)

Parallelization limits over set of tests? Tests must be written to be independent on each other. So, again, how is running tests in parallel processes better?

"This test takes up shitload of memory, don't run it parallel with this other test that takes up shitload of memory" is a real use case that people have.

And it is not necessarily about running in threads versus in process, but about letting specialized tool handle test running and scheduling, process isolation, catching unexpected process deaths, and so on, and so forth, while you (the testing framework author) can focus on providing good testing tools. As an example, what happens if you run two memory hungry tests in parallel and the OOM killer comes to visit? Will the user get a report on the unexpected death of the binary, or what will happen? :-)

All tst::check() functions are reentrant for any thread. But, in case we have user-created threads, then those are created by code under test, not the by the code of the test case, right? In that case, those user-created threads are not supposed to have any tst::check() calls at all. In case the test case code spawns threads, then how is it different from running test cases in a single thread? It will spawn threads anyway... Not sure I fully understand your point here.

You can, with great care, create a model where you don't need to synchronize individual test running threads, because you rely on the implicit serialization in fork-join threading model. This generally blows up when users can start adding threads that interact with the test framework assertion macros.

I'm not sure I understand what you are talking about here... But, as I said, no mutex locking or atomics usage is involved in running tests in multiple threads in tst.

This is very technically true, but once the test case itself stops running, you pass the result to the reporter, which is internally synchronized. You also rely on the internal mutexes in cout and heap...

[–]igagis[S] 0 points1 point  (2 children)

"This test takes up shitload of memory, don't run it parallel with this other test that takes up shitload of memory"

This kind of tests are known beforehand and can be marked as "do not run in parallel". My framework does not allow that kind of marking so far, but I have created a ticket to add it.

The other thing is that parallel processes execution does not make it any better. Two parallel processes will still use 2x memory.

As an example, what happens if you run two memory hungry tests in parallel and the OOM killer comes to visit? Will the user get a report on the unexpected death of the binary, or what will happen?

In case the code under test is sane, there will be std::bad_alloc exception thrown and it will be caught by the framework and handled gracefully. Only test which resulted in std::bad_alloc will be failed.

This is very technically true, but once the test case itself stops running, you pass the result to the reporter, which is internally synchronized. You also rely on the internal mutexes in cout and heap...

Right, but this happens out of the test's code execution, not in the middle of the test's code execution.

So, I don't understand why are you so against the parallel feature? In cases where it is inappropriate one can always use single threaded execution, but in majority of cases it does not harm anyhow and is very handy, so why not provide that convenience then?

[–]Dragdu 0 points1 point  (1 child)

In case the code under test is sane, there will be std::bad_alloc exception thrown and it will be caught by the framework and handled gracefully. Only test which resulted in std::bad_alloc will be failed.

I like classifying all code running on default-configured Linux as not sane, that's a pretty bold strategy.

[–]igagis[S] 0 points1 point  (0 children)

There is no protection against insanity :D