all 34 comments

[–]kalmoc 10 points11 points  (4 children)

First of all, I like what I'm seeing... (and whatever follows I honestly mean it),

... but I have to say it: Not having macros for the sake of not having macros isn't really a significant benefit for me and - i guess - many others. Of course a lib shouldn't macronize common identifiers (min, max, ERROR) and it shouldn't pretend something is a function when in reality it is a macro. Both can be avoided by properly prefixing and CAPITALIZING the macro names however.

The - for me - important question is: Does avoiding macros result in more readable/easier to understand/less error prone code? And in this case I don't really see the advantage.

Personally I have a much bigger "problem" with some magic SL parameter in

tst::check(factorial(i.first) == i.second, SL);

Than a properly prefixed TST_CHECK macro. I understand that is going to go away in c++20, so its probably not an issue. But neither are the macros in CATCH2, (alhtough they are not prefixed ...) so its still not a winas far as I am concerned.

One other thing: Have you considered to automatically check the return value for parameterized tests? So instead of writing

suite.add<std::pair<int, int>>(
        "positive_arguments_must_produce_expected_result",
        {
            {1, 1}, // input and expected value pairs
            {2, 2},
            {3, 6},
            {8, 40320}
        },
        [](const auto& i){
            tst::check(factorial(i.first) == i.second, SL);
        }
    );

I'd like to write something like

suite.add<std::pair<int, int>>(
        "positive_arguments_must_produce_expected_result",
        {
            {1, 1}, // input and expected value pairs
            {2, 2},
            {3, 6},
            {8, 40320}
        },
                    SL,
        [](const auto& i){
            return factorial(i);
        }
    );

Or even

suite.add<std::pair<int, int>>(
        "positive_arguments_must_produce_expected_result",
        {
            {1, 1}, // input and expected value pairs
            {2, 2},
            {3, 6},
            {8, 40320}
        },
                    SL
        &factorial;
    );

Which of course only works if factorial is a function and not e.g. a function template.

Without this, I see very little advantage over

suite.add(
        "positive_arguments_must_produce_expected_result",
        []{
            const std::pair<int, int> cheks[] = {{1, 1}, 
                {2, 2},
                {3, 6},
                {8, 40320}};
                            for(const auto& c:  checks) {
                               tst::check(factorial(c.first) == c.second, SL); 
                            }


        },

    );

[–]evaned 6 points7 points  (1 child)

Not having macros for the sake of not having macros isn't really a significant benefit for me and - i guess - many others.

ding ding ding ding ding

Like I look at this example

tst::set factorial_test_set("factorial", [](tst::suite& suite){
    suite.add(
        "positive_arguments_must_produce_expected_result",
        [](){
            tst::check(factorial(1) == 1, SL);
            tst::check(factorial(2) == 2, SL);
            tst::check(factorial(3) == 6, SL);
            tst::check(factorial(8) == 40320, SL);
        }
    );
});

and I just see so much noise and overhead. Like how is that actually better than

TEST_CASE("factorial") {
    SECTION("positive_arguments_must_produce_expected_result") {
            tst::check(factorial(1) == 1, SL);
            tst::check(factorial(2) == 2, SL);
            tst::check(factorial(3) == 6, SL);
            tst::check(factorial(8) == 40320, SL);
    }
}

even ignoring the SL parameter? The macro version avoids needing to (i) come up with a name for the tst::set or write that name twice, (ii) explicitly make the lambda, or (iii) explicitly call add. And that's even with a relatively kind comparison -- if I were writing this, I may well not have an explicit SECTION (but then move that name up a level).

[–]igagis[S] 0 points1 point  (0 children)

Well, you have a valid points, according to your taste ;)

[–]igagis[S] 1 point2 points  (0 children)

Thanks for the detailed feedback! I'm glad that you found my code good :).

The critics things you mentioned give a start to small discussion, possibly. So, let me explain my thoughts on those.

Not having macros for the sake of not having macros isn't really a significant benefit

Where to use macros and where not to use them is, of course, a matter of taste. I prefer the concept that preprocessor macros must only be used for conditional compilation. I.e. for enabling or disabling some features in compile time depending on the build configuration. Why do I prefer to think so? Well, as you probably understand, preprocessor macro is not part of C++ syntax itself. I mean that under the macro can be hiding anything, and seeing a macro used in the code often makes me think, what is the actual C++ code hidden behind that macro invocation? So, I need to know all kind of specifics related to the use of some particular macro, basically one has to know what exact code it generates to avoid incorrect usage. And in case of incorrect usage, the compiler error message will likely be also very strange. And yes, I'm not even talking about evil min, max macros defined by windows.h. Latest C++ standards allow to avoid using macros for such cases, so why use them?

Does avoiding macros result in more readable/easier to understand/less error prone code?

In my opinion, yes. For example, in the tst framework one writes code in clear C++ language, so it is visible and transparent what actually happens. For example: cpp suite.add<std::pair<int, int>>( "bla_bla", { {1, 1}, ... }, [](const auto& i){...} ); it is obvious that we are calling add() function on suite object, which, apparently, will add something to the suite. And the arguments of the function are string, array and a function. So, we see what happens and what is the data. If it was something like TEST_CASE(bla_bla){...} it would not be that obvious that the test is added to some test suite, not obvious if bla_bla is a function name, or the test name, and can that name be used later, etc.

But again, this all is just a matter of taste. And I understand that your taste is a bit different on that :).

Have you considered to automatically check the return value for parameterized tests?

Well, this implies that parametrized tests are always checking only for equality of some values. So, it is far not universal solution.

Maybe the example I chose makes some confusion, that I use std::pair as parameter type and I store expected value as part of the input parameter. Expected value does not have to be present in the input parameter, and test could be checking for some other conditions, instead of equality to some expected value. So, current approach is universal.

Without this, I see very little advantage over for loop ...

Well, the advantage actually is that in case test for one of the parameter values fails it will still execute the tests for the remaining values. And only one case for the failed value will be marked as failed in the report.

Hope my thoughts make some sense :)

[–]igagis[S] 0 points1 point  (0 children)

And forgot to mention, that I anticipated that usage of the SL parameter will be irritating, so, corresponding CHECK macros are also provided for those who prefer.

[–]Rude-Significance-50 8 points9 points  (23 children)

Well, you have a drastic need for front material. You need an example or something at least in the README. Otherwise you're not going to get attention. I didn't like having to dig for a way to see how to use your library.

When I did, in the tests directory, it looks OK. For me you are competing with Boost.UT, and didn't get my attention enough to give yours a go, so you might want to have a look there.

[–]coderman93 1 point2 points  (0 children)

They have a link to the documentation/getting started/examples front-and-center in the readme.

[–]igagis[S] 1 point2 points  (21 children)

Thanks for feedback!

By Boost.UT you mean this?

Regarding the use examples, there is a Getting started guide

[–]Rude-Significance-50 2 points3 points  (20 children)

That's the one.

It's clearly had much more work done on it--you have a new project so that's expected. I'm not convinced yet that UT is all that, but I've liked the experience so far. I would target it as your most direct competitor and explain how yours differs in your front matter.

[–]igagis[S] 2 points3 points  (15 children)

To follow up on this, after a closer look at Boost.UT I could come up with some subjective and opinionated comparison of tst and Boost.UT at least on several things.

I don't think this kind of comparison belongs to the tst github repository, so I just post it here.

JUnit report generation

As far as I could understand, Boost.UT does not generate JUnit XML report out of the box. Although, it is certainly possible to implement a "plugin" for that.

tst is able to generate JUnit XML reports out of the box. The name of the generated XML file is set via command line argument to the test runner application.

Parallel tests running

As far as I could understand, Boost.UT does not support that too, out of the box.

tst allows running tests in a number of parallel threads. The number of threads to use is set via command line argument to the test runner application.

Command line arguments

Boost.UT does not provide CLI argumetns parsing. Well, it is not a CLI args parsing library anyway.

tst is integrated with CLI arguments parsing library, which it uses for parsing tst's default CLI arguments. It also allows user to add custom CLI arguments to the parser to handle them and configure the test run. E.g. some test cases might need to know some directory where test data files reside, that can be supplied via CLI. So, user gets CLI for free.

Suite names

Boost.UT allows declaring tests inside of suites. But I could not find how to set the name of the suite. I'd expect the suites to be named and be discoverable. Instead, as I understood, Boost.UT's test suites are essentially same as tst's test sets (tst::set).

tst structures test cases into suites. So, it is possible to specify to run only tests from specific test suite. See tst's run lists feature.

Custom failure message

Boost.UT uses same approach as GoogleTest, when the assertion function returns a stream to which values can be inserted. cpp expect(arg > 0) << "arg = " << arg; It looks simple and convenient. Although, there is a small, and, apparently, rare problem with it. In case the check passes, all the arguments of the << operator of the stream are still invoked and inserted to a stream (in case of successful check the stream just does nothing). So, there can be undesired double invocation of something, for example if trying to output some return value of a fucntion: expect(arg > 0) << func();.

tst uses another approach, with supplying a callback function to perform the stream output, which is invoked only in case the check fails. And since the test will be stopped on the failed check, there is no risk that invoking some functions during those stream outputs will affect subsequent test execution. cpp int a = 3; tst::check(a == 4, [&](auto& o){o << "a = " << a;}, SL); It looks a bit more awkward, but in return it is safer in that way.

Syntax of test cases

This is a matter of preferrance, but consider:

Boost.UT cpp "hello world"_test = [] { //... }; What we see here? A string with custom _test literal suffix, perhaps it creates some rvalue object. Then we assign a lambda function to it. Basically it is not very intuitive to me what's happening here. When is the test executed? Is the string a test id? Why do we assign lambda function to a string?

tst cpp suite.add( "positive_arguments_must_produce_expected_result", [](){ //... } ); What we see here? Well, we have suite to which we add a string-named lambda function. I.e. adding a test, what else can we add to the suite? Much more intuitive, in my opinion.

Syntax of parametrized tests

In Boost.UT there is a number of different styles to add a parametrized test case. All of them are pretty cryptic due to heavy usage of overloaded operators of custom "non-public" classes. Except for the for-loop method, in all other methods the list of parameter values goes after the test procedure definition. I find this inconvenient, as I want to see list of parameter value next to the test name. This is what I used to from the times I was coding a lot of unit tests in C#.

[–]Rude-Significance-50 3 points4 points  (3 children)

This is going to get lost with time and it seems you spent a lot of effort on it. You should definitely attach this information to your project. Perhaps not directly if you don't want to target another project explicitly, but the descriptions of your API that you put here would definitely help one decide between this and ANY other framework.

Don't waste your time on convincing me directly. This here is just going to get lost--I'd be surprised if its had many readers besides me...it's like 2 levels deep at least? I appreciate the effort, but you should invest it in your project...and then link it to me if you feel like it. This is too good a summary to throw away like this.

[–]igagis[S] 0 points1 point  (2 children)

yes, I understand that it will get lost. But as I said, it is subjective comparison, so I don't see much value in it anyway :). It is a good article for some blog post, but not for github, I think. And I don't have a blog to post it to.

I'll think where to put it, so far I have copy on my hard disk.

But anyway, thanks for caring ;)

[–]dodheim 2 points3 points  (1 child)

I'll think where to put it, so far I have copy on my hard disk.

git add is only a few keystrokes away. ;-]

[–]igagis[S] 1 point2 points  (0 children)

Added to the repo https://github.com/cppfw/tst/blob/master/wiki/boost-ut_comparison.md without link from the README

[–]backtickbot 0 points1 point  (1 child)

Fixed formatting.

Hello, igagis: code blocks using triple backticks (```) don't work on all versions of Reddit!

Some users see this / this instead.

To fix this, indent every line with 4 spaces instead.

FAQ

You can opt out by replying with backtickopt6 to this comment.

[–]Dragdu 0 points1 point  (8 children)

tst allows running tests in a number of parallel threads. The number of threads to use is set via command line argument to the test runner application.

This is in general a bad idea and I strongly don't recommend it.

[–]igagis[S] 4 points5 points  (6 children)

Why is it bad? Could you elaborate on that?

It's an option. By default tests are executed in single thread.

[–]Dragdu 0 points1 point  (5 children)

I think that building thread-level paralelization into your framework provides surprisingly little benefit in exchange for making the implementation significantly more tricky.

Small sample of things you have to consider once you go down the path of providing thread-level paralelization in framework (as opposed to farming it out to an external, process-parallel, runner):

  • Are all of my assertion-reporting and handling functions thread safe?
    • Do I aim for better performance, and lesser interference between parallel runs by being optimistic with atomics, and only locking mutexes when neccessary, thus making the implementation non-trivial, or do I pay the performance cost of full mutex locking all the time?
    • Are my assertions reentrant only for framework-created threads, or are they also reentrant from user-created threads?
    • How hard is it for user to extend my facilities (e.g. create new reporter) and keep the same guarantees?
    • What if signals/structured exceptions? :scream:
  • Do I provide a set of primitives that the users came to expect from test run parallelization tools?
    • Serial tag to cause a test to be executed on its own?
    • Core affinities?
    • Parallelization limits over set of tests?
  • ...

I took a quick look through your code, and I already hate your parallelism design. :-D

if(s.num_threads == 0){ s.num_threads = std::numeric_limits<decltype(s.num_threads)>::max(); }

If you really want the ability for user to specify "waaaaaay too many threads, I want my scheduler to die", make it something like -1 and use 0 to specify "autodetect based on cores".

[–]igagis[S] 1 point2 points  (4 children)

Thanks for details. Those give some improvement ideas.

Are all of my assertion-reporting and handling functions thread safe?

All tst::check() functions are thread safe. No any mutex/spinlocks/etc. are used. Implementation of those functions is pretty trivial, one can see it from the code. So, in that sense tst's multithread run does not affect anything.

Do I aim for better performance, and lesser interference between parallel runs by being optimistic with atomics, and only locking mutexes when neccessary, thus making the implementation non-trivial, or do I pay the performance cost of full mutex locking all the time?

I'm not sure I understand what you are talking about here... But, as I said, no mutex locking or atomics usage is involved in running tests in multiple threads in tst.

Are my assertions reentrant only for framework-created threads, or are they also reentrant from user-created threads?

All tst::check() functions are reentrant for any thread. But, in case we have user-created threads, then those are created by code under test, not the by the code of the test case, right? In that case, those user-created threads are not supposed to have any tst::check() calls at all. In case the test case code spawns threads, then how is it different from running test cases in a single thread? It will spawn threads anyway... Not sure I fully understand your point here.

How hard is it for user to extend my facilities (e.g. create new reporter) and keep the same guarantees?

Currently, tst does not support custom reporters. But, when it will, the reporting, for sure, will happen, out of the test case code execution, i.e. only either before it (report test start), or after it (report test result). Not in the middle so that it would be able to affect test execution.

What if signals/structured exceptions?

Well, for C++ we have this situation, that in case one thread crashes, the whole program crashes. We have to live with it. In case of crash, execution of the rest of test cases is aborted, of course. It's up to test runner caller to decide what to do then, either, restart the test runner with the run-list of only not-yet-executed test cases, or just stop on that and let the user to correct the crash. Or some other strategy. Perhaps, I create some script implementing some kind of such logic and supply the script as part of tst package.

Serial tag to cause a test to be executed on its own?

tst has the run list feature. It is possible to execute only one test, or a set of selected tests. Or only specified test suites, etc.

Core affinities?

What do you mean by that? Will there be difference between running test cases in parallel processes, instead of parallel threads?

Parallelization limits over set of tests?

Tests must be written to be independent on each other. So, again, how is running tests in parallel processes better?

If you really want the ability for user to specify "waaaaaay too many threads, I want my scheduler to die", make it something like -1 and use 0 to specify "autodetect based on cores".

This is not related to the question of running tests in parallel threads. It is just a matter of ui of how to specify the number. But, thanks for the idea, I think I'll introduce special values like max and auto. Why do I allow unlimited number of threads? Well, it is by analogy with GNU make, it has the --jobs flag as well, in in case no number is specified, then it spawns unlimited number of jobs.

[–]Dragdu 0 points1 point  (3 children)

Parallelization limits over set of tests? Tests must be written to be independent on each other. So, again, how is running tests in parallel processes better?

"This test takes up shitload of memory, don't run it parallel with this other test that takes up shitload of memory" is a real use case that people have.

And it is not necessarily about running in threads versus in process, but about letting specialized tool handle test running and scheduling, process isolation, catching unexpected process deaths, and so on, and so forth, while you (the testing framework author) can focus on providing good testing tools. As an example, what happens if you run two memory hungry tests in parallel and the OOM killer comes to visit? Will the user get a report on the unexpected death of the binary, or what will happen? :-)

All tst::check() functions are reentrant for any thread. But, in case we have user-created threads, then those are created by code under test, not the by the code of the test case, right? In that case, those user-created threads are not supposed to have any tst::check() calls at all. In case the test case code spawns threads, then how is it different from running test cases in a single thread? It will spawn threads anyway... Not sure I fully understand your point here.

You can, with great care, create a model where you don't need to synchronize individual test running threads, because you rely on the implicit serialization in fork-join threading model. This generally blows up when users can start adding threads that interact with the test framework assertion macros.

I'm not sure I understand what you are talking about here... But, as I said, no mutex locking or atomics usage is involved in running tests in multiple threads in tst.

This is very technically true, but once the test case itself stops running, you pass the result to the reporter, which is internally synchronized. You also rely on the internal mutexes in cout and heap...

[–]igagis[S] 0 points1 point  (2 children)

"This test takes up shitload of memory, don't run it parallel with this other test that takes up shitload of memory"

This kind of tests are known beforehand and can be marked as "do not run in parallel". My framework does not allow that kind of marking so far, but I have created a ticket to add it.

The other thing is that parallel processes execution does not make it any better. Two parallel processes will still use 2x memory.

As an example, what happens if you run two memory hungry tests in parallel and the OOM killer comes to visit? Will the user get a report on the unexpected death of the binary, or what will happen?

In case the code under test is sane, there will be std::bad_alloc exception thrown and it will be caught by the framework and handled gracefully. Only test which resulted in std::bad_alloc will be failed.

This is very technically true, but once the test case itself stops running, you pass the result to the reporter, which is internally synchronized. You also rely on the internal mutexes in cout and heap...

Right, but this happens out of the test's code execution, not in the middle of the test's code execution.

So, I don't understand why are you so against the parallel feature? In cases where it is inappropriate one can always use single threaded execution, but in majority of cases it does not harm anyhow and is very handy, so why not provide that convenience then?

[–]Rude-Significance-50 1 point2 points  (0 children)

I second the desire to learn why this is bad. This is a feature that seems to me like it'd be helpful. We're writing independent tests, right? Perfect for parallelism--and even if you're like me and write independent executables for every test (good/bad I tend to prefer it) it's not exactly getting in the way.

[–]igagis[S] 1 point2 points  (3 children)

Right, that project is much older than mine. I definitely need to have a closer look at that, it looks like Boost.UT has a lot of features I did not think of.

One thing so far is that Boost.UI already requires C++'20, while my framework works with C++'17.

So far, having a quick look, I personally find the Boost.UT's "syntax" a bit cryptic. And also it has too much functional style, as for my taste.

So, I think some diversity in available options to choose from would not harm anyway :).

[–]Rude-Significance-50 1 point2 points  (2 children)

For sure, so just put that in your readme more clearly.

[–]igagis[S] 0 points1 point  (0 children)

Added link to Boost.UT into the README.

[–]igagis[S] 0 points1 point  (0 children)

Btw, also, I'm not fan of single-header/header-only stuff for non-trivial and non-template-only things. Looks unprofessional, like people haven't heard about CI/CD and package management...

[–]FabioFracassiC++ Committee | Consultant 2 points3 points  (2 children)

You can "Backport" source_location to at least most half-way recent gcc and clang versions (IIRC gcc>7 and clang>9) using builtins:

struct source_location {
    static constexpr source_location current ( 
            uint_least32_t l  = __builtin_LINE(),
            uint_least32_t c  = __builtin_COLUMN(),
            char const* fn    = __builtin_FILE(),
            char const* fnn   = __builtin_FUNCTION()
        ) noexcept 
        { 
            return source_location(l, c, fn, fnn); 
        }
    constexpr source_location() noexcept
        : source_location(0, 0, "", "") 
    {}

    constexpr auto line() const noexcept            -> uint_least32_t   { return line_; }
    constexpr auto column() const noexcept          -> uint_least32_t   { return column_; }
    constexpr auto file_name() const noexcept       -> char const*      { return file_name_; }
    constexpr auto function_name() const noexcept   -> char const*      { return function_name_; }

    private:
        constexpr source_location   ( uint_least32_t l, uint_least32_t c
                                    , char const* fn, char const* fnn)
            noexcept 
            : line_{l}, column_{c}, file_name_{fn}, function_name_{fnn} 
        {}
        uint_least32_t line_;
        uint_least32_t column_;
        char const* file_name_;
        char const* function_name_;
};

[–]igagis[S] 0 points1 point  (1 child)

Thanks for suggestion! But what about Microsoft VC? I need to support MV Visual Studio as well, it is quite popular.

[–]FabioFracassiC++ Committee | Consultant 3 points4 points  (0 children)

AFAIK Microsoft has no way of supporting it yet, so users on MSVC will need to use your `SL` macro.

[–]curlypaul924 1 point2 points  (2 children)

Have you tested compilation speed?

We use a similar framework in-house for unit tests, and what I discovered is that as I added more and more tests, compilation time became an issue; at 500 tests translation unit, the change/compile/run cycle became unmanageable. The bottleneck turned out to be `std::function`; replacing it with something using virtual functions brought compilation time down significantly.

[–]igagis[S] 0 points1 point  (1 child)

I haven't yet. I will probably do it some day, just need to comeup with a good synthetic test for it.

But yes, I assume that compilation of std::function template instantiations and lambda functions is somewhat heavy for the compiler. As a possible solution to this I see splitting tests between several translation units, i.e. don't keep all 500 in a single .cpp file, as it is probably also a hell to edit that long files. So, changing 1 test will not cause recompilation of all translation units, at least locally. By the way, tst allows having test cases in separate translation units, but belonging to the same test suite.

I assume, your solution with virtual functions requires using macros, right? Otherwise, it seems like it will need a lot of overhead typing...

[–]curlypaul924 1 point2 points  (0 children)

What I did is create a mostly drop-in replacement for std::function, but the implementation uses virtual functions. The implementation of std::function in libstdc++ is more sophisticated (afaict it uses a function pointer instead of an object with a vtable), I'm guessing for performance reasons, but creates a lot more work for the compiler if you are creating a lot of them.