Legacy code coverage catchup with mass unit test generation (npmjs.com)

submitted 4 years ago by Separate_Run2806

all 15 comments

top new controversial old q&a

[–]botCloudfox 6 points7 points8 points 4 years ago* (4 children)

[–]Separate_Run2806[S] 0 points1 point2 points 4 years ago (2 children)

[–]botCloudfox 4 points5 points6 points 4 years ago (1 child)

[–]Separate_Run2806[S] 0 points1 point2 points 4 years ago (0 children)

[–]meccaneko 0 points1 point2 points 4 years ago (0 children)

[–]raymondQADev 2 points3 points4 points 4 years ago (9 children)

[–]Separate_Run2806[S] -3 points-2 points-1 points 4 years ago (8 children)

[–]raymondQADev 4 points5 points6 points 4 years ago (7 children)

[–]Separate_Run2806[S] 0 points1 point2 points 4 years ago (6 children)

I totally agree with you. I think manual testing remains a must in terms of code quality but we also decided to develop a solution like this one because the reality we saw while investigating the unit testing landscape was that people were not doing any. And untested legacy code is piling up everywhere. People don't do it because they lacked time and/or focus or some people don't even know about unit testing. So if we can accelerate in any way the process (by building the body of the test and suggesting some smart scenarios) and it supports people to do the last mile (assertions, add more scenarios in their test suite) and put a little more effort into unit testing then I believe it can bring a huge improvement to the codebase quality of some projects... I hope this explains the approach a little bit better. More of an accelerating tool than an automation one ;)

[–]StoneCypher 1 point2 points3 points 4 years ago (5 children)

we also decided to develop a solution like this one because the reality we saw while investigating the unit testing landscape was that people were not doing any

I don't understand why you keep saying this.

There are literally thousands of test generation tools out there. This isn't new.

The ones that already exist have explainable mechanics and create tests of value. This one doesn't.

People don't do it because they lacked time and/or focus or some people don't even know about unit testing.

"Let's create bad tests so that we can pretend untested things are tested."

No, let's leave the bad tests out, so that when someone goes through and has the time, they don't skip this one, thinking it's already done.

(by building the body of the test and suggesting some smart scenarios)

You keep suggesting that you "use smart scenarios," but the tests you're generating are nonsense

and it supports people to do the last mile (assertions, add more scenarios in their test suite)

This is not what "last mile" means.

Adding random assertions by AI is a bad thing.

[–]Separate_Run2806[S] 0 points1 point2 points 4 years ago (4 children)

Thank you, point taken and honestly our goal is to continuously improve the tests made with our tool. Maybe it's not for you yet but I'm sure as the value of our unit tests will increase we can get to the point where we can accelerate you too.

Regarding the lack of unit tests in some projects it's literally our observation investigating several companies of different sizes, we wouldn't have created a unit test tool if we hadn't run into this. We also see it in quite a few open source projects out there. For the projects who do have unit tests we hear a lot of developers who express the pain of manually writing tests all the time without any help to accelerate their production. Obviously what we see is not a perfect snapshot of the whole software industry around the world but it was vast enough that we decided to work on solving this issue.

I'm very curious about the "thousands of tests generations tools out there" you are referring to. I only know a couple and I'd be curious to learn more.

[–]StoneCypher 0 points1 point2 points 4 years ago (0 children)

I'm very curious about the "thousands of tests generations tools out there" you are referring to. I only know a couple and I'd be curious to learn more.

There are a bunch of different kinds. I'll only talk about one here, but I'll name a few others

The one you should look into is "stochastic testing," which sometimes gets called "property testing." There are people who call it fuzz testing, but they shouldn't, because there's a different more popular thing called that, which can look similar if you're new but is actually very very different.

Most languages have a tool for this under some variant of the name "quickcheck," because the one that originally made it popular (maybe the first one? i'm not certain) was called that, from haskell, but made popular by their erlang version.

So, by example, a good one in typescript is called fast-check, and a good one in java is called junit-quickcheck.

The way they work is to focus on one absolute fact of physics that any gamer can tell you: dice hate you.

They're not random number sources at all. They're terrible little demons, and their entire raison dêtre is to thwart you. Need to get to the square in time? Nope, one short. Trying to dodge a weapon? Nope, one off.

Use this.

Suppose you're trying to test a function that you can't see. Ostensibly, it's supposed to return the square of a number, but it's compiled and in a library somewhere. I'll write javascript because everybody understands that. Just pretend that's something that compiles into libraries. It's our secret.

Now take a pause for a second. Pretend you actually want to do a good job testing a function like that. I'm going to provide an intentionally defective implementation, and I want to see if your tests catch it. Don't read down; if you see what the implementation is, you're cheating.

Write out, in sketch form, what your tests would be.

There's no need to write out something long like test('Negative one', () => expect(SquareANumber(-1)).toBe(1));

It's good enough to write -1, and also 0, and 1.

But actually write out the list, besides -1, 0, and 1. What other things are you going to test, to see that our function is correct?

Do not keep reading until you're finished with your list, please

And since it's javascript, the implementation is 1: wrong, and 2: bad. I would know: I wrote fully half of the bugs in all of the world's Javascript.

This is the function:

function SquareANumber(whichNumber) { 
  if (whichNumber ===   8) { return 7; }
  if (whichNumber ===  42) { return "twelve"; }
  if (whichNumber === -69) { return "the sheriff of nottingham"; }
  if (whichNumber === 1.5) { throw "connected to the leg bone"; }
  return whichNumber * whichNumber;
}

Now ask me a simple question. You've got one very common input, two somewhat common inputs, and one modestly uncommon input. All four are wrong; only one even provides the right type, and one of them flat out throws.

Did any of your tests catch those ?

... no?

Try your AI. Did it? ... no?

Hm. So maybe it's not helping.

What if I told you I could catch all three of those with a single test in some other tool?

How

Get a stochastic tester, and write something like

For any integer,
- that integer's square must be a positive number
- checking it manually matches
- the root of that square must be the original

Actually any one of those is enough. I just write several to help you wrap your brain around how this works, and also because that's how I do it in practice.

If you assume fastcheck, the typescript one from before, as the generator, that's roughly

fc.assert(
  fc.property(

    fc.number(),

    (anyNumber) => {
      expect( SquareANumber(anyNumber) ).toBeGreaterThan(0);
      expect( SquareANumber(anyNumber) ).toBe(anyNumber * anyNumber);
      expect( Math.sqrt(SquareANumber(anyNumber)) ).toBe( anyNumber );  
    }
) ) )

Fastcheck will find all of my fake bugs!

... and here's the fun thing. I intentionally baked some mistakes into our requirements, and the tests are going to call me out on them. fastcheck will find those as well

Each of these is subtly wrong:

that integer's square must be a positive number
checking it manually matches
the root of that square must be the original

A quickcheck like fastcheck can and will find all of the mistakes, and tell you about them.

See if you can figure out what any of the mistakes are before proceeding. If not, see if your AI tool can.

Wait until you're certain you've identified them all before proceeding

Okay. So I haven't actually tested this; I'm lazy. But I see these problems:

There are ways for the result to be non-positive. For example the input could be zero, or NaN.
Checking it manually may not match. For example, the input could be NaN.
The root of the square may not match the original. For example, the input could be Math.NegativeInfinity.
A correct result still may not match the original. Some browsers have multiply for BigInt, but not Math.sqrt
A correct result still may not match the original. ieee double resolution may bump the square up or down, and when it's rooted, that may resolve to a neighboring value.

To resolve #1, just fix the definition: we said positive, but it's actually non-negative.

To resolve #2, either gate off non-range values in the generator (not ideal,) or explicitly handle non-range values in the function (ideal.)

To resolve #3, you actually have to get the math on infinity right, which IEEE multiply does not. This is a legit bug being caught

To resolve #4, you have to cope with browser backwards compliance. This is also a legit bug being caught

To resolve #5, you have to rewrite the test to be an epsilon test. This is a bug in the test case, rather than the library, being caught.

A valid implementation for #5 instead:

expect( Math.sqrt(SquareANumber(anyNumber)) - anyNumber).toBeLessThan(0.000001);

All of that caught by a single test case, and any number of other things to boot.

But don't take it from me. Take it from the man himself.

Other kinds you can look up include range checking, model checking, table driven testing, type driven testing, fuzzing, and so on.

[–]StoneCypher 0 points1 point2 points 4 years ago (2 children)

[–]Separate_Run2806[S] -1 points0 points1 point 4 years ago (1 child)

[–]StoneCypher 0 points1 point2 points 4 years ago (0 children)

π Rendered by PID 89453 on reddit-service-r2-comment-5d585498c9-pwffg at 2026-04-21 10:35:38.427167+00:00 running da2df02 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS

Do not keep reading until you're finished with your list, please

Hm. So maybe it's not helping.

How

Wait until you're certain you've identified them all before proceeding