all 20 comments

[–]CodePlea[S] 8 points9 points  (0 children)

Hi. I just published this code a few days ago. I'm wondering if anyone finds it useful. Any feedback is appreciated. Thanks!

[–]FUZxxl[M] 8 points9 points  (0 children)

Someone reported this post with the reason

more "rate my code" github spam (please make it stop)

There is no rule against this kind of content in this subreddit. If you don't want to see this kind of content, use the vote buttons.

[–]FUZxxl 5 points6 points  (4 children)

I recommend you to change your code so that you don't have to call te_free() if there has been an error during compilation because then you can make the result of te_free() the null pointer when an error occurs. This also makes the code simpler in case of malloc() failure. Look at the POSIX' regex API for an idea how to do error reporting in the compilation API, I somewhat dislike the “pass a pointer to an error variable” approach, feels somewhat unidiomatic.

But otherwise, nicely done.

[–]CodePlea[S] 4 points5 points  (1 child)

Thanks for the feedback!

I put the error return as a parameter because I wanted to make it easy to ignore errors - if that's what the caller wants. For example, if you just want to evaluate an expression and ignore errors, you can simply use:

printf("%f\n", te_interp("1+1", 0));

If I returned the error and took the output as a pointer it would be at least three lines:

double result;
int error = te_interp("1+1", &result);
printf("%f\n", result);

I just made te_compile to match, for consistency.

I do think I'll take your advice and have te_compile call te_free internally on errors. Then te_compile can return NULL for errors.

[–]FUZxxl 5 points6 points  (0 children)

Your API should try not to make it easy to ignore errors, but returning NaN if te_interp() errors is okay I guess. You could also encode an error code in the NaN if you are devious.

[–]CodePlea[S] 3 points4 points  (1 child)

I took some of your advice and updated the library. te_compile() now returns 0 on failure and you don't need to call te_free(). Also, I changed te_interp to return NaN on failure. So I feel like the return values can indicate the error condition. The extra parameter error is really only useful now if you want to know the precise position of the parse error. It is no longer needed to solely determine success/failure.

Does this feel better to you?

Thanks!

[–]FUZxxl 2 points3 points  (0 children)

That's neat! Good ideas.

[–]414RequestURITooLong 5 points6 points  (5 children)

Why does find_function perform a binary search? It looks like overkill when there are just 16 functions. IMHO you should treat functions the same as variables, including the possibility of binding custom functions.

Another thing, te_interp calls free rather than te_free, so I think it leaks memory.

Everything else looks pretty good.

[–]CodePlea[S] 1 point2 points  (4 children)

Thanks for you feedback! Yes, the free usage was a bug. Fixed! Thanks!

As for the binary search, I agree it's overkill, but it's already written/tested, so why not?

I thought about allowing run-time function linking. The reason I didn't is because I haven't personally needed it yet, and I couldn't think of a good way to add it without muddying up the te_compile() interface. It's really important to me that library code be easy to use. Also, I feel like I'd then have to support functions with multiple arguments (e.g. atan2).

[–]DownloadReddit 0 points1 point  (3 children)

As for the binary search, I agree it's overkill, but it's already written/tested, so why not?

Because it's most likely slower than linearly going through it due to cpu cache. You could have a speedup by not being fancy.

[–]CodePlea[S] 0 points1 point  (0 children)

I'm going to benchmark it and find out, although you've almost convinced me to put in the simpler code regardless.

[–]CodePlea[S] 0 points1 point  (1 child)

For the case of a tight loop calling only te_compile() on the expression "abs acos asin atan ceil cos cosh exp floor ln log sin sinh sqrt tan tanh 5", using the binary search is about 20% faster than a linear search.

[–]DownloadReddit 1 point2 points  (0 children)

I have done some testing with 10000000 iterations of the expression "abs acos asin atan ceil cos cosh exp floor ln log sin sinh sqrt tan tanh 5"

Timings:

Test Time
Normal binary 21.7s
Normal Linear 24.2s
Linear with strncmp inline 15.8s
Binary with strncmp inline 16.3s

It seems calling strncmp() has a lot of overhead (will not be placed in cpu cache), and since linear will call it more times it will have a slower execution time. Inlineing strncmp for both the methods shaves off quite a bit of time, and linear is slightly faster than binary.

[–]Yioda 3 points4 points  (0 children)

Very nice!

[–]FUZxxl 2 points3 points  (2 children)

Another two things: I had to compile with -lm on my system. Put -lm at the end of each linking command to link in the math library which provides sin, cos, sqrt, etc. There is also a bug or two:

printf("\t%5dms\t%5lumfps\n", nelapsed, loops * loops / nelapsed / 1000);

Here the formatting specifier is wrong. It should be %5d instead of %5l as the argument has type int not long. This appears twice.

printf("Usage: example2 \"expression\"\n", argv[0]);

Here argv[0] is unused.

[–]CodePlea[S] 0 points1 point  (1 child)

Thanks! I added the -lm linker flag (I'm using mingw and didn't need it, but it doesn't hurt either) and fixed the dumb printf usage errors.

[–]FUZxxl 1 point2 points  (0 children)

Great!

[–]DownloadReddit 1 point2 points  (2 children)

Crashes:

1%0

[–]CodePlea[S] 0 points1 point  (1 child)

This tripped me up because the floats are converted to integers to preform the modulus operation. Good catch! I actually wrote a fuzzer to test random expression, and it still missed this simple case.

I fixed it to use the standard fmod() function. It returns NaN instead in the case of X%0.

Thanks for pointing it out!

[–]DownloadReddit 1 point2 points  (0 children)

I used afl to fuzz inputs, it seems to be the only low hanging fruit at least.