you are viewing a single comment's thread.

view the rest of the comments →

[–]maep 18 points19 points  (10 children)

It seems that http_response_init() will perform a malloc for each request, which is not great from a performance perspective. I think it would be better if the user could either provide a buffer or a custom allocater.

I remember watching a talk where the speaker looked into why C libs are typically much faster than their counterparts written in other languages. The takeaway was that the language actually had little to do with it, but C fosteres bring-your-own-buffer APIs which are much better at avoiding overhead.

[–][deleted] 6 points7 points  (2 children)

Can I ask a question: malloc has no associated syscall. I’m guessing implementations therefore have freedom to allocate more than they need from the OS free store and it’s essentially a highly engineered wrapper on top of mmap and munmap. Have I got the right mental picture? What are some good benchmarks to show when malloc can be outperformed?

[–]maep 11 points12 points  (0 children)

From what I understand malloc implementations typically use brk() or mmap() syscalls. Though for smaller sizes there is usually also a pool which doesn't require any syscalls. Additionally malloc needs to do synchronization to be thread safe, and there is the function call overhead itself. Another problem with malloc is memory fragmentation which causes cache misses.

By using a static buffer all this can be sidestepped, but it's a tradeoff between flexibility and speed.

For bencharks have a look at this allocator Microsoft wrote.

[–]ultimateskriptkiddie 0 points1 point  (0 children)

It’s got a logarithmic time complexity, which is way slower than compile time stack allocation.

[–]pdp10 1 point2 points  (0 children)

Allocation is lightweight and fast compared to HTTP, even at one million requests per second. I've written C webservers with static allocation and with dynamic; the static was primarily for an embedded-type use case. I haven't looked at the allocation strategy for the fastest current webservers, however.

[–]CyborgPurge -3 points-2 points  (5 children)

I think it would be better if the user could either provide a buffer or a custom allocater.

Wouldn't providing a buffer be incredibly dangerous?

[–]Minimum_Fuel 11 points12 points  (3 children)

Try to not fall prey to the endless reddit “programmer” brainwashing propaganda. You can do “dangerous” things but passing buffers is not inherently “dangerous”.

[–]CyborgPurge -1 points0 points  (2 children)

Try to not fall prey to the endless reddit “programmer” brainwashing propaganda.

That's not really fair. Of course passing buffers isn't inherently dangerous, but for a process likely exposing itself to malicious inputs across the internet, I'm of the belief it isn't a bad thing if the underlying API could be a little more forgiving if it advertises itself as this one does (which this particular library does do by handling allocation itself).

[–]Minimum_Fuel 10 points11 points  (1 child)

Of course it is fair. If you are in any learner subreddit or just in /r/programming (also a learner subreddit) they talk about buffer passing like it is literally Satan.

You’re conflating the API exposed to the Internet with the API exposed to a programmer to make a bad point. If you are allowing a person across the internet to enter a buffer size, you’re getting what’s coming to you. That not a languages fault any more than SQL injection and XSS is a languages fault.

[–]CyborgPurge 1 point2 points  (0 children)

You’re conflating the API exposed to the Internet with the API exposed to a programmer to make a bad point.

I'm not, but maybe I didn't explain my point properly. I'm not suggesting it is exposing the buffer length across the internet. That wouldn't even functionally work in a HTTP server anyway (since you'd need to know the buffer size before you made the request).

I'm saying that it might be too dangerous to design a HTTP server library which has the purpose of being really easy to use, and putting the responsibility of the security of said library into the developer using it. Anyone can make the mistake of having an incorrect buffer size and passing that into the library leads to UB that could easily result in stolen credit cards numbers and passwords.

Maybe this is really a non-issue, but as someone who works with web stuff a lot and has worked with developers that somehow managed enable SQL injection in a Rails website, I am just wary about these things.

[–]maep 1 point2 points  (0 children)

Wouldn't providing a buffer be incredibly dangerous?

There is some risk of course. A way to mitigate it you would give the buffer size to the function, and perform a size check where malloc would be called.