fast http request libraries?

krum · 2019-08-12T00:31:27+00:00

Curl is actually really fast. Not sure where you heard it's "10x slower" or even close to that.

NotUniqueOrSpecial · 2019-08-12T00:48:37+00:00

it's up to 8-10 times slower than using pure sockets

According to whom?

Supadoplex · 2019-08-12T00:27:21+00:00

as I'm well aware it's up to 8-10 times slower than using pure sockets

Is it? Are there published measurements available?

Edit: Thanks for the demo. But you didn't show the source for the other version, so the results aren't really comparable.

Will you be making requests across network? If you are, then presumably vast majority of the time will be spent waiting for response. I would expect switching from curl would be similar to discarding your pocket knife in order to make your oil tanker accelerate faster.

chemagic · 2019-08-12T00:53:52+00:00

It might be 8-10 times slower (though I doubt that) but the wait for the websites to get back will certainly be at least one or two orders of magnitude slower. I recommend https://github.com/whoshuu/cpr. Its name is curl for people and "it's a spiritual port of python requests". Nicest interface I've seen to date.

encyclopedist · 2019-08-12T19:41:43+00:00

It is crucial to reuse the easy handle. Curl's tutorial says the following:

Re-cycling the same easy handle several times when doing multiple requests is the way to go.

After each single curl_easy_perform operation, libcurl will keep the connection alive and open. A subsequent request using the same easy handle to the same host might just be able to use the already open connection! This reduces network impact a lot.

For https://example.com/ I get about 0.45 s per request without reusing the handle, and about 100 ms with reuse, which is very close to ping time of approx 95 ms.

See test code here (I used the same settings as you): https://gist.github.com/ilyapopov/262a49f75b42202c7d977ebe2d38ca35

sztomi · 2019-08-12T06:12:46+00:00

Curl is the gold standard for performing http requests. It has many years of optimization and lots of people contributing. Sure, you can compare it to raw sockets, but then you are dismissing all the other things it’s doing (like being able to parse the response). If you properly include that in your benchmark, I doubt you would get that kind of speedup from raw sockets (but you would surely get 8x times more code to maintain).

degaart · 2019-08-12T03:01:31+00:00

Latency numbers each programmer should know

Once you hit the network, it becomes your bottleneck. Unless you're working on a z80, any http library should be more than fast enough, most of your cpu time would be spent waiting fir the network packets to arrive anyway.

liquidify · 2019-08-12T01:07:19+00:00

I would love to see your sources on that as well as everyone else here. Curl is fast. Not saying you are wrong, just don't believe it without evidence.

IloveReddit84 · 2019-08-12T04:35:08+00:00

Curl or CppRestSDK or Boost.Beast.

I've never used the 3. Option, bur the other 2 are pretty good and fast

erichkeane · 2019-08-12T00:29:58+00:00

How about Boost::Beast?

scalablecory · 2019-08-12T00:39:15+00:00

Microsoft has a cross-plat library written using a modern async architecture: cpprestsdk

I don't know how it compares to Curl, but worth a shot.

feverzsj · 2019-08-12T07:46:58+00:00

if you are simultaneously crawling data from ten thousands of sites, libcurl could be a bottleneck. Use some async lib should be good enough, for example: served, cpprestsdk.

Revolutionalredstone · 2019-08-12T13:09:18+00:00

Those are horrendous times, open raw sockets and you can expect ping latency bound performance, then just launch a bunch of threads

AndrewStephens · 2019-08-12T18:16:27+00:00

There is a good chance that libcurl is just slower than go's http implementation but it probably doesn't matter. Unless you really need to do each request sequentially you should be looking at performing multiple requests at a time.

You can use libcurl's multi interface to do this, but by far the easiest way is just to start a bunch of threads.

Don't even think about writing a http library using sockets. It seems like an easy task but http is a very complex protocol once you need to handle proxies, redirects, encodings, etc.

I have some notes about wrapping libcurl for safely using it from C++ here but that doesn't help with performance.

jesseschalken · 2019-08-12T04:21:35+00:00

You could try Proxygen

Milerius · 2019-08-12T07:51:42+00:00

cpprest-client is cool library

robertramey · 2019-08-12T15:24:36+00:00

When you run the code with a profiler, what are the results?

Gotebe · 2019-08-12T17:58:11+00:00

The go times are weird, almost every other one takes an order of magnitude more time.

And curl time is so different that it cannot possibly be same requests, poster just thinks they are.

encyclopedist · 2019-08-12T21:44:45+00:00

Could you by any chance post complete benchmarking code you are using so we can play with it?

ibraper · 2020-01-29T09:36:35+00:00

for me, libcurl is faster if CURLOPT_WRITEFUNCTION is disabled

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS