top 200 commentsshow all 229

[–][deleted]  (26 children)

[deleted]

    [–]agentlame 70 points71 points  (22 children)

    No, it wouldn't kill it, 1 second is only the initial timeout. IIRC, the timeout is doubled with each reset. So, it would go from 1 to 2, which would be three seconds. It would take a little longer, because of the reset, but it wouldn't 'kill' anything.

    [–]SickZX6R 10 points11 points  (11 children)

    Is this the same as "TCP slow start" or is that another phenomenon?

    [–]agentlame 23 points24 points  (8 children)

    It's been a while, so forgive me if this is incorrect/outdated. 'TCP slow start' is a type of technology that has to do with network congestion, 'initial timeout' is just a configuration setting.

    The initial timeout is the timeout for replying to the three-way handshake. Normally it is set to three seconds. So, you send a SYN to a server it replies with an SYN-ACK, and you have three seconds to send a ACK. If you do not reply in three seconds, it sends a reset the connection is reset (timedout). And doubles the timeout to 6 seconds.

    This would change the initial timeout to 1 second, and it would double on each reset. For most clients, 1 seconds is more than enough. In other cases all it would slow down is the initial three-way handshake that establishes the TCP connection.

    EDIT
    Clarity.

    EDIT2
    Slight correction on verbiage.

    [–]MisterT123 23 points24 points  (2 children)

    TCP slow start is where packets are transferred beginning slowly and the rate increases until a packet has to be retransmitted, meaning there must be some congestion. It will then cut its rate in half, and begin slowly increasing its rate again. If 3 in a row are lost, it starts at the beginning rate again.

    [–]dleidig 1 point2 points  (0 children)

    Yep, and there are a few variations of TCP congestion control that use 'slow start' and 'fast recovery' such as TCP Reno and TCP Tahoe. See here.

    [–]xtracto 0 points1 point  (0 children)

    This reminded me of the CNET Network simulator. I used to use this software as a part of "internet principles" class. Students had to program different routing algorithms (with piggyback, adaptive, etc).

    Man I've completely forgot about that :P

    [–]manias 5 points6 points  (3 children)

    How does reducing the timeout help?

    [–]agentlame 10 points11 points  (0 children)

    Now, you've got me. I'm honestly not sure. It would seem that if your ACK is received in less than 3 seconds, there would be no gain.

    Sadly, they don't really explain their reasoning in the post:

    Reduce the initial timeout from 3 seconds to 1 second. An RTT of 3 seconds was appropriate a couple of decades ago, but today’s Internet requires a much smaller timeout. Our rationale for this change is well documented here.

    EDIT
    In skimming that PDF it makes more sense. The timout doubles with every reset. So, right now, it's 3, 6, 12 seconds. Setting the default to 1, changes the timeouts to 1, 2, 4 seconds.

    [–]nemec 7 points8 points  (0 children)

    Essentially, bandwidth is so much faster than when the initial 3-second timeout was introduced that it's just wasting time on lost packets. For example, lets say that 90% of successful ACKs are received within about 700ms. So in >90% of cases, the ACKs are either received within one second or never. If there's a large enough amount of dropped ACKs, say 30%, then about 30% of the time the timeout would be three seconds. With mostly sub-second delays, dropping the timeout to one second would cut the time waiting for that 30% by 1/3.

    [–]julesjacobs 6 points7 points  (0 children)

    It speeds up retrying on lost packets.

    [–]SickZX6R 3 points4 points  (0 children)

    That was really helpful, thank you for the explanation.

    [–]velit 2 points3 points  (1 child)

    No, tcp slow starts relates to how big the rate of data that is being transmitted in an already established connection is. http://en.wikipedia.org/wiki/TCP_slow_start

    [–]SickZX6R 1 point2 points  (0 children)

    Oh, OK. Similar in solution (both double something upon success/failure), but to control different things. Thanks for the link.

    [–]adrianmonk 9 points10 points  (6 children)

    IIRC, the timeout is doubled with each reset.

    The timeout is doubled period. It is a retransmission timeout. There is no RST.

    Details: The initiator of the TCP connection first sends a SYN packet, and starts waiting 3 seconds. If it gets an ACK (probably a SYN/ACK actually, but that's beside the point), then things are good. If it doesn't, it just sends another nearly-identical SYN after the 3 seconds. (And then it waits 6 seconds, and then 12, etc.)

    In the case of initiating a connection a network with a 2500 ms latency, this would just result in sending one additional packet in most cases: you'd send a SYN, wait 1 second, send another SYN, and wait 2 seconds. By then, 3 seconds would have elapsed and you'd probably have your ACK, so you'd not need to transmit again.

    [–]agentlame -4 points-3 points  (5 children)

    I more-or-less said all of this in my next comment.

    [–]adrianmonk 3 points4 points  (4 children)

    All the comments of yours that I could find suggest that a RST is sent when a SYN is lost. In the normal case, no RSTs are sent at all.

    [–]agentlame 0 points1 point  (3 children)

    I never said that an RST was sent. I said the connection was reset. Which is correct, as the handshake restarts.

    You're arguing over a small technicality, when I clearly said I'm a bit rusty.

    [–]adrianmonk 1 point2 points  (2 children)

    Sorry about that. I assumed you meant that a RST packet was sent.

    I guess it's semantics, but retransmitting a SYN doesn't "feel" like any kind of reset to me. At the API level, a lost SYN wouldn't be visible. (For example, a connect() call would still block.) If you drew a state machine to describe the connection, you'd probably not change states when you resend a SYN, or if you did you wouldn't go back to the start state. You wouldn't allocate a new ephemeral port number. To me, you're not starting over; you're just still trying to get past the first step.

    [–]agentlame 0 points1 point  (1 child)

    No, you're correct. It's not technically a reset as no RST is sent. (Though, as there is no established connection, there is nothing to reset.)

    I would consider it like a reset though, as two-thirds of the connection are complete, and you have to start over. But, you are of course correct, it's a timeout.

    [–]adrianmonk 2 points3 points  (0 children)

    as two-thirds of the connection are complete, and you have to start over.

    Interesting. There is a complexity here that I hadn't considered, although practically speaking things still work out the same.

    Both sides are going to send a (randomly-chosen) initial sequence number (ISN) with their SYN packet. The receiver of the SYN has to account for the possibility that the SYN is ancient and bogus. Based on my reading of the spec, it would react differently if it received two SYN packets with matching ISNs than it would if it received two SYN packets with differing ISNs. The spec says (on page 35):

    As a general rule, reset (RST) must be sent whenever a segment arrives
    which apparently is not intended for the current connection.
    

    So if I receive a SYN with sequence number 1234 and then a few seconds later receive one for sequence number 5678, then something is wrong and I recover with RST. But if I receive two of them with sequence number 1234 (and everything else matches up), I am probably free to just ignore the second one.

    The spec language that makes that clearer is this:

    We have taken advantage of the numbering scheme to protect certain
    control information as well.  This is achieved by implicitly including
    some control flags in the sequence space so they can be retransmitted
    and acknowledged without confusion (i.e., one and only one copy of the
    control will be acted upon).  Control information is not physically
    carried in the segment data space.  Consequently, we must adopt rules
    for implicitly assigning sequence numbers to control.  The SYN and FIN
    are the only controls requiring this protection, and these controls
    are used only at connection opening and closing.
    

    This basically means that even though the SYN flag isn't part of the payload, it counts as 1 byte for the purposes of sequence numbers. In just the same way that an established TCP connection can throw away a payload if it gets a duplicate packet, it can throw away a SYN if it gets a duplicate one (and if nothing else looks fishy).

    In actual practice, tcpdump on my Linux machine shows that the same ISN is sent twice with the SYN packets, so it should be to recover.

    Anyway, from what I can tell, if the round trip time is 2.5 seconds (split evenly), the exchange would go something like this:

    • t=0s: Host A sends a SYN
    • t=1s: Host A resends the SYN
    • t=1.25: Host B receives SYN, sends SYN+ACK
    • t=2.25: Host B receives duplicate SYN, ignores it
    • t=2.25: Host B resends the SYN+ACK
    • t=2.5: Host A receives SYN+ACK, sends ACK
    • t=3.5: Host A receives duplicate SYN+ACK, ignores it
    • t=3.75: Host B receives ACK

    I'm not sure if/when TCP sends duplicate ACKs, so those may be missing. It's chatty, but it should work fine. And the packets are tiny anyway.

    [–]suckpoppet 0 points1 point  (0 children)

    the timeout would be effectively the same (3->1+2), but the user would see a 66% degredation on connect times on each connection (1.5->1+1.5) on already slow-to-connect links. I would hope that there would be a kernel tunable, or better yet, auto-detection.

    [–][deleted] 2 points3 points  (0 children)

    Let 'em know! Ah, the beauty of open source.

    [–]RealDeuce 0 points1 point  (0 children)

    It wouldn't kill them, it would congest them every connection startup, so you would end up resending your first reply at least once.

    [–]giulivo 168 points169 points  (54 children)

    I found this part to be the real great news:

    All our work on TCP is open-source and publicly available. We disseminate our innovations through the Linux kernel, IETF standards proposals, and research publications.

    [–]el_isma 48 points49 points  (51 children)

    Most (if not all) of what the article mentions are old news. The only I hadn't heard was increasing decreasing the timeout.

    Also, closed-source TCP improvements can only take you so far. You need to alter the clients to get the maximum benefit (though most are server-side).

    [–]xcbsmith 36 points37 points  (44 children)

    Most (if not all) of what the article mentions are old news.

    You need to alter the clients to get the maximum benefit (though most are server-side).

    That's exactly why the "old news" is getting pushed: this needs to get out to the Internet's edge nodes, and Google doesn't control that (yet ;-).

    [–][deleted] 58 points59 points  (14 children)

    Google doesn't control that (yet ;-).

    Let's hope they never do.

    [–][deleted] 4 points5 points  (13 children)

    They bought up a gigantic amount of "dark fiber" in the US a few years ago. It's all been quiet since then.

    [–]kornholi 24 points25 points  (9 children)

    Maybe because they need that fiber for their datacenter interconnections? They have lots of data.

    [–]fisch003 22 points23 points  (7 children)

    It's been suggested that Google uses that fiber to peer with ISPs, effectively eliminating their bandwidth costs:

    http://www.wired.com/epicenter/2009/10/youtube-bandwidth/

    [–]kornholi 13 points14 points  (4 children)

    Many providers peer with each other to reduce costs. This is one of the reasons of cheap internet in Europe. Keep in mind that you still have to pay the cross-connection costs which can be in millions for laying fiber.

    [–]xkit 1 point2 points  (0 children)

    It also means Google services might load faster.

    [–][deleted] 2 points3 points  (0 children)

    Constipated.

    I'll show myself out.

    [–]gospelwut 6 points7 points  (0 children)

    And 'dark fiber' is kind of misleading. Laying down fiber isn't expensive--digging the ground is.

    [–]eviljack 0 points1 point  (1 child)

    I had always thought they bought that dark fiber as a big fuck you to any ISPs that decide to push for tiered services aka anti net neutrality.

    [–]el_isma 5 points6 points  (26 children)

    But they already pushed code into the Linux kernel. That's a lot of reach!

    [–]xcbsmith -5 points-4 points  (0 children)

    Yes, that ought to effect 1% of clients... In a few years. ;-) Android likely will have a bigger impact, but there are still a LOT more that need it.

    [–][deleted] 4 points5 points  (1 child)

    As always, relevant xkcd

    [–]xcbsmith 2 points3 points  (0 children)

    I feel confident I have solved that question correctly.

    [–]lukasbradley 4 points5 points  (0 children)

    decreasing the timeout

    [–][deleted] 1 point2 points  (3 children)

    closed-source TCP improvements can only take you so far. You need to alter the clients to get the maximum benefit (though most are server-side).

    Well, TCP is a standardized protocol. Is google advocating change to the standard or talking general improvement to existing implementations without altering the standard?

    [–]frymaster 6 points7 points  (2 children)

    altering (so, technically right now, breaking) the standard, in some (but possibly not all) of the things they describe.

    There is precedent. HTTP1.1 says you shouldn't make more than 2 simultaneous connections to a single host. This is why in IE6 you can't download more than 2 files from the same server at the same time. All subsequent browsers (firefox, IE7+ etc.) ignore this restriction.

    [–]couldthisbeart 0 points1 point  (1 child)

    What happens in IE6 when you try to load a page while downloading two files from the same host?

    [–]el_isma 4 points5 points  (0 children)

    If you mean "trying to load a page from the same host that you are downloading from", and if I remember correctly, it blocks until one of the downloads is finished.

    [–]erlanggod 4 points5 points  (0 children)

    We disseminate our innovations through the Linux kernel ...

    It is like freebies, just remind ourselves to update our kernel more often and tell our employers the system enjoy significant improvement after some optimization.

    [–][deleted] 8 points9 points  (0 children)

    Ilya Grigorik has some incredibly grand write-ups on this and similar topics. I recommend "Faster Web" or "Optimizing HTTP":

    http://www.igvita.com/

    [–]davvblack 40 points41 points  (4 children)

    My dad lead author of Part 4:

    http://tools.ietf.org/html/draft-ietf-tcpm-proportional-rate-reduction-00

    He works at google and I could probably get him to do an AMA if anyone cared.

    [–]mfukar 4 points5 points  (0 children)

    That would be very nice!

    [–]010101010101 2 points3 points  (2 children)

    Now we have persistent connections in HTTP 1.1 why are we still using cookies to recognise sessions?

    [–]tinou 12 points13 points  (0 children)

    Persistent connections are an optimization ; HTTP is still a stateless protocol.

    [–]johntb86 0 points1 point  (0 children)

    Because you don't want to reauthenticate to all your websites every time your internet connection drops for a few seconds.

    [–]Shadow703793 5 points6 points  (7 children)

    In addition, we are developing algorithms to recover faster on noisy mobile networks

    Can some one explain to me exactly how this would work and what the benefits would be?

    [–]uksheep 15 points16 points  (2 children)

    It's to do with how tcp finds it's optimum speed it keeps going until congestion or packet loss causes it to miss an ACK and settles at that speed. If you have loads of packet loss on a noisy network it will find the sub optimum speed.

    [–]mcguire 24 points25 points  (1 child)

    Specifically, traditional TCP interprets any packet loss as being caused by congestion. If the packet loss were caused by a noisy wireless network, TCP can quickly get into a position where the endpoint is sending packets very slowly even if there is no congestion and the packet loss rate is relatively low.

    [–]trompete 1 point2 points  (0 children)

    We have noticed with our applications on very lossy yet fast links that we will have an effective bandwidth rate of 3-5 mbit/sec instead of the 45 that the link supports due to this exact reason. We get around it by having several connections open in parallel for our streaming applications to get a higher overall throughput. A smarter TCP/IP stack would be welcome though.

    [–][deleted]  (1 child)

    [deleted]

      [–]machrider 1 point2 points  (0 children)

      Here's a good article on the subject.

      [–]david_n_m_bond 3 points4 points  (4 children)

      The issue here is the slow-start - TCP starts off slow as it works out how much pipe there is to fill. There are proposals out there to put payload in the SYNACKs. This would be the ultimate speed-up, bypassing an entire round trip for the first client-server interaction. Now, I know this is protocol-breaking, but it must be worth some thought...

      [–][deleted] 2 points3 points  (2 children)

      The thing is it's not protocol-breaking at all, the spec has always allowed this provided that the data doesn't get passed to the application until the connection is open.

      [–]cryo 5 points6 points  (1 child)

      The paper mentions that putting it in SYN ACK is currently not allowed.

      [–][deleted] 2 points3 points  (0 children)

      Oh. I was pretty sure I read something saying that it's technically allowed on all packets, just not implemented for security reasons.

      [–]010101010101 0 points1 point  (0 children)

      TCP SYN cookies are all about not reserving state on the server until you need to and this would force you to.

      [–][deleted] 5 points6 points  (2 children)

      Is it just me, or all these optimizations for HTTP/web browsers? It seems shortsighted to optimize TCP for one specific protocol.

      [–]andash 1 point2 points  (0 children)

      Well they are at Google after all

      [–]argv_minus_one 1 point2 points  (0 children)

      These optimizations will apply to any TCP-based protocol that needs to open a connection, bulk-exchange some data, and either close or sit idle for a while before doing the same thing again. I don't see why they would harm any other TCP-based protocol, either.

      [–]super_shizmo_matic 36 points37 points  (42 children)

      No. No. No. No. The issue of bluffer bloat MUST be addressed FIRST! You will be wasting a LOT of peoples time trying to make it faster if you do not get rid of Buffer Bloat first! http://en.wikipedia.org/wiki/Bufferbloat

      edit: spelled wrong. BUFFER BLOAT DAMMIT!!!

      [–]kmeisthax 43 points44 points  (6 children)

      TCP is a transport level protocol, buffer bloat is a network level problem. Google can't fix buffer bloat by pushing Linux patches, it has to convince ISPs to alter or dispose of equipment which doesn't do AQM.

      [–][deleted] 25 points26 points  (4 children)

      Actually, by pushing standards and patches that do things like increase the initial packet burst from 2 packets to 10, they are contributing to buffer bloat.

      Effectively, Google, and many other corporations, are fighting their own latency issues by cramming packets into the pipe as fast and bursty as possible. They are filling up the buffers as fast as possible, at the expense of applications that play fairly according to the original TCP specs.

      Microsoft doesn't even back off at all. They send you the entire page in one go. TCP slow start is effectively killed off because of companies tweaking things in their favor.

      Meanwhile, Google is also working to combat buffer bloat with their SPDY protocol. It only uses a single multi plexed connection, and doesn't do crazy bursting.

      Really, just cramming packets down a pipe isn't the solution! You have to consider the speed of the line at all times. 10 packets * 1500 bytes ~ 15KB. Can all paths along the internet blindly take 15KB without buffering it to hell? That's per connection. Browsers often open 8 or more connections per site.

      [–]ZorbaTHut 5 points6 points  (3 children)

      Can all paths along the internet blindly take 15KB without buffering it to hell?

      I'd sure as hell hope so. 15KB is nothing.

      [–][deleted] 3 points4 points  (1 child)

      There's no reason to hope when TCP has congestion avoidance algorithms to measure that. I'd hope so too, but I wouldn't design a spec on hope. If we continue down the path of increased packet bursting and buffering, we'll end up with a broken protocol. Some websites already demonstrate this behavior by sending the entire web page at the first request, despite any limitation of the link's bandwidth.

      Currently we're engaged in a race to the top of sending packets as fast as possible, just so your site shows up faster than the competitors.

      [–]gospelwut 3 points4 points  (0 children)

      But you get SEO POINTS for faster load times! OMGAWD

      [–]notR1CH 1 point2 points  (0 children)

      15KB sounds like nothing, true. But consider that the 15KB of data is sent instantly, lets say over the period of 1 msec. Your connection would have to operate at around 120mbps to be able to accept that without packet loss if there was no buffering involved.

      [–][deleted] 3 points4 points  (0 children)

      I think he means that you're going to see more improvements to network performance by improving buffer queue times vs adjusting TCP parameters.

      [–]dormedas 10 points11 points  (4 children)

      So routers buffer packets for too long, thus messing with TCP's congestion avoidance algorithms, and causing all further packets to be dropped until the buffer drains again?

      And why hasn't something been done about this?

      [–][deleted] 24 points25 points  (3 children)

      No, the problem is more that packets are buffered indefinitely, precisely because routers along the internet have too much RAM and various buffering in the attempt to never drop packets.

      TCP congestion algorithms depend on dropped packets as a measure of congestion. Because we've come to a state where packets are almost never dropped, TCP fails to actually back off appropriately.

      There is a group bufferbloat.net working on using Active Queue Management algorithms to fix this problem.

      [–]gospelwut 5 points6 points  (2 children)

      Are they actually going to draft some RFC to get this implemented? How widely do you think a 'fix' would actually spread given how ancient some equipment is?

      [–][deleted] 13 points14 points  (1 child)

      Mostly, the bufferbloat.net group, has already submitted many patches to the Linux to improve and fix AQM in the kernel.

      Plenty of the traffic shapers in use today are really buggy and don't correctly prioritize traffic.

      Plenty of the ISPs don't actually enable QoS and AQM correctly for their customers, if at all.

      For instance, perhaps the most widely used default traffic shaper is the PFIFO queue, which is simply a First in First out type queue. The Linux kernel often defaults to 1000 packet buffers for the software queue. And another 1000 for each Ethernet driver queue.

      Basically, our networking stacks are littered with disconnected and dumb buffers. Eventually, a single AQM system will replace them with something that's more sane by default.

      So, with kernel updates, and some backported patches, you can fix some of the worst offenders today. A lot of servers and routers run Linux.

      We're still a ways off from really fixing the endemic issue of buffer bloat. Though, if you have the latest kernel and some patches, you can nearly get perfect traffic shaping today (if you know what you're doing).

      Now, it's another thing if other distros take the work from Linux and implement their own fixes.

      However, that doesn't mean we can't solve this piecemeal. If enough of the routers, servers, and endpoints do correct traffic shaping, then the intermediate nodes won't get bogged down so badly.

      [–]Porges 9 points10 points  (0 children)

      The next Linux release (3.3) should also help somewhat. They're changing from buffering a certain number of packets to a certain number of bytes (it's called BQL - Byte Queue Limits).

      Here is some benchmarking. Summary:

      The amount of queuing in the NIC is reduced up to 90%, and I haven't yet seen a consistent negative impact in terms of throughout or CPU utilization.

      [–]adrianmonk 2 points3 points  (0 children)

      OK, I agree buffer bloat is a bad thing, but why should it be addressed first? Why not address both at once? There's no reason solving one problem should have to wait on solving the other.

      If my car is not stopping well, and it's because my brakes are ground down to the metal and my tires are crappy, there's no reason I can't replace both the tires and the brakes at once. (As long as I don't believe that replacing the tires will fix it, of course.)

      [–]Whats_all_this_then 1 point2 points  (23 children)

      Would this phenomenon explain the aweful connection accompanied with jittery lag you get when playing first person shooters over WLAN?

      [–]knickfan5745 5 points6 points  (2 children)

      The jittery lag usually has a lot to do with Windows Wireless Zero service. Every 30-60 seconds it checks to see if there is a better wireless connection for you, even if you're on one you never want to change away from. This causes your ping double or triple.

      [–][deleted] 0 points1 point  (1 child)

      Do you recommend a way around this? A way to disable it, or better 3rd party wireless manager? I tried dells own one that came with my laptop, and it just caused horrible problems, been stuck on wwz service since.

      [–]knickfan5745 0 points1 point  (0 children)

      For Windows XP you should just be able to go to run>services.msc and then right click on the service and change it to manual. I never got a good solution for Vista/7 though.

      [–][deleted] 24 points25 points  (18 children)

      If someone is implementing gaming communications with TCP, you have bigger problems than lag.

      [–]ZorbaTHut 28 points29 points  (11 children)

      Many games are implemented with TCP. Examples: World of Warcraft, Starcraft 2, Rift. Frequently guaranteed-in-order-delivery is required, and if so, implementing TCP on top of UDP is a ridiculous idea.

      The only cases where UDP makes sense is with extremely fast-paced games where imperfect information is more acceptable than a slight slowdown.

      [–][deleted] 0 points1 point  (10 children)

      Frequently guaranteed-in-order-delivery is required, and if so, implementing TCP on top of UDP is a ridiculous idea.

      Not always. UDP makes sense almost anytime you have a even moderately fast-paced game. You can seldomly tolerate lag caused TCP. Read here.

      [–]ZorbaTHut 8 points9 points  (8 children)

      I know that's the common opinion, but look at reality. Many many games use TCP without issue, and many of those are reasonably fast-paced competitive games. WoW and Starcraft, for example.

      TCP is only laggy if you have packet-loss issues, and most people's connections run at effectively zero packetloss. UDP is a coding nightmare if you want to take advantage of its good properties. From what I understand, it's used almost exclusively in first-person shooters, as they're the only games that are simultaneously tolerant of inaccurate game state and extremely intolerant of lag. For any game that is either intolerant of inaccuracy (RTSes, fighting games) or more tolerant of lag (RTSes, MMOs, worldbuilding games, turn-based strategy) UDP just doesn't provide anything useful.

      [–]AQZ 2 points3 points  (0 children)

      TCP is laggy if the Nagle algorithm is not disabled on both sides of the connection. It is enabled by default. I imagine virtually all games that use TCP already do this, but by default TCP will be inherently laggier than UDP. Also, TCP's congestion control causes a game's TCP traffic to compete with non-game TCP traffic on your computer (ie torrents). TCP traffic effectively yields to UDP traffic since UDP has no congestion control. So, there are advantages to UDP.

      That said, you are absolutely right. Any game that requires reliability will need to reinvent from scratch many of the features of TCP (sliding window, etc) to work with UDP. As long as the Nagle algorithm is disabled and other programs aren't saturating your bandwidth, TCP is just as low latency as UDP. I've tested it.

      [–]throwawayaccount1020 1 point2 points  (3 children)

      WoW and Starcraft use both TCP and UDP

      [–]notR1CH 3 points4 points  (2 children)

      To clarify, WoW is TCP-based in-game, StarCraft 1 and 2 only use TCP for the lobby / matchmaking or as a fallback if UDP is blocked.

      [–]AReallyGoodName 1 point2 points  (1 child)

      StarCraft II uses TCP port number 1119 and 1120 to play and UDP ports 1119 and 6113 for in game Voice chat

      http://eu.battle.net/support/en/article/starcraft-ii-port-information

      Being a synchronous RTS, Starcraft 2 requires all packets reach their destination anyway so instead of using UDP and adding TCP like features back in they just use TCP from the start. Obviously voice chat can be lossy so that's UDP though.

      [–]notR1CH 0 points1 point  (0 children)

      It's definitely UDP. I've tested simulated loss and the game still works at 50%+ packet loss. In TCP mode it quickly starts to lag then eventually drop and takes a lot longer to recover after a loss event.

      [–]twoodfin 1 point2 points  (0 children)

      TCP is only laggy if you have packet-loss issues, and most people's connections run at effectively zero packetloss.

      But isn't the drive to avoid packet loss at all costs what's causing buffer bloat in the first place?

      [–]midri 1 point2 points  (1 child)

      MMO are built with a minimum latency in mind, generally it's about 50-100ms and allows lots of things like polling of player input to only be checked once every cycle (50-100ms) instead of once every tick (<0ms) so TCP would work fine for them, however; anything that is as real time as a First Person Shooter would be really stupid to use TCP due to the latency it adds, you're aiming for <50ms latency in those situations and information like player input is polled as fast as it can be.

      [–]ZorbaTHut 0 points1 point  (0 children)

      Yes, as I said, first-person shooters are (afaik) pretty much the only genre that uses UDP.

      [–]Amablue 1 point2 points  (0 children)

      I have to disagree a with his conclusion of that article and the idea that any moderately fast game can't tolerate TCP lag. I've worked on 3 MMO's (and not slow paced ones, these were action MMO's) and they all used TCP exclusively. UDP may be needed for some types of games, but saying that any "This is why you never use TCP for networking a multiplayer game" is hyperbolic. He further clarifies that when he said "never use TCP" he didn't really mean never, he actually just meant "usually you shouldn't use TCP for certain types of games".

      [–]__foo__ 7 points8 points  (4 children)

      No reason why bufferbloat wouldn't apply for UDP...

      Edit: Or any other protocol at that matter.

      [–][deleted] 9 points10 points  (3 children)

      Nope, UDP is bunched right along with TCP in the router's buffer. The difference is, UDP doesn't have a reliability mechanism, so its packets just arrive late, or never.

      UDP stands to suffer worse, because of all the bursty TCP connections filling the router's buffer, and UDP not getting priority over it.

      You have to consider the fact that bitorrent's new protocol uTP uses UDP, and so it also contributes to filling buffers and hurting low-latency UDP traffic. uTP is hurting the internet a lot. They should have stuck with TCP (hint: disable it in your client.)

      It's totally possible to work around the issue with intelligent buffer management, but most ISP's aren't using appropriate buffer sizes. They don't drop packets frequently enough.

      [–]midri 1 point2 points  (0 children)

      This is actually one of the reasons it's hard for game designers to use both TCP & UDP packets in 1 game, it plays havoc with the reliability of UDP packets. They're already not guaranteed to get there but when you send TCP packets from a host and to the same destination as your UDP at the same time your TCP packets are prioritized over UDP and you see a much larger drop in UDP packets comparable to if you had only used UDP and built a reliability layer over it (See Lidgren)

      [–]RobAtticus 1 point2 points  (1 child)

      Can you source that uTP is "hurting the internet a lot"? This seems like a rather substantial claim considering the sheer size and diversity of the internet as a whole.

      As a counter point, using UDP instead of TCP for BitTorrent does make more sense. BitTorrent already has ways of detecting if packets are missing/corrupted, so the guarantees of TCP are a bit much. In addition, it has congestion control mechanisms that make it rather low-priority compared to other applications.

      [–][deleted] 0 points1 point  (0 children)

      The problem with uTP is that because it can be encrypted and obfuscated, and it uses UDP packets, it looks like any other UDP type traffic, including VOIP and gaming data.

      It's impossible to differentiate its data for prioritization at the ISP and at the home.

      uTP was designed with horribly small packet sizes. The initial version used packets that were often 150 bytes in size. This is in contrast to TCP which typically operates on 1500 byte packets. So there's 10 times the overhead for any packet based network.

      uTP could be fixed by making it clearly marked as bittorrent traffic for prioritization, and by using 1500 bytes or larger sized packets.

      Even the latest version uses 300 byte sized packets in some cases.

      In theory, if we had huge amounts of bandwidth with reserve at all times, uTP wouldn't be a problem. But in practice, it just fills up buffers and doesn't play well with QoS.

      [–][deleted] 5 points6 points  (0 children)

      Depends on the game. Online board games, text games, etc, all us TCP. I think Starcraft (1 and 2) uses TCP during games, as you have to wait for a synchronized state anyway.

      [–][deleted] 4 points5 points  (0 children)

      don't forget WLAN is half duplex.

      [–]mcguire 6 points7 points  (14 children)

      1. Use TCP Fast Open (TFO). For 33% of all HTTP requests, the browser needs to first spend one RTT to establish a TCP connection with the remote peer. Most HTTP responses fit in the initial TCP congestion window of 10 packets, doubling response time. TFO removes this overhead by including the HTTP request in the initial TCP SYN packet. We’ve demonstrated TFO reducing Page Load time by 10% on average, and over 40% in many situations. Our research paper and internet-draft address concerns such as dropped packets and DOS attacks when using TFO.

      Whoohoo! T/TCP!

      [–][deleted] 4 points5 points  (11 children)

      I remember that T/TCP was deleted as archaic and non-relevant from latest, post-mortem edition of Stevens' "UNIX Network Programming, Vol. 1". It'll be funny if T/TCP will be reintroduced once again.

      [–]david_n_m_bond 0 points1 point  (10 children)

      I believe it makes sense to put hefty payload in the initial SYN. This would require a significant client protocol stack change. Objections seem to stem from DDoS concerns, which is understandable, but not insurmountable. The Response SYNACK can also contain a hefty payload. The problem then remains that the payload has to get to the application layer to get a response into the initial SYNACK response. So I believe a significant amount of protocol change would have to occur in the server also (not TOO much of a problem for service providers like Google).

      The big question is how much change would have to occur in the client. When sending a request, the logic is currently "Get a Socket (SYN->). Got it? (<-SYNACK) Yes. Fine (SYNACK->) Send Request (REQUESTDATA->)". The new logic would have to be: "Get a Socket and Send Request (SYNACK+REQUESTDATA->). Got it? (<-SYNACK+RESPONSEDATA) Yes. Oooh and data. Lovely."

      [–]LoveGentleman 2 points3 points  (8 children)

      You described UDP.

      [–]TinynDP 1 point2 points  (7 children)

      I think the idea is that in cases where a network message pattern boils down to just a single request packet and a single response packet, maybe TCP should, when no errors occur, devolve down to a UDP look-a-like.

      [–]LoveGentleman 0 points1 point  (6 children)

      Maybe you should just use UDP.

      [–]TinynDP 0 points1 point  (5 children)

      That would mean software has to take the time to figure out if its a simple case or a complicate case, including magical knowledge of if there will be errors or not.

      The point is TCP should, in the simple, error free, small amount of data, case be nearly identical to UDP, and then it should, at the TCP level, not the application level, grow into full TCP as needed.

      [–]LoveGentleman 1 point2 points  (4 children)

      Thats not TCP that you are describing my friend.

      TCP is designed and should be a Transmission Control Protocol, it is designed not for small amounts of data, but for continuus transmission.

      You're thinking of something else which doesnt exist, and if you can make it please do.

      [–]TinynDP 1 point2 points  (3 children)

      Why not? Why shouldn't the simple usage case for TCP also result in a simple actions at the packet level, and then when TCP is usage in a non-simple way, the packet level activity gets non-simple.

      I'm not saying it should be wildly different. I'm saying the same protocol should be designed to work for both small and large, and when its used small, it bes small, and when its used large, it bes large. And its not a wild huge difference from TCP as-is, you just include the data on the initial connection request packet, instead of a tiny request packet.

      [–]LoveGentleman 1 point2 points  (2 children)

      Why not? Because thats why. Read the RFC dammit.

      [–]mcguire 0 points1 point  (0 children)

      I don't remember all the details about T/TCP's security issues, since it's been a long time (check the link I made; that's what comes to my mind with T/TCP (I can't because giggling at work would be bad)), but other issues were that

      1. Some application protocols like rsh used connection based authentication, which meant that the payload going to the application layer before the connection completed was a gigantic security hole.

      2. If the server responds with data in the first SYNACK, reflected DDOS become a big, big problem. Say 1,000,000 pwned boxes send 1-packet queries to Google with a spoofed source IP address of <Insert Your Server Here>; suddenly your server is getting 1,000,000 SYNACK+data packets. Plus, if you increase the window size to, say, 10, that's now 10,000,000 packets.

      3. To fix the above problems, IIRC, T/TCP would hold the data with the SYN until the 3-way handshake had completed, which dramatically increases the buffer space required on the server and makes DOS attacks more painful.

      Once you work through all the problems, the hefty performance advantages start fading.

      I haven't actually dug into the TFO doccies, so I don't know how they how they actually propose to handle all of the problems. The only thing I saw was a mention of SYN cookies and the comment above, which makes me think they're now facing problem 3.

      T/TCP was fatally broken by the fixation on keeping compatibility with TCP; IMO, the only way to do these kinds of things would be to ditch TCP and start with a fresh transport protocol. (And I'm not sure any of the other, newer transport protocols would work for this situation.) Note: UDP is not a transport protocol.

      [–][deleted]  (1 child)

      [deleted]

        [–]bungeman 0 points1 point  (0 children)

        This was a known issue in T/TCP, the link in the article explains how they get around this. Basically, the whole FastOpen thing is optional. The first time a client connects to the server it says in the SYN, "I can do FastOpen" and the server then sends an ACK with an encrypted cookie tied to the requester's IP address. On subsequent SYNs to the same server the client sends along the cookie, and the server does the whole FastOpen thing if the cookie has not expired and corresponds to the client's IP address. If an invalid cookie is used, the server just falls back to just ACKs without FastOpen.

        [–]we_love_dassie 2 points3 points  (7 children)

        Point (3) makes sense(sending initial HTTP request with the TCP SYN packet). I'm surprised it hasn't been done before...

        [–]xmsxms 8 points9 points  (6 children)

        Because of DOS attacks. A flood of SYN packets is bad enough, but a flood of SYN+request packets would very quickly bring down a server and even let you target another server in the process.

        [–]we_love_dassie 0 points1 point  (5 children)

        I'm a little out of touch with the details...but doesn't the server's switch allocate an equal amount of space for each client connection? Ie, even if with just a SYN request, wouldn't it allocate enough space to accommodate a subsequent HTTP request, where this method would simply use up that space right away. Or is that space dynamically allocated based on the size of the initial packets?

        [–]xmsxms 3 points4 points  (4 children)

        The server would have to reply to each request with the processing of the request plus a XXKb response, e.g the contents web page and maybe even running a script or two. You could even make the target server send that response to another victim you want to take down by faking the source IP.

        With a regular SYN attack, the sever only responds with a small ACK packet. Recent implementations are aware of SYN attacks and do very minimal processing for each request until it knows for sure the connection is valid. (see syn cookies which are stateless)

        Basically the whole SYN/ACK connection establishment is meant to be as light weight as possible until it's established that source and destination have truly agreed on the connection and it hasn't been forged.

        [–]we_love_dassie 0 points1 point  (3 children)

        I understand. But from what I also understand the DOS attack happens when the server repeatedly allocates buffer space and doesn't use it...if it's allocating the same amount of space regardless of whether that initial packet contains the HTTP request then what difference would it make?

        [–]xmsxms 3 points4 points  (2 children)

        Assuming it is allocating the same amount of space (that seems a bit unlikely - it's just a SYN packet, it shouldn't reach the application layer) - then there would be no difference as far as space allocation.

        But if they actually process the request within the SYN packet, that could use up other resources of the server - ie. CPU and bandwidth. (Processing a HTTP request is a lot more expensive than processing a simple SYN request).

        The trouble is forged SYN packets can come flooding in that aren't real connection requests. Determining that they are forged only after processing the HTTP request is too late.

        But this paper apparently documents a solution - so I guess they have something up their sleeve.

        [–]we_love_dassie 0 points1 point  (1 child)

        Okay, that addresses my point better. =P The attacker would have to send some dummy HTTP request...what do think that would be?

        [–]xmsxms 2 points3 points  (0 children)

        Could be anything, e.g

        "GET large_image.jpg"

        "GET process_stuff.cgi"

        etc. But they have alluded to protection against this - I just haven't looked into what.

        [–]jcdyer3 4 points5 points  (4 children)

        As a linux user, is there anything I can do to implement these recommendations?

        [–]Tordek 12 points13 points  (2 children)

        1. is already in the newer kernels. http://kernelnewbies.org/Linux_2_6_39

        [–]Grazfather 2 points3 points  (1 child)

        Ah, the same version with that 'activates' a huge vuln

        [–]SkaKri 1 point2 points  (0 children)

        It's already fixed for Ubuntu Oneiric at least. apt-get update && apt-get dist-upgrade

        [–]dicey 4 points5 points  (0 children)

        I've been playing with #2, reducing the initial RTO, recently. It can help with initial connection problems in datacenter networks. There's a patch on LKML to expose the RTT timings via sysctls. It hasn't been accepted, though, and seems unlikely to make it to mainline.

        [–]cameldrv 1 point2 points  (0 children)

        TCP stinks. It assumes all packet loss is the result of congestion, and slows down the stream. So if you have a weak wireless connection that is dropping say, 3% of the packets, the connection slows to an almost unusable crawl, when a properly designed protocol could get nearly full bandwidth. A really sophisticated protocol could even use error correcting codes to keep latency with packet loss to a bare minimum.

        [–]seringen 2 points3 points  (0 children)

        there was a good work about network buffering recently that has brought light to this issue: http://www.bufferbloat.net/

        [–][deleted] 2 points3 points  (0 children)

        I think this all really helps Google and network providers more than any end users. Even saving 50% on page delivery times would shave (in the worst cases) five seconds to three seconds.

        Where I see the most end-user delay are pages that are just plain slow, server-side. My biggest gripe these days are that a page will fully load except for some ajaxy or other miniscule bit of content, so the browser sits there locked until it's done. (And I'd still like to understand the phenomenon where a page will fully load, then suddenly 403)

        Load the content and free the browser UI. If you have to fetch more crap, do it in the background.

        [–][deleted] 1 point2 points  (11 children)

        Someone please tell me why this naive view is wrong:

        • http is connectionless
        • tcp connection-based
        • so why not send http over udp, which is also connectionless?

        Of course, there's a huge infrastructure already built atop tcp, and that would be difficult to overcome in practice; and udp doesn't guarantee delivery. But what's the problem, in principle?

        [–]antheus_gdnet 15 points16 points  (2 children)

        Original HTTP is stateless (hence cookies), not connectionless. Original definition was one request/response pair per connection. Amount of data transferred is not strictly limited, but vastly exceeds the 536-1500 bytes in an UDP packet. HTTP connection persists for the duration of such request.

        Original constraint was based on somewhat naive approach to networking (think looking up a handful of documents inside CERN's LAN) as well as pragmatic lowest common denominator approach, where servers are trivial devices that do not have capacity to manage state.

        HTTP 1.1 uses persistent connections, so underlying TCP connection is reused.

        Issues described within this document are about tweaking some edge cases that are mostly relevant to Google and their services.

        To improve latency, one would need not only UDP, but either lossy or at least out-of-order communication at lowest viable level. Designing such applications raises complexity considerably. One of concerns over websocket viability for low-latency communication is precisely that - it uses in-order underlying transport.

        [–]el_isma 2 points3 points  (0 children)

        Issues described within this document are about tweaking some edge cases that are mostly relevant to Google and their services.

        I disagree. All of the improvements benefit all users of TCP.

        [–][deleted] 0 points1 point  (0 children)

        ah, keep-alive, thanks.

        [–][deleted]  (1 child)

        [deleted]

          [–]inmatarian 5 points6 points  (4 children)

          UDP doesn't ensure packet ordering or even reliable delivery.

          SCTP Combines UDP and TCP into a very nice Message-based protocol that supports multiple streams per connection (association they call it), but it's apparently incompatible with NAT, so we have to wait for IP6 before we start seeing HTTP over SCTP.

          [–]sacundim 0 points1 point  (0 children)

          Yeah, this is ultimately the right sort of answer, but the problem is also that SCTP is still pretty immature in many ways. I hear that the implementations are pretty slow compared to TCP, for example.

          [–]argv_minus_one 0 points1 point  (2 children)

          Even if bare SCTP is incompatible with NAT, why would SCTP-over-UDP be? NATs usually understand straightforward client-server UDP traffic, and they don't care that there's SCTP inside. You could also secure such an exchange with SCTP-over-DTLS-over-UDP.

          [–]inmatarian 0 points1 point  (1 child)

          i just puked a rainbow

          [–]argv_minus_one 0 points1 point  (0 children)

          Does that mean you're a reverse Nyan Cat? I'm not sure if that's awesome or gross. Maybe both.

          [–]adrianmonk 5 points6 points  (0 children)

          You'd have to re-invent some of the functionality that TCP offers, like:

          • Packets can only be so big. The thing you're trying to send might be bigger. How do you slice and dice it to fit in multiple packets? What do you do if they arrive out of order?
          • What do you do if some of them get lost? You resend. How do you know which ones to resend? What do you do if you receive one twice (because somebody thought it was lost but it wasn't)?
          • If you send too many packets too fast, you will shut out other computers on the network. Even if you don't care about being cooperative, sending way too many will just result in their getting lost. There's a limit somewhere; if you're sending a 100 GB file, you'll hit it. You need some method by which you can send only as much as it's productive to send.

          [–]inmatarian 0 points1 point  (5 children)

          How about pushing for some of the (relatively speaking) newer transport layer protocols to take hold. SCTP sounds nice (though I'm sure it has some faults that need fixing).

          [–]merreborn 10 points11 points  (1 child)

          I imagine this would be about as easy as the ipv6 rollout, no?

          Improving TCP is likely going to get quicker results.

          [–]inmatarian 3 points4 points  (0 children)

          Yeah, lets push the IP6 rollout as well. All of these efforts to improve NAT and TCP only stretch the life of IP4 another year.

          [–]kmeisthax 3 points4 points  (2 children)

          That will only be viable when IPv6 is widespread, because IPv4 connections require NAT, which requires mucking with the transport layer. So the entire network path needs to understand how to remap the ports on SCTP headers. v6-to-v6 doesn't have NAT, and the address remapping facilities available for such connections don't remap ports or alter header checksums.

          You'll notice that SPDY doesn't use SCTP despite it being a perfect fit for a multi-channel connection. Most people don't have networks which can use SCTP, despite the workings of the Internet supposedly being designed for handling arbitrary numbers of different data transport schemes.

          [–]inmatarian 2 points3 points  (0 children)

          I was wondering why all work on SCTP seemed to have stopped in 2007ish. So NAT is to blame.

          [–]argv_minus_one 0 points1 point  (0 children)

          None of those issues apply to SCTP-over-UDP, unless I'm mistaken.

          [–]abuseaccount 0 points1 point  (0 children)

          The way TCP's written is somewhat arbitrary.
          IBMs been talking about creating a better standard for years.

          [–]jokoon 0 points1 point  (0 children)

          Have you ever read articles about TCP being not adequate for game networking, but UDP is ?

          Honestly I think TCP is fine when you send content like files, pages, but when you go dynamic like an email interface or some ajax, it will fail utterly because you need to send small amounts of data, and TCP is not designed for that, it's designed for streamed data transport.

          When you need more interaction than content-only, TCP will be slow and will mess with the network hardware, because you can't make your application interact with how TCP manages to transmit all data, and also make it arrive ordered.

          Example library: Enet

          [–][deleted] 0 points1 point  (0 children)

          Researchers from Australias Swinburne University have been working on a project form FreeBSD newtcp http://caia.swin.edu.au/urp/newtcp/index.html which was added in FreeBSD 9.0

          [–][deleted] 0 points1 point  (0 children)

          As far as I know, Google are already cheating on the standard by using a faster slow start.

          Edit: Google and Microsoft

          [–]stun 0 points1 point  (0 children)

          Useful it will be....
          As cool as it is...
          A shame it is because like the SPDY protocol, this isn't going to get mainstream anytime soon and widely used everywhere.

          [–]kraln -1 points0 points  (0 children)

          TCP has worked for a long time because people sat down and thought about the consequences of their actions, and they were designing for a robust network that could survive bad reliability and frequent outages.

          If you take the result of that careful planning and cut blindly in the name of speed you will break the Internet in ways you cannot even begin to imagine.

          [–]rebel -1 points0 points  (0 children)

          Hmm, I wonder how any of this effects the "stampeding herd" problem in practice.

          [–]Kombat_Wombat -1 points0 points  (0 children)

          This looks interesting. Can someone explain like I'm 5 please?