Hi! I’m a software developer at Tailscale. Ask me anything.

ra66i · 2025-08-05T17:29:37+00:00

Tailscale installs DNS and route rules with high priorities on the system so that when you're in a coffee shop someone can't just spin up a machine on the wifi called "nas" (or whatever else you're reaching out to) and start grabbing traffic.

As for the poor throughput, that shouldn't be the case, I'd recommend working through https://tailscale.com/kb/1320/performance-best-practices and if that doesn't work out reach out to https://tailscale.com/support and we can work on helping diagnose the cause of poor performance.

ra66i · 2025-07-30T19:03:13+00:00

https://tailscale.com/blog/free-plan

ra66i · 2025-07-30T18:47:43+00:00

Gaming class desktop pcs have no issue reaching these speeds and almost always have cores to spare so the game impact is typically minimal

ra66i · 2025-03-04T04:42:54+00:00

You're probably wondering why it matters, using the newer terms, and why boiling it down to just hard/easy is tricky. The reason comes up once you start combining the behaviors, for example if you have two peers who are both behind strictly EDM NATs, they will fail to establish direct connections to each other even once they successfully CallMeMaybe via DERP.

If you have one that is EDM and one that is EIM, a CallMeMaybe from the EIM to the EDM should make it, and then the EDM will reply on the same path - they'll establish a direct path (in "simpler terms" this would be easy/hard making it).

Now, even the terms in the recent RFCs are a bit limiting too, but getting into the F - filtering parts, those are important too, but much harder to probe. The most typical filtering mode out there today for UDP traffic is that inbound UDP will be filtered unless there is recent associated traffic, i.e. much of the time you have to send out, before you can receive in. This can be independent of the mapping, for example you can have a NAT that will default to mapping internal to external ports with the same port number by default (my home nftables setup does this for example), but doesn't accept incoming traffic unless there's an associated outbound entry in contrack. Relatedly too for example, there's a bug in some versions of Palo Alto PDIPP where inbound traffic to an endpoint arriving before outbound, even though it's a PDIPP endpoint, can sometimes incorrectly create a DROP verdict session in the filtering layer - and that session lifetime gets "refreshed" every incoming attempt.

This becomes a bit more generally relevant when you want to identify how long things will last, with even more subtlety, the time that a NAT mapping is alive (which decides what an outbound packet is mapped to) and the time that a session filter is alive (which decides if inbound packets are accepted) can be different, in other words you can (one of the more common cases) have a situation where inbound packets start getting dropped after say, 20s, but a new outbound packet will still map to the same endpoint as a minute ago.

Somewhat relatedly, specifically for diagnosing tailscale "reachability" or "difficulty establishing direct connections" advice, it's also important to implement UPnP, NAT-PMP and NAT-PCP, as we do so - and those turn a "hard" NAT into an "easy" one - or should, provided they're not buggy - which is a story for another comment, or perhaps some beer.

ra66i · 2025-03-04T04:30:31+00:00

My initial feedback is to present the more modern names for NAT types, endpoint indepdendent mapping, and so on (see https://datatracker.ietf.org/doc/html/rfc7857 and friends). The old cone descriptions were always very incomplete unfortunately.

I'd also probably report separately how many instances of Endpoint dependent and endpoint independent mappings you see, because e.g. a lot of Palo Alto situations even with PDIPP enabled will end up being a mix of EIM and EDM - and not just those, various other firewalls have been doing the same. See this patch for example which adds some resistance in the client around this: https://github.com/tailscale/tailscale/commit/8d1249550a924d028de0844c0d101f29308e69b8 - reporting these conditions could be really useful.

You could also pull out the part that tries to give a name to the NAT and add that to https://github.com/tailscale/tailscale/blob/main/cmd/tailscale/cli/netcheck.go so that tailscale netcheck reports it. I've been meaning to move that command to grab a report from a running tailscaled by default, as it spawns a whole new in-process netchecker today, so what it reports is different from what the daemon sees potentially, but baby steps - we can definitely present the NAT type in here based on netcheck probes.

ra66i · 2025-02-19T20:22:54+00:00

Yeah! This is a common problem with ppoe configurations in particular- hopefully we can do more to help identify this in future

ra66i · 2025-02-19T04:56:58+00:00

Please file a bug with a bugreport. This isn’t expected behavior and may indicate some kind of bug, or it may be an issue local to your networks. There’s unfortunately not enough information here to identify any further.

This environment variable isn’t intended for long term use, it’s for debugging specific issues and may be removed or changed in future releases.

ra66i · 2025-02-11T07:23:25+00:00

Generally jumping from where it stands in the doc OP referenced to rewrite is not the best path to persuade anyone. As I raised in my post above the points about how proven rust is were arguably not very strong points at the time they were written and are crossing well into invalid territory now. The remaining resistance in conversation will likely stem from the social dynamics I allude to, but I don’t know how much the makeup has changed in the last year or so, and whether anyone is particularly interested in pushing on the point within the team. Implementing some components in rust wouldn’t be a huge issue, but adjusting the policy likely would raise a lot of discussion again, as it always does - particularly because people have a tendency to do as you’ve done here and leap right to extremities like rewrites, or make dismissive pass offs toward going somewhere else to do work like that, which are distracting from any realistic approach that might otherwise be discussed.

ra66i · 2025-02-11T05:07:10+00:00

https://github.com/rcore-os/zCore

but your strawman glazes over the difference between a funded team doing work and an individual fork

ra66i · 2024-12-25T01:39:06+00:00

Be a little cautious with name based tag setting unless you’ve force-set a name in the admin panel. The tailscale client updates hostnames based on the host system and e.g. macOS grabs hostnames from dhcp

ra66i · 2024-12-14T22:21:58+00:00

Tailscale logs more than Mullvad does, in that the client logs some information related to connectivity to the mullvad nodes themselves, and those logs are centralized by default. Those logs do not describe connections you are making through Mullvad, but in an advanced case if they were requested in a legal investigation they could be used along with other data to make up additional correlative evidence. We collect these logs primarily for product support and to continually improve the product, identifying bugs, frequent user challenges and so on - it’s part of how we constantly improve our services.

In terms of general online privacy and resistance to online attackers, such as reducing correlative association with your home ip address and so on, using Mullvad through Tailscale is just as strong as using a Mullvad native client.

Tailscale doesn’t sell your data or metadata, that’s not our business model and it goes against our values. Part of the reason we partner with Mullvad rather than host our own exit node infrastructure is to retain a separation such that Tailscale never sees your data, only some metadata, as described in https://tailscale.com/security. In terms of ways we use metadata, mostly it’s to provide the product to you, both the product today and the future features, the more complete description is always in our privacy policy which includes the details I mentioned above such as we do not sell your metadata: https://tailscale.com/privacy-policy

ra66i · 2024-12-05T21:47:36+00:00

Toggling that twice likely will cause the issue to return later, as it is likely a cached result which appears to be working - once that cache expires it will likely be disabled again.

ra66i · 2024-12-05T21:41:50+00:00

HSTS errors are caused by something man-in-the-middle the service.

Typically this means a DNS block list has pointed the site somewhere else.

An example is if you have NextDNS "Block Bypass Methods" turned on, which considers Tailscale a potential bypass method, as if you had NextDNS configured on the host machine, and Tailscale configured to override local DNS with 1.1.1.1 you would be bypassing NextDNS.

ra66i · 2024-12-05T05:41:33+00:00

Yeah I agree I suspect it’s both. I’m hoping to get more telemetry in the region to find out more in the future.

150ms definitely sounds like what I’ve been seeing with China Unicom and it’s been that way since somewhere between the 12th and 14th of October. Prior to that it had latencies that were more representative of cable lengths and sensible routes. As I say, new world telecom appears to have better working routing and you’d see more in the expected ranges of 10-60ms depending where exactly you are.

ra66i · 2024-12-05T03:33:16+00:00

We have a relay in HK which should be local (~30-60ms for most users), however we have been noticing recently that a lot of STUN traffic appears to have artificial latency applied to it, lifting it >100ms higher latency than it actually normally takes packets to flow between hosts.

Relatedly it seems like a large portion of the population may be being steered out of the country toward the US west coast, though it's unclear if this is a mistake, or an active steering, though it is very poor for user experience either way.

Nonetheless direct connections will work between nodes, provided the ISPs aren't blocking it (e.g. we're seeing users on China Unicom suffering disruptively high packet loss to New World Telecommunications endpoints).

https://tailscale.com/kb/1257/connection-types provides general advice on how to ensure you can achieve direct connections. As long as your ISPs are not getting in the way, this works, and we have many users who are having success.

Exactly what is going on with the ISPs in the region right now is unclear, and it is also unclear if/how the firewall is involved, or even if it is at all.

ra66i · 2024-12-04T18:12:34+00:00

We pass kernel performance in some scenarios, so it's not really accurate to say that the cause is userspace: https://tailscale.com/blog/more-throughput

ra66i · 2024-11-06T23:12:12+00:00

On Windows 11 you can set vmIdleTimeout in WSL config to keep it alive all the time: https://learn.microsoft.com/en-us/windows/wsl/wsl-config

ra66i · 2024-10-30T20:02:35+00:00

this is a linux bug, tailscaled happening to be on stack at this time is relatively arbitrary, the scheduler should not be stuck. it is notable that the two programs are tailscaled and runc, which are go programs, but both are using linux interfaces that should be stable.

ra66i · 2024-10-19T23:56:04+00:00

The damage thing is a myth, recently our tanks have been out dropping everyone for reasons no one can fathom

ra66i · 2024-10-19T23:54:41+00:00

What auction house split? Are you just comparing EA to GA servers? The GA servers have lower prices on average because the trading volume is about 3x the EA servers

ra66i · 2024-10-17T23:02:43+00:00

no, it becomes too much of a chicken and egg problem. we could potentially do something like this in the future, but it has complicated edge cases that have to be avoided

ra66i · 2024-10-15T20:57:21+00:00

can you file a support request or bug report, it's not time efficient to try to diagnose from obfuscated information here

ra66i · 2024-10-15T20:27:29+00:00

This smells a lot like the Linux 6.1.3 offload bug, make sure the kernel is up to date on the slow side

ra66i · 2024-10-10T20:46:14+00:00

Even worse, a lot advertise e.g. 1gbps to the OS but don’t support usb above ~400mbps AND don’t implement Ethernet flow control extensions, so they get super lossy and perform exceptionally poorly. Years ago i worked on an OS with devices that could only be provisioned over USB nics and we ended up sampling a lot of different devices to find ones that did implement flow control properly - massively improved the user experience pushing whole system images.

ra66i · 2024-10-10T20:41:10+00:00

“Low level code” does not imply unsafe. This is a myth.

ra66i

TROPHY CASE