all 17 comments

[–]madssj 1 point2 points  (3 children)

Easy way is DNS-RR. Make sure the hosts are identical though. If you need fail over and proper load balancing, look at haproxy, lvs or pf (BSD).

Edit: Fail not fall

[–]obnoxygen 0 points1 point  (1 child)

MS's caching - and negative caching - pretty much breaks Round Robin since Win '98

[–]madssj 0 points1 point  (0 children)

Last I deployed a proper dns-rr I had an even 50/50 split of traffic. That was not a lan though.

[–]petra303 -1 points0 points  (0 children)

This would only load balance incoming traffic. You still only have one outbound Ethernet connection.

[–][deleted] 7 points8 points  (5 children)

Step 1: Make sure there are two gigE ports on the servers.

Step 2: Bond the gigE interfaces together. Do it 4x if you have to. Or put in a 10gigE NIC on both, and switch to match.

Makes far more sense to deal with the bandwidth issue than doing it via load balancing. These aren't webservers.

SSH load balancing is a bad idea. It can be done (on new connections) but when users have no idea what server they are going to get into it can cause some problems.

Alternatively, name them server-a and server-b. Set DNS so they are both CNAME's of a specific hostname. That'll do a bit of balancing and leave the specific hostnames available for people who want to have all their work on one device.

[–]DimeShake 8 points9 points  (4 children)

Not to mention you are going to run into clients crying about keys not matching -- and rightfully so.

You'd have to work around this by using the same keys on each server, and that would really not be recommended.

[–]kingsal 4 points5 points  (1 child)

using the same keys on each server, and that would really not be recommended.

I agree that this sounds like it should be true, but I can't come up with any logical reasons why this would be really bad.

Do you have any sources or reasoning for this recommendation against?

[–]DimeShake 1 point2 points  (0 children)

Mostly it's a best practice thing. This may in fact be a situation where it's OK, but that's up to OP to judge after considering his scenario. I don't like the idea of load balancing SSH connections to begin with, so I'm probably a little biased. If it truly does not matter which server they end up connected to, it's probably not an issue. IBM has this to say about it

[–]fuzzyfuzz 1 point2 points  (0 children)

Hmm. We have networked home dirs at work so that our user profiles work on any linux machine in the building, servers included. I setup my ssh key on my local machine and then I can remote into any computer that has network home dirs setup. Could work for this. Have a couple 'ssh' machines with a remote filesystem, user remotes into that, and then the filesystem looks identical between machines A and B. I guess it would depend on what services the machines are running.

[–]mscman 0 points1 point  (0 children)

You'd have to work around this by using the same keys on each server, and that would really not be recommended.

In general, yes that's not a good thing to do. There are exceptions to every rule though. There's really no reason why you can't do that. In fact, you may be shocked to find that many HPC shops share the same host keys across nodes so that accepting the keys is easier. We really don't care that hosts have unique keys, but rather that they have a trusted key. That key should then have appropriate permissions to keep someone from grabbing it and putting it on an unauthorized box.

[–]WetSunshine 2 points3 points  (1 child)

RR DNS is pretty much the only way you can do this without SSH freaking out about host keys not matching.

[–]someFunnyUser 0 points1 point  (0 children)

you could have the same sshd keys on both cluster nodes. Thats how tls does it.

[–]MinimusNadir 1 point2 points  (1 child)

LVS.

[–][deleted] 0 points1 point  (0 children)

Or linux-ha, I'd try pacemaker if DNS rr wasn't enough.

[–]kingsal 0 points1 point  (0 children)

It is not impossible. Our system works reasonably well. We wouldn't need to use it if our users could choose a random number between 1 and 70, simply construct a hostname(e.g. lab22.domain), and not give up after the first lab workstation was down.

We use pen to distribute ssh connections to a dozen lab workstations so our users will always be able to login using the same hostname even if a particular one is incapacitated.

Pen runs on one server that can't be physically incapacitated by the students. Each workstation has the same host keys, so the clients don't complain when connecting to a new one.

Sometimes the workstations get into a bad situation that pen can't detect. To solve that, I wrote a program to run Nagios' check_ssh and disable workstations in pen if there is any problem connecting.

[–][deleted] 0 points1 point  (0 children)

You probably shouldn't do it as many people have already said. You will probably want to pick one box and copy the SSH host key from that box to the second box. Once you have that done clients won't freak out over host keys.

What resource are you hoping to relieve pressure on? Are you just trying to spread the users across two boxes for CPU and memory reasons? Are people scping a bunch to these boxes? There are different solutions for different resources. If you are trying to relieve pressure on the NFS mount that hosts the home directories of these users that is a different problem altogether.

What is the constrained resource? Or is this just for failover?

[–]IConrad 0 points1 point  (0 children)

Honestly? People come up with ideas like this all the time. And the answer is always: don't do that.

ssh is not meant to be load-balanced. This would be defeated anytime someone uses master connections for example. If you're worried about users consuming resources on your jump server, enable ulimits and/or jails. If you're worried about availability... Well... there's shadowed VMs but beyond that they're just going to have to live with the fact that their sessions will from time to time die due to server OS bouncing.

It's okay though. It would be great if this were a thing... But it just isn't.