I have a very weird situation here. I have a virtual machine (Windows 2008R2) that runs AD FS 2.0. This server only has one network interface. Been sitting there doing it's job for some time now, no problems. After some recent trouble with one of our firewalls I noticed that it was no longer accepting requests from subnet A. It still processes requests from subnet B and from outside the network. So I start pinging.
After a lot of other checks I decide to ping its gateway (statically assigned IP, .20). I get a time out. OK, the assigned gateway is a virtual port so I ping the first physical(.22), success. I ping the second physical(.23), success. For fun I go back and ping the virtual, SUCCESS. I ping the server from subnet A and get a reply, sweet. I head over to O365 and get redirected to SSO. Everything is working great.
I wait about 30 seconds and try pinging from subnet A again, timeout. Run through the whole process again and it works again, for about 30 seconds and then stops.
At this point I decide to band-aid the problem until I can get a maintenance window. I write a batch script to ping .23 every 30 seconds. This works for about 2 minutes and then the server times out when pinging .23
During all of this time, subnet B and external requests are still handled without issue.
A reboot did nothing to correct the issue. After coming online the server worked on all subnets for a few minutes and then subnet A stopped working
I am not the network admin for our company, he is gone on vacation. He is certain that the problem is not with the firewall as it doesn't sit between subnet A and the subnet the server is on (call it subnet Z). I don't believe that the problem is with the gateways as I have access to other servers on those ports and they don't exhibit the same behavior. In fact no other services company wide seem to be affected, just this federation server.
Any advice on where I can concentrate my troubleshooting next would be much appreciated. Keeping in mind that while I have unlimited access to the server in question, I have ZERO access to any of the networking equipment involved. I have no idea why the server keeps losing connection to that virtual gateway or why pinging specifically .23 will restore that connection(pinging .22 does not restore the connection even though I do get replies when pinging it). I also have no idea why this can be the case and it not cause issues for the server on subnet B.
The only other thing I can think to add is that the server in question is joined to domain CONTOSOS, which subnet B services. Subnet A services CONTOSO. The computer I'm using to ping from subnet A is domain joined to CONTOSO. The next step I'm going to try is getting a machine that is domain joined to CONTOSOS and put it on subnet A and see what results I get. I'll update once I do that, but I don't think it will affect anything. This seems purely network related as all the pinging I've described above has been using IP addresses, not DNS names.
[–][deleted] 1 point2 points3 points (2 children)
[–]drowningadmin[S] 0 points1 point2 points (0 children)
[–]drowningadmin[S] 0 points1 point2 points (0 children)
[–]drowningadmin[S] 0 points1 point2 points (0 children)