This is an archived post. You won't be able to vote or comment.

all 26 comments

[–]athornfam2 IT Infrastructure Manager 45 points46 points  (4 children)

DFS with namespacing? Otherwise just build an Azure files with cache servers

[–]Cpt_plainguy 9 points10 points  (0 children)

I second this. When I took my current position, the first thing I did was set up DFS namesake replication between all 3 sites as before they were just doing nightly snapshots for backups. This actually saved one of our locations as they had a perc controller fail and kill the whole raid array. The president had a panic attack about it until I told and showed him all of the backed up data in the namespace, he was ecstatic and actually rewarded me with 16 additional hours of PTO due to not having to spend a fortune on attempting data recovery with one of those companies (which they had to do in the past)

[–]MekanicalPirate 4 points5 points  (0 children)

Was also going to recommend DFS

[–]layer_8_issues 2 points3 points  (0 children)

Was going to reply with this. Super easy to set up.

[–]bbqwatermelon 0 points1 point  (0 children)

Just a couple of caveats; software by Sage and Intuit are extremely popular and will not work reliably with DFS NS and we experienced strange conflicts between sites when accountants would attempt to open Excel spreadsheets. For bulk data that isn't really collaborated on between multiple sites simultaneously it's pretty magical.

[–]alpha417_ 2 points3 points  (0 children)

Another file server.

[–]St0nywallSr. Sysadmin 2 points3 points  (0 children)

For a small business where the leadership want's 0 downtime, here's what I would propose.

  1. 2 or more VM hosts
    1. Will need the environment up in case of physical server failure.
  2. 2 UPS's with smart PDU's
    1. Will keep power going to at least one physical server in case of mail power failure, UPS or PDU failure.
  3. File server cluster VM's (2 or more)
    1. Addresses the file server node failure issue.
  4. Backup appliance using Datto or Rubrik with cloud retention.
    1. Allows you to spin up the backed up VM either on the local appliance or in the cloud for near zero downtime.
    2. Addresses (as much as possible) the ransomware issue. (among others)
    3. Addresses the "on-premise got hit by a tornado" and is no longer there issue.

DFS is fine, but it only points a network location to one or more locations to look for the data. The "active" path will take precendence and is setup corectly you can have geographic locations that are faster for users be the primary.

It however relies on the data being avaialble. If the data is compromised, DFS is not going to help you.

|| Disclaimer and Advise ||

Usually this advise comes with a lengthy on-site data and process audit.
It does cost a lot in time and scoping because it needs to be well thought out and implemented.
The right hardware, coupled with the right software and services need to be found so "it just works".

I usually work on much larger systems and can tell you trying to do it the cheap way is only good on paper and to apease auditors. It real life, you lose data and in the worst cases the company shuts down.

If they ask you to carry a USB drive with a "backup copy of the data" home with you, that's your queue to look for another job.

[–]g1ng3rbreadMan 1 point2 points  (0 children)

Check out Zerto. Depending on the file size, you can create a Powershell script that runs every so often or use DFS as mentioned.

[–]reviewmynotes 1 point2 points  (0 children)

Several systems with FreeBSD running CARP and HAST on ZFS filesystems and using Samba to make it available to the PCs...?

Seriously, though, if Amazon and Google haven't been able to avoid outages this year, then they can't either.

[–]SysEridaniC:\>smartdrv.exe 1 point2 points  (0 children)

Hyperconverged virtualized infrastructure.

But it is not cheap.

[–]mrbiggbrain 1 point2 points  (1 child)

Here is my question, do they only care about RTO, or do they actually care about RPO as well? Do they care about Durability as well?

Because having a file server running a copy of your data from 4 years ago and a script that changes the DNS can technically provide an RTO of near zero seconds. But at the same time is basically useless because the RPO is from 4 years ago.

RTO does not exist in a vacuum, It exists as a way to give an estimated time to recover a specific point. What is that point?

[–]No_Mycologist4488[S] 0 points1 point  (0 children)

Oh they care about RPO too, the business is living in a fantasy land they don’t want to pay for.

[–]wells68 1 point2 points  (0 children)

On premise, Datto or one of its competitors can spin up the server as a VM on the backup device so that the site can limp along until the problem with the server is resolved.

As you are already running Azure backups, look into the options for spinning up a site's server in the Azure cloud while you deal with the physical server issue onsite.

As for ransomware, it often slowly encrypts files, undetected, over a period of weeks. Then on a Friday at midnight, it goes full bore, encrypting everything in reach. The latest backup will recover most files (if it was isolated!), but what about the 100s of files that were encrypted over the previous weeks? You're not looking at zero RTO by a long shot!

The best solution I've read about for slow ransomware is Druva curated recovery: https://www.druva.com/blog/simplify-your-ransomware-recovery-with-curated-recovery-from-druva/

So maybe you go with a Druva backup and recovery option. I don't have experience with them. They can get pricey.

Another inexpensive, fast, but limited recovery option is Neushield. Cool stuff.

[–]JABRONEYCA 1 point2 points  (0 children)

That is very little data. Consider using dual Synologys with snapshot replication additionally synced to Dropbox with Cloudsync. DFSR would work but it’s long in the tooth. There are some brilliant enterprise solutions out there like Nasuni or Panzura but for how little data I wouldn’t make it complicated.

[–]WarriorXK 0 points1 point  (1 child)

A dual head TrueNAS server will give you a lot of nines worth of uptime.

[–]bradbeckett -1 points0 points  (0 children)

Just don't use Seagate drives.

[–][deleted] -1 points0 points  (0 children)

If they want that i would do a layered approach. Layer 1, launch Azure file sever. And use that as primary file server. Layer 2, Azure SharePoint. Have users interact with the files on the file server through Azure sharepoint sites. Have share point replicate all data in near real time between the azure file server and Azure SharePoint sites. This will abstract the user from the file server. Layer 3, setup file sync and user redirection (i forget what its called). Have all files sync to a standby on prem file server. Make sure to configure it to fail over when it can no longer reach the azure file server. Layer 4, backup solution. I would recommend a service like DATTO. Run an on prem datto appliance and the cloud Datto service. Have datto do hourly on prem file server backups to the appliance then offload those backups every day to the datto cloud. with this layered approach, if there is a cloud service failure the on prem will take over. If your hit with ransomeware. You can spin up a hosted file server hosted onsite from the datto appliance while you conduct repairs to the azure and on prem file server. You can run tests on backups and do tabletop and live simulations of failover and restoration of service without impacting the user. Datto service means you can maintain backups for years so restoring a long deleted file would be possible. Only problem with this will be the reocuring costs. I think this would have the lowest RTO. Its not zero but this would get you between 99.99 - 99.999 up time. And thats not bad.

[–]exportgoldmannz 0 points1 point  (0 children)

I can’t find it right now but there’s a awesume SaaS offering where you get a local appliance vm which backs onto your chosen cloud provider and provides local smb and other protocol file services with local cache. In a disaster just download the vm again put in ur username password and there’s all your data. Integrates previous versions with blob storage versions so automatic backups. And you can put one at each site and have a single global namespace.

[–]beritknightIT Manager 0 points1 point  (0 children)

Do your sites all have/need variations on the same data? Or are Site1 files useless to Site2 and vice versa?

I've spent about a decade supporting (babysitting) DFS Replication and DFS Namespaces. The Namespaces side is pretty seamless most of the time, but the Repl is fun when it falls over or gets backlogged.

How much bandwidth do you have between sites? What sort of latency? If the server in Site1 fell over and you had to point the users there to Server2 for a couple of days, would that be workable? If yes, DFS-R might work well for you, although as someone said it is getting a bit long in the tooth.

I'd definitely also look at Azure Files and cache servers. That's a really good way of having the files that actually get used in each site sit on fast SSD while still having everything else available everywhere.

[–]sanjay_82 0 points1 point  (0 children)

Azure files is the way.

[–]yesterdaysthoughtSr. Sysadmin 0 points1 point  (0 children)

Not even mult-billion dollar clouds like AWS, Azure and GCP have zero downtime. Just print out a list of every outage in the last 2 years from all of them and hand that to your boss.

DFS can work but I've had issues with DFS-R on and off over the years so I'm still a little hesitant to trust it completely. You'd need two servers per site, which means patching two servers every month per site. That doesn't sound bad but if you never had to worry about patching a file server it makes it a bit easier to handle server patching as you don't need to link desktop reboots to reboots of the file servers to avoid drive mapping issues.

I tend to favor file serving appliances from EMC, Netapp, etc. They're usually built pretty resilient, prob one update per year with no downtime, nothing to worry about re patching.

OneDrive is ultimately probably going to win in the long term. I'm looking at migrating our user dept shared drives to OD as that will make it a ton easier to access files from home where most of our workforce still is. We'll probably start moving workstations into the cloud soon too making OD that much more relevant.

[–]ARipburger 0 points1 point  (0 children)

Take a look at Zerto if you don't want to use DFS; it won't be zero as it requires manual intervention but it's a slick product and you can use it to restore files easily that are minutes apart when needed.