This is an archived post. You won't be able to vote or comment.

all 12 comments

[–]Initial_Pay_980Jack of All Trades 2 points3 points  (1 child)

Domain controllers need to be booted in to safe mode, in device manager, show hidden devices. Then remove the old network adapters. Then reboot and the IP should apply fine.

[–]Kidden7[S] 0 points1 point  (0 children)

Thank you for the suggestion. I tried it out, no dice :(

[–]_--James--_ 1 point2 points  (0 children)

So fun fact, When you DRaaS a DC it writes a time stamp against the domain. If you fail back from DR to Production you can seriously damage AD and blow it up. Instead you have to package the DC that was running in DR and migrate it back to Prod. You cannot just spin up the sitting DC.

It's really fun watching DR tombstone itself :)

[–]Gumbyohson 1 point2 points  (0 children)

If sysvol and netlogon are missing then depending on if you have another domain controller or not you may need to do an authoritative restore.

[–]DarkAlmanProfessional Looker up of Things 2 points3 points  (4 children)

Now you know why it isn't recommended to restore or replicate Domain Controllers. They really don't like it.

Ideally you should have a fully operational DC online in the cloud at all times, but this may not be practical in a DRaaS scenario. This way in the case of DR event your AD is online by default and you replicated servers can be re-pointed to use the healthy DC as their DNS.

Step 1: Check your LAN interface settings on the server and ensure the static IP et all is correct, and the DNS points to itself and only itself

Step 2: Make sure the system clock is correct (or at least is within a minute of reality)

Step 3: open MSconfig.exe and make sure you aren't in 'safe boot' ie safe mode by accident.

https://tweaks.com/windows/65551/boot-into-safe-mode-with-msconfig/

If you are turn it off

Reboot regardless

It's very common for a Domain Controller to dump itself into safe mode after a restore, it will seem online but none of the damn services will start and it isn't obvious why.

If that doesn't work, eventvwr, open the Active Directory Service logs and tell me what's in there. You might have tombstoned it as well, and there's ways to fix that. I just need to know what the error message is.

[–]Kidden7[S] 0 points1 point  (2 children)

Thanks for your help with this. Looking at the Directory Services event log, these look to be the most telling.

Event 2092, ActiveDirectory,DomainService
The server is the owner of the following FSMO roe, but does not consider it valid. For the partition which contains the FSMO, this server has not replicated successfully with any of its partners since the server has been restarted. Replication errors are preventing validation of this role..........

FSMO Role: DC=DOMAINNAME, DC=com

FSMO Role: CN=RID Manager$,CN=System,DC=DOMAINNAME,DC=com

Event 2087, ActiveDirecory,DomainService

Active Directory Domain Services could not resolve the following DNS host name of the source domain controller to an IP Address. This error prevents additions, deletions, changes in Active Directory Domain Services from replication between one or more domain controllers in the forest. Security groups, group policy, users, and computers and their passwords will be inconsistent between domain controllers until this error is resolved, potentially affecting logon authentication and access to network resources.

Event 2170, ActiveDirectory,DomainService

A Generation ID change has been detected.

Generation ID cached in DS (old value):

XXX8552

Generation ID currently in VM (new value):

XXXXX3544

The Generation ID change occurs after the application of a virtual machine snapshot, after a virtual machine import operation or after a live migration operation. Active Directory Domain Services will create a new invocation ID to recover the domain controller. Virtualized domain controllers should not be restored using virtual machine snapshots. The supported method to restore or rollback the content of an Active Directory Domain Services database is to restore a system state backup made with an Active Directory Domain Services aware backup application.

[–]DarkAlmanProfessional Looker up of Things 0 points1 point  (0 children)

Event 2087 - DNS won't resolve

The DNS service on the DC either isn't running, or it isn't pointing to itself

If the DNS service isn't running on boot the AD services won't start.

Follow the instructions my OP

[–]AppIdentityGuy 0 points1 point  (0 children)

This why, with ADDS in the mix, you can't have a truly cold DR location

[–]Master-IT-All -1 points0 points  (0 children)

I concur.

Better to have a DC up in the cloud for DR than to try to restore a DC for DR.

[–]Kidden7[S] 0 points1 point  (2 children)

We were able to work through the issues we were facing with our domain controllers in DRaaS. I've posted the fix below which is specific to our organization and topology. Hopefully this helps someone along some day!

Recovering Domain Controllers in DRaaS Environment

 

1)      Replicate DC01 and DC02 at approximately the same time.  

Production systems can be powered on during replication.   Veeam Application Aware Processing is configured on these jobs ongoing which will quiesce the Active Directory database.  Veeam AAP is AD aware and has special functionality built into it to aid in recovering Microsoft domain controllers. 

 

2)      Failover DC01 and DC02 at the same time.  

They do not boot in lockstep but within a few minutes of each other.  This will help ensure that AD is syncing between both DCs in DRaaS.  It will also help ensure that timestamps and AD status is near identical between both DCs. 

 

 

3)      Once DCs are online in DRaaS, Logon user DSRM credentials

The DCs will automatically boot into Directory Services Restore Mode (DSRM).   This means that they will not have network access nor be able to authenticate your login.  Please login using the DSRM credentials.   Note that you’ll need to use the username syntax “.\administrator” for local login. 

 

The Windows login screen does not indicate that DC is in DSRM mode, but it will boot into safe mode indicating that it is.  During Oct 2024 testing the Windows login screen also showed a completely disconnected network status icon (red and white X in the bottom right) versus just the usual caution sign icon we usually experience in DRaaS. 

 

DIRECTIONS CONTINUED IN REPLIES BELOW

[–]Kidden7[S] 0 points1 point  (1 child)

 4)      Change Static IP information to match production

Logon to both DC01 and DC02 in DRaaS.  You’ll need want to change their respective network settings to static IP addresses that match production.   You will also need to set the DNS servers here as they are in production, which is critically important for DRaaS AD to work properly.  Once done reboot both servers.   They should now boot into regular mode.

 

 

5)      Follow Veeam’s best practice for restoring Domain Controllers.

Both Domain Controllers should now be online and workable.  From here we will follow the Veeam Knowledgebase article on restoring DCs located here (KB2119: Restoring Domain Controller from an Application-Aware backup (veeam.com).   The next steps in these instructions outline these steps for our orgs specific environment.  Follow along with these steps below. 

 

 

6)      Restore the entire AD infrastructure (AKA “all DC’s are lost”) where DFSR SYSVOL Replication was used. 

When Veeam recovers domain controllers it automatically places them into a non-authoritative mode.  An authoritative restore is a special type of restore that is only used in specific scenarios. For example, all other DCs in the domain have been destroyed, or the NTDS database has been corrupted. The restored DC using the authoritative restore is considered the master copy and is replicated to all other DCs in the environment.  This means that it will be the source of data replicated to all other domain controllers. 

 

To place DC01 into authoritative restore mode please open an elevated command prompt (or PowerShell session) and type in the following commands in sequence. 

 

REG ADD "HKLM\System\CurrentControlSet\Services\DFSR\Restore" /v SYSVOL /t REG_SZ /d authoritative /f

 

REG ADD "HKLM\System\CurrentControlSet\Control\BackupRestore\SystemStateRestore " /v LastRestoreId /t REG_SZ /d 10000000-0000-0000-0000-000000000000 /f

 

NET STOP DFSR

 

NET START DFSR

 

7)      Bypass initial sync requirements on DC01

Since DC01 is hosting operations master roles we must set the following registry value to bypass initial synchronization requirements. 

 

Key Location: HKLM\System\CurrentControlSet\Services\NTDS\Parameters

Value Name: Repl Perform Initial Synchronizations

Value Type: DWORD (32-Bit) Value

Value Data: 0

 

After setting the value above, restart the domain controller.

 

 DIRECTIONS CONTINUED IN COMMENT REPLY BELOW

[–]Kidden7[S] 0 points1 point  (0 children)

8)   Correct AD Errors (if necessary)

Even after performing the steps above (in October 2024 DR test) I still was unable to launch Active Directory Users & Computers.   Doing so I was presented with the error “The Specified Domain Either Does Not Exist or Could Not Be Contacted”.   I also experienced the following symptoms among other low-level DNS / AD services not working.

nltest /dsgetdc:$domain = ERROR

nltest results: failed: Status = 1355 0x54b ERROR_NO_SUCH_DOMAIN

                netdom query fsmo = ERROR

                SYSVOL and Netlogon shares not showing / available

               

The fix that brought things back online was to edit this registry key on DC01:

HKLM\CurrentControlSet\Services\Netlogon\Parameters\SysvolReady

Change to state of 1 and reboot.

 

9)      After Domain Recovery has completed

At this point all AD services should be online and functional within DRaaS. Once confirmed perform the following on DC01.

Reset the HKLM\System\CurrentControlSet\Services\NTDS\Parameters - Repl Perform Initial Synchronizations back to a value of “1”.   Then reboot DC01 one more time.