Friday, July 24, 2015

DFS-R SYSVOL Replication - Performing an Authoritative Restore

Many articles have the correct procedure for this, however i have had a few of these cases come up recently so i figured it would be best to aggregate the fixes i've had to put in place.

Failure Reasons

  • Unclean Shutdown - Microsoft changed it so that if DFS-R detects a dirty shutdown, it DOES NOT resume replication. This is obviously very bad if you don't regularly check your event logs.
  • Last Contact Too Old - There is also a limit on how old the database can get without talking to another DFS-R peer. If you don't catch this in time then this can also prevent replication.


Helpful Commands

Please make sure you understand what you are doing before using these commands. These are tools to repair/workaround DFS-R issues, but can also introduce issues if used improperly.
  • If you are getting eventlog entries indicating that you need to resume replication, it should already have the command in the log. It will look something like this (with a different volumeGuid):
    wmic /namespace:\\root\microsoftdfs path dfsrVolumeConfig where volumeGuid="B4A015E2-A116-11DE-89FB-806E6F6E6963" call ResumeReplication
  • If your eventlog entries are giving ID 4012 saying that it's been disconnected too long, then you may want to (temporarily) raise the limit:
    wmic.exe /namespace:\\root\microsoftdfs path DfsrMachineConfig set MaxOfflineTimeInDays=100
  • If you suffered from this due to a dirty shutdown, you can enable auto recovery to prevent future issues:
    wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set StopReplicationOnAutoRecovery=FALSE


Authoritative Restore Procedure

  1. Manually backup your SYSVOL, typically it's in C:\WINDOWS\SYSVOL
  2. Throughout this process, BE PATIENT. I have seen some of these steps take 10 minutes to take full effect. If you don't see the events right away do not try the process over again, just wait.
  3. Always use an Administrator command prompt.
  4. Stop the DFSR service on all domain controllers (net stop DFSR)
  5. Run adsiedit.msc, connect to the Default Naming Context, and drill down to your Domain Controllers OU
  6. Pick a domain controller that will be your authoritative restore source. This should be the one with the up-to-date copy of SYSVOL. Drill down to CN=Domain System Volume and then double click CN=SYSVOL Subscription.
  7. Set msDSFR-Enabled to FALSE and msDFSR-Options to 1 and press OK.
  8. On all other domain controllers, perform the same procedure except leave msDFSR-Options not set
  9. On the primary domain controller:
    1. Force an AD replication (repadmin /syncall /AdP)
    2. Start DFSR (net start DFSR)
    3. Wait for event 4114 signaling that it has stopped replication
    4. Go back into adsiedit and only on the primary domain controller, set msDFSR-Enabled to TRUE
    5. Force an AD replication (repadmin /syncall /AdP)
    6. Run dfsrdiag pollad
    7. Wait for event 4602 signaling that it started replication, confirm with net share that SYSVOL and NETLOGON are being shared.
  10. Now perform this task on every other domain controller:
    1. Start DFSR (net start DFSR)
    2. Wait for event 4114 signaling that it has stopped replication
    3. Go back into adsiedit and set msDFSR-Enabled to TRUE for this domain controller only
    4. Force an AD replication (repadmin /syncall /AdP)
    5. Run dfsrdiag pollad
    6. Wait for event 4602 signaling that it started replication, confirm with net share that SYSVOL and NETLOGON are being shared. Note that on this particular step i have had it take a bit of time, and have had to re-run the pollad command after 5-10 minutes for it to actually work.
  11. You should now be fully replicated. If your issues were caused by unclean shutdowns then you might want to consider making it not stop replication on recovery

No comments:

Post a Comment