In a comment on my article about installing updates on Exchange 2010 DAG members Greg asks:
I would like to know how many Exchange admins [ahem] server admins actually follow these steps when they patch, we have a lab where we just patch Server A in a two multi role server DAG reboot it, then patch Server B and reboot it, then balance out the databases evenly among them and call it an evening, no problems so far
To be fair we didn’t go to Exchange 2010 until SP3 so maybe things are more better with SP 3 but we don’t bother with any of the procedures above and use Exchanges built in intelligence.
Now Greg does say that this is how they patch their lab servers, so it isn’t clear whether this is how they treat their production servers. However it is still a topic worth discussing.
The short answer, from my point of view, is that running the DAG maintenance scripts is the wisest course of action. And I’m sure that anyone who has experienced a scenario where those scripts fail to switchover a database due to some other underlying condition at the time would agree with me.
For anyone else who is just running them because it looks like “the way it is done”, and is curious why this is the wisest course of action, let’s look briefly at the possible consequences of not running the DAG maintenance scripts during server maintenance.
Consider that Exchange Server is an application running on a Windows server. When someone tells that server to shut down or restart, Windows is going to shut down or restart. In most situations the Exchange Server gets little say in the matter.
In the process of Windows Server stopping Exchange services, the Primary Active Manager will try to switchover any active mailbox database copies to another DAG member. In a simple two-member DAG it is pretty obvious where the active database copy will end up. In larger DAGs there is a little more to it. If you want to dive into the full details read up on the Best Copy Selection process here.
If you do read that then you might notice the term “lossy failover”.
A lossy failover is a potential data loss scenario. And the default AutoDatabaseMountDial setting of “GoodAvailability” allows lossy failovers to occur. And even with Exchange Server’s ability to automatically attempt to recover data lost in a lossy failover, it is not a guarantee.
For a test lab this is not likely to be a concern, and most of the time I allow my own test lab servers to automatically patch and restart (on separate schedules) without my manual intervention.
From time to time there is a problem as a result of that, which requires me to apply a little bruce force and occasionally accept some data loss to bring all databases online again.
In a production environment I would really hate to put myself in that situation, or know that others are putting themselves in that situation simply because they can’t be bothered running the DAG maintenance scripts. My recommendation is to always use the DAG maintenance scripts during patching or other server maintenance on your DAG members.