Sunteți pe pagina 1din 27

My Collection

This document is provided "as-is". Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. This document does not provide you with any legal rights to any intellectual property in any Microsoft product or product name. You may copy and use Terms of Use (http://technet.microsoft.com/cc300389.aspx) | Trademarks (http://www.microsoft.com/library/toolbar/3.0/trademarks/en-us.mspx)

Table Of Contents
Chapter 1
Managing High Availability and Site Resilience: Exchange 2013 Help Managing Database Availability Groups: Exchange 2013 Help Managing Mailbox Database Copies: Exchange 2013 Help Monitoring Database Availability Groups: Exchange 2013 Help

Chapter 1

TechNet

Products

IT Resources

Downloads

Training

Support

Managing High Availability and Site Resilience


Exchange 2013
1 out of 1 rated this helpful Applies to: Exchange Server 2013 Topic Last Modified: 2012-11-05 After you build, validate, and deploy a Microsoft Exchange Server 2013 high availability or site resilience solution, the solution transitions from the deployment phase to the operational phase of the overall solution lifecycle. The operational phase consists of several tasks, and all tasks are related to one of the following areas: database availability groups (DAGs), mailbox database copies, performing proactive monitoring, and managing switchovers and failovers. Contents Database availability group management Mailbox database copy management Proactive monitoring Switchovers and failovers

Database availability group management


The operational management tasks associated with DAGs include: Creating one or more DAGs Creating a DAG is typically a one-time procedure performed during the deployment phase of the solution lifecycle. However, there may be reasons for creating DAGs that occur during the operational phase, for example: The DAG is configured for third-party replication mode, and you want to revert to using continuous replication. You can't convert a DAG back to continuous replication; you need to create a DAG. You have servers in multiple domains. All members of the same DAG must also be members of the same domain. Managing DAG membership Managing DAG members is an infrequent task typically performed during the deployment phase of the solution lifecycle. However, because of the flexibility provided by incremental deployment, managing DAG membership may also be performed throughout the solution lifecycle. Configuring DAG properties Each DAG has various properties that can be configured as needed. These properties include: Witness server and witness directory The witness server is a server outside the DAG that acts as a quorum voter when the DAG contains an even number of members. The witness directory is a directory created and shared on the witness server for use by the system in maintaining a quorum. IP addresses Each DAG will have one or more IPv4 addresses, and optionally, one or more IPv6 addresses. The IP addresses assigned to the DAG are used by the DAG's underlying cluster. The number of IPv4 addresses assigned to the DAG equals the number of subnets that comprise the MAPI network used by the DAG. You can configure the DAG to use static IP addresses or to obtain addresses automatically by using Dynamic Host Configuration Protocol (DHCP). Datacenter Activation Coordination mode Datacenter Activation Coordination mode is a property setting on a DAG that's designed to prevent splitbrain conditions at the database level, in a scenario in which you're restoring service to a primary datacenter after a datacenter switchover has been performed. For more information about Datacenter Activation Coordination mode, see Datacenter Activation Coordination Mode. Alternate witness server and alternate witness directory The alternate witness server and alternate witness directory are values that you can preconfigure as part of the planning process for a datacenter switchover. These refer to the witness server and witness directory that will be used when a datacenter switchover has been performed. Replication port By default, all DAGs use TCP port 64327 for continuous replication. You can modify the DAG to use a different TCP port for replication by using the ReplicationPort parameter of the Set-DatabaseAvailabilityGroup cmdlet. Network discovery You can force the DAG to rediscover networks and network interfaces. This operation is used when you add or remove networks or introduce new subnets. Rediscovery of all DAG networks can be forced by using the DiscoverNetworks parameter of the Set-DatabaseAvailabilityGroup cmdlet. Network compression By default, DAGs use compression only between DAG networks on different subnets. You can enable compression for all DAG networks or for seeding operations only, or you can disable compression for all DAG networks. Network encryption By default, DAGs use encryption only between DAG networks on different subnets. You can enable encryption for all DAG networks or for seeding operations only, or you can disable encryption for all DAG networks. Shutting down DAG members The Exchange 2013 high availability solution is integrated with the Windows shutdown process. If an administrator or application initiates a shutdown of a Windows server in a DAG that has a mounted database that's replicated to one or more DAG members, the system will try to activate another copy of the mounted databases prior to allowing the shutdown process to complete. However, this new behavior doesn't guarantee that all of the databases on the server being shut down will experience a lossless activation. As a result, it's a best practice to perform a server switchover prior to shutting down a server that's a member of a DAG. For detailed steps about how to create a DAG, see Create a Database Availability Group. For detailed steps about how to configure DAGs and DAG properties, see Configure Database Availability Group Properties. For more information about each of the preceding management tasks, and about managing DAGs in general, see Managing Database Availability Groups. Return to top

Mailbox database copy management


The operational management tasks associated with mailbox database copies include: Adding mailbox database copies When you add a copy of a mailbox database, continuous replication is automatically enabled between the existing database and the database copy. Configuring mailbox database copy properties You can configure a variety of properties, such as the database activation policy, the amount of time, if any, for replay lag and truncation lag, and the activation preference for the database copy. Suspending or resuming a mailbox database copy You can suspend a mailbox database copy in preparation for seeding, or for other forms of maintenance. You can also suspend a mailbox database copy for activation only. This configuration prevents the system from automatically activating the copy as a result of a failure, but it still allows the system to keep the database copy up to date with log shipping and replay. Updating a mailbox database copy Updating, also known as seeding , is the process in which a copy of a mailbox database is added to another Mailbox server. This becomes the baseline database for the copy. After the initial first seed of the baseline database copy, only in rare circumstances will the database need to be seeded again. Activating a mailbox database copy Activating is the process of designating a specific passive copy as the new active copy of a mailbox database. This process is referred to as a switchover. For more information, see "Switchovers and Failovers" later in this topic. Removing a mailbox database copy You can remove a mailbox database copy at any time. Occasionally, it may be necessary to remove a mailbox database copy. For example, you can't remove a Mailbox server from a DAG until all mailbox database copies are removed from the server. In addition, you must remove all copies of a mailbox database before you can change the path for a mailbox database. For detailed steps about how to add a mailbox database copy, see Add a Mailbox Database Copy. For detailed steps about how to configure mailbox database copies, see Configure Mailbox Database Copy Properties. For more information about each of the preceding management tasks, and about managing mailbox database copies in general, see Managing Mailbox Database Copies. For detailed steps about how to remove a mailbox database copy, see Remove a Mailbox Database Copy. Return to top

Proactive monitoring
Making sure that your servers are operating reliably and that your database copies are healthy are key objectives for daily messaging operations. Exchange 2013 includes a number of features that can be used to perform a variety of health monitoring tasks for DAGs and mailbox database copies, including:

Get-MailboxDatabaseCopyStatus Test-ReplicationHealth Crimson channel event logging In addition to monitoring the health and status, it's also critical to monitor for situations that can compromise availability. For example, we recommend that you monitor the redundancy of your replicated databases. It's critical to avoid situations where you're down to a single copy of a database. This scenario should be treated with the highest priority and resolved as soon as possible. For more detailed information about monitoring the health and status of DAGs and mailbox database copies, see Monitoring Database Availability Groups. Return to top

Switchovers and failovers


A switchover is a manual process in which an administrator manually activates one or more mailbox database copies. Switchovers, which can occur at the database or server level, are typically performed as part of preparation for maintenance activities. Switchover management involves performing database or server switchovers as needed. For example, if you need to perform maintenance on a Mailbox server in a DAG, you would first perform a server switchover so that the server didn't host any active mailbox database copies. For detailed steps about how to perform a database switchover, see Activate a Mailbox Database Copy. Switchovers can also be performed at the datacenter level. A failover is the automatic activation by the system of one or more database copies in reaction to a failure. For example, the loss of a disk drive in a RAID-less environment will trigger a database failover. The loss of the MAPI network or a power failure will trigger a server failover. Return to top

TechNet

Products

IT Resources

Downloads

Training

Support

Managing Database Availability Groups


Exchange 2013
This topic has not yet been rated Applies to: Exchange Server 2013 Topic Last Modified: 2013-01-08 A database availability group (DAG) is a set of up to 16 Microsoft Exchange Server 2013 Mailbox servers that provides automatic, database-level recovery from a database, server, or network failure. DAGs use continuous replication and a subset of Windows failover clustering technologies to provide high availability and site resilience. Mailbox servers in a DAG monitor each other for failures. When a Mailbox server is added to a DAG, it works with the other servers in the DAG to provide automatic, database-level recovery from database failures. When you create a DAG, it's initially empty, and a directory object is created in Active Directory that represents the DAG. The directory object is used to store relevant information about the DAG, such as server membership information. When you add the first server to a DAG, a failover cluster is automatically created for the DAG. In addition, the infrastructure that monitors the servers for network or server failures is initiated. The failover cluster heartbeat mechanism and cluster database are then used to track and manage information about the DAG that can change quickly, such as database mount status, replication status, and last mounted location. Contents Creating DAGs DAG membership Configuring DAG properties DAG networks Configuring DAG members Performing maintenance on DAG members Shutting down DAG members Installing update rollups on DAG members

Creating DAGs
A DAG can be created using the New Database Availability Group wizard in the Exchange Administration Center (EAC), or by running the NewDatabaseAvailabilityGroup cmdlet in the Exchange Management Shell. When creating a DAG, you provide a name for the DAG, and optional witness server and witness directory settings. In addition, one or more IP addresses are assigned to the DAG, either by using static IP addresses or by allowing the DAG to be automatically assigned the necessary IP addresses using Dynamic Host Configuration Protocol (DHCP). You can manually assign IP addresses to the DAG by using the DatabaseAvailabilityGroupIpAddresses parameter. If you omit this parameter, the DAG attempts to obtain an IP address by using a DHCP server on your network. For detailed steps about how to create a DAG, see Create a Database Availability Group. When you create a DAG, an empty object representing the DAG with the name you specified and an object class of msExchMDBAvailabilityGroup is created in Active Directory. DAGs use a subset of Windows failover clustering technologies, such as the cluster heartbeat, cluster networks, and cluster database (for storing data that changes or can change quickly, such as database state changes from active to passive or the reverse, or from mounted to dismounted or the reverse). Because DAGs rely on Windows failover clustering, they can only be created on Exchange 2013 Mailbox servers running the Windows Server 2008 R2 Enterprise or Datacenter operating system or the Windows Server 2012 Standard or Datacenter operating system. Note: The failover cluster created and used by the DAG must be dedicated to the DAG. The cluster can't be used for any other high availability solution or for any other purpose. For example, the failover cluster can't be used to cluster other applications or services. Using a DAG's underlying failover cluster for purposes other than the DAG isn't supported.

DAG witness server and witness directory


When creating a DAG, you need to specify a name for the DAG no longer than 15 characters that's unique within the Active Directory forest. In addition, each DAG is configured with a witness server and witness directory. The witness server and its directory are used only when there's an even number of members in the DAG and then only for quorum purposes. You don't need to create the witness directory in advance. Exchange automatically creates and secures the directory for you on the witness server. The directory shouldn't be used for any purpose other than for the DAG witness server. The requirements for the witness server are as follows: The witness server can't be a member of the DAG. The witness server must be in the same Active Directory forest as the DAG. The witness server must be running Windows Server 2012, Windows Server 2008 R2, Windows Server 2008, Windows Server 2003 R2, or Windows Server 2003. A single server can serve as a witness for multiple DAGs. However, each DAG requires its own witness directory. We recommend that you use an Exchange 2013 Client Access server in the Active Directory site containing the DAG. This allows the witness server and directory to remain under the control of an Exchange administrator. Regardless of what server is used as the witness server, if the Windows Firewall is enabled on the intended witness server, you must enable the Windows Firewall exception for File and Printer Sharing. Important: If the witness server you specify isn't an Exchange 2013 or Exchange 2010 server, you must add the Exchange Trusted Subsystem universal security group (USG) to the local Administrators group on the witness server prior to creating the DAG. These security permissions are necessary to ensure that Exchange can create a directory and share on the witness server as needed. Neither the witness server nor the witness directory needs to be fault tolerant or use any form of redundancy or high availability. There's no need to use a clustered file server for the witness server or employ any other form of resiliency for the witness server. There are several reasons for this. With larger DAGs (for example, six members or more), several failures are required before the witness server is needed. Because a six-member DAG can tolerate as many as two voter failures without losing quorum, it would take as many as three voters failing before the witness server would be needed to maintain a quorum. Also, if there's a failure that affects your current witness server (for example, you lose the witness server because of a hardware failure), you can use the Set-DatabaseAvailabilityGroup cmdlet to configure a new witness server and witness directory (provided you have a quorum). Note: You can also use the Set-DatabaseAvailabilityGroup cmdlet to configure the witness server and witness directory in the original location if the witness server lost its storage or if someone changed the witness directory or share permissions. As a best practice, in an environment where a DAG is extended across multiple datacenters (and Active Directory sites) and configured for site resilience, we recommend that you use a witness server in your primary datacenter (the datacenter containing the majority of your user population). If each datacenter has a similar

number of users, the datacenter you choose to host the witness server is considered to be the primary datacenter from the solution's perspective. If the witness server is in the datacenter with the majority of the client population, the majority of clients retain access after a failure. If the datacenter is remote to large user populations, this may affect your decision. You would then need to determine if there's a requirement for the primary datacenter to remain healthy and active if there's a loss of wide are network (WAN) connectivity to the other two datacenters. In that event, the witness server should also be in the primary datacenter. Although it's supported to use a witness server in a third datacenter, we don't recommend this scenario. From an Exchange perspective, this configuration doesn't provide you with greater availability. It's important that you examine the critical path factors if you use a witness server in a third datacenter. For example, if the WAN connection between the primary datacenter and the second and third datacenter fails, the solution in the primary datacenter becomes unavailable.

Specifying a witness server and witness directory during DAG creation


When creating a DAG, you must provide a name for the DAG. You can optionally also specify a witness server and witness directory. If you specify a witness server, we recommend that you use an Exchange 2013 Client Access server, because this allows an Exchange administrator to be aware of the availability of the witness server. When creating a DAG, the following combinations of options and behaviors are available: You can specify only a name for the DAG, and leave the Witness server and Witness directory fields blank. In this scenario, the wizard searches for a Client Access server that doesn't have the Mailbox server installed, and it automatically creates the default directory (%SystemDrive%:\DAGFileShareWitnesses\< DAGFQDN>) and default share (<DAGFQDN>) on that server and uses that Client Access server as the witness server. For example, consider the witness server CAS3 on which the operating system has been installed onto drive C. The DAG DAG1 in the domain contoso.com would use a default witness directory of C:\DAGFileShareWitnesses\DAG1.contoso.com, which would be shared as \\CAS3\DAG1.contoso.com. You can specify a name for the DAG, the witness server that you want to use, and the directory you want created and shared on the witness server. You can specify a name for the DAG and the witness server that you want to use, and leave the Witness directory field blank. In this scenario, the wizard creates the default directory on the specified witness server. You can specify a name for the DAG, leave the Witness server field blank, and specify the directory you want created and shared on the witness server. In this scenario, the wizard searches for a Client Access server that doesn't have the Mailbox server installed, and it automatically creates the specified DAG on that server, shares the directory, and uses that Client Access server as the witness server. When a DAG is formed, it initially uses the Node Majority quorum model. When the second Mailbox server is added to the DAG, the quorum is automatically changed to a Node and File Share Majority quorum model. When this change occurs, the DAG's cluster begins using the witness server for maintaining quorum. If the witness directory doesn't exist, Exchange automatically creates it, shares it, and provisions the share with full control permissions for the cluster name object (CNO) computer account for the DAG. Note: Using a file share that's part of a Distributed File System (DFS) namespace isn't supported. If Windows Firewall is enabled on the witness server before the DAG is created, it may block the creation of the DAG. Exchange uses Windows Management Instrumentation (WMI) to create the directory and file share on the witness server. If Windows Firewall is enabled on the witness server and there are no firewall exceptions configured for WMI, the New-DatabaseAvailabilityGroup cmdlet fails with an error. If you specify a witness server, but not a witness directory, you receive the following error message.

The task was unable to create the default witness directory on server < Server Name>. Please manually specify a witness directory. If you specify a witness server and witness directory, you receive the following warning message.

Unable to access file shares on witness server 'ServerName'. Until this problem is corrected, the database availability group may be more vulnerable to failures. You can use the Set-DatabaseAvailabilityGroup cmdlet to try the operation again. Error: The network path was not found. If Windows Firewall is enabled on the witness server after the DAG is created but before servers are added, it may block the addition or removal of DAG members. If Windows Firewall is enabled on the witness server and there are no firewall exceptions configured for WMI, the Add-DatabaseAvailabilityGroupServer cmdlet displays the following warning message.

Failed to create file share witness directory 'C:\DAGFileShareWitnesses\DAG_FQDN' on witness server 'ServerName' . Until this problem is corrected, the database availability group may be more vulnerable to failures. You can use the Set-DatabaseAvailabilityGroup cmdlet to try the operation again. Error: WMI exception occurred on server 'ServerName': The RPC server is unavailable. (Exception from HRESULT: 0x800706BA) To resolve the preceding error and warnings, do one of the following: Manually create the witness directory and share on the witness server, and assign the CNO for the DAG full control for the directory and share. Enable the WMI exception in Windows Firewall. Disable Windows Firewall. Return to top

DAG membership
After a DAG has been created, you can add servers to or remove servers from the DAG using the Manage Database Availability Group wizard in the EAC, or using the Add-DatabaseAvailabilityGroupServer or Remove-DatabaseAvailabilityGroupServer cmdlets in the Shell. For detailed steps about how to manage DAG membership, see Manage Database Availability Group Membership. Note: Each Mailbox server that's a member of a DAG is also a node in the underlying cluster used by the DAG. As a result, at any one time, a Mailbox server can be a member of only one DAG. If the Mailbox server being added to a DAG doesn't have the failover clustering component installed, the method used to add the server (for example, the AddDatabaseAvailabilityGroupServer cmdlet or the Manage Database Availability Group wizard) installs the failover clustering feature. When the first Mailbox server is added to a DAG, the following occurs: The Windows failover clustering component is installed, if it isn't already installed. A failover cluster is created using the name of the DAG. This failover cluster is used exclusively by the DAG, and the cluster must be dedicated to the DAG. Use of the cluster for any other purpose isn't supported. A CNO is created in the default computers container. The name and IP address of the DAG is registered as a Host (A) record in Domain Name System (DNS). The server is added to the DAG object in Active Directory. The cluster database is updated with information on the databases mounted on the added server.

In a large or multiple site environment, especially those in which the DAG is extended to multiple Active Directory sites, you must wait for Active Directory replication of the DAG object containing the first DAG member to complete. If this Active Directory object isn't replicated throughout your environment, adding the second server may cause a new cluster (and new CNO) to be created for the DAG. This is because the DAG object appears empty from the perspective of the second member being added, thereby causing the Add-DatabaseAvailabilityGroupServer cmdlet to create a cluster and CNO for the DAG, even though these objects already exist. To verify that the DAG object containing the first DAG server has been replicated, use the Get-DatabaseAvailabilityGroup cmdlet on the second server being added to verify that the first server you added is listed as a member of the DAG. When the second and subsequent servers are added to the DAG, the following occurs: The server is joined to the Windows failover cluster for the DAG. The quorum model is automatically adjusted: A Node Majority quorum model is used for DAGs with an odd number of members. A Node and File Share Majority quorum model is used for DAGs with an even number of members. The witness directory and share are automatically created by Exchange when needed. The server is added to the DAG object in Active Directory. The cluster database is updated with information about mounted databases. Note: The quorum model change should happen automatically. However, if the quorum model doesn't automatically change to the proper model, you can run the SetDatabaseAvailabilityGroup cmdlet with only the Identity parameter to correct the quorum settings for the DAG.

Pre-staging the cluster name object for a DAG


The CNO is a computer account created in Active Directory and associated with the cluster's Name resource. The cluster's Name resource is tied to the CNO, which is a Kerberos-enabled object that acts as the cluster's identity and provides the cluster's security context. The formation of the DAG's underlying cluster and the CNO for that cluster is performed when the first member is added to the DAG. When the first server is added to the DAG, remote PowerShell contacts the Microsoft Exchange Replication service on the Mailbox server being added. The Microsoft Exchange Replication service installs the failover clustering feature (if it isn't already installed) and begins the cluster creation process. The Microsoft Exchange Replication service runs under the LOCAL SYSTEM security context, and it's under this context in which cluster creation is performed.

Warning: If your DAG members are running Windows Server 2012, you must pre-stage the CNO prior to adding the first server to the DAG. In environments where computer account creation is restricted, or where computer accounts are created in a container other than the default computers container, you can pre-stage and provision the CNO. You create and disable a computer account for the CNO, and then either: Assign full control of the computer account to the computer account of the first Mailbox server you're adding to the DAG. Assign full control of the computer account to the Exchange Trusted Subsystem USG. Assigning full control of the computer account to the computer account of the first Mailbox server you're adding to the DAG ensures that the LOCAL SYSTEM security context will be able to manage the pre-staged computer account. Assigning full control of the computer account to the Exchange Trusted Subsystem USG can be used instead because the Exchange Trusted Subsystem USG contains the machine accounts of all Exchange servers in the domain. For detailed steps about how to pre-stage and provision the CNO for a DAG, see Pre-Stage the Cluster Name Object for a Database Availability Group.

Removing servers from a DAG


Mailbox servers can be removed from a DAG by using the Manage Database Availability Group wizard in the EAC or the Remove-DatabaseAvailabilityGroupServer cmdlet in the Shell. Before a Mailbox server can be removed from a DAG, all replicated mailbox databases must first be removed from the server. If you attempt to remove a Mailbox server with replicated mailbox databases from a DAG, the task fails. There are scenarios in which you must remove a Mailbox server from a DAG before performing certain operations. These scenarios include: Performing a server recovery operation If a Mailbox server that's a member of a DAG is lost, or otherwise fails and is unrecoverable and needs replacement, you can perform a server recovery operation using the Setup /m:RecoverServer switch. However, before you can perform the recovery operation, you must first remove the server from the DAG using the Remove-DatabaseAvailabilityGroupServer cmdlet with the ConfigurationOnly parameter. Removing the database availability group There may be situations in which you need to remove a DAG (for example, when disabling third-party replication mode). If you need to remove a DAG, you must first remove all servers from the DAG. If you attempt to remove a DAG that contains any members, the task fails. Return to top

Configuring DAG properties


After servers have been added to the DAG, you can use the EAC or the Shell to configure the properties of a DAG, including the witness server and witness directory used by the DAG, and the IP addresses assigned to the DAG. Configurable properties include: Witness server The name of the server that you want to host the file share for the file share witness. We recommend that you specify a Client Access server as the witness server. This enables the system to automatically configure, secure, and use the share, as needed, and enables the messaging administrator to be aware of the availability of the witness server. Witness directory The name of a directory that will be used to store file share witness data. This directory will automatically be created by the system on the specified witness server. Database availability group IP addresses One or more IP addresses assigned to the DAG. These addresses can be configured using manually assigned static IP addresses, or they can be automatically assigned to the DAG using a DHCP server in your organization. The Shell enables you to configure DAG properties that aren't available in the EAC, such as DAG IP addresses, network encryption and compression settings, network discovery, the TCP port used for replication, and alternate witness server and witness directory settings, and to enable Datacenter Activation Coordination mode. For detailed steps about how to configure DAG properties, see Configure Database Availability Group Properties.

DAG network encryption


DAGs support the use of encryption by leveraging the encryption capabilities of the Windows Server operating system. DAGs use Kerberos authentication between Exchange servers. Microsoft Kerberos security support provider (SSP) EncryptMessage and DecryptMessage APIs handle encryption of DAG network traffic. Microsoft Kerberos SSP supports multiple encryption algorithms. (For the complete list, see section 3.1.5.2, "Encryption Types" of Kerberos Protocol Extensions). The Kerberos authentication handshake selects the strongest encryption protocol supported in the list: typically Advanced Encryption Standard (AES) 256-bit, potentially with a SHA Hash-based Message Authentication Code (HMAC) to maintain integrity of the data. For details, see HMAC. Network encryption is a property of the DAG and not a DAG network. You can configure DAG network encryption using the Set-DatabaseAvailabilityGroup cmdlet in the Shell. The possible encryption settings for DAG network communications are shown in the following table.

DAG network communication encryption settings

Setting Disabled Enabled InterSubnetOnly SeedOnly

Description Network encryption isn't used. Network encryption is used on all DAG networks for replication and seeding. Network encryption is used on DAG networks when replicating across different subnets. This is the default setting. Network encryption is used on all DAG networks for seeding only.

DAG network compression


DAGs support built-in compression. When compression is enabled, DAG network communication uses XPRESS, which is the Microsoft implementation of the LZ77 algorithm. For details, see An Explanation of the Deflate Algorithm and section 3.1.4.11.1.2.1 "LZ77 Compression Algorithm" of Wire Format Protocol Specification. This is the same type of compression used in many Microsoft protocols, in particular, MAPI RPC compression between Microsoft Outlook and Exchange. As with network encryption, network compression is also a property of the DAG and not a DAG network. You configure DAG network compression by using the SetDatabaseAvailabilityGroup cmdlet in the Shell. The possible compression settings for DAG network communications are shown in the following table.

DAG network communication compression settings


Setting Disabled Enabled InterSubnetOnly SeedOnly Return to top Description Network compression isn't used. Network compression is used on all DAG networks for replication and seeding. Network compression is used on DAG networks when replicating across different subnets. This is the default setting. Network compression is used on all DAG networks for seeding only.

DAG networks
A DAG network is a collection of one or more subnets used for either replication traffic or MAPI traffic. Each DAG contains a maximum of one MAPI network and zero or more replication networks. In a single network adapter configuration, the network is used for both MAPI and replication traffic. Although a single network adapter and path is supported, we recommend that each DAG have a minimum of two DAG networks. In a two-network configuration, one network is typically dedicated for replication traffic, and the other network is used primarily for MAPI traffic. You can also add network adapters to each DAG member and configure additional DAG networks as replication networks. Note: When using multiple replication networks, there's no way to specify an order of precedence for network use. Exchange randomly selects a replication network from the group of replication networks to use for log shipping. In Exchange 2010, manual configuration of DAG networks was necessary in many scenarios. By default in Exchange 2013, DAG networks are automatically configured by the system. Before you can create or modify DAG networks, you must first enable manual DAG network control by running the following command:

Set-DatabaseAvailabilityGroup <DAGName> -ManualDagNetworkConfiguration $true

After you've enabled manual DAG network configuration, you can use the New-DatabaseAvailabilityGroupNetwork cmdlet in the Shell to create a DAG network. For detailed steps about how to create a DAG network, see Create a Database Availability Group Network. You can use the Set-DatabaseAvailabilityGroupNetwork cmdlet in the Shell to configure DAG network properties. For detailed steps about how to configure DAG network properties, see Configure Database Availability Group Network Properties. Each DAG network has required and optional parameters to configure: Network name A unique name for the DAG network of up to 128 characters. Network description An optional description for the DAG network of up to 256 characters. Network subnets One or more subnets entered using a format of IPAddress/Bitmask (for example, 192.168.1.0/24 for Internet Protocol version 4 (IPv4) subnets; 2001:DB8:0:C000::/64 for Internet Protocol version 6 (IPv6) subnets). Enable replication In the EAC, select the check box to dedicate the DAG network to replication traffic and block MAPI traffic. Clear the check box to prevent replication from using the DAG network and to enable MAPI traffic. In the Shell, use the ReplicationEnabled parameter in the SetDatabaseAvailabilityGroupNetwork cmdlet to enable and disable replication. Note: Disabling replication for the MAPI network doesn't guarantee that the system won't use the MAPI network for replication. When all configured replication networks are offline, failed, or otherwise unavailable, and only the MAPI network remains (which is configured as disabled for replication), the system uses the MAPI network for replication. The initial DAG networks (for example, MapiDagNetwork and ReplicationDagNetwork01) created by the system are based on the subnets enumerated by the Cluster service. Each DAG member must have the same number of network adapters, and each network adapter must have an IPv4 address (and optionally, an IPv6 address as well) on a unique subnet. Multiple DAG members can have IPv4 addresses on the same subnet, but each network adapter and IP address pair in a specific DAG member must be on a unique subnet. In addition, only the adapter used for the MAPI network should be configured with a default gateway. Replication networks shouldn't be configured with a default gateway. For example, consider DAG1, a two-member DAG where each member has two network adapters (one dedicated for the MAPI network and the other for a replication network). Example IP address configuration settings are shown in the following table.

Example network adapter settings


Server-network adapter EX1-MAPI EX1-Replication EX2-MAPI EX2-Replication IP address/subnet mask 192.168.1.15/24 10.0.0.15/24 192.168.1.16 10.0.0.16 Default gateway 192.168.1.1 Not applicable 192.168.1.1 Not applicable

In the following configuration, there are two subnets configured in the DAG: 192.168.1.0 and 10.0.0.0. When EX1 and EX2 are added to the DAG, two subnets will be

enumerated and two DAG networks will be created: MapiDagNetwork (192.168.1.0) and ReplicationDagNetwork01 (10.0.0.0). These networks will be configured as shown in the following table.

Enumerated DAG network settings for a single-subnet DAG


Name MapiDagNetwork Subnets 192.168.1.0/24 Interfaces EX1 (192.168.1.15) EX2 (192.168.1.16) ReplicationDagNetwork01 10.0.0.0/24 EX1 (10.0.0.15) EX2 (10.0.0.16) To complete the configuration of ReplicationDagNetwork01 as the dedicated replication network, disable replication for MapiDagNetwork by running the following command. False True MAPI access enabled True Replication enabled True

Set-DatabaseAvailabilityGroupNetwork -Identity DAG1\MapiDagNetwork -ReplicationEnabled:$false

After replication is disabled for MapiDagNetwork, the Microsoft Exchange Replication service uses ReplicationDagNetwork01 for continuous replication. If ReplicationDagNetwork01 experiences a failure, the Microsoft Exchange Replication service reverts to using MapiDagNetwork for continuous replication. This is done intentionally by the system to maintain high availability.

DAG networks and multiple subnet deployments


In the preceding example, even though there are two different subnets in use by the DAG (192.168.1.0 and 10.0.0.0), the DAG is considered a single-subnet DAG because each member uses the same subnet to form the MAPI network. When DAG members use different subnets for the MAPI network, the DAG is referred to as a multi-subnet DAG. In a multi-subnet DAG, the proper subnets are automatically with each DAG network. For example, consider DAG2, a two-member DAG where each member has two network adapters (one dedicated for the MAPI network and the other for a replication network), and each DAG member is located in a separate Active Directory site, with its MAPI network on a different subnet. Example IP address configuration settings are shown in the following table.

Example network adapter settings for a multi-subnet DAG


Server-network adapter EX1-MAPI EX1-Replication EX2-MAPI EX2-Replication IP address/subnet mask 192.168.0.15/24 10.0.0.15/24 192.168.1.15 10.0.1.15 Default gateway 192.168.0.1 Not applicable 192.168.1.1 Not applicable

In the following configuration, there are four subnets configured in the DAG: 192.168.0.0, 192.168.1.0, 10.0.0.0, and 10.0.1.0. When EX1 and EX2 are added to the DAG, four subnets will be enumerated, but only two DAG networks will be created: MapiDagNetwork (192.168.0.0, 192.168.1.0) and ReplicationDagNetwork01 (10.0.0.0, 10.0.1.0). These networks will be configured as shown in the following table.

Enumerated DAG network settings for a multi-subnet DAG


Name MapiDagNetwork Subnets 192.168.0.0/24 192.168.1.0/24 ReplicationDagNetwork01 10.0.0.0/24 10.0.1.0/24 Interfaces EX1 (192.168.0.15) EX2 (192.168.1.15) EX1 (10.0.0.15) EX2 (10.0.1.15) False True MAPI access enabled True Replication enabled True

DAG networks and iSCSI networks


By default, DAGs perform discovery of all networks detected and configured for use by the underlying cluster. This includes any Internet SCSI (iSCSI) networks in use as a result of using iSCSI storage for one or more DAG members. As a best practice, iSCSI storage should use dedicated networks and network adapters. These networks shouldn't be managed by the DAG or its cluster, or used as DAG networks (MAPI or replication). Instead, these networks should be manually disabled from use by the DAG, so they can be dedicated to iSCSI storage traffic. To disable iSCSI networks from being detected and used as DAG networks, configure the DAG to ignore any currently detected iSCSI networks using the Set-DatabaseAvailabilityGroupNetwork cmdlet, as shown in this example:

Set-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork02 -ReplicationEnabled:$false -IgnoreNetwork:$true

This command will also disable the network for use by the cluster. Although the iSCSI networks will continue to appear as DAG networks, they won't be used for MAPI or replication traffic after running the above command. Return to top

Configuring DAG members


Mailbox servers that are members of a DAG have some properties specific to high availability that should be configured as described in the following sections: Automatic database mount dial Database copy automatic activation policy Maximum active databases

Automatic database mount dial


The AutoDatabaseMountDial parameter specifies the automatic database mount behavior after a database failover. You can use the Set-MailboxServer cmdlet to configure the AutoDatabaseMountDial parameter with any of the following values: BestAvailability If you specify this value, the database automatically mounts immediately after a failover if the copy queue length is less than or equal to 12. The copy queue length is the number of logs recognized by the passive copy that needs to be replicated. If the copy queue length is more than 12, the

database doesn't automatically mount. When the copy queue length is less than or equal to 12, Exchange attempts to replicate the remaining logs to the passive copy and mounts the database. GoodAvailability If you specify this value, the database automatically mounts immediately after a failover if the copy queue length is less than or equal to six. The copy queue length is the number of logs recognized by the passive copy that needs to be replicated. If the copy queue length is more than six, the database doesn't automatically mount. When the copy queue length is less than or equal to six, Exchange attempts to replicate the remaining logs to the passive copy and mounts the database. Lossless If you specify this value, the database doesn't automatically mount until all logs generated on the active copy have been copied to the passive copy. This setting also causes the Active Manager best copy selection algorithm to sort potential candidates for activation based on the database copy's activation preference value and not its copy queue length. The default value is GoodAvailability. If you specify either BestAvailability or GoodAvailability, and all the logs from the active copy can't be copied to the passive copy being activated, you may lose some mailbox data. However, the Safety Net feature (which is enabled by default) helps protect against most data loss by resubmitting messages that are in the Safety Net queue. In addition to the preceding values, you can also configure the AutoDatabaseMountDial parameter with a custom value by using ADSI Edit or Ldp.exe to modify the attribute directly in Active Directory. The AutoDatabaseMountDial parameter is represented by the msExchDataLossForAutoDatabaseMount attribute of the Mailbox server object. The whole number numeric value for this attribute represents the maximum number of transaction log files you are willing to lose to mount a database without human intervention. If you configure the AutoDatabaseMountDial parameter with a custom value greater than 12, we recommend that you also increase the duration of the Safety Net retention period to enable increased protection against a greater number of lost logs.

Example: configuring automatic database mount dial


The following example configures a Mailbox server with an AutoDatabaseMountDial setting of GoodAvailability.

Set-MailboxServer -Identity EX1 -AutoDatabaseMountDial GoodAvailability

Database copy automatic activation policy


The DatabaseCopyAutoActivationPolicy parameter specifies the type of automatic activation available for mailbox database copies on the selected Mailbox servers. You can use the Set-MailboxServer cmdlet to configure the DatabaseCopyAutoActivationPolicy parameter with any of the following values: Blocked If you specify this value, databases can't be automatically activated on the selected Mailbox servers. IntrasiteOnly If you specify this value, the database copy is allowed to be activated on servers in the same Active Directory site. This prevents cross-site failover or activation. This property is for incoming mailbox database copies (for example, a passive copy being made an active copy). Databases can't be activated on this Mailbox server for database copies that are active in another Active Directory site. Unrestricted If you specify this value, there are no special restrictions on activating mailbox database copies on the selected Mailbox servers.

Example: configuring database copy automatic activation policy


The following example configures a Mailbox server with a DatabaseCopyAutoActivationPolicy setting of Blocked.

Set-MailboxServer -Identity EX1 -DatabaseCopyAutoActivationPolicy Blocked

Maximum active databases


The MaximumActiveDatabases parameter (also used with the Set-MailboxServer cmdlet) specifies the number of databases that can be mounted on a Mailbox server. You can configure Mailbox servers to meet your deployment requirements by ensuring that an individual Mailbox server doesn't become overloaded. The MaximumActiveDatabases parameter is configured with a whole number numeric value. When the maximum number is reached, the database copies on the server won't be activated if a failover or switchover occurs. If the copies are already active on a server, the server won't allow databases to be mounted.

Example: configuring maximum active databases


The following example configures a Mailbox server to support a maximum of 20 active databases.

Set-MailboxServer -Identity EX1 -MaximumActiveDatabases 20

Return to top

Performing maintenance on DAG members


Before performing any type of software or hardware maintenance on a DAG member, you should first remove the DAG member from service by using the StartDagServerMaintenance.ps1 script. This script moves all the active databases off the server and blocks active databases from moving to that server. The script also ensures that all critical DAG support functionality that may be on the server (for example, the Primary Active Manager (PAM) role) is moved to another server and blocked from moving back to the server. Specifically, the StartDagServerMaintenance.ps1 script performs the following tasks: Runs Suspend-MailboxDatabaseCopy with the ActivationOnly parameter to suspend each database copy hosted on the DAG member for activation. Pauses the node in the cluster, which prevents the node from being and becoming the PAM. Sets the value of the DatabaseCopyAutoActivationPolicy parameter on the DAG member to Blocked. Moves all active databases currently hosted on the DAG member to other DAG members. If the DAG member currently owns the default cluster group, the script moves the default cluster group (and therefore the PAM role) to another DAG member. If any of the preceding tasks fails, all operations, except for successful database moves, are undone. After the maintenance is complete and the DAG member is ready to return to service, you can use the StopDagServerMaintenance.ps1 script to take the DAG member out of maintenance mode and put it back into production. Specifically, the StopDagServerMaintenance.ps1 script performs the following tasks: Runs the Resume-MailboxDatabaseCopy cmdlet for each database copy hosted on the DAG member. Resumes the node in the cluster, which enables full cluster functionality for the DAG member. Sets the value of the DatabaseCopyAutoActivationPolicy parameter on the DAG member to Unrestricted. Both scripts accept the -ServerName parameter (which can be either the host name or the fully qualified domain name (FQDN) of the DAG member) and the -WhatIf parameter. Both scripts can be run locally or remotely. The server on which the scripts are executed must have the Windows Failover Cluster Management tools installed

(RSAT-Clustering). Return to top

Shutting down DAG members


The Exchange 2013 high availability solution is integrated with the Windows shutdown process. If an administrator or application initiates a shutdown of a Windows server in a DAG that has a mounted database that's replicated to one or more DAG members, the system attempts to activate another copy of the mounted database prior to allowing the shutdown process to complete. However, this new behavior doesn't guarantee that all of the databases on the server being shut down will experience a lossless activation. As a result, it's a best practice to perform a server switchover prior to shutting down a server that's a member of a DAG. Return to top

Installing update rollups on DAG members


Installing Microsoft Exchange Server 2013 update rollups on a server that's a member of a DAG is a relatively straightforward process. When you install an update rollup on a server that's a member of a DAG, several services are stopped during the installation, including all Exchange services and the Cluster service. The general process for applying update rollups to a DAG member is as follows: 1. 2. 3. 4. Use the StartDagServerMaintenance.ps1 script to put the DAG member in maintenance mode. Install the update rollup. Use the StopDagServerMaintenance.ps1 script to take the DAG member out of maintenance mode and put it back into production. Use the RedistributeActiveDatabases.ps1 script to rebalance the active database copies across the DAG.

You can download the latest update rollup for Exchange 2013 from the Microsoft Download Center. Return to top

TechNet

Products

IT Resources

Downloads

Training

Support

Managing Mailbox Database Copies


Exchange 2013
1 out of 1 rated this helpful Applies to: Exchange Server 2013 Topic Last Modified: 2012-12-10 After a database availability group (DAG) has been created, configured, and populated with Mailbox server members, you can use the Exchange Administration Center (EAC) or the Exchange Management Shell to add mailbox database copies in a flexible and granular way.

Managing database copies


After multiple copies of a database are created, you can use the EAC or the Shell to monitor the health and status of each copy and to perform other management tasks associated with database copies. Some of the management tasks you may need to perform include suspending or resuming a database copy, seeding a database copy, monitoring database copies, configuring database copy settings, and removing a database copy.

Suspending and resuming database copies


For a variety of reasons, such as performing planned maintenance, it may be necessary to suspend and resume continuous replication activity for a database copy. In addition, some administrative tasks, such as seeding, require you to first suspend a database copy. We recommend that all replication activity be suspended when the path for the database or its log files is being changed. You can suspend and resume database copy activity by using the EAC, or by running the SuspendMailboxDatabaseCopy and Resume-MailboxDatabaseCopy cmdlets in the Shell. For detailed steps about how to suspend or resume continuous replication activity for a database copy, see Suspend or Resume a Mailbox Database Copy. Log truncation doesn't occur on the active mailbox database copy when one or more passive copies are suspended. If planned maintenance activities are going to take an extended period of time (for example, several days), you may have considerable log file buildup. To prevent the log drive from filling up with transaction logs, you can remove the affected passive database copy instead of suspending it. When the planned maintenance is completed, you can re-add the passive database copy.

Seeding a database copy


Seeding, also known as updating, is the process in which a database, either a blank database or a copy of the production database, is added to the target copy location on another Mailbox server in the same DAG as the active database. This becomes the baseline database for the copy maintained by that server. Depending on the situation, seeding can be an automatic process or a manual process that you initiate. When a database copy is added, the copy will be automatically seeded, provided that the target server and its storage are properly configured. If you want to manually seed a database copy and don't want automatic seeding to occur when creating the copy, you can use the SeedingPostponed parameter when running the Add-MailboxDatabaseCopy cmdlet. Database copies rarely need to be reseeded after the initial seeding has occurred. But if reseeding is necessary, or if you want to manually seed a database copy instead of having the system automatically seed the copy, these tasks can be performed by using the Update Mailbox Database Copy wizard in the EAC or by using the Update-MailboxDatabaseCopy cmdlet in the Shell. Before seeding a database copy, you must first suspend the mailbox database copy. For detailed steps about how to seed a database copy, see Update a Mailbox Database Copy. After a manual seed operation has completed, replication for the seeded mailbox database copy is automatically resumed. If you don't want replication to automatically resume, you can use the ManualResume parameter when running the Update-MailboxDatabaseCopy cmdlet.

Choosing what to seed


When performing a seed operation, you can choose to seed the mailbox database copy, the content index catalog for the mailbox database copy, or both the database copy and the content index catalog copy. The default behavior of the Update Mailbox Database Copy wizard and the Update-MailboxDatabaseCopy cmdlet is to seed both the mailbox database copy and the content index catalog copy. To seed just the mailbox database copy without seeding the content index catalog, use the DatabaseOnly parameter when running the Update-MailboxDatabaseCopy cmdlet. To seed just the content index catalog copy, use the CatalogOnly parameter when running the Update-MailboxDatabaseCopy cmdlet.

Selecting the seeding source


Any healthy database copy can be used as the seeding source for an additional copy of that database. This is particularly useful when you have a DAG that has been extended across multiple physical locations. For example, consider a four-member DAG deployment, where two members (MBX1 and MBX2) are located in Portland, Oregon and two members (MBX3 and MBX4) are located in New York, New York. A mailbox database named DB1 is active on MBX1 and there are passive copies of DB1 on MBX2 and MBX3. When adding a copy of DB1 to MBX4, you have the option of using the copy on MBX3 as the source for seeding. In doing so, you avoid seeding over the wide area network (WAN) link between Portland and New York. To use a specific copy as a source for seeding when adding a new database copy, you would do the following: Use the SeedingPostponed parameter when running the Add-MailboxDatabaseCopy cmdlet to add the database copy. If the SeedingPostponed parameter isn't used, the database copy will be explicitly seeded using the active copy of the database as the source. You can specify the source server you want to use as part of the Update Mailbox Database Copy wizard in the EAC, or you can use the SourceServer parameter when running the Update-MailboxDatabaseCopy cmdlet to specify the desired source server for seeding. In the preceding example, you would specify MBX3 as the source server. If the SourceServer parameter isn't used, the database copy will be explicitly seeded from the active copy of the database.

Seeding and networks


In addition to selecting a specific source server for seeding a mailbox database copy, you can also use the Shell to specify which DAG networks to use, and optionally override the DAG network's compression and encryption settings during the seed operation. To specify the networks you want to use for seeding, use the Network parameter when running the Update-MailboxDatabaseCopy cmdlet and specify the DAG networks that you want to use. If you don't use the Network parameter, the system uses the following default behavior for selecting a network to use for the seeding operation: If the source server and target server are on the same subnet and a replication network has been configured that includes the subnet, the replication network will be used. If the source server and target server are on different subnets, even if a replication network that contains those subnets has been configured, the client (MAPI) network will be used for seeding. If the source server and target server are in different datacenters, the client (MAPI) network will be used for seeding. At the DAG level, DAG networks are configured for encryption and compression. The default settings are to use encryption and compression only for communications on different subnets. If the source and target are on different subnets and the DAG is configured with the default values for NetworkCompression and NetworkEncryption, you can override these values by using the NetworkCompressionOverride and NetworkEncryptionOverride parameters, respectively, when running the Update-MailboxDatabaseCopy cmdlet.

Seeding process
When you initiate a seeding process by using the Add-MailboxDatabaseCopy or Update-MailboxDatabaseCopy cmdlets, the following tasks are performed: 1. Database properties from Active Directory are read to validate the specified database and servers, and to verify that the source and target servers are running Exchange 2013, they are both members of the same DAG, and that the specified database isn't a recovery database. The database file paths are also read. 2. Preparations occur for reseed checks from the Microsoft Exchange Replication service on the target server. 3. The Microsoft Exchange Replication service on the target server checks for the presence of database and transaction log files in the file directories read by the Active Directory checks in step 1. 4. The Microsoft Exchange Replication service returns the status information from the target server to the administrative interface from where the cmdlet was run. 5. If all preliminary checks have passed, you're prompted to confirm the operation before continuing. If you confirm the operation, the process continues. If an error is encountered during the preliminary checks, the error is reported and the operation fails. 6. The seed operation is started from the Microsoft Exchange Replication service on the target server. 7. The Microsoft Exchange Replication service suspends database replication for the active database copy. 8. The state information for the database is updated by the Microsoft Exchange Replication service to reflect a status of Seeding. 9. If the target server doesn't already have the directories for the target database and log files, they are created. 10. A request to seed the database is passed from the Microsoft Exchange Replication service on the target server to the Microsoft Exchange Replication service on the source server using TCP. This request and the subsequent communications for seeding the database occur on a DAG network that has been configured as a replication network. 11. The Microsoft Exchange Replication service on the source server initiates an Extensible Storage Engine (ESE) streaming backup via the Microsoft Exchange Information Store service interface. 12. The Microsoft Exchange Information Store service streams the database data to the Microsoft Exchange Replication service. 13. The database data is moved from the source server's Microsoft Exchange Replication service to the target server's Microsoft Exchange Replication service. 14. The Microsoft Exchange Replication service on the target server writes the database copy to a temporary directory located in the main database directory called temp-seeding. 15. The streaming backup operation on the source server ends when the end of the database is reached. 16. The write operation on the target server completes, and the database is moved from the temp-seeding directory to the final location. The temp-seeding directory is deleted. 17. On the target server, the Microsoft Exchange Replication service proxies a request to the Microsoft Exchange Search service to mount the content index catalog for the database copy, if it exists. If there are existing out-of-date catalog files from a previous instance of the database copy, the mount operation fails, which triggers the need to replicate the catalog from the source server. Likewise, if the catalog doesn't exist on a new instance of the database copy on the target server, a copy of the catalog is required. The Microsoft Exchange Replication service directs the Microsoft Exchange Search service to suspend indexing for the database copy while a new catalog is copied from the source. 18. The Microsoft Exchange Replication service on the target server sends a seed catalog request to the Microsoft Exchange Replication service on the source server. 19. On the source server, the Microsoft Exchange Replication service requests the directory information from the Microsoft Exchange Search service and requests that indexing be suspended. 20. The Microsoft Exchange Search service on the source server returns the search catalog directory information to the Microsoft Exchange Replication service. 21. The Microsoft Exchange Replication service on the source server reads the catalog files from the directory. 22. The Microsoft Exchange Replication service on the source server moves the catalog data to the Microsoft Exchange Replication service on the target server using a connection across the replication network. After the read is complete, the Microsoft Exchange Replication service sends a request to the Microsoft Exchange Search service to resume indexing of the source database. 23. If there are any existing catalog files on the target server in the directory, the Microsoft Exchange Replication service on the target server deletes them. 24. The Microsoft Exchange Replication service on the target server writes the catalog data to a temporary directory called CiSeed.Temp until the data is completely transferred. 25. The Microsoft Exchange Replication service moves the complete catalog data to the final location. 26. The Microsoft Exchange Replication service on the target server resumes search indexing on the target database. 27. The Microsoft Exchange Replication service on the target server returns a completion status. 28. The final result of the operation is passed to the administrative interface from which the cmdlet was called.

Configuring database copies


After a database copy is created, you can view and modify its configuration settings when needed. You can view some configuration information by examining the Properties page for a database copy in the EAC. You can also use the Get-MailboxDatabase and Set-MailboxDatabaseCopy cmdlets in the Shell to view and configure database copy settings, such as replay lag time, truncation lag time, and activation preference order. For detailed steps about how to view and configure database copy settings, see Configure Mailbox Database Copy Properties.

Using replay lag and truncation lag options


Mailbox database copies support the use of a replay lag time and a truncation lag time, both of which are configured in minutes. Setting a replay lag time enables you to take a database copy back to a specific point in time. Setting a truncation lag time enables you to use the logs on a passive database copy to recover from the loss of log files on the active database copy. Because both of these features result in the temporary buildup of log files, using either of them will affect your storage design.

Replay lag time


Replay lag time is a property of a mailbox database copy that specifies the amount of time, in minutes, to delay log replay for the database copy. The replay lag timer starts when a log file has been replicated to the passive copy and has successfully passed inspection. By delaying the replay of logs to the database copy, you have the capability to recover the database to a specific point in time in the past. A mailbox database copy configured with a replay lag time greater than 0 is referred to as a lagged mailbox database copy , or simply, a lagged copy. A strategy that uses database copies and the litigation hold features in Exchange 2013 can provide protection against a range of failures that would ordinarily cause data loss. However, these features can't provide protection against data loss in the event of logical corruption, which although rare, can cause data loss. Lagged copies are designed to prevent loss of data in the case of logical corruption. Generally, there are two types of logical corruption: Database logical corruption The database pages checksum matches, but the data on the pages is wrong logically. This can occur when ESE attempts to write a database page and even though the operating system returns a success message, the data is either never written to the disk or it's written to the wrong place. This is referred to as a lost flush. To prevent lost flushes from losing data, ESE includes a lost flush detection mechanism in the database along with a page patching feature (single page restore). Store logical corruption Data is added, deleted, or manipulated in a way that the user doesn't expect. These cases are generally caused by third-party applications. It's generally only corruption in the sense that the user views it as corruption. The Exchange store considers the transaction that produced the logical corruption to be a series of valid MAPI operations. The litigation hold feature in Exchange 2013 provides protection from store logical corruption (because it prevents content from being permanently deleted by a user or application). However, there may be scenarios where a user mailbox becomes so corrupted that it would be easier to restore the database to a point in time prior to the corruption, and then export the user mailbox to retrieve uncorrupted data. The combination of database copies, hold policy, and ESE single page restore leaves only the rare but catastrophic store logical corruption case. Your decision on whether to use a database copy with a replay lag (a lagged copy) will depend on which third-party applications you use and your organization's history with store logical corruption. If you choose to use lagged copies, be aware of the following implications for their use: The replay lag time is an administrator-configured value, and by default, it's disabled. The replay lag time setting has a default setting of 0 days, and a maximum setting of 14 days. Lagged copies aren't considered highly available copies. Instead, they are designed for disaster recovery purposes, to protect against store logical corruption.

The greater the replay lag time set, the longer the database recovery process. Depending on the number of log files that need to replayed during recovery, and the speed at which your hardware can replay them, it may take several hours or more to recover a database. We recommend that you determine whether lagged copies are critical for your overall disaster recovery strategy. If using them is critical to your strategy, we recommend using multiple lagged copies, or using a redundant array of independent disks (RAID) to protect a single lagged copy, if you don't have multiple lagged copies. If you lose a disk or if corruption occurs, you don't lose your lagged point in time. Lagged copies aren't patchable with the ESE single page restore feature. If a lagged copy encounters database page corruption (for example, a -1018 error), it will have to be reseeded (which will lose the lagged aspect of the copy). Activating and recovering a lagged mailbox database copy is an easy process if you want the database to replay all log files and make the database copy current. If you want to replay log files up to a specific point in time, it's a more difficult operation because you manually manipulate log files and run Exchange Server Database Utilities (Eseutil.exe). For detailed steps about how to activate a lagged mailbox database copy, see Activate a Lagged Mailbox Database Copy.

Truncation lag time


Truncation lag time is a property of a mailbox database copy that specifies the amount of time, in minutes, to delay log deletion for the database copy after the log file has been replayed into the database copy. The truncation lag timer starts when a log file has been replicated to the passive copy, successfully passed inspection, and has been successfully replayed into the copy of the database. By delaying the truncation of log files from the database copy, you have the capability to recover from failures that affect the log files for the active copy of the database.

Database copies and log truncation


Log truncation works the same in Exchange 2013 as it did in Exchange 2010. Truncation behavior is determined by the replay lag time and truncation lag time settings for the copy. The following criteria must be met for a database copy's log file to be truncated when lag settings are left at their default values of 0 (disabled): The log file must have been successfully backed up, or circular logging must be enabled. The log file must be below the checkpoint (the minimum log file required for recovery) for the database. All other lagged copies must have inspected the log file. All other copies (not lagged copies) must have replayed the log file. The following criteria must be met for truncation to occur for a lagged database copy: The log file must be below the checkpoint for the database. The log file must be older than ReplayLagTime + TruncationLagTime. The log file must have been truncated on the active copy.

Database activation policy


There are scenarios in which you may want to create a mailbox database copy and prevent the system from automatically activating that copy in the event of a failure, for example: If you deploy one or more mailbox database copies to an alternate or standby datacenter. If you configure a lagged database copy for recovery purposes. If you are performing maintenance or an upgrade of a server. In each of the preceding scenarios, you have database copies that you don't want the system to activate automatically. To prevent the system from automatically activating a mailbox database copy, you can configure the copy to be blocked (suspended) for activation. This allows the system to maintain the currency of the database through log shipping and replay, but prevents the system from automatically activating and using the copy. Copies blocked for activation must be manually activated by an administrator. You can configure the database activation policy for an entire server by using the Set-MailboxServer cmdlet or an individual database copy by using the Set-MailboxDatabaseCopy cmdlet to set the DatabaseCopyAutoActivationPolicy parameter to Blocked. For more information about configuring database activation policy, see Configure Activation Policy for a Mailbox Database Copy.

Effect of mailbox moves on continuous replication


On a very busy mailbox database with a high log generation rate, there is a greater chance for data loss if replication to the passive database copies can't keep up with log generation. One scenario that can introduce a high log generation rate is mailbox moves. Exchange 2013 includes a Data Guarantee API that's used by services such as the Microsoft Exchange Mailbox Replication service (MRS) to check the health of the database copy architecture based on the value of the DataMoveReplicationConstraint parameter that was set by the system or an administrator. Specifically, the Data Guarantee API can be used to: Check replication health Confirms that the prerequisite number of database copies is available. Check replication flush Confirms that the required log files have been replayed against the prerequisite number of database copies. When executed, the API returns the following status information to the calling application: Retry Signifies that there are transient errors that prevent a condition from being checked against the database. Satisfied Signifies that the database meets the required conditions or the database isn't replicated. NotSatisfied Signifies that the database doesn't meet the required conditions. In addition, information is provided to the calling application as to why the NotSatisfied response was returned. The value of the DataMoveReplicationConstraint parameter for the mailbox database determines how many database copies should be evaluated as part of the request. The DataMoveReplicationConstraint parameter has the following possible values: None When you create a mailbox database, this value is set by default. When this value is set, the Data Guarantee API conditions are ignored. This setting should be used only for mailbox databases that aren't replicated. SecondCopy This is the default value when you add the second copy of a mailbox database. When this value is set, at least one passive database copy must meet the Data Guarantee API conditions. SecondDatacenter When this value is set, at least one passive database copy in another Active Directory site must meet the Data Guarantee API conditions. AllDatacenters When this value is set, at least one passive database copy in each Active Directory site must meet the Data Guarantee API conditions. AllCopies When this value is set, all copies of the mailbox database must meet the Data Guarantee API conditions. Check Replication Health When the Data Guarantee API is executed to evaluate the health of the database copy infrastructure, several items are evaluated.

If the DataMoveReplicationConstraint

Conditions

SecondCopy

At least one passive database copy for a replicated database must meet the conditions in the next column.

The passive database copy must: Be healthy. Have a replay queue within 10 minutes of the replay lag time. Have a copy queue length less than 10 logs. Have an average copy queue length less than 10 logs. The average copy queue length is computed based on the number of times the application has queried the database status.

SecondDatacenter

At least one passive database copy in another Active Directory site must meet the conditions in the next column.

AllDatacenters

The active copy must be mounted, and a passive copy in each Active Directory site must meet the conditions in the next column.

AllCopies

The active copy must be mounted, and all passive database copies must meet the conditions in the next column.

Check Replication Flush The Data Guarantee API can also be used to validate that a prerequisite number of database copies have replayed the required transaction logs. This is verified by comparing the last log replayed timestamp with that of the calling service's commit timestamp (in most cases, this is the timestamp of the last log file that contains required data) plus an additional five seconds (to deal with system time clock skews or drift). If the replay timestamp is greater than the commit timestamp, the DataMoveReplicationConstraint parameter is satisfied. If the replay timestamp is less than the commit timestamp, the DataMoveReplicationConstraint isn't satisfied. Before moving large numbers of mailboxes to or from replication databases within a DAG, we recommend that you configure the DataMoveReplicationConstraint parameter on each mailbox database according to the following:

Mailbox databases that don't have any database copies A DAG within a single Active Directory site A DAG in multiple datacenters using a stretched Active Directory site A DAG that spans two Active Directory sites, and you will have highly available database copies in each site A DAG that spans two Active Directory sites, and you will have only lagged database copies in the second site

None

SecondCopy

SecondCopy

SecondDatacenter

SecondCopy This is because the Data Guarantee API won't guarantee data being committed until the log file is replayed into the database copy, and due to the nature of the database copy being lagged, this constraint will fail the move request, unless the lagged database copy ReplayLagTime value is less than 30 minutes. AllDatacenters

A DAG that spans three or more Active Directory sites, and each site will contain highly available database copies

Balancing database copies


Due to the inherent nature of DAGs, as the result of database switchovers and failovers, active mailbox database copies will change hosts several times throughout a DAG's lifetime. As a result, DAGs can become unbalanced in terms of active mailbox database copy distribution. The following table shows an example of a DAG that has four databases with four copies of each database (for a total of 16 databases on each server) with an uneven distribution of active database copies.

DAG with unbalanced active copy distribution


Server Number of active databases 5 1 12 1 Number of passive databases 11 15 4 15 Number of mounted databases 5 1 12 1 Number of dismounted databases 0 0 0 0 Preference count list 4, 4, 3, 5 1, 8, 6, 1 13, 2, 1, 0 1, 1, 5, 9

EX1 EX2 EX3 EX4

In the preceding example, there are four copies of each database, and therefore, only four possible values for activation preference (1, 2, 3, or 4). The Preference count list column shows the count of the number of databases with each of these values. For example, on EX3, there are 13 database copies with an activation preference of 1, two copies with an activation preference of 2, one copy with an activation preference of 3, and no copies with an activation preference of 4. As you can see, this DAG isn't balanced in terms of the number of active databases hosted by each DAG member, the number of passive databases hosted by each DAG member, or the activation preference count of the hosted databases. You can use the RedistributeActiveDatabases.ps1 script to balance the active mailbox databases copies across a DAG. This script moves databases between their copies in an attempt to have an equal number of mounted databases on each server in DAG. If required, the script also attempts to balance active databases across sites. The script provides two options for balancing active database copies within a DAG: BalanceDbsByActivationPreference When this option is specified, the script attempts to move databases to their most preferred copy (based on activation preference) without regard to the Active Directory site. BalanceDbsBySiteAndActivationPreference When this option is specified, the script attempts to move active databases to their most preferred copy, while also trying to balance active databases within each Active Directory site. After running the script with the first option, the preceding unbalanced DAG becomes balanced, as shown in the following table.

DAG with balanced active copy distribution

Server

Number of active databases 4 4 4 4

Number of passive databases 12 12 12 12

Number of mounted databases 4 4 4 4

Number of dismounted databases 0 0 0 0

Preference count list 4, 4, 4, 4 4, 4, 4, 4 4, 4, 4, 4 4, 4, 4, 4

EX1 EX2 EX3 EX4

As shown in the preceding table, this DAG is now balanced in terms of number of active and passive databases on each server and activation preference across the servers. The following table lists the available parameters for the RedistributeActiveDatabases.ps1 script.

RedistributeActiveDatabases.ps1 script parameters


Parameter DagName Description Specifies the name of the DAG you want to rebalance. If this parameter is omitted, the DAG of which the local server is a member is used. BalanceDbsByActivationPreference Specifies that the script should move databases to their most preferred copy without regard to the Active Directory site. BalanceDbsBySiteAndActivationPreference Specifies that the script should attempt to move active databases to their most preferred copy, while also trying to balance active databases within each Active Directory site. ShowFinalDatabaseDistribution AllowedDeviationFromMeanPercentage Specifies that a report of current database distribution be displayed after redistribution is complete. Specifies the allowed variation of active databases across sites, expressed as a percentage. The default is 20%. For example, if there were 99 databases distributed between three sites, the ideal distribution would be 33 databases in each site. If the allowed deviation is 20%, the script attempts to balance the databases so that each site has no more than 10% more or less than this number. 10% of 33 is 3.3, which is rounded up to 4. Therefore, the script attempts to have between 29 and 37 databases in each site. ShowDatabaseCurrentActives Specifies that the script produce a report for each database detailing how the database was moved and whether it's now active on its most-preferred copy. Specifies that the script produce a report for each server showing its database distribution. Specifies that the script run only on the DAG member that currently has the PAM role. The script verifies it's being run from the PAM. If it isn't being run from the PAM, the script exits. LogEvents IncludeNonReplicatedDatabases Specifies that the script logs an event (MsExchangeRepl event 4115) containing a summary of the actions. Specifies that the script should include non-replicated databases (databases without copies) when determining how to redistribute the active databases. Although non-replicated databases can't be moved, they may affect the distribution of the replicated databases. Confirm The Confirm switch can be used to suppress the confirmation prompt that appears by default when this script is run. To suppress the confirmation prompt, use the syntax -Confirm:$False. You must include a colon ( : ) in the syntax.

ShowDatabaseDistributionByServer RunOnlyOnPAM

RedistributeActiveDatabases.ps1 examples
This example shows the current database distribution for a DAG, including preference count list.

RedistributeActiveDatabases.ps1 -DagName DAG1 -ShowDatabaseDistributionByServer | Format-Table

This example redistributes and balances the active mailbox database copies in a DAG using activation preference without prompting for input.

RedistributeActiveDatabases.ps1 -DagName DAG1 -BalanceDbsByActivationPreference -Confirm:$False

This example redistributes and balances the active mailbox database copies in a DAG using activation preference, and produces a summary of the distribution.

RedistributeActiveDatabases.ps1 -DagName DAG1 -BalanceDbsByActivationPreference -ShowFinalDatabaseDistribution

Monitoring database copies


A database copy is your first defense if a failure occurs that affects the active copy of a database. It's therefore critical to monitor the health and status of database copies to ensure that they will be available when needed. You can view a variety of information, including copy queue length, replay queue length, status, and content index state information, by examining the details of a database copy in the EAC. You can also use the Get-MailboxDatabaseCopyStatus cmdlet in the Shell to view a variety of status information for a database copy. For more information about monitoring database copies, see Monitoring Database Availability Groups.

Removing a database copy


A database copy can be removed at any time by using the EAC or by using the Remove-MailboxDatabaseCopy cmdlet in the Shell. After removing a database copy, you must manually delete any database and transaction log files from the server from which the database copy is being removed. For detailed steps about how to remove a database copy, see Remove a Mailbox Database Copy.

Database switchovers
The Mailbox server that hosts the active copy of a database is referred to as the mailbox database master. The process of activating a passive database copy changes the mailbox database master for the database and turns the passive copy into the new active copy. This process is called a database switchover. In a database switchover, the active copy of a database is dismounted on one Mailbox server and a passive copy of that database is mounted as the new active mailbox database on another Mailbox server. When performing a switchover, you can optionally override the database mount dial setting on the new mailbox database master. You can quickly identify which Mailbox server is the current mailbox database master by reviewing the right-hand column under the Database Copies tab in the EAC. You can perform a switchover by using the Activate link in the EAC, or by using the Move-ActiveMailboxDatabase cmdlet in the Shell. There are several internal checks that will be performed before activating a passive copy: The status of the database copy is checked. If the database copy is in a failed state, the switchover is blocked. You can override this behavior and bypass the health check by using the SkipHealthChecks parameter of the Move-ActiveMailboxDatabase cmdlet. This parameter allows you to move the active copy to a database copy in a failed state. The active database copy is checked to see if it's currently a seeding source for any passive copies of the database. If the active copy is currently being used as a source for seeding, the switchover is blocked. You can override this behavior and bypass the seeding source check by using the SkipActiveCopyChecks parameter of the Move-ActiveMailboxDatabase cmdlet. This parameter allows you to move an active copy that's being used as a seeding source. Using this parameter will cause the seeding operation to be cancelled and considered failed. The copy queue and replay queue lengths for the database copy are checked to ensure their values are within the configured criteria. Also, the database copy is verified to ensure that it isn't currently in use as a source for seeding. If the values for the queue lengths are outside the configured criteria, or if the database is currently used as a source for seeding, the switchover is blocked. You can override this behavior and bypass these checks by using the SkipLagChecks parameter of the Move-ActiveMailboxDatabase cmdlet. This parameter allows a copy to be activated that has replay and copy queues outside of the configured criteria. The state of the search catalog (content index) for the database copy is checked. If the search catalog isn't up to date, is in an unhealthy state, or is corrupt, the switchover is blocked. You can override this behavior and bypass the search catalog check by using the SkipClientExperienceChecks parameter of the MoveActiveMailboxDatabase cmdlet. This parameter causes this search to skip the catalog health check. If the search catalog for the database copy you're activating is in an unhealthy or unusable state and you use this parameter to skip the catalog health check and activate the database copy, you will need to either crawl or seed the search catalog again. When performing a database switchover, you also have the option of overriding the mount dial settings configured for the server that hosts the passive database copy being activated. Using the MountDialOverride parameter of the Move-ActiveMailboxDatabase cmdlet instructs the target server to override its own mount dial settings and use those specified by the MountDialOverride parameter. For detailed steps about how to perform a switchover of a database copy, see Activate a Mailbox Database Copy.

TechNet

Products

IT Resources

Downloads

Training

Support

Monitoring Database Availability Groups


Exchange 2013
1 out of 1 rated this helpful Applies to: Exchange Server 2013 Topic Last Modified: 2012-11-16 Making sure that servers are operating reliably and that database copies are healthy are key daily objectives for messaging administrators. To help ensure the availability and reliability of your Microsoft Exchange Server 2013 organization, the hardware, Windows operating system, and Exchange 2013 services and protocols must be actively monitored. Historically, monitoring Exchange has meant using an external application, such as System Center 2012 Operations Manager, to collect performance and event log data, and to react or provide recovery action for problems that are detected as a result of analyzing the collected data. Exchange 2010 and previous versions included health manifests and correlation engines in the form of management packs. These correlation engines would analyze the collected data and make a determination as to whether a particular component was healthy or unhealthy. In addition, System Center 2012 Operations Manager was also able to leverage the built-in test cmdlet infrastructure to run synthetic transactions against various aspects of the system to ensure the system was available. In Exchange 2013, native, built-in monitoring and recovery actions are included in a feature called Managed Availability. You can use the details in this topic for monitoring the health and status of mailbox database copies for database availability groups (DAGs). Contents Managed availability Get-MailboxDatabaseCopyStatus cmdlet Test-ReplicationHealth cmdlet Crimson channel event logging CollectOverMetrics.ps1 script CollectReplicationMetrics.ps1 script

Managed availability
Managed availability is the integration of built-in monitoring and recovery actions with the Exchange built-in high availability platform. It's designed to detect and recover from problems as soon as they occur and are discovered by the system. Unlike previous external monitoring solutions for Exchange, managed availability doesn't try to identify or communicate the root cause of an issue. It's instead focused on recovery aspects that address three key areas of the user experience: Availability Can users access the service? Latency How is the experience for users? Errors Are users able to accomplish what they want? The new architecture in Exchange 2013 makes each Exchange server an island where services on that island only service the active databases located on that server. The architectural changes in Exchange 2013 require a new approach to availability model used by Exchange. The Mailbox and Client Access server architecture imply that any Mailbox server with an active database is in production for all services, including all protocol services. As a result, this fundamentally changes the model used to manage the protocol services. Managed availability was conceived to address this change and to provide a native health monitoring and recovery solution. The integration of the building block architecture into a unified framework provides a powerful capability to detect failures and recover from them. Managed availability moves away from monitoring individual separate slices of the system to monitoring the end-to-end user experience, and protecting the end user's experience through recovery-oriented computing. In Exchange 2013, client access protocols for a specific mailbox are always served from the protocol instance that's local to the active database copy. As a result, it's important that managed availability's monitoring and recovery actions take into account more than just the health of the database. Managed availability is an internal process that runs on every Exchange 2013 server. It's implemented in the form of two services: Exchange Health Manager Service (MSExchangeHMHost.exe) This is a controller process used to manage worker processes. It's used to build, execute, and start and stop the worker process, as needed. It's also used to recover the worker process in case that process fails, to prevent the worker process from being a single point of failure. Exchange Health Manager Worker process (MSExchangeHMWorker.exe) This is the worker process responsible for performing the run-time tasks. Managed availability uses persistent storage to perform its functions: XML configuration files are used to initialize the work item definitions during startup of the worker process. The Windows registry is used to store run-time data, such as bookmarks. The Windows crimson channel event log infrastructure is used to store the work item results. As illustrated in the following drawing, managed availability includes three main asynchronous components that are constantly doing work. Managed availability

The first component is the probe engine, which is responsible for taking measurements on the server and collecting data. The results of those measurements flow into the second component, the monitor. The monitor contains all of the business logic used by the system based on what is considered healthy on the data collected. Similar to a pattern recognition engine, the monitor looks for the various different patterns on all the collected measurements, and then it decides whether something is considered healthy. Finally, there is the responder engine, which is responsible for recovery actions. When something is unhealthy, the first action is to attempt to recover that component. This could include multi-stage recovery actions; for example, the first attempt may be to restart the application pool, the second may be to restart the service, the third attempt may be to restart the server, and the subsequent attempt may be to take the server offline so that it no longer accepts traffic. If the recovery actions are unsuccessful, the system escalates the issue to a human through event log notifications.

The probe engine contains probes, checks, and notification logic. Probes are synthetic transactions performed by the system to test the end-to-end user experience. Checks are the infrastructure that perform the collection of performance data, including user traffic, and measure the collected data against thresholds that are set to determine spikes in user failures. This enables the checks infrastructure to become aware when users are experiencing issues. Finally, the notification logic enables the system to take action immediately based on a critical event, without having to wait for the results of the data collected by a probe. These are typically exceptions or conditions that can be detected and recognized without a large sample set. Monitors query the data collected by probes to determine if action needs to be taken based on a predefined rule set. Depending on the rule or the nature of the issue, a monitor can either initiate a responder or escalate the issue to a human via an event log entry. In addition, monitors define how much time after a failure that a responder is executed, as well as the workflow of the recovery action. Monitors have various states. From a system state perspective, monitors have two states: Healthy The monitor is operating properly and all collected metrics are within normal operating parameters Unhealthy The monitor isn't healthy and has either initiated recovery through a responder or notified an administrator through escalation. From an administrative perspective, monitors have additional states that appear in the Shell: Degraded When a monitor is in an unhealthy state from 0 through 60 seconds, it's considered Degraded. If a monitor is unhealthy for more than 60 seconds, it is considered Unhealthy. Disabled The monitor has been explicitly disabled by an administrator. Unavailable The Microsoft Exchange Health service periodically queries each monitor for its state. If it doesn't get a response to the query, the monitor state becomes Unavailable. Repairing An administrator sets the Repairing state to indicate to the system that corrective action is in process by a human, which allows the system and humans to differentiate between other failures that may occur at the same time corrective action is being taken (such as a database copy reseed operation). Return to top

Get-MailboxDatabaseCopyStatus cmdlet
You can use the Get-MailboxDatabaseCopyStatus cmdlet to view status information about mailbox database copies. This cmdlet enables you to view information about all copies of a particular database, information about a specific copy of a database on a specific server, or information about all database copies on a server. The following table describes possible values for the copy status of a mailbox database copy.

Database copy status


Database copy status Failed Description The mailbox database copy is in a Failed state because it isn't suspended, and it isn't able to copy or replay log files. While in a Failed state and not suspended, the system will periodically check whether the problem that caused the copy status to change to Failed has been resolved. After the system has detected that the problem is resolved, and barring no other issues, the copy status will automatically change to Healthy. Seeding The mailbox database copy is being seeded, the content index for the mailbox database copy is being seeded, or both are being seeded. Upon successful completion of seeding, the copy status should change to Initializing. SeedingSource Suspended The mailbox database copy is being used as a source for a database copy seeding operation. The mailbox database copy is in a Suspended state as a result of an administrator manually suspending the database copy by running the Suspend-MailboxDatabaseCopy cmdlet. Healthy The mailbox database copy is successfully copying and replaying log files, or it has successfully copied and replayed all available log files. ServiceDown Initializing The Microsoft Exchange Replication service isn't available or running on the server that hosts the mailbox database copy. The mailbox database copy is in an Initializing state when a database copy has been created, when the Microsoft Exchange Replication service is starting or has just been started, and during transitions from Suspended, ServiceDown, Failed, Seeding, or SinglePageRestore to another state. While in this state, the system is verifying that the database and log stream are in a consistent state. In most cases, the copy status will remain in the Initializing state for about 15 seconds, but in all cases, it should generally not be in this state for longer than 30 seconds. Resynchronizing The mailbox database copy and its log files are being compared with the active copy of the database to check for any divergence between the two copies. The copy status will remain in this state until any divergence is detected and resolved. The active copy is online and accepting client connections. Only the active copy of the mailbox database copy can have a copy status of Mounted. Dismounted The active copy is offline and not accepting client connections. Only the active copy of the mailbox database copy can have a copy status of Dismounted. Mounting The active copy is coming online and not yet accepting client connections. Only the active copy of the mailbox database copy can have a copy status of Mounting. The active copy is going offline and terminating client connections. Only the active copy of the mailbox database copy can have a copy status of Dismounting. The mailbox database copy is no longer connected to the active database copy, and it was in the Healthy state when the loss of connection occurred. This state represents the database copy with respect to connectivity to its source database copy. It may be reported during DAG network failures between the source copy and the target database copy. The mailbox database copy is no longer connected to the active database copy, and it was in the Resynchronizing state when the loss of connection occurred. This state represents the database copy with respect to connectivity to its source database copy. It may be reported during DAG network failures between the source copy and the target database copy. FailedAndSuspended The Failed and Suspended states have been set simultaneously by the system because a failure was detected, and because resolution of the failure explicitly requires administrator intervention. An example is if the system detects unrecoverable divergence between the active mailbox database and a database copy. Unlike the Failed state, the system won't periodically check whether the problem has been resolved, and automatically recover. Instead, an administrator must intervene to resolve the underlying cause of the failure before the database copy can be transitioned to a healthy state. SinglePageRestore This state indicates that a single page restore operation is occurring on the mailbox database copy.

Mounted

Dismounting

DisconnectedAndHealthy

DisconnectedAndResynchronizing

The Get-MailboxDatabaseCopyStatus cmdlet also includes a parameter called ConnectionStatus, which returns details about the in-use replication networks. If you use this parameter, two additional output fields, IncomingLogCopyingNetwork and SeedingNetwork, will be populated in the task's output.

Get-MailboxDatabaseCopyStatus examples

The following examples use the Get-MailboxDatabaseCopyStatus cmdlet. Each example pipes the results to the Format-List cmdlet to display the output in list format. This example returns status information for all copies of the database DB2.

Get-MailboxDatabaseCopyStatus -Identity DB2 | Format-List

This example returns the status for all database copies on the Mailbox server MBX2.

Get-MailboxDatabaseCopyStatus -Server MBX2 | Format-List

This example returns the status for all database copies on the local Mailbox server.

Get-MailboxDatabaseCopyStatus -Local | Format-List

This example returns status, log shipping, and seeding network information for the database DB3 on the Mailbox server MBX1.

Get-MailboxDatabaseCopyStatus -Identity DB3\MBX1 -ConnectionStatus | Format-List

For more information about using the Get-MailboxDatabaseCopyStatus cmdlet, see Get-MailboxDatabaseCopyStatus. Return to top

Test-ReplicationHealth cmdlet
You can use the Test-ReplicationHealth cmdlet to view continuous replication status information about mailbox database copies. This cmdlet can be used to check all aspects of the replication and replay status to provide a complete overview of a specific Mailbox server in a DAG. The Test-ReplicationHealth cmdlet is designed for the proactive monitoring of continuous replication and the continuous replication pipeline, the availability of Active Manager, and the health and status of the underlying cluster service, quorum, and network components. It can be run locally on or remotely against any Mailbox server in a DAG. The Test-ReplicationHealth cmdlet performs the tests listed in the following table.

Test-ReplicationHealth cmdlet tests


Test name ClusterService Description Verifies that the Cluster service is running and reachable on the specified DAG member, or if no DAG member is specified, on the local server. ReplayService Verifies that the Microsoft Exchange Replication service is running and reachable on the specified DAG member, or if no DAG member is specified, on the local server. ActiveManager Verifies that the instance of Active Manager running on the specified DAG member, or if no DAG member is specified, the local server, is in a valid role (primary, secondary, or stand-alone). TasksRpcListener Verifies that the tasks remote procedure call (RPC) server is running and reachable on the specified DAG member, or if no DAG member is specified, on the local server. TcpListener Verifies that the TCP log copy listener is running and reachable on the specified DAG member, or if no DAG member is specified, on the local server. DagMembersUp ClusterNetwork QuorumGroup FileShareQuorum DBCopySuspended Verifies that all DAG members are available, running, and reachable. Verifies that all cluster-managed networks on the specified DAG member, or if no DAG member is specified, the local server, are available. Verifies that the default cluster group (quorum group) is in a healthy and online state. Verifies that the witness server and witness directory and share configured for the DAG are reachable. Checks whether any mailbox database copies are in a state of Suspended on the specified DAG member, or if no DAG member is specified, on the local server. Checks whether any mailbox database copies are in a state of Failed on the specified DAG member, or if no DAG member is specified, on the local server. DBInitializing Checks whether any mailbox database copies are in a state of Initializing on the specified DAG member, or if no DAG member is specified, on the local server. Checks whether any mailbox database copies are in a state of Disconnected on the specified DAG member, or if no DAG member is specified, on the local server. Verifies that log copying and inspection by the passive copies of databases on the specified DAG member, or if no DAG member is specified, on the local server, are able to keep up with log generation activity on the active copy. Verifies that replay activity for the passive copies of databases on the specified DAG member, or if no DAG member is specified, on the local server, is able to keep up with log copying and inspection activity.

DBCopyFailed

DBDisconnected

DBLogCopyKeepingUp

DBLogReplayKeepingUp

Test-ReplicationHealth example
This example uses the Test-ReplicationHealth cmdlet to test the health of replication for the Mailbox server MBX1.

Test-ReplicationHealth -Identity MBX1

Return to top

Crimson channel event logging


Windows includes two categories of event logs: Windows logs, and Applications and Services logs. The Windows logs category includes the event logs available in previous versions of Windows: Application, Security, and System event logs. It also includes two new logs: the Setup log and the ForwardedEvents log. Windows logs are intended to store events from legacy applications and events that apply to the entire system. Applications and Services logs are a new category of event logs. These logs store events from a single application or component rather than events that might have system-wide impact. This new category of event logs is referred to as an application's crimson channel. The Applications and Services logs category includes four subtypes: Admin, Operational, Analytic, and Debug logs. Events in Admin logs are of particular interest if you use event log records to troubleshoot problems. Events in the Admin log should provide you with guidance about how to respond to the events. Events in the Operational log are also useful, but may require more interpretation. Admin and Debug logs aren't as user friendly. Analytic logs (which by default are hidden and disabled) store events that trace an issue, and often a high volume of events are logged. Debug logs are used by developers when debugging applications. Exchange 2013 logs events to crimson channels in the Applications and Services logs area. You can view these channels by performing these steps: 1. Open Event Viewer. 2. In the console tree, navigate to Applications and Services Logs > Microsoft > Exchange. 3. Under Exchange, select a crimson channel: HighAvailability or MailboxDatabaseFailureItems. The HighAvailability channel contains events related to startup and shutdown of the Microsoft Exchange Replication service, and the various components that run within the Microsoft Exchange Replication service, such as Active Manager, the third-party synchronous replication API, the tasks RPC server, TCP listener, and Volume Shadow Copy Service (VSS) writer. The HighAvailability channel is also used by Active Manager to log events related to Active Manager role monitoring and database action events, such as a database mount operation and log truncation, and to record events related to the DAG's underlying cluster. The MailboxDatabaseFailureItems channel is used to log events associated with any failures that affect a replicated mailbox database. Return to top

CollectOverMetrics.ps1 script
Exchange 2013 includes a script called CollectOverMetrics.ps1, which can be found in the Scripts folder. CollectOverMetrics.ps1 reads DAG member event logs to gather information about database operations (such as database mounts, moves, and failovers) over a specific time period. For each operation, the script records the following information: Identity of the database Time at which the operation began and ended Servers on which the database was mounted at the start and finish of the operation Reason for the operation Whether the operation was successful, and if the operation failed, the error details The script writes this information to .csv files with one operation per row. It writes a separate .csv file for each DAG. The script supports parameters that allow you to customize the script's behavior and output. For example, the results can be restricted to a specified subset by using the Database or ReportFilter parameters. Only the operations that match these filters will be included in the summary HTML report. The available parameters are listed in the following table.

CollectOverMetrics.ps1 script parameters


Parameter DatabaseAvailabilityGroup Description Specifies the name of the DAG from which you want to collect metrics. If this parameter is omitted, the DAG of which the local server is a member will be used. Wildcard characters can be used to collect information from and report on multiple DAGs. Database Provides a list of databases for which the report needs to be generated. Wildcard characters are supported, for example, Database:"DB1","DB2" or -Database:"DB*". Specifies the duration of the time period to report on. The script gathers only the events logged during this period. As a result, the script may capture partial operation records (for example, only the end of an operation at the start of the period or vice-versa). If neither StartTime nor EndTime is specified, the script defaults to the past 24 hours. If only one parameter is specified, the period will be 24 hours, either beginning or ending at the specified time. EndTime Specifies the duration of the time period to report on. The script gathers only the events logged during this period. As a result, the script may capture partial operation records (for example, only the end of an operation at the start of the period or vice-versa). If neither StartTime nor EndTime is specified, the script defaults to the past 24 hours If only one parameter is specified, the period will be 24 hours, either beginning or ending at the specified time. ReportPath Specifies the folder used to store the results of event processing. If this parameter is omitted, the Scripts folder will be used. When specified, the script takes a list of .csv files generated by the script and uses them as the source data to generate a summary HTML report. The report is the same one that's generated with the -GenerateHtmlReport option. The files can be generated across multiple DAGs at many different times, or even with overlapping times, and the script will merge all of their data together. GenerateHtmlReport Specifies that the script gather all the information it has recorded, group the data by the operation type, and then generate an HTML file that includes statistics for each of these groups. The report includes the total number of operations in each group, the number of operations that failed, and statistics for the time taken within each group. The report also contains a breakdown of the types of errors that resulted in failed operations. ShowHtmlReport SummariseCsvFiles Specifies that the HTML-generated report should be displayed in a Web browser after it's generated. Specifies that the script read the data from existing .csv files that were previously generated by the script. This data is then used to generate a summary report similar to the report generated by the GenerateHtmlReport parameter. Specifies the type of operational actions the script should collect. The values for this parameter are Move, Mount, Dismount, and Remount. The Move value refers to any time that the database changes its active server, whether by controlled moves or by failovers. The Mount, Dismount, and Remount values refer to times that the database changes its mounted status without moving to another computer. ActionTrigger Specifies which administrative operations should be collected by the script. The values for this parameter are Admin or Automatic. Automatic actions are those performed automatically by the system (for example, a failover when a server goes offline). Admin actions are any actions that were performed by an administrator using either the Exchange Management Shell or the Exchange Administration Center.

StartTime

ActionType

RawOutput

Specifies that the script writes the results that would have been written to .csv files directly to the output stream, as would happen with write-output. This information can then be piped to other commands.

IncludedExtendedEvents

Specifies that the script collects the events that provide diagnostic details of times spent mounting databases. This can be a timeconsuming stage if the Application event log on the servers is large.

MergeCSVFiles ReportFilter

Specifies that the script takes all the .csv files containing data about each operation and merges them into a single .csv file. Specifies that a filter should be applied to the operations using the fields as they appear in the .csv files. This parameter uses the same format as a Where operation, with each element set to $_ and returning a Boolean value. For example: {$_DatabaseName notlike "Mailbox Database*"} can be used to exclude the default databases from the report.

CollectOverMetrics.ps1 examples
The following example collects metrics for all databases that match DB* (which includes a wildcard character) in the DAG DAG1. After the metrics are collected, an HTML report is generated and displayed.

CollectOverMetrics.ps1 -DatabaseAvailabilityGroup DAG1 -Database:"DB*" -GenerateHTMLReport -ShowHTMLReport

The following examples demonstrate ways that the summary HTML report may be filtered. The first uses the Database parameter, which takes a list of database names. The summary report then contains data only about those databases. The next two examples use the ReportFilter option. The last example filters out all the default databases.

CollectOverMetrics.ps1 -SummariseCsvFiles (dir *.csv) -Database MailboxDatabase123,MailboxDatabase456 CollectOverMetrics.ps1 -SummariseCsvFiles (dir *.csv) -ReportFilter { $_.DatabaseName -notlike "Mailbox Database*" } CollectOverMetrics.ps1 -SummariseCsvFiles (dir *.csv) -ReportFilter { ($_.ActiveOnStart -like "ServerXYZ*") -and ($_ .ActiveOnEnd -notlike "ServerXYZ*") }

Return to top

CollectReplicationMetrics.ps1 script
CollectReplicationMetrics.ps1 is another health metric script included in Exchange 2013. This script provides an active form of monitoring because it collects metrics in real time, while the script is running. CollectReplicationMetrics.ps1 collects data from performance counters related to database replication. The script gathers counter data from multiple Mailbox servers, writes each server's data to a .csv file, and then reports various statistics across all of this data (for example, the amount of time each copy was failed or suspended, the average copy or replay queue length, or the amount of time that copies were outside of their failover criteria). You can either specify the servers individually, or you can specify entire DAGs. You can either run the script to first collect the data and then generate the report, or you can run it to just gather the data or to only report on data that's already been collected. You can specify the frequency at which data should be sampled and the total duration to gather data. The data collected from each server is written to a file named CounterData.<ServerName>.<TimeStamp>.csv . The summary report will be written to a file named HaReplPerfReport.<DAGName>.<TimeStamp>.csv, or HaReplPerfReport.<TimeStamp>.csv if you didn't run the script with the DagName parameter. The script starts Windows PowerShell jobs to collect the data from each server. These jobs run for the full period in which data is being collected. If you specify a large number of servers, this process can use a considerable amount of memory. The final stage of the process, when data is processed into a summary report, can also be quite time consuming for large amounts of data. It's possible to run the collection stage on one computer, and then copy the data elsewhere for processing. The CollectReplicationMetrics.ps1 script supports parameters that allow you to customize the script's behavior and output. The available parameters are listed in the following table.

CollectReplicationMetrics.ps1 script parameters


Parameter DagName Description Specifies the name of the DAG from which you want to collect metrics. If this parameter is omitted, the DAG of which the local server is a member will be used. DatabaseNames Provides a list of databases for which the report needs to be generated. Wildcard characters are supported for use, for example, DatabaseNames:"DB1","DB2" or -DatabaseNames:"DB*". Specifies the folder used to store the results of event processing. If this parameter is omitted, the Scripts folder will be used. Specifies the amount of time the collection process should run. Typical values would be one to three hours. Longer durations should be used only with long intervals between each sample or as a series of shorter jobs run by scheduled tasks. Frequency Specifies the frequency at which data metrics are collected. Typical values would be 30 seconds, one minute, or five minutes. Under normal circumstances, intervals that are shorter than these won't show significant changes between each sample. Specifies the identity of the servers from which to collect statistics. You can specify any value, including wildcard characters or GUIDs. Specifies a list of .csv files to generate a summary report. These files are the files named CounterData.<CounterData>* and are generated by the CollectReplicationMetrics.ps1 script. Mode Specifies the processing stages that the script executes. You can use the following values: CollectAndReport This is the default value. This value signifies that the script should both collect the data from the servers and then process them to produce the summary report. CollectOnly This value signifies that the script should just collect the data and not produce the report. ProcessOnly This value signifies that the script should import data from a set of .csv files and process them to produce the summary report. The SummariseFiles parameter is used to provide the script with the list of files to process.

ReportPath Duration

Servers SummariseFiles

MoveFilestoArchive LoadExchangeSnapin

Specifies that the script should move the files to a compressed folder after processing. Specifies that the script should load the Shell commands. This parameter is useful when the script needs to run from outside the Shell, such as in a scheduled task.

CollectReplicationMetrics.ps1 example

The following example gathers one hour's worth of data from all the servers in the DAG DAG1, sampled at one minute intervals, and then generates a summary report. In addition, the ReportPath parameter is used, which causes the script to place all the files in the current directory.

CollectReplicationMetrics.ps1 -DagName DAG1 -Duration "01:00:00" -Frequency "00:01:00" -ReportPath

The following example reads the data from all the files matching CounterData* and then generates a summary report.

CollectReplicationMetrics.ps1 -SummariseFiles (dir CounterData*) -Mode ProcessOnly -ReportPath

Return to top

S-ar putea să vă placă și