Sunteți pe pagina 1din 41

Engineering White Paper

EMC CLARiiON Storage Solutions


Microsoft Exchange 2003 Best Practices
Storage Configuration Guidelines

Abstract

This white paper presents the latest storage configuration guidelines and best practices for Microsoft Exchange
on CLARiiON storage systems. It is focused on Exchange 2003, but most material also applies to Exchange
2000.

Published 8/8/2005

8/8/2005

Copyright 2005 EMC Corporation. All rights reserved.


EMC believes the information in this publication is accurate as of its publication date. The information is
subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS. EMC CORPORATION
MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE
INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable
software license.

Part Number H1363


EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices
Storage Configuration Guidelines

8/8/2005

Table of Contents
Executive Summary............................................................................................ 5
Intended Audience.............................................................................................. 5
Introduction ......................................................................................................... 5
Environmental Parameters for Storage Design ........................................................................... 5
User Community Information.................................................................................................... 5
Backup/Recovery Requirements.............................................................................................. 6
Other Organizational Requirements or Constraints ................................................................. 7

Planning Storage for the Exchange Production Data...................................... 7


Exchange Storage Groups........................................................................................................... 7
How Many ESGs per Server? .................................................................................................. 7
How Many Databases per ESG? ............................................................................................. 8
How Many LUNs per ESG?...................................................................................................... 8
Calculating the Base I/O per User Requirement.......................................................................... 9
Calculating the IOPS Requirement for an Exchange Environment ........................................... 10
RAID Types and the Read/Write Ratio ...................................................................................... 10
Other Factors That May Impact I/O ........................................................................................... 11
Calculating the Capacity Requirement for Database LUNs....................................................... 12
Choosing a RAID and Disk Type ............................................................................................... 13
Comparing RAID 1/0 to RAID 5.............................................................................................. 14
Comparing 10K rpm to 15K rpm............................................................................................. 16
Comparing 73 GB, 146 GB, and 300 GB ............................................................................... 16
Capacity Check ...................................................................................................................... 16
Summary ................................................................................................................................ 17
MetaLUNs .................................................................................................................................. 17
Building Blocks ....................................................................................................................... 18
Log LUN Configuration .............................................................................................................. 18
Additional Storage Considerations for the Exchange Production Data ..................................... 19
Public Folders......................................................................................................................... 19
SMTP Queue.......................................................................................................................... 20
Keeping EDB Files and STM Files Together ......................................................................... 20
Smaller Exchange Environments ........................................................................................... 20

Planning Storage for Local Recovery ............................................................. 21


SnapView for Disk-Based Replication ....................................................................................... 21
Clone-Based Replication........................................................................................................ 21
Snapshot-Based Replication .................................................................................................. 22
Comparison of Local Replication Options .............................................................................. 23
Online Backup to Disk................................................................................................................ 23
Recovery Storage Groups ......................................................................................................... 24

Planning Storage for Local Message Archiving............................................. 24


Storage-System Considerations...................................................................... 25
iSCSI Guidelines........................................................................................................................ 26
CLARiiON Storage Systems Comparison ................................................................................. 27

Putting It All Together ...................................................................................... 28


EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices
Storage Configuration Guidelines

8/8/2005
Consider Site-Specific Constraints......................................................................................... 28
Configure the Cleanest Looking Layout Diagram .................................................................. 28
Plan Throughout for Operational Resiliency .......................................................................... 28
Validate the Design ................................................................................................................ 29

Additional Recommendations for Optimal Performance .............................. 29


Storage-System Tuning ............................................................................................................. 29
Exchange Server and Windows Environment ........................................................................... 30
Windows File-System Alignment ............................................................................................... 30

Conclusion ........................................................................................................ 31
Appendix A: Storage Design Examples.......................................................... 32
Example 1 .................................................................................................................................. 32
RAID-Adjusted Back-end Disk IOPS Calculation................................................................... 32
Capacity Calculation with 200 MB Mailboxes (73 GB Drives) ............................................... 32
Capacity Calculation with 400 MB Mailboxes (73 GB Drives) ............................................... 32
Matching Up the IOPS and Capacity Requirements .............................................................. 33
Example 2 .................................................................................................................................. 33
Capacity Calculation with 200 MB Mailboxes ........................................................................ 34
Matching Up the IOPS and Capacity Requirements .............................................................. 34
Example 3 .................................................................................................................................. 34
Calculations ............................................................................................................................ 35

Appendix B: Quantifying Exchange User Activity ......................................... 39


Determining the Peak Activity Period ........................................................................................ 39
Measuring IOPS per User.......................................................................................................... 40
Read/Write Ratio........................................................................................................................ 40
Performance Counter Guidelines............................................................................................... 40

Appendix C: Additional Resources ................................................................. 41


EMC White Papers .................................................................................................................... 41
Microsoft White Papers.............................................................................................................. 41

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

8/8/2005

Executive Summary
This white paper proceeds through the step-by-step process of planning the storage layout for Exchange
data on an EMC CLARiiON storage system. It offers considerations and the latest best practice
recommendations along the way. The approach taken is to design a layout that meets the following goals:

Optimal performance During peak periods, user response times are still acceptable and there is no
buildup of mail queues.

Efficient backup and rapid recovery Backups complete within the allotted window, with an
acceptable impact on the production environment. Local recovery meets the Service Level Agreement
(SLA) requirement.

Simplicity of design The resulting configuration is straightforward to implement and easy to


manage.
In addition to recommendations for production data storage layout, the paper includes considerations for
configuring CLARiiON storage for Exchange backup, local replication, and archiving.

Intended Audience
The intended audience for this white paper is system engineers who have customers interested in
implementing Microsoft Exchange using EMC CLARiiON Fibre Channel storage.
The reader should have a general knowledge of Microsoft Exchange and Windows technology, as well as
an understanding of basic CLARiiON features and terminology.

Introduction
Its said that if you ask 10 consultants to architect an Exchange storage design for an organization, you will
get back 10 (or more) different design proposals. CLARiiON storage systems offer a great deal of
flexibility with various combinations of RAID and disk types. Many Exchange storage configurations will
work well on CLARiiON systems, but some may cause unforeseen bottlenecks. As new versions of
Exchange are released and as CLARiiON array technology advances, best practice information for
Exchange storage design is constantly evolving.

Environmental Parameters for Storage Design


Several factors figure into the storage design for an Exchange environment. The more you know about an
organizations use of the existing messaging system and the better defined the requirements are for the new
implementation, the closer you should be able to come to constructing a useful design.
This section describes data items to gather. Whenever possible, it is always valuable to have this data
supported by empirical measurements (concurrent users, IOPS, read/write ratio, log files/day, etc.) from the
current environment.

User Community Information


The information gathered in this category should lead to a good estimate of the I/O profile for a set of users
over time. It should also lead to determining the users storage requirements. Appropriate tools and
counters for quantifying the following information are included in Appendix B:

How many total users?


Today
Anticipated growth over the next few years

How many concurrent users during the peak period?

How many mailboxes not associated with an individual user (such as a central help desk mailbox)?

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

8/8/2005

What mail client is used?


Outlook (2003 cached, or other)
Outlook Web Access
Mobile devices (Blackberry)

When are the peak activity periods?

What is the typical working day?

Is there geographic dispersal of the users across time zones?

What is the Exchange activity level of the users?


Categorization of the user types leads to estimated base IOPS demand (see Table 1 on page 9)
Measured I/O in the existing environment gives the best starting point

Are there special category userswith different security, performance, or backup/recovery


requirements?

What are the mailbox size limits?

Is there anything else pertinent that helps to describe the user profile for this organization?
Heavy use of personal folders?
Do users often send large documents?
Integrated use of voice mail?
Considerable use of Outlook 2003 shared folders?

What are the characteristics of public folder usage?


Size of the public store
Replication activity among public stores

Backup/Recovery Requirements
The choice of backup and recovery method will play an important part in the resulting storage design.
Once again, measurement of the existing environment will provide the best starting point for the new
design.

What is the deleted item retention period?


This is the time period that Exchange maintains an item after a user has deleted ittypically 10 to 30
days.

What is the chosen backup method?


Backup to disk using standard Exchange online backup
Backup to tape using standard Exchange online backup
Clone-based replication (with archival backup to disk or tape)
Uses an EMC application such as Replication Manager SE (RMSE) or Replication Manager (RM)
to create physical copies of the Exchange data on CLARiiON clones (BCVs). Archival backup to
disk or tape performs an offline copy of the Exchange data files (databases and logs) from the
BCV.
Snapshot-based replication (with archival backup to disk or tape)
Uses an EMC application such as RMSE or RM to create copies of the Exchange data on
CLARiiON snaps. Archival backup to disk or tape performs an offline copy of the Exchange data
files (databases and logs) from the snapshot.

What is the timing of the backup activity?

What are the requirements (service-level agreement) for recovery?

Is a distance replication or disaster recovery solution planned?


DR site distance
Network connection

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

8/8/2005

Other Organizational Requirements or Constraints


This category covers any additional pertinent factors that have already been decided, or have been added as
a requirement.

What are the type, number, and location of Exchange servers?

Are the Exchange servers clustered?

What is the planned Exchange front-end/back-end server layout?

What is the SAN/network structure?

Is there an existing CLARiiON storage system?

What other software will be operating in the Exchange environment?


Antivirus
E-mail archiving solution
Applications integrated with Exchange (e.g., for workflow, etc.)
Exchange-integrated third-party tools (e.g., for mailbox recovery, enhanced indexing, etc.)

Planning Storage for the Exchange Production Data


This section provides guidelines for configuring storage to handle the production Exchange data. When
designing a storage configuration for Exchange, the first disk measurement to consider must be the I/O
operations per second (IOPS) that the Exchange environment requires. Once you have calculated the
number of drives necessary to meet the I/O demand, then determine the capacity requirement and adjust the
drive count upward if necessary.

Exchange Storage Groups


The Exchange storage group (ESG) is the fundamental unit for layout planning. When backing up
Exchange, the elements of an ESG should be treated together.

How Many ESGs per Server?


In the past, Microsoft has recommended that to make efficient use of CPU and memory, the administrator
must fill up an ESG with users before creating additional Exchange storage groups. However,
improvements in Exchange (starting in Exchange 2000 SP2) allow the use of Exchanges maximum of four
production storage groups without running out of system resources. Because there is a single set of log
files for all databases within one storage group, there are advantages to configuring more ESGs on a server
even when you dont have to. With Exchange 2003, in most cases it will be best to use all four storage
groups.
Following are some considerations for the various ESG configuration options.
Using All Four ESGs
Works well with most servers deployed todayat least dual processor and 1 GB memory.

Offers the best granularity for performance. Using multiple Exchange storage groups results in more
log operations in parallel. Since database performance depends on log file performance, increasing the
number of logs can increase overall performance.

Offers the most granular management. Using four storage groups allows maintenance and recovery
operations that are fully compartmentalized and affect the least amount of users possible.

Allows for full separate treatment of a set of users with different performance, security, or
backup/recovery requirements.

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

8/8/2005
Using the Fewest ESGs Possible
Makes the most efficient use of server resources and thus is better for underpowered servers. This has
become a less-significant (potentially negligible) factor with newer servers and Exchange 2003.

Fewer ESGs may be easier to manage.

This comes at the expense of lost flexibility and added exposure. Downtime or data loss caused by the
loss of a particular database or storage group affects more users.

Using Two or Three ESGs


If a server does not have the resources to handle four ESGs, spreading users across an extra storage
group or two can be a compromise.

Appropriate for a smaller number of mailboxes on the server (~1500 or less).

Some organizations may prefer to reserve the fourth storage group for growth, low-risk testing, or
some recovery scenarios.

In some cases, the optimal disk layout may align better with two or three ESGs for performance or
capacity reasons.

How Many Databases per ESG?


There can be up to five databases in each Exchange 2000 or 2003 storage group. Following are some
considerations for database configuration.
Using Five Databases
A particular database can become logically corrupt. With five databases, logical problems on one
database will affect the data of the fewest number of people.

For some backup methods, a single database can be restored without affecting the users in other
databases in the ESG. Thus database recovery affects the fewest number of people.

Makes for databases of the smallest possible size. If you are allowing space for offline
defragmentation, you can keep all databases on the same LUN, but need to allow defrag space for the
size of just one database (the largest).

Using Less Than Five Databases


Administration may be a little easier when operating manually.

This is one way of reserving for growth.

Single-instance store is on a per-database basis. Spreading users across fewer databases may provide
storage and performance benefits (e.g., when sending a large document to a large distribution list) from
single-instance storage.

In some cases, the optimal disk layout may align better with three or four ESGs for performance or
capacity reasons.

How Many LUNs per ESG?


In most cases, two LUNs should be allocated for each ESGone for the transaction logs, and one for the
databases (edb and stm files).
The reasons for placing all ESG databases together, and some considerations for variations, are described in
the next sections.
Maintaining All Databases on the Same LUN
Less complexity for backup/recovery and disk configuration.

Offers the optimal capacity utilization for offline database defragmentation.

Allows growth space to be shared for any of the databases.

Best performance for volume shadow copies. Fewer LUNs provide additional safety margin for
completing VSS split operations within the allotted 10-second time window.

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

8/8/2005

Distributing Databases on Multiple LUNs


Allows for more granular restore from clones (but during recovery, all databases in the ESG must be
down anyway because of the resetting of the checkpoint file).

In the case of very large mailboxes, LUN sizes can approach 1 TB or more. Splitting the ESG
databases across two or three LUNs can improve clone synchronization times and the performance of
incremental SAN Copy from a clone.

Calculating the Base I/O per User Requirement


The best way to provide enough I/O to your application, especially in a large Exchange environment, is to
know your users usage profile. Sizing of the storage infrastructure should be based on a careful analysis of
the number of current and anticipated users, and their messaging habits and patterns. The fundamental
calculation concerns I/Os per second (IOPS) per user.
Table 1 describes four Exchange user categories and provides an estimate of the IOPS demand per user for
each1.
Table 1. Exchange User Profiles
Typical User
Profile

Description

Mailbox Size

Expected I/Os per Second

Light POP3

Hosted Internet mail

< 25 MB

0.08

Light MAPI

Infrequent e-mail access


Small mailboxes

<50 MB

0.18

Typical/Moderate

Constant e-mail access


Office worker

75-100 MB

0.4

Heavy

Active e-mail access


E-mail business processes
High-tech workers

100-200 MB

0.75

Once you have defined user profiles, you can calculate the total IOPS required by multiplying each user by
their predicted use of IOPS. Its best to measure what the user IOPS are now. IOPS per user should be
approximately the same when migrating from version 5.5 or 2000 to Exchange 2003. The slight IOPS
decrease seen in equivalent transactions between Exchange 2000 and 2003 are typically more than offset by
the gradual growth of users use of the messaging system.
There are other factors that increase the expected IOPS/user, including:

Very active Exchange servers (>2000 users)

Large mailboxes (>200 MB)


Start with a rough estimate that for each doubling of the mailbox size over 100 MB you increase the
IOPS per user by about one third.

Regularly sending very large documents (>5 MB)


Integrated voice mail is roughly equivalent to large documents.

Blackberry client users


Count each Blackberry user as the equivalent of two to three typical users.

Journaling
This adds significant overhead. Start with an estimate double the I/O.

IOPS numbers referenced from the Microsoft Exchange Server 2003 Performance and Scalability Guide
white paper.
EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices
Storage Configuration Guidelines

8/8/2005
A typical read/write ratio in Exchange is from 2:1 to 3:1. A ratio lower than this (higher percentage of
writes) will also increase the IOPS/user on RAID storage. This is discussed in RAID Types and the
Read/Write Ratio on page 10.
Organizations typically use many mailboxes in Exchange that are not associated with an individual user.
These mailboxes often serve as the central contact point for a group, for conference rooms, or mailboxes
used by integrated applications. Depending on the number of these unassociated mailboxes (it can be
significant10 percent or more) and their activity level (which can vary significantly), it may be
appropriate to factor these into the IOPS calculation. By default, treat these mailboxes as equivalent to the
typical user mailboxes, and always include them in the capacity calculation.

Calculating the IOPS Requirement for an Exchange Environment


This is a key step in configuring CLARiiON storage for good Exchange performance. It is a best practice
to configure dedicated disk drives for the Exchange databases. Calculate the periods of highest I/O demand
during the day by looking at the anticipated cumulative effect of user activity, system activity (virus
checkers), and background activity (local or remote replication). Balance the I/O where possible with
scheduling (backup during off-peak hours) and even distribution of users across ESGs. Then, plan a design
that will handle the resulting peak I/O load.
Start with the measurement or estimate of the I/O profile of the Exchange users in the organization. Plan
for the peak user load timetypically mid-morning on Monday.
An example with 3,000 Exchange users, separated evenly into four storage groups, runs through the set
of calculations in the next few sections. The example is highlighted in a series of text boxes.

1000 heavy users at a peak of 1 IOPS each

[1000 IOPS]

2000 typical users at a peak of .5 IOPS each

[1000 IOPS]

A general read/write ratio of 2:1

Maximum concurrency of active users at 90%


This results in a requirement of [(1000 + 1000) x .9] = 1,800 host-based IOPS for the 3,000 users
during their peak activity period.

RAID Types and the Read/Write Ratio


Depending on the particular organizational requirements, there are two possible RAID type options that can
be appropriate for production Exchange database LUNs:

RAID 1/0 This offers the best performance with high protection, but only 50 percent of the RAID
group capacity is usable. It is frequently recommended because it provides sufficient space across the
number of spindles required for handling peak I/O load with todays larger disk drives. On a RAID
1/0 LUN, there are two physical I/O operations for each write requested (a write to each mirrored
disk), described as a write penalty of two.

RAID 5 This configuration offers a higher usable capacity per RAID group than RAID 1/0. It can
be effective for environments with very large mailboxes and/or lower IOPS requirements. However, in
a RAID 5 group there are four physical I/O operations for each write requested (two reads to calculate
parity, one write for data, and one write for parity).
Regardless of the RAID type chosen, it is important to configure enough drives to handle the I/O demand.
Table 2. Write Penalty by RAID Type
RAID Type

Write Penalty

RAID 1/0 (Striping + Mirroring)

RAID 5 (Striping + Parity)

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

10

8/8/2005
Use this formula to adjust the base user I/O requirements in the ESG by figuring in the write penalty for
each RAID group:
(Base IOPS x Read %) + (Base IOPS x Write % x Write Penalty)
= RAID-Adjusted Back-End IOPS
Carrying on the example above, with 1800 total base IOPS and a 2:1 read/write ratio:
For RAID 1/0:
(1800 x 2/3) + (1800 x 1/3 x 2) = 1200 + 1200 = 2400 IOPS
For RAID 5:
(1800 x 2/3) + (1800 x 1/3 x 4) = 1200 + 2400 = 3600 IOPS

Other Factors That May Impact I/O


There are other administrative operations that may impact I/O. Several of these should be scheduled to
take place only during off-peak times (see the following examples). This additional I/O activity must be
accounted for by increasing the IOPS capacity by some amount over the RAID-adjusted IOPS requirement.
To estimate the amount of I/O overhead these additional activities will cost, it is best to perform tests in an
environment that match the target production environment as closely as possible. The less sure you are of
this overhead, the more capacity you should assign to ensure good performance.
Examples of Background I/O Activity That Cannot Be Scheduled to Off-Peak Times
High load on the server The more active mailboxes a server is managing and the less memory it has,
the less likely any particular users mailbox will be cached. If this has not been figured into the peruser IOPS requirement, it should be figured in here. For example, going from 2,500 to 4,000 users on
a system can increase the IOPS per user by about 10 percent.

Server-based antivirus protection Besides the extra reads, antivirus software can add 20 percent or
more to the CPU utilization of the Exchange server.

Integrated features and applications The impact here depends on the number and type of any
integrated features (such as content indexing) or applications (such as workflow), and the amount of
their use.

Synchronous or asynchronous mirroring If a mirroring solution is used for distance replication or


disaster recovery, it should be factored in very carefully.
The total I/O demand during peak user load can be calculated by adding the cumulative overhead of the
background activity to the RAID-adjusted IOPS requirement. This overhead is often calculated simply as
an added percentage, but some activities (such as virus checkers) involve primarily reads, in which case its
more accurate to ignore the write penalty in the calculation.
RAID-Adjusted User IOPS + I/O Overhead = IOPS Requirement at Peak User Activity
For example, if operating on an active 3000 user Exchange server running a frequently used
workflow application, it would be reasonable to start with an estimated overhead percentage of 20
percent. Continuing the example to calculate the IOPS requirement during peak user activity:
RAID 1/0:
2400 + 20% = 2880 IOPS
RAID 5:
3600 + 20% = 4320 IOPS

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

11

8/8/2005
There are other schedulable activities that can significantly increase I/O demand on the Exchange LUNs.
You should understand their impact. It is important to factor these activities into the peak I/O requirements
if they are scheduled to take place during a period of high user activity.
Examples of Schedulable Activities
Online backup to disk or tape
This places heavy read activity on the production LUNs. There is added overhead if the Exchange
server is used to manage the backup.

Local clone-based replication


Clone-based replication using an application such as RMSE involves synchronizing all source
Exchange LUNs in the ESG to their clones. During the incremental synchronization part of the
process, there is heavy back-end read activity against the production LUNS.
Once the copy is complete on the clones, Eseutil performs an integrity check on each page of the
database replicas. The validated copy may also then be archived or copied (via SAN Copy) to a
remote site. These checking and archiving functions cause high read activity, but it is to the clones
rather than the production LUNs. As long as best practice is followed and the clone LUNs are on
separate drives from production LUNs, this heavy read activity comes into calculations only in terms
of backup scheduling and use of overall array resources. A single Eseutil process can use up to 30
percent of the CPU of one SP.

Local snap-based replication


Snaps will have a more significant impact on production Exchange LUNs than clone-based replication.
If snaps are chosen as a backup method, it is particularly important to factor in the associated impact.
This is discussed further in the section Snapshot-Based Replication on page 22.

Exchange online maintenance


By default, Exchange schedules online database maintenance for a 4-hour period nightly to perform
functions that include clearing out deleted mailboxes and deleted items that have gone past their
retention period, plus online defragmentation. This timing and duration can be adjusted.
Because of the heavy I/O that online maintenance adds, schedule it to take place during the period of
lightest activity. At the beginning of online maintenance, Exchange performs an Active Directory
lookup for each user in the database. Slightly offsetting the online maintenance start times of the
databases will reduce the impact of these searches on the Active Directory.
You cannot perform a backup on a database at the same time it is undergoing online maintenance.
Maintenance will pause until the backup job for that database completes, but it will not extend
operation past its allotted time window. Take this into consideration to ensure that online maintenance
gets enough time each day.
In summary, take additional I/O activity into consideration when calculating the anticipated demand. Then,
design to accommodate the peak I/O load with that overhead factored in.

Calculating the Capacity Requirement for Database LUNs


Usually the I/O requirements will determine the number of drives required, but it is also necessary to know
the space requirement for the databases in the storage group. For certain environments (such as large
mailboxes or low IOPS per user) the storage capacity requirements may call for more disk space than
performance needs dictate. Comparatively, this is an easy calculation.
For each category of users in the ESG, multiply the maximum allowed mailbox size by the number of users
in that category. If the number of users is planned to grow, factor that in here. Allow an additional
percentage of the database size for deleted item retention (~ 10 percent for a typical 30-day retention
period). Because its important not to run out of space on the LUN, allow an additional buffer of 10 to 20
percent of the sum of the databases on the LUN.

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

12

8/8/2005
Offline Defragmentation
With Exchange 2003, in most cases it is no longer necessary to perform offline defragmentation of the
databases. Normal online maintenance will defragment the database, but it will not compact the size of the
file. The only way to actually shrink the database size is to perform offline defragmentation, where the
database is dismounted and Eseutil is used to rebuild a new copy. To have the shortest time offline, this
rebuild is performed on the same LUN as the source. There must be free space equaling at least 110 percent
of the size of the database for the rebuild. If multiple databases are stored on the same LUN, enough space
must be allowed to handle the rebuild of the database with the largest size.
Summarizing the Calculation
Space Required for the ESG Database LUNs =
Maximum Mailbox Size * Number of mailboxes
+ Extra space for the deleted item retention
+ Public Folder space (if part of the ESG)
+ 10% to 20% free space for growth protection
+ Space for offline defragmentation if required
For example, with 750 users in one ESG, distributed evenly across five databases but all on the same
LUN (no offline defrag space included):
250 heavy users @ 100 MB Mailbox

25 GB

500 typical users @ 75 MB Mailbox

37.5 GB

Sum requirement for mailboxes

62.5 GB

Add 10% for deleted item retention

62.5 x 1.1 = 68.8 GB

Size of each database

68.8 / 5 = 13.8 GB

Add 15% for extra free space

68.8 x 1.15 = 79.1 GB

To include space for offline defragmentation, add space equivalent to the size of the largest database.
With space for offline defragmentation
(extra free space is already available)

79.1 + 13.8 = 92.9 GB

When four or five databases are planned for each storage group, deleted item retention is typical, and space
is required for offline defragmentation; you can perform a quick capacity calculation by simply allocating
free space equal to 50 percent of the mailbox total (62.5 x 1.5 = 93.75).

Choosing a RAID and Disk Type


Regardless of RAID type, a physical disk can handle a certain number of Exchange-style IOPS. The IOPS
capacity of disks continues to improve with new disk models, but the performance improvement has not
kept pace with the increase in storage capacity. Consequently, most Exchange disk configurations today
are determined by I/O requirements rather than capacity.
There is not consistent agreement on the IOPS capability of a disk drive. Although some sequential I/O
tests have indicated that a CLARiiON 10K rpm drive can perform at a speed greater than 300 IOPS, a more
practical value to use with the Exchange 4 KB random I/Os on CLARiiON has been determined to be in the
vicinity of 130 IOPS. Similarly, a practical value to use for 15K rpm drives is 180 IOPS.

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

13

8/8/2005

Table 3. IOPS per Spindle


Disk rpm

CLARiiON Disk I/O Capacity


with Exchange Databases

10K

130 IOPS

15K

180 IOPS

Using these values, you can divide into the RAID-adjusted IOPS requirement to construct a table indicating
the number of drives needed to handle the I/O demand of the ESG. Although in the end you may be
required to round up the number of drives to meet RAID 1/0 or RAID 5 requirements, the following table
leaves the number as calculated (rounded up to the next whole drive).
RAID-Adjusted IOPS / IOPS per Disk = Drive Count for the Exchange Database LUNs
Continuing the example, calculating the drive requirement for the RAID 1/0 IOPS total (2880) and RAID 5
IOPS total (4320):
Table 4. Disk-Drive Requirements by RAID Type and Disk Speed All Users
RAID 1/0

RAID 5

10K rpm

23

(2880/130)

34 (4320/130)

15K rpm

16

(2880/180)

24 (4320/180)

Since its advisable to lay out Exchange databases on dedicated drives, the most straightforward design is
to dedicate one or more RAID groups for each ESG.
If the 3000 users are spread across four storage groups, applying the example to some dedicated RAID
configurations:
Table 5. Disk-Drive Requirements by RAID Type and Disk Speed Per ESG
RAID 1/0

RAID 5

10K rpm

(3+3)

10 (4+1)s
9 (4+1) and (3+1)

15K rpm

(2+2)

6 (5+1)

This assumes that the I/O requirements for each storage group are the same. In practice, some fine tuning
is often required because of varying user counts and activity level.

Comparing RAID 1/0 to RAID 5


It is an accepted notion that RAID 1/0 is a better choice for random-write environments like Exchange. The
effect is somewhat subtle; since all writes hit the write cache, RAID 1/0 and RAID 5 RAID groups perform
equally well until the storage system is sufficiently busy and the write cache becomes saturated. The
advantage of using RAID 1/0 rather than RAID 5 with Exchange is that RAID 1/0 groups can flush the
cache of Exchanges random write load about 15 percent to 30 percent faster than RAID 5 groups. This
equates to cache-speed performance at a higher random-write load. Additionally, rebuild times and rebuild
impact is reduced with RAID 1/0 in the event of disk failures (see Table 6., reproduced from the CLARiiON
Best Practices for Fibre Channel Storage white paper).

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

14

8/8/2005

Table 6. RAID Types and Relative Performance in Failure Scenarios


RAID Type

Rebuild IOPS Loss

Rebuild Time

Impact of Second Failure


during Rebuild

RAID 5

50%

15% to 50% slower than RAID 1/0

Loss of data

RAID 1/0

20% - 25%

15% to 50% faster than RAID 5

Loss of data 14% of time in an


eight-disk group (1/[n-1])

RAID 1*

20% - 25%

15% to 50% faster than RAID 5

Loss of data

* RAID 1 is not a recommended RAID type for Exchange database LUNs, but it may be appropriate for some smaller
log LUN configurations.
The downside may be cost. RAID 1/0 requires more drives for a given capacity, but that extra capacity may
not be valuable when the spindle count is determined by I/O throughput. Consider the following
calculation to determine the number of users that one physical drive can handle:
User IOPS per Drive = IOPS per Drive x Host-Based IOPS / Back-End IOPS
The ratio of host-based IOPS to back-end IOPS depends on the read/write ratio and the RAID type. The
easiest way to get this figure is to add the left and right sides of the read/write ratio and divide by the (read
ratio + write ratio x write penalty).
RAID 1/0, User IOPS per Drive with a 3:1 read/write ratio on a 10K drive:
130 x (3 + 1) / (3 + 1 x 2) = 130 x 4 / 5 = 104 IOPS
RAID 5, User IOPS per Drive with a 3:1 read/write ratio on a 10K drive:
130 x (3 + 1) / (3 + 1 x 4) = 130 x 4 / 7 = 74 IOPS
Users per Drive = User IOPS per Drive / IOPS per User
RAID 1/0, .8 IOPS per user:
104 / .8 = 130 users per drive
RAID 5, .8 IOPS per user:
74 / .8 = 92 users per drive
Table 7.7 uses these calculations for six drives, configured as RAID 1/0 and RAID 5, with a few different
mailbox sizes, allowing 50 percent free space on the LUN for deleted item retention, growth, and offline
defragmentation.
Table 7. Capacity Comparison for Six 73 GB Drives by RAID Type and Mailbox Size
Required Capacity
Usable
Capacity

Users @ .8
IOPS

75 MB
Mailboxes

150 MB
Mailboxes

250 MB
Mailboxes

RAID 1/0 (3+3)


10K

198 GB

780 (130x6)

88 GB

176 GB

293 GB

RAID 5 (5+1)
10K

330 GB

552 (92x6)

62 GB

124 GB

207 GB

Note that once the mailbox size gets much over 150 MB, there is not enough space on the RAID 1/0 group
to accommodate the number of users whose I/O it can handle. This does not take into consideration the
fact that the increased mailbox size causes the IOPS per user to increase.
RAID 5 becomes more appropriate when the capacity requirements are high, relative to the I/O demand,
such as when the I/O demand is low (< .4 IOPS) or the mailbox limit is high (>250 MB).

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

15

8/8/2005

Comparing 10K rpm to 15K rpm


Another choice for buying performance is the 15K rpm drive. The 15K rpm drive offers up to 30 percent
better performance over 10K rpm drives in the kind of random-access case that Exchange presents. The
increased speed assures the write cache avoids saturation and keeps writes going at cache speeds. This does
not apply to the sequential access log devices, where the two RAID types perform about the same.

Comparing 73 GB, 146 GB, and 300 GB


Smaller drives offer more performance per gigabyte. However, the performance/gigabyte curve drops as
larger drives are deployed. Base the disk-size decision for production Exchange drives on:

The I/O capacity of the drive (number of users it will support, when averaged across the RAID group).

The average maximum mailbox size for those users.

Design simplicity and flexibility.


For example, if a RAID 1/0 3+3 group will support the I/O demand of 600 Exchange users with an average
mailbox maximum of 100 MB, 73 GB drives provide more than enough storage capacity. In some cases,
where the predominant disk on the storage system is a larger size or faster speed, it may make sense to
standardize on that type to simplify LUN layout. Refer to Appendix A for some comparative examples.

Capacity Check
The amount of usable space in a RAID group can be calculated with the following formula2:
RAID 1/0:
Usable space of a RAID 1/0 Group = Usable Drive Space x (# Drives / 2)
RAID 5:
Usable space of a RAID 5 Group = Usable Drive Space x (# Drives 1)
Table 8 compares the usable capacity of 10 drives configured as RAID 1/0 and as RAID 5.
Table 8. Usable Capacity of 10 Drives by Disk Size and RAID Type
Raw Capacity

Usable Capacity
per Drive

10 Spindles

10 Spindles

Usable Capacity

Usable Capacity

5+5 R1/0

Two 4+1 R5

36 GB *

33

165 (33x10/2)

264 (2x33x[5-1])

73 GB

66

330

528

146 GB

134

670

1072

300 GB

268

1340

2144

* No longer sold
Typically, a disk layout that meets the I/O requirements for an ESG will contain more than enough capacity
to meet the ESG storage requirements. However, if the mailboxes are particularly large, the I/O demand per
user is very low, or if a large amount of space is required for offline defragmentation, the storage
requirements may affect the design.

For exact figures, especially with the use of vault drives, refer to the CLARiiON Capacity Calculator at
http://clariipub.corp.emc.com/cse/capacitycalc/capacitycalc.htm

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

16

8/8/2005
Referring to Table 8.8, if the required storage for the ESG went beyond 330 GB, the RAID 1/0 spindle
count of 10 for 36 GB and 73 GB drives would not be adequate. Options would be to increase the drive
count, move up to 146 GB drives, or switch to RAID 5.
Using the calculation of 92.9 GB per ESG from the example, disk space requirements would not be a
factor except for a RAID 1/0 2+2 configuration of 36 GB drives.
Appendix A includes some additional examples of I/O and capacity calculations.

Summary
Because storage capacity of disk drives has outpaced their increase in I/O throughput, IOPS capacity is the
standard to use today when determining the required number of drives. Since a RAID 1/0 group can
perform better than that of a RAID 5 group under certain I/O loads with the same number of spindles, and
since the space disadvantage of RAID 1/0 has become less significant, we recommend RAID 1/0 as the
default choice for Exchange database volumes. RAID 5 may be appropriate for some customers depending
on I/O load and cost considerations.
The decision to choose between 10K or 15K drives will likely come down to cost. Since IOPS are the
determining factor in how many disks you need, the number of 15K drives required will usually be less
than the 10K drive requirement. Balance the additional cost per drive against savings on additional drives,
DAEs, and possibly cabinets.

MetaLUNs
CLARiiON storage systems can combine multiple LUNs into a larger metaLUN that spans multiple RAID
groups. MetaLUNs offer two primary advantages: I/O load balancing and expandability. Usually you will
configure an ESG for its maximum anticipated size at the start, but it is possible to use a metaLUN to
handle gradual growth.
The main advantage is the ability to distribute I/O over many spindles without resorting to host striping.
Striped volumes are particularly advantageous with workloads such as Exchange that are random and
bursty, and metaLUNs make the use of striped volumes simple. Suppose that you have planned two ESGs
to reside on their own 3+3 RAID 1/0 groups, for a total of 12 spindles. An alternative design would be to
create a metaLUN for each of the ESGsboth spanning the two 3+3 groups. The same 12 spindles would
still be handling the I/O of the two ESGs, but in this case, if the I/O demand of one ESG is higher than the
other, the combined load will be balanced across all of the drives. This can help to avoid an I/O bottleneck
for a particular ESG. The two storage groups would also share the cost of a disk rebuild. By spanning two
RAID groups, they double the risk of being affected by a rebuild but a rebuild will only affect half the disks
of the metaLUN.

Figure 1. MetaLUNs Sharing Two RAID Groups


Another choice would be a single 6+6 RAID group. It would yield the same performance, but with less
growth potential. To accommodate growth, additional free space on the existing RAID group set can be
EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices
Storage Configuration Guidelines

17

8/8/2005
used by concatenating the metaLUN. RAID group expansion also provides room for metaLUN growth (by
concatenating the metaLUN), along with added performance.
For added load balancing across a RAID group set, you can interleave the data of multiple storage groups
on the RAID set by creating metaLUNs for each ESG in the following order (illustrated in Figure 2):

Stripe the first component of ESG1


Stripe the first component of ESG2
Concatenate the next component of ESG1
Concatenate the next component of ESG2

Figure 2. Interleaving MetaLUNs

Building Blocks
Configuring metaLUNs can add a level of complexity to an Exchange design. In practice, the metaLUN
configuration performs well and is easier to manage if you choose from a limited number of proven RAID
group types and sizes. These groups serve as flexible building blocks that will be easy to multiply out in
the full Exchange storage design.
Following are recommended building block elements that are small enough to be flexible, allow for growth,
and have been proven to perform well:

RAID 10: 3+3; 4+4; 5+5

RAID 5: 4+1
Its possible to overuse metaLUNs. While offering the benefits of flexibility and load balancing,
metaLUNs residing on the same RAID group all share the risk of a physical disk failure in that group. It is
worth considering the consequences of a RAID group failure when you plan your metaLUN layout.
Typically, two or three ESG metaLUNs sharing a RAID group set become a practical limit.

Log LUN Configuration


When configuring disks for Exchange, most attention is paid to the database LUNs because they typically
represent the highest risk of performance bottleneck. But the database performance depends on the log
response time. Database transactions are gated by the completion of the associated log write.
When choosing a RAID type for log file LUNs, I/O performance and data protection are the overriding
factors rather than capacity. RAID 1/0 is the best RAID type to use for log LUNs. It provides better
response time than RAID 5 in degraded situations. In the case of a disk failure, RAID 1/0 rebuilds
complete faster than RAID 5. The longer the rebuild period, the more vulnerability there is to data loss.
Data loss always occurs if a second drive is lost during rebuild of a RAID 5 or RAID 1 group (see Table 6
on page 15).
Although writes to the log LUN are sequential, performance tests have shown that you can take best
advantage of a set of drives by sharing a set of log LUNs on them. A rough rule of thumb to use for log
drives is to allocate one-eighth to one-tenth the number of spindles you have allocated for the databases,
rounding up for RAID 1/0. For example, if you have calculated the need for 36 drives to handle the

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

18

8/8/2005
databases of four Exchange storage groups on a server, you could allocate four drives in a RAID 1/0 2+2
group to handle the four transaction log LUNs for that server.
To avoid added recovery complications in the unlikely case of the physical loss of a RAID group, the log
files for multiple servers should be stored on separate drives.
There are some other important considerations in the design of the disk layout for the Exchange transaction
logs:

There is one set of transaction logs for each ESG (i.e., the logs for all databases in the ESG are
combined).

For manageability and flexibility, each set of log files should reside on its own LUN. This is an actual
requirement for VSS-based backups.

The transaction log files for an ESG should always be on a separate LUN from their associated
databases. The log LUN should never share the same spindles as the database LUN for the same
ESGeven on small systems. This is for protection rather than performance. If something should
happen to the database, the log files are essential to recover transactions since the last backup. If those
log files reside on the same physical disk and that disk is damaged, this option is lost.

Log I/Os are 100 percent writes. They are most frequently 512-byte writes, but can be up to 64 KB or
larger.

Host I/O to the log LUN equals approximately 10 to 15 percent of the host I/O to the database LUNs.

The size of each Exchange log file is 5 MB. Most online Exchange backup processes will delete log
files whose transactions have been committed to the database. It is important to confirm that
committed log files are being deleted to avoid running out of space on the log LUN.

Circular logging is an Exchange feature that causes log files to be deleted after their transactions have
been committed to the database. Only a handful of the logs are maintained at any time to save space.
However, this sacrifices the ability to recover a database up to the minute. It is off by default and
should never be turned on.

Calculating Log LUN Storage Capacity Requirements


You can calculate storage capacity requirements for a log LUN by multiplying 5 MB by the maximum
number of log files maintained before being pruned.
If you dont know the maximum number of log files generated in a day, you can use a rough rule of thumb
of one log file per user per day. (Microsoft uses an estimate of two logs per day for configuration; EMCs
actual log file rate is considerably less than one per day.)
Unless you specify otherwise, typical Exchange online backups will prune the log files on each run. For
example, if a backup is run nightly, the log LUN for a 1,000-user ESG would minimally need space for
1,000 log files, or 5 GB. Be sure to allow extra capacity to ensure that the log file LUN never runs out of
space.

Additional Storage Considerations for the Exchange Production


Data
The previous configuration guidelines have covered design recommendations for database and log LUNs
for medium-to-large Exchange implementations. This section describes some additional considerations.

Public Folders
Its difficult to provide general guidelines for configuring public folder storage because their usage varies
so widely. Some organizations practically ignore the existence of Exchanges public folder capability,
while others use them extensively for shared document repositories, discussions groups, shared calendars,
and several other purposes. By default, the public store is contained within a dedicated database in the first

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

19

8/8/2005
storage group; but when used actively, it is often configured on its own Exchange server with at least one
replicated copy to another Exchange server.
The best starting point for planning public folder storage for a newly migrated Exchange 2003 environment
is to examine the current I/O, storage usage, and growth rate for public folders in the current environment.
Adjust these measurements as necessary based on any planned changes to the public folder use policy (such
as adding a new integrated application that makes significant use of public folders, or switching shared
documents to file shares). Then, deploy using the principles described in this paper as with any other
Exchange storage group.

SMTP Queue
The SMTP message queue should be placed on CLARiiON storage. It is not necessary to create replicas of
the SMTP queue. However, the queue can be placed on one of the database LUNs for the server as long as
the additional I/O and capacity requirements are factored in.

Keeping EDB Files and STM Files Together


An Exchange message store (database) consists of an EDB filecontaining all content generated by MAPI
clients, indexed message properties, and moreand a streaming file containing content generated by
Internet clients. Since the EDB file and STM file compose a complete message store, it is advisable to
keep them together on the same LUN.

Smaller Exchange Environments


If a relatively small (fewer than 20 database drives or 1,000 users) Exchange system is implemented on a
CLARiiON system, the same I/O requirements apply for the database LUNs, and thus it remains important
to allocate enough drives. It still may make sense to use RAID 1/0 for the database drives.
Database and log files should still be kept on separate RAID groups. For these smaller configurations, it
may be appropriate to use a RAID 1 pair for log files. If the number of drives is tight, the extra capacity on
the log drives could be used for some other light purpose.
If there are only two RAID groups available for Exchange data, the users can be split across two ESGs and
the logs placed with the databases for the alternate ESG. However, you must then factor in the log I/O
requirement when calculating the spindle count.

Figure 3. Sharing a Database LUN with the Logs from an Alternate Storage Group

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

20

8/8/2005

Planning Storage for Local Recovery


This section provides guidelines for configuring storage to handle local replicas and disk-based backups of
Exchange production data.

SnapView for Disk-Based Replication


SnapView is an optional software package for the EMC CLARiiON storage system3. Using SnapView,
users can create a point-in-time viewor multiple viewsof a LUN, which can subsequently be made
accessible to another server, or simply held as a point-in-time copy for possible restoration. For instance, a
system administrator can make the SnapView replica accessible to a backup server so that the production
server can continue processing without the downtime traditionally associated with backup processes. In the
event of a data corruption on the source LUN, SnapView replicas can be used to restore the contents of a
corrupted LUN to the point-in-time creation of the replica. SnapView can create replicas using either
clones (BCVs) or snapshots.
There are currently two products sold by EMC that facilitate the management and integration with
Exchange 2003 and the Windows Volume Shadow Copy Service (VSS) to create verified SnapView
replicas of an ESG, ready to be restored immediately:

Replication Manager/SE (RMSE)

Replication Manager (RM)

Clone-Based Replication
The backup option that provides the most rapid recovery of an Exchange database today is clone-based
replication. SnapView clones provide users the ability to create fully populated copies of LUNs within a
single array. Once synchronized, clones can be fractured from their source, and then presented to a
secondary server for read and write access. Following the initial synchronization, clones can be
incrementally resynchronized, where only the data that has changed on the source since the clone was
fractured will be copied to the clone. In the event of a data corruption on the source LUN, clones can be
used to restore the source LUN via a reverse synchronization operation. This returns the source LUN to the
point-in-time view of the source as it was when the clone was fractured. In the unlikely event of a
hardware error on the LUN (for instance, a multiple-drive failure), the clone can be repurposed as the
production LUN. Thus, clones provide protection against both software errors and hardware errors.
RM and RMSE allow you to configure up to eight clones for each of the Exchange production LUNs.
Consider the following when clone-based replication is chosen as the backup method:

Know the characteristics of the clone backups youll be using.


Number of clones for each production LUN Each extra clone maintains a validated backup copy
for a different point in time.
Timing and frequency of the backup operation Spread out backups of the storage groups and
time them for low-usage periods to take best advantage of array resources.

Fibre drives are recommended for clones. Their performance is well-suited to clone
resynchronizations. ATA drives are not recommended for clone LUNs in an active Exchange
environment because the clone resynchronization operation to an ATA drive is considerably (up to
several times) slower. The likelihood of affecting the production environment during this time
increases. Additionally, with an IOPS rate of less than half that of fibre drives, the backup window
will also be extended on ATA drives by an Eseutil check that takes longer to complete.

Clone LUNs can be bound on drives that differ in size, speed, RAID geometry, or even drive type. For
instance, some users may elect to use RAID 1/0 for their production LUNs, and RAID 5 for their

SnapView is supported on all CLARiiON models, CX300 and higher. The only CLARiiON platform on
which SnapView is not supported is the CX200.
EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices
Storage Configuration Guidelines

21

8/8/2005
clones. RAID 5 is recommended for clones because they dont have the same IOPS requirements, and
provide greater capacity. Additionally, some users may elect to put production data on 15K rpm
drives, and use 10K rpm drives for their clones.

Eseutil will be the most I/O-intensive activity on the database clones. Use the Eseutil throughput
requirements when determining the spindle count here.

When using RAID 5 with clones, take modified RAID 3 (MR3) support into consideration. RAID 5
4+1 sets offer a good balance of rebuild time with MR3 support4.

146 GB and 300 GB drives may also be more appropriate with clones. The combination of RAID 5
and larger drives allows you to configure sufficient extra capacity to store multiple clones on the same
RAID group. Since clone synch operations are scheduled, the I/O capacity for a set of drives can be
shared to handle multiple backups occurring at different times.

Do not place clone LUNs in the same RAID group that contains their source LUN.

Plan clone layouts to avoid backups occurring simultaneously on different LUNs configured on the
same RAID group. LUN resynchronizations and Eseutil integrity checks both involve a heavy amount
of I/O. Trying to pair two or more of these activities at the same time will slow each other down and
possibly affect user response times.

Consider using metaLUNs for clones to provide more spindles to improve Eseutil performance.

Keep in mind the building blocks discussion on page 18. For simplicity, it is advisable to try to select
your RAID group from the recommended choices, and expand using this as a base.

To add some extra protection, if you are configuring multiple clones for an Exchange LUN, its best to
alternate the clones on separate spindle sets.

Snapshot-Based Replication
RM and RMSE can create a VSS shadow copy via a SnapView snapshot. As with clones, snapshots
provide users a readable and writeable LUN replica. Snapshots, however, are not fully populated copies.
They use a pointer-and-copy-based design, where pointers map to data regions on the source LUN until
they are changed, at which point the original data is copied to a reserved area and the pointers are
redirected accordingly. In this way, users have only to allocate sufficient disk space to accommodate the
changes to the source LUN.
The process of allocating the pointers according to a particular point in time is referred to as starting a
SnapView session. To see the contents of a particular session, a user can activate a snapshot to the session.
The reserved LUN is the private LUN that contains the original source data, and the process of copying that
data is referred to as copy on first write (since it must only occur on the initial change to the source LUN).
As with clones, SnapView session data can be used to restore a corrupted source LUN. Much like the
reverse synchronization that clones offer, SnapView sessions can be rolled back to the point-in-time view
of when the session was started.
Snap-based replication is not an ideal backup method to use with Exchange for the following reasons:

SnapView copy on first write (COFW) must be completed before allowing a change to production
data. This causes additional overhead on the production LUNs, especially for the period shortly after
the snap session is started, and can noticeably impact performance. Even if the replication is
performed during off hours, the snap session remains active during the day to hold the backup copy. In
that case, the highest COFW activity takes place when users become active the following morning.

The Eseutil integrity check on the snap results in very heavy read activity to the production database
LUN, which can cause elevated disk response times.

The replica taken is not a completely separate physical copy. An unlikely physical loss of the
production LUN results in a loss of the backup as well.

For information on MR3 writes, refer to the CLARiiON RAID 5 Optimizations section of the EMC
CLARiiON Fibre Channel Storage Fundamentals white paper.
EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices
Storage Configuration Guidelines

22

8/8/2005

The space required for the reserved LUN pool (snap cache) is larger than typicalpossibly equivalent
to the cumulative size of all files on the source LUNbecause of the random nature of Exchange I/O.

Comparison of Local Replication Options


When comparing the benefits of clone-based to snapshot-based replication of Exchange data, the clone
option has several advantages. It is the recommended solution for Exchange local replication. The
following tables provide a brief comparison of clone/snapshot capabilities and impact, and of the currently
released EMC local replication solutions.
Table 9. Comparison of SnapView Clones and Snapshots for Exchange Local Replication
Feature

Clones

Snapshots

Accessible to secondary server

Yes

Yes

Recovery in event of software error

Yes

Yes

Maximum replicas

Can be used with MirrorView

No

Yes

Recovery in event of hardware error

Yes

No

Disk space, as a percentage of the


source LUN

100% per clone

Varies widely. With a daily snapshot


for local Exchange replication, the
snapshot may approach 100%.

Performance impact on source

Only during resync and impact


is typically less than COFW
activity

Yes, for the duration of session. Any


activity on the snap (such as Eseutil or
backup to tape) directly impacts the
production LUN.

Table 10. Comparison of VSS-Integrated EMC Local Replication Products


Product

Supported
Exchange
Versions

Supported OS

Supported Array

Replicas

RMSE a

5.5
2000
2003

Windows 2000
Windows 2003

CLARiiON

Up to eight clones or
snapshots

Replication
Managerb

5.5
2000
2003

Windows 2000
Windows 2003

CLARiiON
Symmetrix
Some third-party arrays
(see EMC Support Matrix)

Up to eight clones or
snapshots

RMSE also supports replication of SQL and Windows file systems.

Replication Manager also supports additional operating systems and replication of additional applications.

Online Backup to Disk


Disk-based Exchange online backup has become a competitive alternative to tape backup, offering faster
performance and higher reliability. Consider the following when online backup to disk is chosen as the
primary backup method.

Capacity requirements are determined by the size and frequency of the backups and the number of
copies maintained before archiving to tape.

Online backups perform sequential writes. A configuration that optimizes this type of I/O will perform
best.

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

23

8/8/2005

CLARiiON ATA disk drives are effective for backup to disk. They perform well with sequential I/O
operations especially with FLARE Release 13 or higher and a RAID 3 configuration. They are also
most competitive with the cost of tape backup.

If keeping more than one backup of an ESG on disk, you can alternate copies across two different
RAID groups for added protection.

Most testing has been done on RAID 5 4+1 (RAID 3 4+1 for ATAs). It has been determined to be the
sweet spot for performance with backup to disk.

Multiple streams within the same LUN will often result in better overall throughput, although the perstream performance will understandably decline. If each stream is sent to a separate LUN on the
RAID group, the data in a particular stream will be more contiguous on the disk and thus restore
somewhat faster (by 5 MB/s to 10 MB/s).

For the most rapid recovery and the least user impact, individual Exchange databases should be as
small as possible with the smallest possible number of users. If the backup method will be online
backup to disk, you have added incentive to use the maximum number of storage groups and databases
within those ESGs.

If online backup is performed during a period of active production, be sure to take this additional
activity into account when calculating I/O requirements, also paying attention to overall array
limitations.
For more detail on this topic, refer to the white paper EMC CLARiiON Backup Storage Solutions CX
Series: Backup-to-Disk Performance Guide.

Recovery Storage Groups


With the addition of the recovery storage group (RSG) feature in Exchange Server 2003, a separate server
for mailbox recovery is no longer required. Within an RSG, an administrator can mount a second copy of a
mailbox database and use it to recover any data it contains.
When allocating storage for RSGs, plan a single LUN to handle the size of the largest regular Exchange
storage group, plus existing logs for that ESG. Log files can reside in the same LUN as the databases since
users cannot log on to these and mail cannot be delivered to them.
With local replication to clones, its convenient to mount a snapshot of the clones for use with a recovery
storage group. Of course this is useful only if the data you need to recover is recent enough to exist on the
clone replicas. You cannot mount the clone directly for use with an RSG because mounting the Exchange
database would make it unusable for the standard VSS recovery. When planning to mount this replica, be
sure to allocate some snap cache for this purpose.
For more detail on this topic, refer to the Microsoft white paper Using Exchange Server 2003 Recovery
Storage Groups.

Planning Storage for Local Message Archiving


Message archiving on CLARiiON storage has been a growing component of new messaging system
designs. The archiving implementation generally involves adding one or more servers to the environment
to manage the movement of the content of messages past a set age out of the standard Exchange database
structure. The archiving servers also manage the near-line retrieval of these messages when a user calls for
one.
EMC Legato EmailXtender software provides this archiving capability. The EmailXtender manual
provides a formula to estimate the amount of data to be archived.
Container File Storage (disk space for archived messages) =
Number of users
x

Messages per user per day

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

24

8/8/2005
x

Days per work week

Number of weeks mail is retained

Average message size (KB)

.000001 (to convert the result to GB)

For example:
3200 x 20 x 5 x 250 x 50 x .000001 = 4000 GB
EmailXtender also estimates an approximately 20 percent overhead for the installation and Message Center
drive, plus .5 GB for queuing. The space required on the installation LUN in this case would be:
4000 x .2 + .5 = 800.5 GB
Because there is usually a very large quantity of data archived, and the access to this data is less timecritical, this is a very appropriate application for CLARiiON ATA drives.

Storage-System Considerations
Once you have determined the I/O requirements of the new messaging system and settled upon an
appropriate number and type of disk drive, the next step is to consider the throughput and features of the
storage system.
The best throughput and data protection from the storage system results from a design that aims for
balanced use of the array resources, considering both the layout of the data and the timing of schedulable
activities. Consider the following when planning your disk layout:

Avoid configuring Exchange database LUNs on the CLARiiON persistent storage manager (PSM)
drives (drives 0-2). However, these drives may be suitable for the lower I/O requirements of a
properly configured set of log LUNs.

Its not required to have the log LUN and database LUN for an ESG managed by opposite CLARiiON
storage processors (SPs). It matters more that the overall I/O demand is balanced across the two SPs.

There is a small performance advantage (three to four percent) to be gained by binding the RAID 1/0
primaries and secondaries on different back-end buses. The main advantage to this approach is that the
administrator does not have to be cognizant of which LUN is on which back-end bus. The back-end is
balanced by virtue of the RAID group layout.

Dont forget to include hot spares in the configuration. Typically, you should add one hot spare for
every 30 drives.

Clone LUNs must be assigned to the same SP as their source. (It is allowed to configure clones on the
opposite SP to its source, but the clone would be trespassed for synchronizations in this case.) Ensure
that both the current SP owner and preferred owner are the same for both the source and target LUNs.

MirrorView (synchronous or asynchronous) cannot be used with any LUN that is part of a clone group
(source or target).

Some activities within the Exchange environment place heavy use of the CLARiiON SP. These
include performing an Eseutil check against a database, and running CLARiiON layered applications
particularly when acting upon several LUNs at once. When configuring an array for Exchange and
scheduling Exchange administrative operations, its important to consider the limits of the SP CPU
resources.

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

25

8/8/2005

iSCSI Guidelines
The CLARiiON CX500i and CX300i models provide direct support for iSCSI connections. Note the
following considerations before choosing an iSCSI model CLARiiON storage system for Exchange:

Compared to Fibre Channel, iSCSI offers a lower cost hurdle for customers moving from directattached or internal server-based storage to SAN. Connectivity components are less expensive and IP
network expertise is more common.

There is inherently more processing required with the iSCSI protocol than Fibre Channel and the
delivery of iSCSI packets is less regular. This introduces extra latency into the I/O stream. Exchange
activity can generate high disk I/O, and it has a low tolerance for slow disk response.

The iSCSI models support the ability to boot directly from the storage system, but only with an iSCSI
HBA (TCP/IP Offload Engine, or TOE) installed in the server.

The iSCSI models support only up to a two-node Exchange cluster.

The CLARiiON Disk Library is fibre-based and requires the network infrastructure of a fibre SAN.

Local replicationincluding VSS-supported shadow copies for Exchange 2003 via Replication
Manageris supported. These storage system models are internally identical to their equivalent Fibre
Channel models; therefore, internal performance such as clone synchronizations should be the same.

Remote replication using CLARiiON layered applications (SAN Copy or MirrorView) is not supported
to or from the iSCSI models.

Front-end ports on iSCSI models (1 Gb versus 2 Gb for fibre models) have the potential to be a
bottleneck during high I/O activity. Eseutil, for example, can process an Exchange database at the rate
of 10 GB per minute. All CX models have Fibre Channel back ends.
When configuring an iSCSI model CLARiiON storage system for Exchange, note the following
recommendations:

When working with the Microsoft iSCSI Initiator Service, use the Initiator Control Panel to configure
the LUNs on the CLARiiON storage system as persistent targets. This is necessary to automatically
reestablish a connection to the storage system upon a restart.

Use Gigabit Ethernet and separate network traffic from storage traffic. Dedicated Gigabit Ethernet
offers the best throughput to the CLARiiON iSCSI models.

Configure redundant controllers and switches for high availability.

While it is common to use a network interface card (NIC) in servers for iSCSI connections to save
cost, consider replacing the NIC with an iSCSI HBA. A TOE handles the overhead added by the
TCP/IP processing, which can be particularly important on an active Exchange server.

If you use a NIC, install software in this order:


1.

Microsoft iSCSI initiator

2.

Navisphere Agent

3.

PowerPath

During the installation of Navisphere Agent, make sure to answer yes when asked if the system uses
the Microsoft iSCSI Initiator.

If you use a TOE, the recommended order is:


1.

HBA software (such as QLogic SANsurfer)

2.

Navisphere Agent

3.

PowerPath

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

26

8/8/2005
During the installation of Navisphere Agent, make sure to answer no when asked if the system uses the
iSCSI initiator.

The bottom line for performance is to minimize traffic latency in the iSCSI connections between
servers and storage system, and to test Exchange performance in the environment to ensure that
response time is as required.
For further detail on Microsoft iSCSI guidelines, refer to Knowledge Base article 839686, Support for
iSCSI Technology Components in Exchange Server. For information on Microsoft iSCSI cluster support
requirements refer to the FAQ at:
http://www.microsoft.com/WindowsServer2003/technologies/storage/iscsi/i
scsicluster.mspx

CLARiiON Storage Systems Comparison


Table 11 lists useful specifications for the five current CLARiiON CX models.
Table 11. CLARiiON CX Series Storage Systems Feature Summary
Feature

CX700

CX500

CX300

CX500i

CX300i

Maximum Disks

240

120

60

120

60

Storage Processors (SP)

CPUs/SP

2x3
GHz

2x1.6
GHz

1x800
MHz

2x1.6
GHz

1x800
MHz

Front-End Ports/SP

4@2
Gb
(fibre)

2@2
Gb
(fibre)

2@2
Gb
(fibre)

2@1
Gb
(iSCSI)

2@1
Gb
(iSCSI)

Back-End Fibre Channel Ports/SP

4@2
Gb

2@2
Gb

1@2
Gb

2@2
Gb

1@2
Gb

I/O Buses

Array Cache

8 GB

4 GB

2 GB

4 GB

Highly Available Hosts


Maximum LUNs
MirrorView Images (Total Primary + Secondary)
Snapshot LUNs

2 GB

64 c

256

128

64

128

2048

1024

512

1024

512

100

50

n/a

n/a

n/a

300

150

100

150

100

Clone Groups

50

25

25

25

25

Clone Objects *

100

50

50

50

50

SAN Copy Concurrent Sessions (and Max


Destinations)

16
(100)

8 (50)

4 (50)

n/a

n/a

Max Incremental SAN Copy Source LUNs

100

50

25

n/a

n/a

Exchange Users with no replication b

20,000

10,000

5,000

6,000

3,000

Exchange Users with local replication (RMSE)

10,000

5,000

2,500

TBD

TBD

Clone objects and MV images share the max.

As described in this paper, the number of Exchange users a storage system can handle will vary greatly.
The number in this table is provided primarily as a starting point and for comparison between the models. It
refers to a storage system dedicated to Exchange users (.4 IOPS/user). While it is very unusual for a
CLARiiON storage processor to fail, it is advisable to plan for and test an SP failure scenario to understand
how the environment would run in degraded mode, and then to size accordingly.
c

For iSCSI, each NIC/TOE connected to the storage system counts as a connection.

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

27

8/8/2005
Be aware of these specifications when determining the appropriate storage system(s) for the new Exchange
storage design. For example, you can create up to 50 clone objects on a CLARiiON CX500. Each LUN in
a clone groupincluding the sourcecounts as one clone object. If the local replication plan calls for two
clone copies of each Exchange production LUN, the CX500 can support up to 16 Exchange LUNs (each
production LUN and its two clones represent three clone objects, for a total of 48 clone objects). If the
Exchange design follows the standard recommendation of two LUNs per storage group, the CX500 can
handle eight ESGs. The number of servers does not matter here. The CX500 could support four Exchange
servers with two ESGs each, two servers with four ESGs, or any other combination totaling eight ESGs. If
only one clone is used per production LUN, the CX500 could support up to 12 ESGs, assuming that no
other resource limits are reached.

Putting It All Together


This section presents some final suggestions for designing a successful Exchange storage design.

Consider Site-Specific Constraints


The resulting storage design must obviously take into account any requirements or restrictions in the
environment where the messaging system will be implemented.
For example, the organization may have an existing CX500 available that they want to use before any
additional array is purchased. They may not be able to dedicate an entire array to Exchange use, or may
have a certain type of drive already in place for the Exchange data. There are likely to be many other
decisions made already that will affect the storage design, such as number and location of Exchange
servers, number of ESGs per server, network capacity, etc.
Regardless of the constraints, the core requirement of providing enough drives to meet peak I/O demand
remains, as does the strong recommendation to keep log and database LUNs for the same ESG on separate
spindles.

Configure the Cleanest Looking Layout Diagram


Using a building block style, draw up a clean-looking storage layout diagram. This will ease understanding
of the design, help identify possible weaknesses, and aid in the storage administration of the
implementation.

Plan Throughout for Operational Resiliency


Increasing availability through the elimination of single failure points is easier in SAN environments.
Consider all possible points of failure and distribute users from the same functional area to minimize the
impact of planned or unplanned downtime.

Separate ESGs width first (more ESGs), depth second (databases per ESG)

Separate Exchange servers typically 4,000 or fewer users per mailbox server

Separate storage systems spread out users (see following example)

Geographically dispersed locations for larger organizations


Upper management and other priority users and key group mailboxes may have stricter SLAs than the rest
of the organization. These users should also be distributed, in addition to the extra performance or HA
configurations they are provided.
Example
For an organization of 7,000 typical Exchange 2003 users, a single CX500 could handle their basic mailbox
requirements. A CX700 could handle their mailbox requirements, including local replication to clones. As
an alternative to the single CX700, consider designing the configuration with two CX500s handling 3,500
users each. Distributing these 3,500 users over all four ESGs places fewer than 900 users per Exchange
storage group and under 200 per database, minimizing user impact and maintaining conservative database
sizes.
EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices
Storage Configuration Guidelines

28

8/8/2005
Additionally, each of the CX500s could be configured to handle disaster recovery for the peer storage
system in the unlikely event one storage system experiences a significant problem. By providing additional
storage space and temporarily postponing local replication, all 7,000 users could be supported on a single
CX500. Placing these two storage systems in separate sites offers an added protection against site failure.
Even without taking disaster recovery into consideration there are advantages to using two smaller storage
systems in place of a single larger model. Distributing users from all departments across both storage
systems would allow department business to continue to function at some level in the unlikely event of a
significant failure of one unit.

Figure 4. Configuring Two CX500s in Place of One CX700 for Improved Availability

Validate the Design


Before committing to a particular design, it is a good idea to conduct a peer review, and if possible compare
it with known good configurations.
Where possible, the configuration should be built and tested with performance tools such as Microsoft
JetStress and Performance Monitor, and EMC Navisphere Analyzer.
Anticipate unforeseen issues to arise early in a rollout and be prepared to address them.

Additional Recommendations for Optimal Performance


Once the CLARiiON storage system is in place and ready to be configured, follow the tuning guidelines in
this section.

Storage-System Tuning
The CLARiiON storage system is well-behaved and high performing when you use the default parameters.
Some settings must be set at installation. For most systems, follow these guidelines:

Read and write cache: ON for all LUNs

Read cache size: 20 percent of available cache

Write cache size: All remaining cache

LUN stripe element size: 128 blocks (default)

Prefetch settings: Default settings

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

29

8/8/2005

Cache page size: 4 KB.


Heterogeneous systems (those serving UNIX as well as Windows hosts, or SQL Server as well as
Exchange hosts) are best with 8 KB.

Exchange Server and Windows Environment


There are a few server-tuning and environment recommendations included here because problems that they
can avoid are often mistakenly attributed to disk issues.

If the Windows server has 1 GB or more of physical memory, set the /3GB switch in the BOOT.INI
file. This changes the allocation for user mode from 2 GB to 3 GB of the 4 GB Windows virtual
address space and can avoid a memory bottleneck. Information on setting this switch, as well as the
related /USERVA switch, and Registry setting (HeapDeCommitFreeBlock Threshold) is
included in the Microsoft Knowledge Base article 815372: How to Optimize Memory Usage in
Exchange Server 2003 at:
http://support.microsoft.com/?id=815372

The location and speed of domain controllers, DNS servers, and the global catalog are vital to
Exchange performance. The supporting servers should be local enough to the Exchange servers to
provide a fast connection. There should also be enough of them to support the Exchange environment.
Microsoft recommends at least one CPU processor used for a global catalog server for each of four
processors dedicated to Exchange. For example, there should be one dual-CPU global catalog server
to handle two four-processor Exchange servers.

Do not use file-level scanning antivirus software on any Exchange files. This can cause corruption to
the files, including database, log, and checkpoint files. For details, refer to Microsoft Knowledge Base
article 328841: Exchange and antivirus software.
http://support.microsoft.com/?id=328841

Windows File-System Alignment


Windows has an internal structure called the Master Boot Record (MBR) that inhabits the beginning of a
physical device. The MBR uses hidden sectors on the drive. This value is defaulted to 63. The result is
that on a CLARiiON LUN, Windows will always create the first partition on that disk starting at the 64th
sector, thus misaligning it with the underlying RAID stripe. This will cause disk crossings for a percentage
of small I/O (typical of Exchange), resulting in slightly lower performance.
The resolution is to modify the MBRs hidden sectors from 63 to the value matching the stripe element
size. The recommended stripe element size is the default value of 128. Diskpar.exe is a command-line
utility provided by Microsoft in the Windows 2000 Resource Kit. It can explicitly set the starting offset of
the MBR. (For more information about using the Diskpar utility, see the Microsoft Windows 2000
Resource Kit Help.) The Diskpar utility was not released as part of Windows 2003 or its Resource Kit. It
has been replaced by new functionality in the Diskpart utility (ending with the letter t) that is available as
part of Service Pack 1 for Windows 2003. On Windows 2003 systems without SP1, you can use Diskpar
from Windows 2000. You can not use the pre-SP1 version of Diskpart in Windows 2003 for this purpose.
For Exchange, setting the offset with Diskpar or Diskpart is preferred to setting the offset in Navisphere
during the binding of a LUN. Aligning after the LUN is bound offers these advantages:

Any clone or SAN Copy of this disk will automatically include the alignment.

If a change in the offset is required after implementation, data will still be lost but you do not have to
rebind the LUN.

This is consistent with the recommended method for Symmetrix systems.

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

30

8/8/2005

Conclusion
The CLARiiON CX series storage systems address a wide range of Exchange SAN storage requirements by
providing flexible levels of capacity, functionality, and performance. By following the guidelines provided
in this paper, you should be able to design an Exchange storage layout that takes the best advantage of
CLARiiON performance and protection capabilities.

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

31

8/8/2005

Appendix A: Storage Design Examples


The three examples below demonstrate the best practices described in this paper.

Example 1
Given the following information:

4,000 typical users

.7 IOPS average; 1 IOPS peak 4,000 IOPS

All additional I/O overhead already factored in

3:1 read/write ratio

3,000 read and 1,000 write IOPS

RAID-Adjusted Back-end Disk IOPS Calculation


RAID 1/0:
Total IOPS =

Read IOPS + (Write IOPS x Write Penalty)


3,000 + (1,000 x 2) = 5,000

Drives Required =

Total IOPS / IOPS per Drive


5,000 / 130 = 38.5 40 Drives @10K
5,000 / 180 = 27.8 28 Drives @15K

RAID 5 (4+1):
Total IOPS =

3,000 + (1,000 x 4) = 7,000

Drives Required =

7,000 / 130 = 53.8 55 Drives @10K


7,000 / 180 = 38.9 40 Drives @15K

Spindles Required to Handle Required IOPS


rpm

RAID 1/0

RAID 5 (4+1)

10K

40

55

15K

28

40

Capacity Calculation with 200 MB Mailboxes (73 GB Drives)


Capacity Required =

Total Mailboxes x Mailbox Size x 1.5


4,000 x 200MB x 1.5 = 1,172 GB

Drives Required =

Capacity Required / Usable Space per Drive


1,172 / 66 = 17.8 18 Drives

Capacity Calculation with 400 MB Mailboxes (73 GB Drives)


Capacity Required =

4,000 x 400MB x 1.5 = 2,344 GB

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

32

8/8/2005
2,344 / 66 = 35.5 36 Drives
Drive Requirements for Capacity by RAID Type and Disk Size
Mailbox
Size

Disk
Size

Base

RAID 1/0

RAID 5 (4+1)

200 MB

73 GB

18

36 (18 x 2)

25 (18 x 1.25 rounded up)

200 MB

146 GB

18

15

400 MB

73 GB

36

72

45

400 MB

146 GB

18

36

25

Matching Up the IOPS and Capacity Requirements

4,000 users with 200 MB mailboxes match best to 40 RAID 1/0 73 GB drives @ 10K rpm (close to
the capacity requirement of 36 drives).

The user IOPS and capacity requirements do not match well with RAID 5
(55 @ 10K or 40 @ 15K for IOPS, but only 25 GB needed for capacity).

With 400 MB mailboxes for 4,000 users, capacity becomes more of a factor. The user IOPS and
capacity requirements match well with RAID 5 (40 @ 15K for IOPS; 45 for capacity).

Going to 146 GB drives for RAID 1/0 would also match well (40 @ 10K for IOPS; 36 for capacity).

Recommended Drives for Exchange Databases


4,000 Users @ 200 MB Mailboxes

40 RAID 1/0 drives; 73 GB @ 10K

4,000 Users @ 400 MB Mailboxes

45 RAID 5 drives; 73 GB @ 15K


or
40 RAID 1/0 drives; 146 GB @ 10K

Example 2
Given the following information:

8,000 light users

.3 IOPS average; .5 IOPS peak

All additional I/O overhead already factored in

4,000 IOPS

3:1 Read/Write ratio


3,000 read and 1,000 write IOPS
Because the IOPS total is the same as the previous example, the result is the same.
Spindles Required to Handle Required IOPS
rpm

RAID 1/0

RAID 5 (4+1)

10K

40

55

15K

28

40

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

33

8/8/2005

Capacity Calculation with 200 MB Mailboxes


Capacity Required =

8,000 x 200 MB x 1.5 = 2,344 GB

Drives Required =

2,344 / 66 GB = 35.5 36 Drives

Spindles Required to Handle Required Capacity


Mailbox
Size

Disk
Size

Base

RAID 1/0

RAID 5 (4+1)

200 MB

73 GB

36

72

45

200 MB

146 GB

18

36

25

Matching Up the IOPS and Capacity Requirements

With 200 MB mailboxes for 8,000 users, capacity is also a factor because of the relatively low I/O
demand per user. The user IOPS and capacity requirements match well with RAID 5 (40 @ 15K for
IOPS; 45 for capacity).

Going to 146 GB drives for RAID 1/0 would also match well (40 @ 10K for IOPS; 36 for capacity).

Recommended Drives for Exchange Databases


8,000 Light Users @ 200 MB Mailboxes

45 RAID 5 drives; 73 GB @ 15K


or
40 RAID 1/0 drives; 146 GB @ 10K

Note that this is the same result that was determined to handle 4,000 typical users with 400 MB mailboxes.

Example 3
An organization is revamping their messaging system to Exchange 2003. They have provided the
following information:

12,000 Exchange users (8,000 typical; 4,000 heavy). The proportion of typical and heavy users (2:1)
will remain the same in all calculations.

In the current Exchange 5.5 environment, a server with a proportionally balanced set of typical and
heavy users generated .8 IOPS per user.

The organization will use clustered Exchange servers all at the same sitein the Eastern Time Zone.
Eight of the 12,000 users are also in the Eastern Time Zone; 2,000 are in the Pacific Time Zone (-3
hours) and 2,000 are in Western Europe (+5/6 hours).

Each active node will support 4,000 users evenly spread (also proportional typical/heavy) across four
storage groups.

The measured read/write ratio in their 5.5 environment is very close to 2:1.

They plan to use local replication with rapid recovery, maintaining two clones for each production
Exchange LUN. Backups will run twice a day, including once around mid-day.

User concurrency during the peak period is 90 percent.

Typical users will be allotted a maximum mailbox size of 75 MB and heavy users will be allotted 150
MB mailboxes.

The majority of e-mail clients will be Outlook 2003 in cached mode. There are also a significant
number of users accessing with Outlook Web Access (OWA), especially during off hours.

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

34

8/8/2005

Calculations
Initial IOPS Calculation
With 4,000 users per server and four ESGs, there will be 1,000 users per ESG.

Given the same level of activity, there is currently no indication that Exchange I/O demand increases
when going from 5.5 to 2003. However there is a tendency for users of Exchange 2003 to take more
advantage of the mail system features. For this reason, we will use a base peak IOPS per user estimate
of .7 rather than .6.
Total Host-based IOPS =

Using the 2:1 Read/Write ratio

Total Users x %age Concurrent Users x IOPS/User


12,000 x .9 x .7 = 7,560 IOPS
(Total Host IOPS x Read percentage) +
(Total Host IOPS x Write percentage x Write Penalty)
(7,560 x 2/3) + (7,560 x 1/3 x 2) = 10,080 IOPS

RAID 5 Adjusted IOPS =

(7,560 x 2/3) + (7,560 x 1/3 x 4) = 15,120 IOPS

RAID 1/0 Adjusted IOPS =

Adding Administrative I/O Cost and Determining the Database Spindle Count
Because some local replication operations are planned to take place during the day, we will factor in 25
percent additional IOPS overhead, and then create a small table with results for each RAID type and disk
speed.
For RAID 1/0 with 10K rpm drives:
Required Drives =

(RAID 1/0-Adjusted IOPS + %age I/O Overhead) / IOPS per Drive


[10,080 + .25(10,080)] / 130 = 96.9 97 Drives

Filling in the table (without rounding up to accommodate RAID groups):


Drives Necessary to Handle Exchange-Related I/O (not rounded up for RAID)
RAID 1/0

RAID 5

10K

97

146

15K

70

105

There are a total of 12 ESGs. An option that fits well here is to choose RAID 1/0 15K drives. By
dedicating 72 drives for the Exchange database LUNs, you can divide this number evenly with RAID 1/0
3+3 groups. Using metaLUNs to provide some I/O balancing between storage groups, one good
configuration is to create two metaLUNseach holding one ESGacross two of the 3+3 RAID groups.
A summary formula to determine the number of disk drives is:
(IOPS x %Reads) + WP (IOPS x %Writes) + X
-------------------------------------------------------------- = Required Physical Disks
IOPS per Disk
where WP = the write penalty for the specific RAID type and X represents the additional I/O overhead
placed on the drives.

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

35

8/8/2005
Checking Capacity
1,000 users in the ESG (one third @ 75 MB and two thirds @ 150 MB):
25 GB + 100 GB = 125 GB
Adding 10 percent additional space for deleted item retention:
137 GB, or 28 GB for each of five databases
Adding enough space for offline defragmentation of one of the databases (28 GB) including a 10 percent
buffer:
137 + 28 + 14 = 180 GB (rounded)
Disk space requirements will not be the main determinant for drive type here, but if 36 GB drives are
chosen, the 180 GB would require additional spindles to be added to a RAID 1/0 configuration.
Database Considerations
With 73 GB drives, the usable storage capacity of two RAID 1/0 3+3 groups together is 66 GB x 6 = 396.
Splitting this in half for two metaLUNs yields 192 GB for each, which is quite close to the required 180
GB.
With 146 GB drives, there would be considerably more space available, but it would go unused since the
I/O capacity of the drives is already consumed by the Exchange activity. If 146 GB drives were chosen,
you would still limit the Exchange metaLUNs to around 180 GB each to keep down the synchronization
time and minimize the space used for the clones.
Log LUN Considerations
The rule of thumb is roughly one log drive for every eight to ten data drives. There are 72 data drives. By
adding a few extra log drives, we can configure three separate RAID 1/0 2+2 groups to hold the log LUNs
for each server. The extra I/O and space capacity make it suitable for the configuration to include the
CLARiiON vault drives.
Although logs are pruned twice a day, each log LUN will be configured to hold up to four days worth of
log files. With 1,000 users in a storage group, and roughly one log file (5 MB each) generated per user per
day, each LUN will be sized at 20 GB.
Clone Considerations
The clones should be configured in conjunction with planning the backup schedule. Plan to have only one
backup at a time operating on a RAID group. MetaLUNs are advisable here to provide more shared
spindles during the active backup cycle. One reasonable option is to create a set of RAID groups for each
of the three Exchange servers. Since there will be two clones for each Exchange production LUN, and
there will be eight production LUNs (four database LUNs @ 180 GB and four log LUNs @ 20 GB) per
server, each RAID group set will be configured with 16 metaLUNs, each holding one clone.
CLARiiON Exchange replication applications (SIME, RMSE, RM) operate on one ESG per backup job.
Microsoft VSS allows only one backup job at a time, per Exchange server. This means that if jobs overlap,
they will be against different Exchange servers. If the clones for each server are kept on separate spindles,
it should ensure that each servers RAID group set handles only one job at a time.
In this configuration, it becomes reasonable to push for higher capacity, with 146 GB drives in RAID 5
groups. With a usable capacity of 1,632 GB, three RAID 5 (4+1) groups of 146 GB 10K drives would just
handle the capacity requirements (1,600 GB) of the 16 clone LUNs.
Time Zone Considerations
The decision was made to place the 2,000 European users mailboxes in two ESGs sharing the same drives,
and the 2,000 West Coast users mailboxes in two other ESGs sharing a drive set. This still spreads out I/O
and SP activity on the CLARiiON as a whole, and allows administrators to schedule backup and
maintenance activities for the distant users data during their own off-peak times.
EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices
Storage Configuration Guidelines

36

8/8/2005
Additional Design Rules
A few other best practice guidelines were applied to determine the layout:

Do not place Exchange database or clone LUNs on the CLARiiON PSM drives (0-2). An I/O
bottleneck caused by high Exchange activity could affect performance of the entire array. These drives
and any others in their RAID group also have less available space.

Do not mix 10K and 15K rpm drives in the same one third section (0-4; 5-9; 10-14) of a DAE. There
has been a concern that in some situations, the 15K drives in that section would perform at the slower
10K speed.

Create a building block style layout that is repeatable as the environment expands.

When creating RAID groups, build them across multiple buses for additional protection.

Resulting Design
The following figures describe one design that meets the requirements for this organization.
Database LUNs:

One metaLUN for each ESG, built across two RAID 1/0 3+3 groups, with 73 GB 15K rpm drives

Two ESGs from the same Exchange server sharing the same RAID group pair
Log LUNs:

Four log LUNs for the ESGs of each Exchange server combined on their own RAID 1/0 2+2 group
with 73 GB 15K drives
Clone LUNs:

One metaLUN for each clone, built across three RAID 5 4+1 groups, with 146 GB 10K drives

Two clones for each database and log LUN, resulting in a total of 16 clone metaLUNs on the set of
three RAID 5 groups for each of the three Exchange servers

Figure 4. Example 3 Disk Layout Block Diagram

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

37

8/8/2005

Figure 5. Example 3 Disk Layout Design

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

38

8/8/2005

Appendix B: Quantifying Exchange User Activity


Usually the best starting point for determining I/O and capacity requirements for a new Exchange
configuration is to measure the usage in the current environment. This appendix provides information on
how to gather some useful metrics.

Determining the Peak Activity Period


Typically the peak activity period takes place mid-morning on Mondays. To measure I/O activity and
determine the peak, run the System Monitor tool (PerfMon) charting the Physical Disk\Disk Transfers/sec
counter for the Exchange database LUNs across a 24-hour period with anticipated heavy activity (perhaps
at the end of a quarter; not during the summer if possible).
The high points on the chart should indicate the peak activity. Its possible that nightly online maintenance
will show a higher I/O rate than at any point during the day. Since response time is more important when
users are active, you can usually calculate based on peak activity during daytime hours.
The following chart shows the host I/O activity on an Exchange database and log LUN. Peak IOPS during
the day for this LUN are about 300. Online maintenance was performed for six hours between 10 p.m. and
4 a.m. (22:00 to 04:00), and an online backup was taken at 6:00 p.m. (18:00).
600

500

400

300

200

100

0
00:00

02:00

04:00

06:00

08:00

10:00

12:00

14:00

16:00

18:00

20:00

22:00

00:00

Figure 6. Host I/O Activity for a 24-Hour Period

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

39

8/8/2005

Measuring IOPS per User

Select a production server with a user load matching your target community.

Use System Monitor to monitor Physical Disk\Disk Transfers/sec counter over the peak 2 hours of
server activity.

Calculate the current IOPS/user as described in the following formula:


IOPS/user = (average disk transfer/sec) (number of users)

Read/Write Ratio
During the peak activity period, use System Monitor to measure these two counters on the database LUN:
Disk Reads/sec : Disk Writes/sec

Performance Counter Guidelines


The following counters from the Windows System Monitor (PerfMon) provide an indication of how the
disk subsystem is performing in the Exchange environment.

Physical Disk \ Avg. Disk Secs per Read Over an hour, the average should be less than 20
milliseconds.

Physical Disk \ Avg. Disk Secs per Write Over an hour, the average should be less than 20
milliseconds.

MSExchangeIS \ RPC Requests Over an hour, the average should be less than 25 per second.

MSExchangeIS \ RPC Averaged Latency Over an hour, the average should be less than 50
milliseconds.

Processor \ Processor Time. Over an hour, the average should be less than 65 percent.
You can use the Exchange Server Best Practices Analyzer Tool to evaluate the overall health of an
Exchange server configuration. A free copy of the tool is available at:
http://www.exbpa.com

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

40

8/8/2005

Appendix C: Additional Resources


EMC White Papers

EMC CLARiiON Fibre Channel Storage Fundamentals

EMC CLARiiON Best Practices for Fibre Channel Storage

EMC CLARiiON Backup Storage Solutions CX Series Backup-to-Disk Performance Guide

CLARiiON/Exchange DR Cookbook - Disaster Recovery Configuration with RMSE and SAN Copy

EMC CLARiiON Guidelines Summary for Highly Reliable Exchange Server Design

Microsoft White Papers

Exchange 2003 Performance and Scalability Guide

Troubleshooting Exchange Server 2003 Performance

Optimizing Storage for Exchange Server 2003

Deploying IP SANs with the Microsoft iSCSI Architecture

Using Exchange Server 2003 Recovery Storage Groups

EMC CLARiiON Storage Solutions Microsoft Exchange 2003 Best Practices


Storage Configuration Guidelines

41

S-ar putea să vă placă și