Sunteți pe pagina 1din 2

Question: How to improve performance on a VNX or CLARiiON storage system that is forced flushing

Environm
ent: Product: CLARiiON
Environm
ent: Product: VNX Series
Environm
ent: EMC SW: Navisphere Analyzer
Environm
ent: EMC SW: Unisphere
Problem: Forced flushing performance issues
Problem: High host response times for all hosts performing write I/O
Problem: Navisphere Analyzer will show the percentage write cache used as 100% (or very close to 100%) at times.
Problem: Navisphere CLI 'getcache' will show the percentage dirty pages as 100% (or very close to 100%) at times.
Problem:
Ktrace logs contain large numbers of the following messages:
~ dropped ddrb ~ pri 0 [OPTIONAL], op 1 [READ].
Root
Cause:
The performance problems here are being caused by write cache being 100% full. This is can happen at different
times on different SPs, but they can both have this problem. When write cache is 100% full, the SP will do a
"forced flush" of data in write cache to the drives. While this is occurring, there will be high response times for
all hosts performing writes on the CLARiiON storage system.
Fix:
Follow these steps:
1. Find which LUNs are doing heavy write I/O during the periods of 100% full write cache.
2.
ATA & SATA RAID 5 LUNs are the prime suspects, because they can process much fewer IOPS than FC, SAS
and EFD drives.
3. See if there are ways of speeding up writes to these LUNs.
4.
Add more spindles to the RAID group

o Use up to nine drives in RAID 3 or 5.
o Use up to 16 drives in RAID 1/0.

or
o Use striped metaLUNs across multiple RAID groups (see emc226845). These RAID groups be of the same
RAID type and should contain equal numbers of drives with the same performance characteristics (e.g., all 7200
RPM ATA). Ideally, any other LUNs to use the RAID groups should be striped into similar metaLUNs in order
to avoid some RAID groups becoming "bottlenecks."

Note: Never use striped metaLUNs across LUNs in the same RAID group.
For ATA drives, use RAID 3 instead of RAID 5, if the writes are sequential (see emc140046). This does not
apply to CX3 and CX4 SATA2 drives, because these have Native Command Queuing.
Make sure all ATA LUNs in the RAID group have the same SP owner (see emc119711). This does not apply to
SATA2 drives used by CX3 and CX4 series CLARiiONs.
5. If a Celerra is using these LUNs, then the throughput load can be redistributed, using the Data-Mover's volume
management, to stripe the workload over more LUNs/RAID Groups.
6. Upgrading the CLARiiON SP hardware would reduce forced flushing, because there will be more write cache
available (see table below) and the CLARiiON will be able to flush the I/O more quickly (unless the drives too
heavily utilized).
7. Increasing the size of write cache (by redeploying some read cache) and decreasing the high/low watermarks to
60/40%, respectively, will reduce the chances of Forced Flushing occurring. See table below for the
recommended write cache sizes.
8. LUNs used by Windows will perform better if the file system is correctly aligned with the LUN. See EMC
Knowledgebase solution emc64915 for details on how to align the file system.
9. If the configuration cannot be improved any further:
10.
The write caching can be disabled on selected LUNs, which do not require high performance, but are using a lot
of write cache (e.g., ATA backup-to-disk).
Alternatively to "Write Aside" cache settings on slow LUNs can be reduced.

o This value can be individually set on each LUN using the "naviseccli ... chglun -w ..." command.
o "Write-aside" specifies, in blocks, the largest write request size that will be written to cache for that LUN.
o Write requests greater than the write-aside value are written directly to disk, bypassing write cache.
o Valid values are 16 through 65534.
o Writes with a size over the "Write Aside" value will bypass write cache and go directly to disk.
o This will help if the I/O sizes filling the cache are large, but write I/O lower than the "Write Aside" settings will
still go to cache.
Note: Upgrading the storage processors can provide higher levels of write cache and reduce SP utilization, which
should reduce the need for forced flushing.
Notes:
See article emc219608 for how to monitor the write cache levels, in order to determine how effective any
configuration changes have been.
Notes:
For VNX recommended cache settings, please refer to article emc267304
The following values are the EMC Best Practice Recommendations for the write cache size on CX3 and CX4.

CX4-960 CX4-480 CX3-80 CX3-40 CX4-240 CX3-20 CX4-120 CX3-10
Recommended
Write Cache
size (MB) 9760 3600 3072 2500 1011 842 498 250



The Read Cache should be set to use all the remaining available memory, once the write cache levels have been
set.
Notes:
See EMC CLARiiON Best Practices for Fibre Channel Storage document on Powerlink for recommended RAID
configurations.
Notes:
Messages such as the following may be appearing in the Ktracelogs, during periods of forced flushing:
DD c3dd57a0 dropped ddrb 0xfffffadfa6bf27f0, fru 0x32 [0.3.5], pri 0 [OPTIONAL], op 1 [READ].
DD 86944648 dropped ddrb 0xa2280510, fru 0x11 [0.1.2], pri 0 [OPTIONAL], op 1 [READ].
Dropped ddrb messages occur when there is heavy I/O traffic and there are optional requests in the Device
Handler (DH) request queue that can be dropped. Optional I/Os are things like cache pre-reads that can be
dropped when the cache is performing mostly writes.

S-ar putea să vă placă și