Documente Academic
Documente Profesional
Documente Cultură
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices Technical Note
P/N 300-005-416 REV A02 June 19, 2008
Executive summary ................................................................................... 2 Introduction ................................................................................................ 2 Power subsystem overview...................................................................... 3 Power Vault overview............................................................................... 7 Best practices for site and system service activities............................. 10 Conclusion ................................................................................................ 28 References ................................................................................................. 29
Executive summary
A key to Symmetrix DMXs resiliency is its ability to maintain data integrity during power outages. Each Symmetrix storage array comes with battery backup in order to safely power down the array in the event of power loss. Ensuring that all data tracks in memory are successfully written to disk is essential to maintaining the consistency of application data stored on the Symmetrix. As cache size, disk size, and power requirements have grown, the time required to vault data in the event of a power outage has also increased. Power Vault is designed to limit the time needed to power off the system if it needs to switch to a battery supply. Upon detecting the need to come offline or power down, the system stops all I/O and the DAs start writing the contents of the appropriate areas of global memory to special vault devices located on each DA. The vault image is fully redundant, with the specified contents of global memory being saved twice to independent disks. When the DAs are done writing (saving) everything to these vault disks, the machine either finishes powering down or remains at the offline state if it doesnt need to power down. When the machine is powered back up, the DAs write all the saved information on the Power Vault devices back to all the correct locations in global memory. Write-pending or format-pending tracks that were in cache before the power down will be restored to cache.
Introduction
Vault save operations are triggered automatically by environmental conditions. Vaulting can also be initiated by manually powering down a system or taking the disk directors offline. When performing a vault operation, the Symmetrix uses disk storage called Power Vault devices. Power Vault devices are volumes on designated physical disk drives that reserve a dedicated 5 GB space each for vaulting data, including metadata, from global memory during a powerdown operation. During powerup, the data is written back to global memory to restore the system. Power Vault devices are configured on the first four disk drives on each drive loop in the direct-attached storage bay. For each pair of disk directors 160 GB of total capacity is reserved for vault devices. This technical note describes the vault save and restore operations, configuration options and rules, and environmental monitoring. Best practice
2 Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
recommendations for powerdown and powerup procedures, as well as for preventing accidental triggering of a vault save operation, are also included.
Audience
This internal technical note is intended for technology professionals who support Symmetrix systems. The Power Vault in EMC Symmetrix DMX-3 and Symmetrix DMX-4 Systems white paper contains much of the same information found here without the detailed commands and tools. This document is available on EMC.com and Powerlink and is the document customers should reference.
lines when connected to the system bay PDUs. The PDPs contain the manual On/Off power switches, which are accessible through the rear door. Two Communications and Environmental Control (XCM) modules communicate with the system bay BBU modules to determine the BBU status and run battery tests. The XCMs monitor and log environmental events across critical components and report any operational problems. Critical components include global memory directors, power supplies, fans, and various on/off switches. The XCM environmental control is capable of monitoring each components local voltages ensuring optimum power delivery. Temperature of global memory directors is also continuously monitored. Figure 1 illustrates the Symmetrix DMX-3 and DMX-4 system bay.
*The Service Processor consists of the KVM and the server. **The Battery Backup Unit Assembly consists of tow Battery Backup Unit Modules.
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
Storage bay The storage bay power subsystem consists of the drive enclosure power supply/cooling modules and the BBU modules that provide the battery backup for the drive enclosures. The 2.2-kilowatt BBUs provide redundant backup for every four drive enclosures. The A-side BBU modules receive their power from the A-side PDU and support both the A- and B-side drive enclosures. The Bside BBU modules receive their power from the B-side PDU and support both the A- and B-side drive enclosures. The back of each drive enclosure also contains two LCCs. Each LCC (LCC A, LCC B) supports and controls one Fibre Channel loop and monitors the drive enclosure environment. Two PDPs, one for each zone, provide a centralized cabinet interface and distribution control of the AC power input lines when connected to the storage bay PDUs. The storage bays PDPs contain the manual On/Off power switches, which are accessible through the rear door. Figure 3 illustrates the storage bay.
Figure 3. Storage bay Further information on the power subsystems is available on Powerlink.
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
The following sections describe these operations in more detail. Power Vault save operation When a system is powered down, transitioned to offline, or when environmental conditions trigger a vault situation, a Power Vault save operation is initiated. The part of global memory being saved first reaches a consistent image (no more writes). The disk directors then write the appropriate sections of global memory to the vault disks, saving two copies of the logical data. The BBU modules automatically transition to battery backup when the Symmetrix system detects loss of AC power. The BBU modules maintain power to the Symmetrix system for up to 5 minutes while the global memory is vaulted to the vault disk drives. When the disk directors are done writing to the vault disks, the system either finishes powering off or remains in the offline state if it doesnt need to power off. Power Vault restore operation When the Symmetrix system is powered on, the startup program does the following: Initializes the hardware and the environmental system Restores the global memory from the saved data while checking the integrity of the data
Performs cleanup, data structure integrity, and reinitialization of needed global memory data structures
Power Vault in Symmetrix DMX-3 and DMX-4: Operational Overview and Best Practices
At the end of startup program, the system resumes normal operation when the BBUs are recharged enough to support another vault save. The system will not come back online until the BBU modules have a minimum of 300 seconds of holdup time. If any condition is not safe, the system will not resume operation and will call the EMC Customer Support Center for diagnosis and repair. In this state, EMC Customer Support will be able to communicate with the Symmetrix system and find out the reason for not resuming normal operation.
Note: A fully charged BBU will have 600 seconds of charge time, enough to support two vault save operations.
Please note that multiple component failures within a zone are handled differently than a complete zone failure. If power is lost to an entire zone, the system will not begin to vault. However, if a single zone encounters multiple component failures that reduce the number of operational power supplies to an insufficient value, then the system will begin to vault. In general, the system will vault if half of the SPS modules in the system bay fail. This needs to be taken into consideration especially with smaller 24-slot systems (DMX-3 1500 and DMX-4 1500). For example, a minimum configuration DMX-4 1500 is required to have only two SPS modules configured per zone in the system bay. But the system will begin to vault if two SPS modules fail in a single zone. A system with this
8 Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
configuration will also begin to vault if each zone were to lose a single SPS as both zones would have lost redundancy.
Note: Systems with this type of configuration can be upgraded with additional SPS modules and power supplies to avoid these situations.
Six-slot systems (DMX-3 950 and DMX-4 950) have fewer components in the system bay and are essentially protected only by the other zone. Therefore, these systems will not need to vault until both zones experience a failure.
84,VALT,PRMS command
The 84,VALT,PRMS command, shown in Figure 4, displays the following information: Power Sources Minimum required, configured count, and current count. Vault Drives Minimum required and capacity count. The capacity count will decrement if a drive that contains vault devices were to fail. NTV A value of 0 indicates the system does not need to vault. A value of 1 indicates the system needs to vault. ATV A value of 0 indicates the system does not have the ability to vault. A value of 1 indicates the system has the ability to vault. NTPD A value of 0 indicates the system does not need to power down. A value of 1 indicates the system needs to power down.
Power Vault in Symmetrix DMX-3 and DMX-4: Operational Overview and Best Practices
10
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
It is also highly recommended that EMC Customer Service field personnel and site electricians verify that the AC power feeds to each rack of equipment come from redundant power sources/PDUs prior to beginning any power-related maintenance. The following guidelines should be followed by customer site personnel to avoid accidentally causing the system to vault while performing site maintenance: Verify system health status prior to beginning any power-related maintenance. Discuss power maintenance plans with the local EMC Customer Service field team prior to activity. Have the local EMC Customer Service field team verify the Power Zone Task is disabled before beginning maintenance. The Power Zone Task section on page 26 provides more information. When installing a new system, apply power from two separate source PDUs at the site to each bays two PDPs. The Physical Planning Guides for each specific model provide additional details and are also available on Powerlink. Before beginning any power maintenance on site or on previously installed systems, verify that each bays PDPs are connected to separate source PDUs. Configure power redundantly to ensure that any PDU shutoff will not take power away from both power zones of any bay. If power has been removed from any bays zone more than once, or if power has been removed from any combination of zones, do not remove power from the other zone until adequate battery recharge time has passed on the discharged BBUs. Battery charge time must be greater then 300 seconds on the SPSs to support at least one vault save if needed. Charge time typically takes a minimum of 2 hours.
Note: Enginuity 5773 introduces a Battery Conservation algorithm that will shut off the SPS modules in a zone that loses AC power as long as the opposite zone is fully operational. This helps prevent the SPS modules from being completely drained during a single-zone power failure. The loss of power to the zone will still drain the SPS modules of some battery capacity, but it increases the amount of single-zone power outages that can be experienced before the SPS modules drop below 300 seconds of holdup time. This functionality is also targeted for a future release of 5772 code.
EMC Customer Service field personnel should perform the following steps to verify battery charge status:
Power Vault in Symmetrix DMX-3 and DMX-4: Operational Overview and Best Practices
11
From the main SymmWin screen, select Tools > Environmental. Then, click the Health Check tab, and click Run Health Check, as shown in Figure 5.
12
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
If errors are present, click the Power System tab to view where the issue exists. Figure 6 shows sample results.
Figure 6. Power System tab: Enclosures with SPS battery charge less than 300 seconds
Power Vault in Symmetrix DMX-3 and DMX-4: Operational Overview and Best Practices
13
The Inlines command for displaying the system bay SPS status is 84,VALT,GPWR. Figure 7 shows a sample of the command output.
Figure 7. Inlines command 84,VALT,GPWR: System bay SPSs below 300 seconds battery charge time The Inlines commands to display the storage bay SPS status are more tedious. The preferred method is via the Environmental GUI.
14
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
Run a health check before power activity. Before removing power from a zone, check that opposite zone has all power supplies on without faults. Before beginning any maintenance, check the SPS health. If previously unknown failures are found, contact EMC Customer Support Center for assistance. Perform the following steps to verify SPS health: From the main SymmWin screen, select Tools > Environmental. Then, click the Health Check tab, and click Run Health Check. Figure 8 shows a faulty SPS in the system bay.
Figure 8. Environmental GUI health check results: SPS not ready in system bay
Power Vault in Symmetrix DMX-3 and DMX-4: Operational Overview and Best Practices
15
Click the Power System tab to identify the faulty component, as shown in Figure 9.
16
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
The Inlines command to display system bay SPS status is 84,VALT,GPWR. Figure 10 shows sample output.
Power Vault in Symmetrix DMX-3 and DMX-4: Operational Overview and Best Practices
17
Figure 11. Environmental GUI health check: SPS not ready in storage bay
18
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
If errors are present, click the Power System tab to view where the issue exists. Figure 12 shows sample results.
Figure 12. Power System tab: SPS not ready in storage bay The Inlines commands to display BBU status and power for the storage bay are more complex; the preferred method is via the Environmental GUI.
Power Vault in Symmetrix DMX-3 and DMX-4: Operational Overview and Best Practices
19
Each BBU provides power to four disk-array enclosures (DAEs). Ensure that all four DAEs on opposite sides are healthy before maintenance. If previously unknown failures are found, contact the EMC Customer Support Center for assistance. Perform the following steps to verify DAE health: From the main SymmWin screen, select Tools > Environmental. Then, click the Health Check tab, and click Run Health Check, as shown in Figure 13.
Figure 13. Environmental GUI health check results: DAE SPS problems detected
20
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
If errors are present, click the Power System tab to view where the issue exists. Figure 14 indicates an SPS 4A issue in storage bay 1A for DAEs 13-16. The issue must be resolved before starting any power maintenance on the B-side power zone.
Figure 14. Power System tab: Problem detected on SPS 4A in storage bay 1A for DAEs 13-16
Power Vault in Symmetrix DMX-3 and DMX-4: Operational Overview and Best Practices
21
Check the battery test history along with charge state status before doing any maintenance. Perform the following steps to verify battery test history: From the main SymmWin screen, select Tools > Environmental. Then, click the Power System tab, and click the Battery Test tab, as shown in Figure 15.
Figure 15. Battery test history If there is a failure, press F7 to refresh the display. If the failure remains, contact the EMC Customer Support Center for assistance.
22
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
Whenever an action is taken on a power component, ensure that NTV is not set with the 6F,PLOG command (shown in Figure 16). A value of 1 indicates the system needs to vault. If it is set, there is a 60-second window to undo the action that set it. When removing disk drives that contain vault devices, check that the NTV does not get set with the 6F,PLOG command (shown in Figure 16). A value of 1 indicates the system needs to vault.
Figure 16. Inlines command 6F,PLOG Online DAE or disk drive upgrades Do not connect any cables from the new DAEs/drive bays to the existing system until all SPS units in the new DAEs/drive bays are fully charged. Ensure all SPS units in the new DAEs or new drive bays are fully charged prior to proceeding with the online upgrade. There should be no flashing green LEDs on any SPS module. Normal charge time is 2 hours minimum. Furthermore, do not power down any drive bays that are installed even if the disk drives in the bay are not yet configured.
Power Vault in Symmetrix DMX-3 and DMX-4: Operational Overview and Best Practices
23
If there are any power or environmental issues in a daisy-chained environment, the vault devices that reside on that loop are discounted. If all the extended loops have power events due to the low SPS charge status, then Enginuity will discount all the drive loops and subsequently all the vault drives. If the number of available vault drives falls below a predetermined level as set by Enginuity, the system will begin to vault. Knowledgebase solution emc154235 provides more information.
Powerdown procedure
The Symmetrix contains no user-serviceable parts. Therefore, the system bay and storage bays should not be opened for any reason by untrained personnel. If the Symmetrix is in need of repair, only qualified personnel familiar with safety procedures for electrical equipment and the Symmetrix should access components inside the unit.
Note: The Symmetrix is designed to stay powered up for most all situations. Unless there is an emergency situation, first call the EMC Customer Support Center for assistance before powering down the Symmetrix.
The method that EMC Customer Service field personnel should use to power down the Symmetrix is to execute a script. In the Procedure Wizard under RTS/CEs Services, run the Sym Offline/Shutdown script. The script will ensure that the system bay is powered down prior to the storage bays, or else the system may not be able to save the vault image. The script instructs the service personnel to perform these steps to power down the Symmetrix:
1.
2. 3.
On the rear door of the system bay (Figure 17), press the Zone A and Zone B PDP power switches to the down O OFF position. On the rear door of the storage bays (Figure 17), press the Zone A and Zone B PDP power switches to the down O OFF position.
24
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
Powerup procedure
Note: Do not use the following procedures for powering up the Symmetrix for the first time. If you have powered down the Symmetrix for an emergency condition, call the EMC Customer Support Center for assistance before powering up the system. Please reference the Physical Planning Guide and Quick Start Power Connection Guide for your specific model for power connection and configuration requirements.
Perform these steps to power up the Symmetrix after it has been routinely powered down by the PDP power switches: 1. 2. On the rear door of each storage bay (Figure 17), press the Zone A and Zone B PDP power switches to the up I ON position. On the rear door of the system bay (Figure 17), press the Zone A and Zone B PDP power switches to the up | ON position. The Symmetrix system begins its initial microcode program load (IMPL) startup procedure. Wait at least 30 minutes for the IMPL procedure to complete.
3.
Power Vault in Symmetrix DMX-3 and DMX-4: Operational Overview and Best Practices
25
Note: The actual IMPL procedure time varies by system type and configuration.
All other configuration rules are strictly defined by Enginuity. The options and rules are discussed in the following section. Power Vault Wait Time The configuration file is customizable in order for the machine to be able to survive brownouts in areas where there are frequent power interruptions. It can be set for 1, 2, or 3 minutes to ride out temporary power interruptions. After that time, the front-end directors will be taken offline and the disk directors will begin the vault save. The default setting is 1 minute. EMC Customer Service field personnel should discuss with the local Configuration RTS before attempting to make any changes to this setting. The exact system configuration must be taken into account before an increase to the wait time value is approved. The 3-minute value is an extreme case and is approved only in specific configurations. The Power Vault Wait Time setting is in the IMPL Initialization screen under Common Settings, as shown in Figure 18. An online configuration change must be performed to commit the change to the system.
Figure 18. Power Vault Wait Time field in IMPL Initialization screen Power Zone Task The Power Zone Task is optional under normal operation. However, it is highly recommended and considered best practice that the local EMC Customer Service field team verifies this setting is disabled before beginning any powerrelated maintenance on the system and before the customer begins any power26 Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
related maintenance onsite. The task setting can be modified in the Symmetrix Site Configuration window, as shown in Figure 19. An online configuration change is not required to change this setting.
Figure 19. Enable Power Zone setting in Symmetrix Site Configuration window When the Power Zone Task is enabled and the system loses AC power to a single power zone, a 20-hour timer is invoked. The system will call home to notify the EMC Customer Support Center of the event. When the timer counts down to 5 hours, the system will again call home. Before the 20-hour period ends, the EMC customer engineer can choose one of three options: Repair the cause of the power fault. Reset the 20-hour timer to continue the single power zone operation. Allow the system to vault, shutting down the system in an orderly manner.
Knowledgebase solution emc130487 provides more information on the Power Zone Task.
Power Vault in Symmetrix DMX-3 and DMX-4: Operational Overview and Best Practices
27
Vault device configuration rules The following configuration rules apply to vault devices: 5 GB on the first four drives of every drive loop is reserved for memory vaulting. Each disk director pair requires 32 such dedicated devices for a total of 160 GB of vault space per disk director pair. The vault devices can only use single-mirror data protection and cannot be configured with TimeFinder/Snap, virtual, or dynamic spare devices. The drive pool, virtual devices, or drive devices cannot reside in the 5 GB of vault space. However, they can reside on the same physical disk drive as the vault devices but not within the vault devices. The distribution of the vault devices across the disk directors, the back-end interfaces, and the physical disks should be such that a full vault save will be possible within the time frame dictated by the capacity of the battery (up to 5 minutes). The total capacity of all of the vault hypervolumes in the system will be at least sufficient to keep two logical copies of the persistent part of global memory. Physical drives that contain vault devices are candidates for permanent sparing if the following is true for each model: On the DMX-3, if there is an available spare on the same primary loop On the DMX-4, if there is an available spare on the same disk director processor
Conclusion
This technical note explained Power Vault operations, described hardware features and environmental monitoring, and recommended best practices for performing power maintenance. More information is provided in the References section on the next page.
28
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices
References
DMX-3
Symmetrix DMX-3 Product Guide Symmetrix DMX-3 Physical Planning Guide Symmetrix DMX-3 Quick Start Power Connection Guide
DMX-3 950
Symmetrix DMX-3 950 Product Guide Symmetrix DMX-3 950 Physical Planning Guide Symmetrix DMX-3 950 Quick Start Power Connection Guide
DMX-4
Symmetrix DMX-4 Product Guide Symmetrix DMX-4 Physical Planning Guide Symmetrix DMX-4 Quick Start Power Connection Guide
DMX-4 950
Symmetrix DMX-4 950 Product Guide Symmetrix DMX-4 950 Physical Planning Guide Symmetrix DMX-4 950 Quick Start Power Connection Guide
Other
Knowledgebase solution emc119567 What is the correct power ON/OFF sequence for the Symmetrix DMX-3?"
Power Vault in Symmetrix DMX-3 and DMX-4: Operational Overview and Best Practices
29
Copyright 2007, 2008 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED "AS IS." EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All other trademarks used herein are the property of their respective owners.
30
Power Vault in Symmetrix DMX-3 and DMX-4 Systems: Operational Overview and Best Practices