Sunteți pe pagina 1din 17

Troubleshooting Guide for BIOS POST on

13th Generation of Dell PowerEdge Servers


Wei Liu
Dell Server BIOS Development
September 2014

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge


Servers

Revisions
Date

Description

August 2014

Initial draft

THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND
TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF
ANY KIND.
2014 Dell Inc. All rights reserved. Reproduction of this material in any manner whatsoever without the express
written permission of Dell Inc. is strictly forbidden. For more information, contact Dell.
Dell, the DELL logo, and the DELL badge are trademarks of Dell Inc. Intel, the Intel Logo are trademarks or registered
trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Microsoft, Windows, and
Windows Server are registered trademarks of Microsoft Corporation in the United States and/or other countries. Other
trademarks and trade names may be used in this document to refer to either the entities claiming the marks and
names or their products. Dell disclaims any proprietary interest in the marks and names of others.

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Table of contents
Revisions ............................................................................................................................................................................................. 2
Executive summary .......................................................................................................................................................................... 4

1.

BIOS Splash Screen Display........................................................................................................................................... 4

2.

POST Error and Warning Messages ............................................................................................................................. 6

3.

Post Code in iDRAC Web GUI....................................................................................................................................... 9

4.

Driver Health Status Report ......................................................................................................................................... 10

5.

Dell Diagnostics (ePSA) ................................................................................................................................................ 12

6.

Red Screen of Death (RSOD) ...................................................................................................................................... 14

7.

Yellow Screen of Death (YSOD) ................................................................................................................................. 16

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Executive summary
The Unified Extensible Firmware Interface (UEFI) is a set of industry-standard firmware interfaces that is
designed to replace the legacy BIOS to support modern operating systems and hardware architectures.
Dell has been shipping UEFI support in the BIOS since the 11th generation of PowerEdge servers through a
UEFI-over-Legacy model, where it is the legacy BIOS that initializes the whole system and loads the UEFI
layer at the end of Power-On Self-Test (POST) if needed. The Dell Lifecycle Controller technology is built
upon UEFI as well.
The BIOS on the 13th generation of Dell PowerEdge servers is now a native UEFI implementation, with a
Compatibility Support Module (CSM) to provide legacy BIOS interfaces to support operating systems that
are not UEFI-aware. The look and feel of the boot process is dramatically different from the previous
generations.
This guide provides troubleshooting solution for possible issues that may arise during POST and pre-boot
environment on the 13th generation of PowerEdge servers.

1. BIOS Splash Screen Display


After the system is powered on, the Dell server BIOS may get to video display almost instantly. Fig. 1 is a
sample snapshot of the POST splash screen. The text next to the progress bar on the bottom of the screen
indicates various phases of POST. The text can aid in troubleshooting issues that happen during the
system boot process.
The following table lists the currently supported progress texts in the BIOS:
Text Display
Initializing Intel QuickPath Interconnect...
Configuring Memory
Loading BIOS Drivers

Initializing iDRAC

Initializing iDRAC Done


Initializing PCIe, USB and Video
Initializing PCIe, USB and Video Done
Legacy PCI option ROM initialization (BIOS
boot mode only)
Testing Memory (X% Complete)

Phase of the Boot Process


BIOS performs an early initialization of the chipset,
processors, and QPI interfaces.
BIOS initializes the system memory.
BIOS starts the Driver Execution Environment (DXE)
phase, loads and executes DXE drivers to perform
additional chipset, processor and hardware initializations.
BIOS waits for iDRAC to become ready. This phase may
take more than a few seconds on the first AC power on of
the system.
iDRAC initialization has completed.
Start of PCI enumeration and detection of USB keyboard
devices.
PCI and USB enumeration has completed.
Applies to the BIOS boot mode only. The onscreen
display varies, depending on the type of PCIe cards that
are installed in the system.
Software-based memory test phase. A percent progress .
Note: The memory test is disabled in the BIOS setup by
default.

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Testing Memory Done [No Errors]


Testing Memory Done [Errors Encountered]
Testing Memory Aborted
Loading Lifecycle Controller Drivers
Loading Lifecycle Controller Drivers Done
Initializing Firmware Interfaces

Running In-System Characterization...


Connecting iSCSI device(s)

Enumerating Boot options


Enumerating Boot options Done
Entering Lifecycle Controller
Lifecycle Controller: Applying Updates or
Setting System Configuration
Lifecycle Controller: Collecting System
Inventory
Lifecycle Controller: Done
Booting

Memory test completed without any issue.


Memory test has found error(s).
Memory test was aborted by pressing <ESC> or spacebar
.
BIOS loads the Lifecycle Controller drivers.
BIOS has finished loading the Lifecycle Controller drivers.
BIOS connects the UEFI drivers to the device handles. The
UEFI drivers from add-in PCIe cards are expected to be
installed in this phase.
In-System Characterization (ISC) is in progress.
the UEFI iSCSI device drivers are connected. This display
applies to UEFI boot mode only. It gets displayed when an
iSCSI boot device(s) has been configured.
BIOS starts to enumerate Boot Options in the system.
The enumeration of Boot Options has completed.
The system is booting into the Lifecycle Controller.
An Automated Task Application is being scheduled in the
Lifecycle Controller.
Lifecycle Controller is collecting system inventory for this
boot.
Lifecycle Controller has finished execution.
BIOS has finished POST and is giving control to the
operating system.

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Fig. 1 POST splash screen and progress bar

2. POST Error and Warning Messages


The BIOS on the 13th generation of PowerEdge servers can display informational, warning and error
messages during POST to help you troubleshoot various issues. If the error occurs early in POST, such as
during memory initialization, then a pop-up message box with a detailed description of the issue (e.g. Fig.
2) may be displayed on the screen.

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Fig. 2 An error message box in early POST


If the issue is detected at a later time in POST, corresponding error and warning messages aredisplay ed
on the screen with a UEFIxxxx prefix. An event entry is logged in the Lifecycle Controller log (LC log) as
well. Depending on the severity of the error/warning, the system may proceed with continuing boot, or
prompt with F1/F2/F10/F11 for user input, or reset, or halt. The message comprisesof two parts, the
error/warning message itself, and a recommended response action. You can follow the corresponding
recommended response action to address the issue. For a complete list of POST error and warning
messages, see the Event and Error Message Reference Guide for 13th Generation Dell PowerEdge Servers.
In the following example, the UEFI driver for the Integrated Network card is not signed. The user has just
turned on Secure Boot in BIOS setup utility. In the next boot, a few error messages are displayed on the
screen during POST.
-

The first error message (UEFI0072) displays that the UEFI driver from the Integrated NIC 1 Port 1
Partition 1 was not loaded because it failed the Secure Boot authentication. You may address this issue
by updating the NIC firmware to a version that supports the UEFI driver signing.

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

The second error message (UEFI0071) displays that the previously configured UEFI network boot
interface is no longer available. This is a result of the corresponding UEFI driver not being loaded.
The third warning message (UEFI0074) displays that the Secure Boot policy has been modified since
the last time the system was booted. In this particular example, the user enabled Secure Boot on
purpose, so no action needs to be taken.

Fig. 3 An example of POST error messages

Corresponding logs for the error and warning messages will be recorded in the Lifecycle Log (Fig. 4).

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Fig. 4 Screen shot of the Lifecycle Log

3. Post Code in iDRAC Web GUI


In case you cannot get to the screen display, the Post Code feature available in the iDRAC web GUI may
come handy. This page displays the last system POST code with a descriptive text. POST code helps to
detect pre-video hangs, report fatal errors, and analyze system failures during POST.

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Fig. 5 An example of the Post Code in the iDRAC Web GUI

4. Driver Health Status Report


The UEFI specification defines a Driver Health Protocol (DHP). The DHP provides services allowing a UEFI
driver to express health status of a controller, return status messages associated with the health status,
perform repair operations if necessary and request configuration changes to place the controller back in a
usable state.
Dell server BIOS checks the driver health status of each UEFI driver in the system, and displays the status
messages . The BIOS may invoke the repair and configuration utility if a repair or reconfiguration operation
is required. In most cases, you can follow the instructions on the screen to proceed.
Fig. 6 is an example display where the BIOS halts on some errors returned from DHP. In this particular
example, the iDRAC DHP detected that the backplane 2 power cable has been disconnected; The LSI SAS
controller requires configuration changes, possibly due to a catastrophic issue.

10

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Fig. 6 Example of errors detected by UEFI Driver Health Protocol


The following (Fig. 7) is a snapshot of the Driver Health Manager in the case when a driver requires
configuration change. The Driver Health Manager lists all the device instances that require reconfiguration.
You can select each one of them and follow the instructions on the screen to configure the devices.

11

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Fig. 7 Driver Health Manager

5. Dell Diagnostics (ePSA)


Dell Enhanced Pre-Boot System Diagnostics (ePSA) are diagnostics tests that are embedded in the system
(Fig. 8). These tests allow you to check the hardware health status outside the operating system
environment. The findings of this diagnostics can assist you in troubleshooting the fault and working
toward a resolution to the issue.
The ePSA can be launched from the Boot Manager-> System Utilities-> Launch Diagnostics (Fig. 9).

12

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Fig. 8 Sample screen shot of ePSA

13

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Fig. 9 Launching diagnostics from Boot Manager

6. Red Screen of Death (RSOD)


The Dell server BIOS implements an enhanced CPU exception handler (RSOD) which aids the user and
tech support to analyze the software exception when the system crashes in the pre-boot UEFI
environment. The debug information is displayed on the screen and additional information and stack
traces can be retrieved through the serial port (if available). You can save the dump and use it for
debugging offline.

14

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

A sample RSOD display is depicted in Fig. 10.

Fig. 10 An example of the RSOD screen shot


When an exception is raised by the processor the BIOS displays the RSOD screen with the following
information related to the exception.

The exception type, such as Page Fault, General Protection Fault, Divide by Zero,
Breakpoint, and so on.
A Dell-defined error value, pre-fixed with UEFIxxxx. Note a corresponding error will be
logged to the LC log as well.
Partial register set (x86 64bit).
Last-Branch records and associated module names if available.
Current RIP and Faulting driver module name
Stack trace back from faulted module.

Additional information is available from the serial port dump. To retrieve the serial dump, you can connect
the server to a client system with a null modem cable and use any terminal program (for example, Putty or
HyperTerminal) with the baud rate set to 115200 bps, then press <ENTER>. The serial dump can be
retrieved from Serial over LAN (SOL) method as well.

15

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Note: The RSOD serial dump can be obtained at the point of failure. The serial session does not have to
be started prior to the RSOD.
RSOD are usually caused by software issues, and may be resolved by updating the BIOS, Lifecycle
Controller, or the UEFI firmware for PCIe cards. You may send the screen shot and serial dump to Dell
support for further analysis, should you encounter a RSOD even after all the firmware updates.

7. Yellow Screen of Death (YSOD)


When a hardware error occurs during UEFI pre-boot environment (excluding CSM phase in BIOS boot
mode), the Dell server BIOS may display a Yellow Screen of Death (YSOD) with some of the software
contexts at the time when the issue is detected.
The hardware errors include Nonmaskable Interrupt (NMI) and Machine Check Errors (MCE). You should
check the System Event Log (SEL) to identify the source and type of the error. Update the corresponding
device firmware if the error is originated from a PCIe device.
Note: The stack trace displayed on the YSOD screen only provides some context information before the
failure, and not the source of the problem.
A sample YSOD is depicted in Fig. 11.

16

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

Fig. 11 An example of the YSOD screen shot

17

Troubleshooting Guide for BIOS POST on Dell 13th Generation of PowerEdge Servers

S-ar putea să vă placă și