Sunteți pe pagina 1din 29

Troubleshooting Performance Insight

Irwin McCallister HP Software & Services

2009 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Agenda
Introduction Log

files and tools wide problems nodes and processing problems and Web Access Server with NNMi Data in report pack

System No

Missing

Summary Reporting

Integrations

19 May 2010

Introduction
Data PI

Flow Packs and Datapipes

Report

Servers and Pollers

19 May 2010

Basic Data flow SNMP Collection


Web Access Server Property Property trend_proc
ovpi_run_sql trend_sum

Daily Daily trend_sum trend_sum Base SQL Property


trendpm

Hourly Hourly Report Pack Datapipe

trend_sum Rate trendpm <raw>_upld2 raw

Poller mw_collect
dpipe_snmp bcp_gateway dpipe_snmp dpipe_snmp

<raw>_upld1

<raw>_keys

Devices
4 19 May 2010

ODBC

Troubleshooting Outline
Based Check

on the data flow

the timer, database and Web Access Server (WAS) for major problems the data flow:
data is being collected and inserted trendpm raw-to-delta summaries and other processing reporting Check Check Check Check

Follow

19 May 2010

Tools
Management Log

Console Home/OVPI Status and Performance snap-in files script Support Online

SelfMonitoring.pl Software PVLmon

add-on management and troubleshooting tool from HP Partner PerVigil

19 May 2010

Tools Management Console


Example for OVPI Status First screenshot shows database cannot be accessed (Oracle Listener was down) After Listener fixed, the database can be accessed (and the Tablespace Fill Status can also be displayed)

19 May 2010

Log Files
trend.log audit.log

Main PI log files

the main log for PI data collection and processing start and stop times for processes amount of data collected

metrics.log

website0.log

and piweb.log logs for the Web Access Server (WAS)

19 May 2010

Log Files

Other PI log files

console.log builder.log viewer.log

Management Console (piadmin) log Builder log

Viewer log Package Manager log

report_<DATETIME>.log

19 May 2010

Log Files
Oracle JBoss

Related log files

alert log or Sybase error.log

logs

10

19 May 2010

System wide problems - 1


All

reports, NRT and Daily, in multiple report packs are empty Management Console / Home / OVPI Status check the trend.log for:

Check Next

Database errors Connection problems Database full Hung processes

11

19 May 2010

System wide problems - 2


Database

connections:

ODBC used by poller processes (mw_collect / pa_collect / ee_collect, bcp_gateway, trendpm) and trendcopy JDBC used by trend_sum, node_manager, Web Access Server Client used by ovpi_run_sql sqlplus on Oracle, isql on Sybase
Database

connection information comes from systems.xml and config.prp files

12

19 May 2010

System Performance
Check Check

indexmaint and db_delete_data run successfully each day they do not overlap if they do, adjust the trendtimer.sched GATHER_STATS_JOB recommendation is to disable this nightly Oracle job Oracle, involve your DBA to tune the database Sybase, use the dbtuner and see the additional tuning suggestions in the Admin Guide

Oracle For For

13

19 May 2010

No Data in a report pack, other report packs populating


Check the OVPI Timer on the server and poller (if separate) Check reports - NRT (if available), Daily, Forecast If NRT reports empty, check the metrics.log on the poller system If metrics.log is not showing successful data loading, check the trend.log on the poller If metrics.log shows data being loaded, check the trend.log on the poller for trendpm errors or hung processes for the datapipe collections

In case of hung trendpm, consider truncating the upload tables if they contain millions of rows

Check the trend.log on the server for trend_sum or SQL errors or hung processes for the datapipe or report pack
19 May 2010

14

Missing nodes SNMP 1


Check NRT or Snapshot reports Ensure the node is not on the Exclusion List If using multiple pollers, check that the node has been assigned to the correct View group Check that the node is in the correct Type group

If nodes are not in the correct Type groups, manually re-run Type Discover (from poller if applicable) If still not discovered, use ping and snmpwalk to verify connectivity, SNMP version, community string and MIB support

SNMP agents Type Discovered by trend_discover t Discovery runs each night on the PI server by default Uses SNMP v1 only

15

19 May 2010

Missing nodes SNMP 2


Once nodes are assigned to the correct Type (and View) groups, then they should appear in NRT reports within about 2 hours Check the trend.log on the poller for produced no results and other messages relating to the missing node If collections still fail, try these actions: Double-check the Community String Uncheck the SNMP V2 flag for that node Increase the SNMP Timeout and Retries in the SNMP Profile Reduce OIDs per PDU in SNMP Profile Disable SNMP v2 GetBulk (if used)

16 19 May 2010

Missing nodes SNMP 3


Double-check

if MIB is fully supported on the node

Use snmpwalk for the specific MIB table

Check

for other SNMP errors

NoSuchName

17

19 May 2010

Missing nodes pa_collect 1


Check the System Resource NRT or Snapshot reports Ensure the server is not on the Exclusion List If using multiple pollers, check that the node has been assigned to the correct View group Check that the node is in the OVPA Type group

Performance Agents (PA) or Operations Agents (OA) added to the OVPA Type group by OVPA_Collection_Daily.pro Discovery runs each night on the server by default

Configure HTTPS if agents require it Manually run the discover if needed

18

19 May 2010

Missing nodes pa_collect 2


Check

the trend.log on the poller for messages relating to the missing server
For PA or OA, both success and failure logged

If

there are no messages for this server, recheck the OVPA Type (and View) group If there are "unable to ping" / "not pingable" messages in trend.log for this server, edit the {DPIPE_HOME}/data/pa_rpt.cnfg and change doping to: doping=no If still not collecting, use ovcodautil -ping to verify connectivity from the poller to target server
19 19 May 2010

Missing nodes pa_collect 3


Once the agent is responding to ovcodautil -ping, pa_collect should also work If the base System Resource reports are populating but others are failing to populate for some servers, check the SystemResource_Supported_OVPA_and_EPC_Metrics.pdf document in {DPIPE_HOME}/packages/SystemResource/Docs. Not all metrics/reports are supported from Operations Agent, or on all platforms. If using PI version earlier than 5.40, connectivity to agents can fail if Port 382 is blocked even if that port is not being used. See KM527647

20

19 May 2010

Slow Collections
Too

few child collectors

Default number of child collectors for mw_collect and pa_collect is 5 which is too low Edit the trendtimer.sched on each polling system and increase child collectors to 25 as a starting point by adding c 25 to the mw_collect and pa_collect entries
Overloaded

system

Options are to increase performance/capacity of the system, or reduce load, or both

21

19 May 2010

Summary and processing problems


Database Hung

appears full, usually on Oracle

processes, usually due to SQL hanging or running very slowly in the database:
Usually have to kill the hung process Ensure db_delete_data and indexmaint running Apply latest Patches or Hotfixes Tune database Reduce load if necessary

22

19 May 2010

Reporting and Web Access Server


Check piweb.log and website0.log files Upgrade to JBoss 4.23 with PI 5.4x (post-install step) Can increase WAS Java Heap memory if seeing java.lang.OutOfMemoryError see Admin Guide Report performance improvements in latest PI 5.41 Hotfix Thresholds also run in JBoss If JBoss wont start, check for an old Java process for PiJBoss Derby Database also runs in app server

Used to store metrics about collections that are displayed in Performance section of the Management Console Database can grow very large or be corrupted Can reduce retention and recreate Derby Database if needed

23

19 May 2010

Integration with NNMi

Node Sync

Import list of managed nodes and their SNMP Read Community Strings from NNMi to PI PI then Type Discovers the node and starts polling NNMi v8.11 or later patches supported with PI 5.31 or later NNMi v9.0 supported with PI 5.41 Provides reports on NNMi Incidents NNMi v8.11 or later patches supported with Report Packs 14.0 (for PI 5.40) or later NNMi v9.0 supported with Report Packs 14.1 (for PI 5.41)

NNMi Incident and Availability Report Pack


24

19 May 2010

Node Sync with NNMi


See

the Integration Guide for configuration

Uses Logs For

nnmpi_wizard to configure on PI, nnmpi_cmd for automated sync to {DPIPE_HOME}/log/NNMPI_Wizard.log PI 5.40 upgrade to 5.41 or apply Hotfix 5.40.000 piweb HF05 NNMi 8 to the latest version available, currently NNM 8.x Patch 8

Patch

25

19 May 2010

NNMi Incident & Availability Report Pack


See

the Incident and Availability Report Pack User Guide for information Logs to {DPIPE_HOME}/log/NNMi_Datapipe.log For Report Packs 14.0, upgrade to 14.1 (or apply Hotfix 5.40.000 piweb HF05) Patch NNMi 8 to the latest version available, currently NNM 8.x Patch 8 Ensure configuration follows KM752866 - PI server name needs to match systems.xml file Apply Hotfix for QCCR1B41849

26

19 May 2010

Known Issues in Latest Releases


Report

Packs 14.1

Problems after upgrading Service Assurance and Cisco SAA Datapipes from earlier versions (Does not impact new install of 14.1 versions of these packages) Upgrades to 14.1 (and 14.0) versions of the SystemResource_Disk and SystemResource_NetInterface from previous versions are extremely slow. Request Hotfixes before upgrading these packages.

27

19 May 2010

Questions?

28

19 May 2010

To connect with your peers, visit the HP Software Solutions Community:

www.hp.com/go/swcommunity
29 19 May 2010

S-ar putea să vă placă și