Sunteți pe pagina 1din 51

SmartCloud Application Performance Management

and Analytics
SmartCloud Analytics Log Analysis & Predictive Insights

2013 IBM Corporation

Please note
IBMs statements regarding its plans, direc3ons, and intent are subject to change or withdrawal
without no3ce at IBMs sole discre3on.
Informa3on regarding poten3al future products is intended to outline our general product direc3on
and it should not be relied on in making a purchasing decision.
The informa3on men3oned regarding poten3al future products is not a commitment, promise, or
legal obliga3on to deliver any material, code or func3onality. Informa3on about poten3al future
products may not be incorporated into any contract. The development, release, and 3ming of any
future features or func3onality described for our products remains at our sole discre3on.

Performance is based on measurements and projec3ons using standard IBM benchmarks in a


controlled environment. The actual throughput or performance that any user will experience will
vary depending upon many factors, including considera3ons such as the amount of
mul3programming in the users job stream, the I/O congura3on, the storage congura3on, and the
workload processed. Therefore, no assurance can be given that an individual user will achieve results
similar to those stated here.

2013 IBM Corporation

IBM Investment in Analytics

2013

More than $16B in Acquisitions


Since 2005
More than 10,000 Technical
Professionals

Social Analytics/Consumer Insight

More than 7,500 Dedicated


Consultants

Workload Optimized Systems


Advanced Case Management
Content Analytics

Largest Math Department


in Private Industry

Decision Management
Stream Computing

More than 27,000 Business


Partner Certifications

Pervasive Content
pureScale
pureXML

Deep Compression
Developer Productivity
Autonomic Operations

2005
3

2013 IBM Corporation

Shifting market for IT Operations

APM Digest survey* of Senior IT Ops @ Fortune 500


50% growing dissatisfaction with traditional performance
management solutions for Production IT
75% of them are dissatisfied with their Business
management solutions
Inability to adapt to rapidly changing applications &
workloads (Systems of Interaction)
30% of them believe that they do not have a way to
proactively detect problems
Looking to operate on raw data and gain actionable
insights

Operational Visibility

IT Overwhelmed by data

IT Analytics solutions can predict, detect and help solve


problems by churning through piles of data and
translating this to understandable, relevant information,
and actionable insights.
* Source: APMDigest:
hEp://apmdigest.com/it-analy3cs-emerging-as-dissa3sfac3on-grows-with-apm-and-bsm-tools

2013 IBM Corporation

Business Value to IT Analytics Adoption


Op'mized
Performance

Track, Op3mize, and Predict


capacity and performance needs
over 3me

Perform
Track capacity and performance
of applica3ons and services in
classic and cloud environments

Op3mize resource
deployment
with what-if and best t planning

tools

Increase u3liza3on of exis3ng
assets

Predic've Outage
Avoidance
Ensure availability of
applica3ons and services

Predict
Use learning tools to
augment custom best
prac3ces


Leverage sta3s3cal
aximize
methods to m
predic3ve warning

Use past maintenance to
predict part failures

Faster Problem
Resolu'on

Find & correct problems faster


with tools that determine ac3ons
required to resolve issues

Search
Iden'fy problems quicker with
insight to large unstructured
repositories

quicker by
Isolate problems
bringing relevant unstructured
data into problem inves3ga3ons
Repair problems quicker with
the right details quickly to hand.

Improved Insight

Enhance visibility into systems


resource rela3onships while
increasing customer sa3sfac3on

Know
Determine what resources
are interdependent to assess
impact of failures

Gain insight into
what is
important to your
customer
Decrease customer churn
and acquisi3on costs while
increasing customer reten3on
and sa3sfac3on

Lower IT Administra'on Costs with Automated Analy'cs



Escalate performance and capacity issues automa3cally,
reducing manual analysis eorts

Reduce manual customiza3on using learning tools that automa3cally adjust to new normals
Detect and present problems with a proposed resolu3on, to be able to do more with less

5

2013 IBM Corporation

IT Data Requirements - Metrics, Events and Logs


IT professionals need Metrics, Events and Logs to resolve IT issues.
When we need to resolve problems with workloads

we typically look at three types of data.


Metrics
Events
Logs

- Structured performance data


- Discrete alerts
- Unstructured/semi-structured
data

Metrics and events can tell you what is happening.


To answer why oSen need to look at the logs.


Metrics
Events

Logs

2013 IBM Corporation

Operations / Performance Data is Exploding


A typical enterprise with 5000 servers, running 125 applications across 2 to 3
data centers generates in excess of 1.3 TB of data per day
Data Ratio
Metric Data

Only 3% of the data generated is

Unstructured Data
3%

opera3ons oriented metric data.

97% is made up of unstructured/semi


structured data


Workloads are running on heterogeneous
plaXorms.

97%

2013 IBM Corporation

SmartCloud Analytics Marketecture Overview


End Customer Client value

Plan and op'mize

Faster problem detec'on and


resolu'on

Failure Risk Es'ma'on and


Avoidance

Insight & Care

Flexible Consump9on Models



Digital Download
On Premise

SaaS

Embed

BP / Customer driven Ecosystem


Smarter Infrastructure Insight Pack

IT Opera'onal Insight Pack

Search

Systems

Workloads

Wireless

Op'mize

InfoSphere BigInsights

InfoSphere Informa'on
Server

InfoSphere
Streams

Applications

Predict

Network

Voice

Security

Mainframe

more . . . IBM
Watson

Storage

Assets
2013 IBM Corporation

SmartCloud Analytics delivers end-to-end problem resolution


Integration with various types of solutions to accelerate end-to-end
problem resolution and increase visibility into the IT systems*

SmartCloud Analytics
Link metrics in the
context of Log
search results.
Add log data into
APM and IT
Dashboards
Search metric data

Monitoring
Solution
Metrics
9

Detect and alert


on anomalies
based on trends
observed in
logs

Add and search event


data in Log Analytics

Search logs in
the context of
an anomaly
event

Show log searches in


the context of events

Problem /
Anomaly
Detection
Solutions

Show events in the


context of Log search

Alerts generated from


scheduled searches

Logs,
Support docs / Social data
Refine /
scope
search in
logs and
docs using
topology
and
configurati
on context

Search
and
analyze
service
tickets
Search
events,
logs, docs
with ticket
context

Event
Management
Solutions

Discovery
and APM
solutions

Service Desk
Solution

Events

Cong /
Topology

Tickets

*Planned roadmap items

2013 IBM Corporation

Examples: Analytics for Operational Environments

Capacity Trending
Server, Process, Middleware and DB
Trending to automatically highlight risk
while there is time available to avoid
outages or slow-downs
Extend with SPSS Correlation for
maximum confidence
Dynamic Thresholds
automatically recommend and set
thresholds based on attribute
performance.
Event Thresholds
Manage flows to highlight important
alerts:
-- throttle floods
-- escalate threshold events that are
important
Dynamic Thresholds and Trending are built-in to Tivoli Monitoring for immediate value,
and significantly reduce Set-up and Administration of Monitoring Environments

10

2013 IBM Corporation

Example: Capacity Analytics for Cloud & Virtualized Infrastructure


Provides visibility of how Resources are Allocated to Applications and Services

Cloud Consumers conservative in estimating their system needs (Over / Under Estimate)

Understanding Historical behavior helps optimize Capital Management

Know what Resources are Available and Predict how they will be Used

Maintain awareness of total and available capacity


Predict physical and virtual resource capacity bottlenecks
Gain business agility by determining room for expansion via what-if analysis

Optimize Resource Allocation

Right-size virtual machines


Policy-driven workload placement for
performance and security optimization

Visibility into the cloud infrastructure

11

See and Manage all major Hypervisor


environments from one place
Leverage Perspec3ve from the Past to the
Future to Ensure Resource availability

2013 IBM Corporation

IBM SmartCloud Analytics Log Analysis


Delivers Problem Isolation and Faster Problem Resolution
Search, and Index unstructured data
to provide consolidated view
Faster Problem
Resolu'on

Find & correct problems faster


with tools that determine ac3ons
required to resolve issues

Search
Iden'fy problems quicker with
insight to large unstructured
repositories

quicker by
Isolate problems
bringing relevant unstructured
data into problem inves3ga3ons
Repair problems quicker with
the right details quickly to hand.

12

Built on IBMs Big Data platform


Integrate structured and
unstructured data for better
problem identification and
resolution

Extensible, with IBM and partner expertise built-in


Get the last critical piece of data for identifying, isolating,
and correcting problems faster

2013 IBM Corporation

IBM SmartCloud Analytics Log Analysis


Collects large volumes of obscure unstructured data and transforms it
through analytics into actionable intelligence.
GBs of
Obscure Log
Files

Single
Actionable
Dashboard

Intelligent
Support Docs
Integration
through
Advanced Text
Analytics
13

Insight Packs

2013 IBM Corporation

IBM SmartCloud Analytics - Log Analysis Client Value


IBM SmartCloud Analy'cs Log Analysis helps IT Generalists and Applica'on
Specialists accelerate problem resolu'on through rapid analysis of unstructured data
Value

Highlights

Faster Problem Identification and


Isola'on

Collec'on and
Annota'on of data

Quickly search structured and unstructured data.

Generic Logs Support

Perform cross domain analysis on this data.

Federa'on of Data

Faster Problem Repair


-By linking expert knowledge to log error/warning
messages

Advanced Text Analy'cs


Downloadable insight
packs on the ISM Library
star'ng with WebSphere
and DB2

Improved Service Availability and


Maintainability of Custom Apps
- Provide users with advanced insights into custom

applica'ons quickly

14

Tools to create custom


insight packs for your
own applica'ons

2013 IBM Corporation

SmartCloud Analytics Roadmap Overview


Release



Key Dates



Key
Capabili'es










15

SmartCloud Analy'cs

SmartCloud Analy'cs

SmartClould Analy'cs

Log Analysis v1.1

Log Analysis V.Next

Log Analysis V.Next

Workgroup Edi'on

Workgroup Edi'on

Enterprise Edi'on

Q4 2013

June 2013

Fast to install and


download
Data collec3on,
annota3on and
indexing
Search UI
WAS & DB2 insight
packs
Generic log support
Insight pack tooling


Enterprise scalability*

Integra3on with Tivoli


Monitoring and Event
Management Solu3ons
Addi3onal Content and Tooling
Logstash support for data
collec3on

* Enterprise scalability only on Enterprise


Edi3on


2013 IBM Corporation

Linking Information
Linking of search results with structured data
Supports linking indexed data with federated sources
Plan to provide out of the box linkages with key Tivoli/IBM products

16

2013 IBM Corporation

Expert Guidance
Provides Expert advice by searching support docs
E.g. when there is an error message found in a log file, search in support documentation for relevant
information on further explanation and/or fix.

17

2013 IBM Corporation

A Healthcare Provider reduces


3me to diagnose system
problems by providing a holis3c
view of all relevant data
Need
Have too many tools across structured and
unstructured datasets making problem resolu3on
dicult and 3me consuming
Desired a solu3on to 3me-correlate a view into
many sources of data to perform problem
detec3on, isola3on and repair

Benefits
Reduced time to determine root cause of
problems by leveraging performance, event
and log data
Skills required to diagnose problems were easily
saved and repeated to reduce overall costs

18

2013 IBM Corporation

90-Day Free Trial Available

hVp://www-01.ibm.com/soZware/'voli/products/log-analysis
19

2013 IBM Corporation

Proactive, Predictive and Preventative Management


Few companies are genuinely proactive or preventative
Most organization react to service outages in progress
Diagnosis can be complicated by organizational silos,
disparate tools, complexity and the sheer volume of data.
Outages and degradation can cost millions of dollars,
impact brand, customer churn & retention
CxOs are challenging their management teams to prevent
outages rather than just reacting after failure.

Why arent operations teams preventative today?


- Too much data to analyze manually
- Existing analytic techniques, such as standard thresholds, are not up to the task
- They cannot detect problems while they are emerging (before business impact)
- Set threshold too high, insufficient warning before total failure.
- Set threshold too low, too much noise, everything is ignored
20

SmartCloud Analytics Predictive Insights


Proactive and self-learning Performance intelligence
Real-time analytics for detecting and avoiding service disruption
Uses advanced Watson research algorithms
Correlates metrics across multiple domains and heterogeneous
data sources.
Leverages IBM Big Data technology
Embeds InfoSphere Streams, IBMs unique streaming analytic
engine
Enables ultra-high scalability commodity server computing
clusters and large algorithm sizes to maximize machine
intelligence value
Leverages InfoSphere Datastage, IBMs market leading
mediation solution
Quickly integrate to any monitoring source using a large
library of out-of-the-box connectors
Leverages your Tivoli and non-Tivoli environments
21

Predictive Anomaly Detection using Behavior Learning


Automated problem detection
Learn the environment through statistical analysis & correlation
Predict problems with high confidence based on changes in metric behavior
Augment manually applied thresholds by noticing when metrics
behave deviates from normal behavior
Watch related groups of metrics for additional
insight and maximum predictive warning

Discover and Group Resources based


on behavior
Augments service and application modeling

Powered by IBM BigData and


Data Mediation for rapid delivery,
performance & scalability

22

Example Scenario: Internet Banking Application


Goal: Automatically learn normal mathematical relationships between metrics
Web Response Time

Internet Banking

Anomaly Event

Business Impacted
WRT Bad

Application

Web Response
Time

ESB

Java / WAS

AIX

RHEL

WRT Good

User Requests

Oracle

Windows

Core Banking
Application
z/OS

Time
Early Warning

Learns Web Response Time has a normal causal

relationship with User Requests - WRT gets slower as user


load gets higher.
If this healthy historical relationship breaks down, say due
to a memory leak, an anomaly is raised immediately
The problem is detected even while WRT service is good

Emerging problems can be detected even while service levels are good in absolute terms

23

Correlation of Multiple Metrics


Statistical models can discover mathematical relationships between metrics
Internet Banking

Internet Banking
A

Application

ESB

Java / WAS

AIX

RHEL

Oracle

Core Banking
Application

Windows

z/OS

The extent this can be achieved depends on a number of factors, such as: range and type of data, availability of
data, and stability of environment. Analytics falls back to a single metric if metrics are unrelated.
24

Multiple Metrics Analysis - Value of this approach


Learns normal operational behaviour across
the infrastructure, including how metrics
behave together.
Maximize Advance Warning: Identifies metric
relationship changes that signal a problem long
before traditional thresholds
Identifies problems before you know to look for
them
Detects service impacts that are not identifiable
by fixed thresholds alone.
Assists with root cause analysis by indicating
the most offending metrics.

Provides a more intelligent real-time


assessment of data, able to detect
problems as they are emerging

Reduces expensive and time consuming false


alerts.

25

Large retail bank increased


online banking application
availability through predictive
analytics
Need
Ensure critical retail banking applications
were online 24x7 for high customer
satisfaction
Proactive anomaly detection was needed to
ensure adequate time to resolve major
incidents before they became service
impacting

Benefits
Alerted 10 major incidents in a 4 week period
in advance of customer detection
Simultaneously monitors over 80 servers and
40K metrics
Estimated savings with outage avoidance
analytics was $600K for this 4 week period
26
26

Example: Field Trial at Large Retail Bank


Retail Bank experiencing severe problems with their online banking applica3on
Trial Scope:
Online banking service with back end application

ITM AIX, Linux, Windows, ITCAM for WAS, ITCAM for


WRT
~80 servers
~40k metrics
Results:
15 Major Incidents reported during the 4 week trial period
10 major incident were detected or predicted by SCA-Predictive Inisights
5 missed incidents were application code problems and not manifest in health metrics
100% of detectable problems detected

Prediction & Detection Intervals:


Report included a Problem Start Time, a Problem Detection Time and Problem Resolution Time
6 out of the 10 detected incidents were predicted before the customers Problem Start Time
All 10 out of 10 detected problem were detected before or around the customers Problem

Detection Time interval


Results for this Customer
Using industry average outage costs, potential outage avoidance savings for 4 weeks:
Event reduction savings for 4 weeks:

$53k

$600k
27

Backup

28

2013 IBM Corporation

Solution Architecture - Mediation


Predictive Insights
User Interface & Management
Tivoli Integrated Portal

Post-Processing Rules
Uses OMNIbus Rule Engine

Market leading mediation - provided as option


Proven rapid integration to new data sources.
Productivity tooling & collaboration included
High performance and scalability.
Large framework of connectors.

Anomaly Consolidation

Fast integration to common monitoring data formats.

Analytic Application

Analytic Engine

IBM InfoSphere Streams

Mediation

IBM InfoSphere Datastage

Windows based development environment

29

Mediation Rapid Common Extraction


Predictive Insights provides a quick setup Common Extractor feature that allows fast extraction
from the most common interface types such as:
- CSV
- Databases and database connectors, e.g. JDBC
Monitoring Suites

Interface

Implemented in trials

HP Sitescope

JDBC

Yes

Quest Foglight

Script dump to CSV

Yes

CA Wily Introscope

JDBC

Yes

IBM ITM TDW

DB2

Yes

IBM TDW Proxy Agent (low lat)

CSV

Yes

IBM ITCAM TDW

DB2

Yes

Compuware VAM

Script dump to CSV

Yes

HP Mercury BAC

JDBC

Yes

IBM Performance Manager

CSV

Yes

Brix

CSV

Yes

IBM Service Quality Manager

CSV

Yes

Other extractions can be quickly built from a large library of Datastage connectors
30

Media3on Connector Library InfoSphere DataStage


RDBMS!

General Access "

Standards & Real Time !

Legacy!

DB2 (on Z, I, P or X series)"


Oracle"
Informix (IDS and XPS)"
Ingres"
Netezza"
Progress"
RDB"
RedBrick"
SQL/DS"
SQL Server"
Sybase (ASE & IQ)"
Teradata"
Universe"
UniData"
NonStop SQL"
InfoSphere Federation Server"
InfoSphere Classic Federation"
And more.."

Sequential File"
Complex Flat File"
File Set"
Data Set"
Named Pipe"
iWay"
FTP"
SFTP "
Compressed / Encoded Data"
External Command Call"
Parallel/wrapped 3rd party apps"
EMC InfoMover"
Web logs"
Email"

WebSphere MQ"
Java Messaging Services (JMS)"
Java"
XML & XSL-T"
EBXML"
Web Services (SOAP)"
Enterprise Java Beans (EJB)"
EDI"
FIX"
SWIFT"
HIPAA"

Allbase/SQL"
C-ISAM"
D-ISAM"
Datacom/DB"
DS Mumps"
Enscribe"
Essbase"
FOCUS"
IDMS/SQL"
ImageSQL"
Infoman"
KSAM"
M204"
MS Analysis"
Nomad"
Nucleus"
RMS S2000"
Supra"
TOTAL"
TurboImage"
Unify"
And many more."

Enterprise Applications!
JDE/PeopleSoft OneWorld "
Oracle Applications"
PeopleSoft"
SAS"
SAP BW"
SAP R/3"
Siebel"
Ariba"
Manugistics"
I2"

CDC!
DB2 (on Z, I, P, X series)"
Oracle"
SQL Server"
Sybase"
Informix"
IMS"
VSAM"
ADABAS"
IDMS"
Datacom"
"

Etc"
31

Solution Architecture Analytic Engine


Predictive Insights
User Interface & Management
Tivoli Integrated Portal

Real-time streaming analytic engine, provided as a


component
High volume and low latency.

Post-Processing Rules
Uses OMNIbus Rule Engine

Supports server clustering and redundancy (next


rel)

Anomaly Consolidation

Analytic Application

Analytic Engine

IBM InfoSphere Streams

Mediation

IBM InfoSphere Datastage

Enables large algorithm capacity 80,000 metrics in


a single algorithm instance (a typical banking
application produces ~30,000 - 60,000 metrics)
Allows multiple algorithm instances spread across
commodity server computing clusters, making
maximum advantage of multi-core parallelism (next
rel)
32

Solution Architecture Analytics


Predictive Insights
User Interface & Management
Tivoli Integrated Portal

Post-Processing Rules
Uses OMNIbus Rule Engine

Automated anomaly detection and prediction on


time-series performance metrics
Behavioural learning to model not only one metric
at a time, but the relationships between them for
anomaly detection...

Anomaly Consolidation

Analytic Application

Analytic Engine

IBM InfoSphere Streams

Mediation

IBM InfoSphere Datastage

Single metric evaluation replacing many manual


thresholds for any time series data
Multiple metric correlation enabling earlier
detection than traditional thresholds with higher
confidence

33

Solution Architecture Anomaly Consolidation


Predictive Insights
User Interface & Management
Tivoli Integrated Portal

Post-Processing Rules
Uses OMNIbus Rule Engine

Anomaly Consolidation

Analytic Application

Targeted for next release, the alarm consolidation


framework reduces the events that are presented
externally allowing for efficient processing and
accurate alerts.
Different techniques will be selectable depending
on the richness of the data processed.
Internal Events

External Events

UV: Node A: Metric 1

EXT: Node A, B, C, Metric 1, 2, 3

UV: Node B: Metric 2

EXT: Node M, Metric 47

U V: Node C: Metric 3
MV: Node B, C: Metric 2, 3
MV Node A, B, C: Metric 1, 3, 2

Analytic Engine

UV: Node M: Metric 47

IBM InfoSphere Streams

Mediation

IBM InfoSphere Datastage

It reduces the volume of external alarms


forwarded to event consoles or application/domain
administrators, without removing any information
that could be useful in prediction, detection or RCA.

34

Solution Architecture Anomaly Post Processing


Predictive Insights
User Interface & Management
Tivoli Integrated Portal

Post-Processing Rules
Uses OMNIbus Rule Engine

Anomaly Consolidation

Analytic Application

The post-processing engine allows anomaly


events to modified, customized, or enriched
It can an be optionally used to put some business/
domain context around the domain agnostic
analytic anomaly events.
Typically this will be used to re-prioritize anomaly
severity to major if it is service impacting.
Internal Events

External Events

UV: Node A: Metric 1

EXT: Node A, B, C, Metric 1, 2, 3

UV: Node B: Metric 2

EXT: Node M, Metric 47

U V: Node C: Metric 3
MV: Node B, C: Metric 2, 3
MV Node A, B, C: Metric 1, 3, 2

Analytic Engine

IBM InfoSphere Streams

Mediation

IBM InfoSphere Datastage

UV: Node M: Metric 47

For example, if Metric 2 represents Online


Banking Web Response Time, then the anomaly
severity can be changed to Major.
This reuses OMNIbus Probe rules libraries, but is
dependent on having a northbound OMNIbus object
server to receive the anomaly events.

35

Solution Architecture User Interface & Management


Predictive Insights
User Interface & Management
Tivoli Integrated Portal

Post-Processing Rules

TIP based anomaly visualization


Allow all anomalous metric to be visualized
together

Uses OMNIbus Rule Engine

Normalizes metric scales, and allows, pan/zoon


etc, so that anomalous conditions are more readily
apparent.

Anomaly Consolidation

In-context linking between OMNIbus, TBSM, ITMM


AEL and anomaly charts

Analytic Application

Analytic Engine

IBM InfoSphere Streams

Mediation

IBM InfoSphere Datastage

Inherits all TIP features for unified user


management and permissions.

36

IBM SmartCloud Analytics - Differentiators

37

Licensing Model:

Based on average data consump3on not your worst day

Expert Advice:

Out of the box IBM exper3se based on advanced


text analy3cs

Text Analy3cs:

Only product in industry with advanced text analy3cs



engine able to extract insights from unstructured sources

like service 3ckets & support documents

Integra3on:

Integra3on with Tivoli products

Data Federa3on:

Easily link structured and unstructured data

PlaXorm:

Core technology based on IBMs Big Data PlaXorm


poten3al to use a common Big Data plaXorm for the
en3re business

Easily Extendable:

For customers, OEM or value-added reseller to


add their own Insight Packs

Mul3 Source:

Ingests metrics, cong, events, logs, traces,


and topology to perform RCA not just logs


2013 IBM Corporation

IBM SmartCloud Analytics - Differentiators

38

Anomaly Detec3on:



Anomaly detec3on and predic3on on logs and metrics


- helps users know what to search for and what is
trending

End to end Solu3on:





End to end solu3on to predict, analyze and



resolve problems, not just a point solu3on that

analyzes a narrow spectrum of data

2013 IBM Corporation

User Scenarios

3939

2013 IBM Corporation

Targeted Users
IT Opera'ons

Applica'on
developer

Con3nuously monitors for


anomalies across the en3re
infrastructure. When an
anomaly is detected, this
user routes the problem to
the right team.

Develops and tests large


distributed mul3-component
applica3ons on a middleware
stack. Debugs applica3on during
development.

Supports a business applica3on that


is built on middleware. Needs to
solve problems quickly to avoid any
business impact.
Needs to capture best prac3ces in a
tool so that diagnosis is less
dependent on skills availability.

40

Opera'ons
Teams

Analy'c
Content
Creator
Domain experts build content and
leverage available support
documenta9on

Applica'on
owner/Support
Engineer

2013 IBM Corporation

Example 1
A developer validates a complex distributed
application by running tests overnight. He wants to
know which points in his application are causing
exceptions through detailed analysis of the stack
traces

He uses text analy3cs of complex logs such as stack traces to


nd paEerns

Searches for frequent problems and paEerns to decide which
por3on of the code requires aEen3on.

41

2013 IBM Corporation

Example 2
Users of healthcare application are facing problems.
Its taking too much time to view patient data and at
times they see errors on their browser. They complain
to customer support
The customer support engineer creates a 3cket. The applica3on support
engineer now needs to nd the root cause of the problem.

The applica3on is distributed over mul3ple nodes and involves various

middleware and legacy technologies. It generates dierent types and large


volumes of metric and log data.


The support engineer searches the applica3on logs to locate a period when
transac3ons were slow for a specic set of users.

Using problem context, he searches expert advice for solu3ons

42

2013 IBM Corporation

Product Capabilities

4343

2013 IBM Corporation

Key Product Capabilities to Meet Client Needs


Mul3ple op3ons to upload log data
Search a massive amount of log les
Link structured data with unstructured data
Expert guidance
Analy3cs for trends and anomaly detec3on

Ability to build insight packs for domain/applica3on

44

2013 IBM Corporation

Multiple Options to Upload Log Data


App Developer/
IT Ops Engineer

Applica'on/system
Push logs
(Log File Agent, REST
interface)


Log Analy'cs
Server

Business Users

Applica3on
Components

45

Pull logs using remote


monitoring
(agent less op3on)

2013 IBM Corporation

Searching Information

Log le

[10/9/12 5:51:38:295 GMT+05:30] 0000006a servlet E


com.ibm.ws.webcontainer.servlet.ServletWrapper service SRVE0068E:
Uncaught exception created in one of the service methods of the servlet
TradeAppServlet in application DayTrader2-EE5. Exception created :
javax.servlet.ServletException: TradeServletAction.doSell(...) exception
selling holding 3111 for user =uid:43 at
org.apache.geronimo.samples.daytrader.web.TradeServletAction.doSell(Trade
ServletAction.java:708)


Log Analy'cs Server

46

2013 IBM Corporation

Text Analytics on Logs


Developer: Which are the top 10 Java classes that have most errors ?

Need to extract what are errors


and class names from
WebSphere logs

<ThreadID> <EventType> <ClassName>

Informa3on extrac3on may


involve complex context
sensi3ve grammar that need
processing beyond simple
regular expression parsing.


Follows immediately

[7/25/12 8:27:09:391 CEST] 00000028 E


com.ibm.bpe.u3l.Assert.asser3on

Within a single log record

Leverages Market leading Text


Analytics solution

47

2013 IBM Corporation

Sample App Dashboard

48

2013 IBM Corporation

Example: Metadata Extraction from Tech Note

49

2013 IBM Corporation

Ability to Build Insight Packs


Provides ability to build / deliver client, industry or scenario specic use cases. Insight Pack contains
informa3on such as
How to interpret log data ?
[07/25/12 02:38:25:295 GMT+05:30] 00000010 TraceResponse E DSRA1120E: Application did
not explicitly close all handles to this Connection. Connection cannot be pooled.

How to link log data with metric data and applica3on data
[07/25/12 02:38:25:295 GMT+05:30] 00000010
TraceResponse E DSRA1120E: Application did not
explicitly close all handles to this Connection.
Connection cannot be pooled.

What to search in log les?

What are the sources of expert advice?


50

2013 IBM Corporation

51

2013 IBM Corporation

S-ar putea să vă placă și