Sunteți pe pagina 1din 36

INITIAL Iub

TROUBLESHOOTIN
G
OVERVIEW

Agenda

Objective
CPP O&m Concepts
Protocols
O&m Client Services
Counters Overview
Performance Management
Iub over ATM
Initial Counters
Iub Analysis
Fail After Admission
IP Iub Throughput
Questions?

OBJECTIVE
Main idea is introduce to the transport engineer the basic concepts of
troubleshooting on Iub interface, by presenting initial counters and
KPIs, that could help to define which area needs further
investigations.
Based on these conclusions, network optimization services can be
performed.

CPP O&M CONCEPTS


Moshell is a suite of tools for O&M of CPP-based nodes.
CPP is the Connectivity Packet Platform on which are based the following
nodes: RNC, RBS, MGW, RXI.
Information collected by CPP counters every 15 minutes in stored in xml
files (ROP files).
Information are read and stored into a SQL database on a daily basis.

PROTOCOLS
Protocols used for accessing these services:
http
unsecure protocols (unencrypted): telnet, ftp, iiop
secure protocols (encrypted): ssh, sftp, ssliop
NODE
RS232

Hyper
Terminal

OSE shell (COLI)


TELNET (23) / SSH (22)

HTTP (80)
MoShell

Ethernet
or
IPoverATM

TCP/IP

FTP (21) / SFTP (22)

File system

CM (Configuration Mgmt)
IIOP (56834)
/
SSL IOP (56836)

FM (Fault Mgmt)

MIB

PM (Performance Mgmt)
Scanners

Figure 1 - Protocols

The O&M client services


Configuration Service (CS): Read and change configuration data;
configuration data is stored in the MO attributes
Alarm Service (AS): Retrieve the list of alarms currently active on
each MO
Notification Service (NS): Subscribe and receive notifications from
the node, informing about parameter/alarm changes in the MOs
Inventory Service (IS): Get a list of all HW and SW defined in the
node
Log Service (LS): Save a log of certain events such as changes in
the configuration, alarms raising and ceasing, node and board restarts
Performance Measurement (PM): Setup that are stored in MO pmattributes and output to an XML file every 15 minutes.

COUNTER TYPES:

COUNTERS overview

Peg: a counter that is increased by 1 at each occurrence of a specific


activity.
Gauge: a counter that can be increased or decreased depending on the
activity in the system.
Accumulator: a counter that is increased by the value of a sample. It
indicates the total sum of all sample values taken during a certain time.
The name of an accumulator counter begins either with pmSum or
pmSumOfSamp.
Scan: a counter that is increased by 1 each time the corresponding
accumulator counter is increased. It indicates how many samples have
been read.
Probability Density Function (PDF): is a list of range values. If the value
falls within a certain range, the range counter for that range is increased.

COUNTERS OVERVIEW

Counter Reset Behavior


Counter values can be either reset at the end of ROP Period or can
be accumulated up to the counter limit.
In a counter that is not reset after ROP period, the incremented
value during a ROP period is the difference between two consecutive
ROPs.
Counter Classification
Counters can be grouped by NE Type:
RNC
RXI
RBS
Or by area of interest:
Radio Network RNC specific counters
Radio Network RBS specific counters
Transport Network counters

iUb over atm

Figure 3 - Iub configuration example

iUb over atm


AAL2 CAC and resources usage:

AAL2 connection admission control (CAC) is executed before a new


AAL2 connection is set up in the system.

AAL2 connections in UTRAN are always initiated by RNC.

RNC reserves a CID and the relevant bandwidth, and forwards the
establish request message through the AP. It will contain, the
allocated CID, the traffic descriptors and QoS

iUb over atm

CID
Because of standardization constrains, no more than 248 AAL2
connections can be simultaneously established on a single AAL2
path: more than 248 connections can be established between two
adjacent nodes if more than one AAL2 path is configured.
When an AAL2 connection is allocated on an AAL2 path, a Channel
Identifier (CID) is reserved and assigned by the node that is
originating or forwarding the AAL2 connection request.

Figure 4 AAL2 Connections table

iUb over atm


In particular:
The AAL2 path capacity assumed by CAC is equal to:
the configured PCR, for CBR AAL2 paths
the configured MCR, for UBR+ AAL2 paths
zero, for UBR AAL2 paths
Flow Control:
The Flow Control function has been conceived to dynamically adapt
transmission rate of Best Effort services to Iub available bandwidth by
reducing transmission rate during Iub congestion situations

Initial
counter
check
Recommended to check in an initial investigation as they will give clues
on whether the source of the problem is transport network based.
Checking if the number of Unsuccessful local or remote AAL2
connections is increasing will indicate where potential problems exist,
at the NodeB, RXI or RNC. The OutConns, viewed at AAL2 Access
points in RNC looking towards the RXI/NodeB, and AAL2 Access
Points in the RXI looking towards the NodeBs are the best counters
to observe.

Aal2Ap
Aal2Ap
Aal2Ap
Aal2Ap

pmUnSuccOutConnsLocalQosClassA/B/C/D
pmUnSuccInConnsLocalQosClassA/B/C/D
pmUnSuccOutConnsRemoteQosClassA/B/C/D
pmUnSuccInConnsRemoteQosClassA/B/C/D

Initial counter check


The following counters show the BW utilization.
VclTp, VplTp, Atmport

pmBwUtilizationRx;
pmBwUtilizationTx

To check ATM links utilization


VclTp, VplTp, Atmport
pmTransmittedAtmCells
pmReceivedAtmCells
To show number of RRC/RAB Establishment failures after admission
Utrancell
pmNoFailedAfterAdm

Initial counter check

To check for congestion in the control plane


Iub interface

UniSaalTp

NbapCommon

Iublink

pmNoOfLocalCongestions
pmNoOfDiscardedNbapMessages
pmTotalTimeIublinkCongestedDl

Iu/Iur interface

NniSaalTp

pmNoOfLocalCongestions

To check for interface availability


Iub interface

UniSaalTp

Iu/Iur interface
NniSaalTp

pmLinkInServiceTime

pmLinkInServiceTime

Initial counter check


The following counter shows if Iub Bandwidth is limiting HS services, measured
in %.
OBS. if > 75% cause could be Iub capacity or Radio limitations.

IubDataStreams

pmCapAllocIubHsLimitingRatioSpi<xx>

To see HS frame loss

IubDataStreams

IubDataStreams

pmHsDataFramesLostSpi<XX>
pmHsDataFramesReceivedSpi<XX>

To check ATM link quality

Aal2PathVccTp,

Aal5TpVccTp,VpcTp

pmBwLostCells
pmFwLostCells

Initial counter check

Check the physical layer quality of the transmission link


ImaLink
pmSesIma
pmSesImaFe

pmUasIma
pmUasImaFe
ImaGroup

pmGrUasIma

E1PhyspathTerm,
E1Ttp,E3PhysPathterm

pmEs
pmSes

pmUas
Os155SpiTtp

pmMsEs
pmMsSes
pmMsUas
pmMsBbe

Vc12Ttp,Vc4Ttp

pmVcEs
pmVcSes
pmVcUas

Iub analysis
Strict Admission Traffic

The following
AAL2 flowchart
Setup Failure summarises an Iub link analysis
procedure based on AAL2 Setup failure rate
OK
No AAL2 Setup Failure
examination.
Lack of CID

Local
AAL2
Setup Failure

Best Effort Traffic

Create More
Class A VCs

Lack of Bw

Remote

Bad TN quality

Check Flow
Control Counters

No AAL2 Setup Failure

Local

Lack of CID

Create More
Class B&C VCs

Bad TN quality

Check Physical
Layer Quality

AAL2
Setup Failure
Remote

Check Physical
Layer Quality

AAL2 Setup Failure Rate

The following KPIs and AAL2Ap counters are suggested to monitor the AAL2
Setup Failure rate on an Iub link.
Counters

Aal2Ap::pmUnSuccOutConnsLocalQoSClass<x> (A/B/C/D)
Number of unsuccessful attempts to allocate AAL2 resources during
establishment of outgoing connections on this Access Point (AP). Caused by
Rejects in Connections Admission Control (CAC).

Aal2Ap::pmUnSuccOutConnsRemoteQoSClass<x>
(A/B/C/D)
Number of unsuccessful establishments of outgoing connections on this AAL2
Access Point (AP).

Aal2Ap::pmSuccOutConnsRemoteQosClass<x> (A/B/C/D)
Number of successful establishments of outgoing connections on this AAL2
Access Point (AP).

AAL2 Setup Failure Rate

[ AAL2 _ Fail _ Rate _ Local _ ClassA]%


pmUnSuccOu
KPIs tConnsLocalQoSClassA *100%
pmSuccOutConns Re moteQoSClassA pmUnSuccOu tConnsLocalQoSClassA pmUnSuccOu tConns Re moteQoSClassA

[ AAL2 _ Fail _ Rate _ Re mote _ ClassA]%

pmUnSuccOutConns Re moteQoSClassA *100%


pmSuccOutConns Re moteQoSClassA pmUnSuccOutConns Re moteQoSClassA

Similar formulae can be used for Class B & Class C.


The AAL2_Fail_Rate_Local_ClassA KPI signals possible problems in the Iub section
between the RNC and the next connected node (NodeB or RXI).
The AAL2_Fail_Rate_Remote_ClassA KPI signals possible problems in the Iub section
between any intermediate RXI.

CID Utilization Estimate

This is a crude method of calculating the number of CIDs as it does not distinguish
between traffic types.

There is a second method using Erlang Counters, that wont be demonstrated on this
presentation.
Counters

Aal2Ap:: pmExisTransConns
The number of existing connections for the Access Point (AP) existing in the node.. Gauge
Counter
Aal2Ap:: pmExisOrigConns
Number of existing connections for the Access Point (AP) originating in this node.
Gauge Counter.

Aal2Ap:: pmExisTermConns
Number of existing connections for the Access Point (AP) terminating in this node.
Gauge Counter.

KPI

CID Utilization Estimate


Average _ No _ Connections

[ pmExisOrigConns pmExisTerm Conns pmExisTran sConns ]


n

where n is the number of paths per AAL2 Access Point.


Note: if the RXI is a pure AAL2 switching node, then the pmExisOrigConns and
pmExisTermConns counters can be discounted as there can be no originated
or terminated connections in the node, only transiting connections.

This method of CID calculation gives a basic estimate of CID utilization.


In a typical Iub link with one VC (normally vc39) defined for Strict Admission
traffic and one VC (normally vc50) defined for Best Effort traffic, the division
by 2 in the formula will average the total number of used CIDs over both
traffic types. For example, if the counter returns a value of 360, it is not known
if this is 180 CIDs in both ClassA and ClassB&C, or maybe 240 in ClassA and
120 in ClassB&C. If it is the latter, then VC expansion is needed, as the
maximum number of CIDs allowed per path (248) is being reached.

BW Utilization Estimate

Bandwidth utilization can be measured per VP and also per VC using


counters.
To monitor Best Effort VC utilization is better use Flow Control methodology.
Counters

VplTp:: pmTransmittedAtmCells = Number of transmitted ATM cells. This


counter is incremented for each transmitted ATM cell. Peg counter.
VplTp:: pmReceivedAtmCells = Number of received ATM cells. This counter
is incremented for each received ATM cell. Peg counter.

KPIs

AAL2 _ VP _ Utilisation _ Tx

VplTp :: pmTransmit tedAtmCells


* 100%
Meas _ Length( s ) * egressAtmPCR

AAL2 _ VP _ Utilisation _ Rx

VplTp :: pm Re ceivedAtmCells
* 100%
Meas _ Length( s ) * ingressAtm PCR

TN quality

Physical Layer Quality

Several counters are available to monitor the availability and the quality of
physical and IMA terminations in CPP nodes.

Errored Seconds (ES): seconds with block errors during the PM interval.
These counters are incremented for each second where one or more blocks
with one or more errors are received.

Severely Errored Seconds (SES): seconds during available time having a


severe bit error rate.

Unavailable Seconds: the accumulated unavailable time in seconds during


the interval. Unavailable time starts when 10 consecutive SES are detected,
and ends when 10 consecutive non-SES are detected. These counters are
incremented for each second of unavailable time

Flow Control
HSFrameLossRatio

pmHsDataFr
amesLostSp
i xx *100%
HSDPA
Congestion
KPIs:

pmHsDataFrames Re ceivedSpi xx pmHsDataFramesLostSpi xx

High frame loss indicates potential congestion problems.


<xx> = the supported SPI (Scheduling Priority Indicator)

HSFrameDelayDistribution pmHsDataFrameDelayIubSpi xx

This counter indicates the percentage of times where Iub congestion has occurred per SPI
(Scheduling Priority Indicator).
Experience has shown that in high loaded Iub cases, this counter could reach values of
about 6575%.

Flow Control
Low HS Throughput Site Analysis Study Case
Counters were extracted and graphs plotted to
illustrate the HS Frame Loss Ratio and HSLimitIub
KPIs over time

Flow Control

Examining the KPIs resulting graphs below, it was evident that the channel
normally reserved for ClassA traffic (vc39), was experiencing abnormally high
bandwidth utilization.
The ClassB&C traffic channels (vc50 & vc51) were experiencing abnormally low
utilization (next slide).

Flow Control

Flow Control

Enhanced Uplink Congestion KPIs


Eul _ Frame _ Loss _ Ratio

pmEdchDataFramesLost
* 100%
pmEdchDataFrames Re ceived pmEdchDataFramesLost

High frame loss indicates potential congestion problems.


Eul _ Frame _ Delay _ Distribution pmEdchDataFrameDelayIub

This counter is difficult to post process, so is only recommended to be used


with troubleshooting rather than performance monitoring

Failure After Admission

What is Failure After Admission?

refers to an RRC/RAB setup failure that occurs after the user has been
admitted to the network.

Admission to the network occurs when the user successfully completes an


initial RRC Connection Setup request.

An RRC failure that occurs after the initial admission could be if the user
wanted to upswitch to a higher rate while on an existing call and the upswitch
could not be achieved, due to lack of resources (Radio or Transport). This
would be perceived by the user as a slow connection.

On the other hand, a RAB setup failure would be perceived by the user as a
failure to setup a call.

Failure After Admission

In general, high Failure After Admission occurrences are mainly due to:

Transport Network: lack of BW/CIDs, or,


Radio Network: lack of Channel Element Availability.

Failure After Admission Study Case


To perform this study case the following procedure is performed:

Identification of a problem site, by extraction of


pmNoFailedAfterAdm counter.
AAL2 Setup Failure Rate, counter retrieval and KPI calculation.
Graphical Analysis to establish correlation between both

Graphical Analysis

IP Iub Throughput

AvUserThrH s kbit / s

AvNrHsUser sPerCell

The client should define a user throughput threshold, in


order to identify the bandwidth target to be delivered (in
average) for user.
After that, this threshold should be compared with
actual customer average throughput, as defined below:
pmDlTraffi
cVolumePsI ntHs
THROUGHPUTPER
USER:
1
This formula calculates
Cells
the average Bit-rate per user
Meas _ Length( s )
AvNrHsUser sPerCell

on Iub interface. Cells

pmSumBestPsHsAdchRabEstablish

pmSamplesBestPsHsAdchRabEstablish

pmSumBestPsEulRabEstablish
pmSumBestPsStreamHsR abEst

pmSamplesBestPsEulRabEstablish pmSamplesBestPsStrea mHsRabEst

IP Iub Throughput

If the throughput per user is below defined threshold, should be identified if it


has been limited by Flow Control. This can be done using Iub congestion
counter:

HSLimitIub pmCapAllocIubHs lim itingratiospi xx

Other indication that the transport network is overloaded, could be measured


by frame loss counter, that should present values below 2%.
HSFrameLossRatio

pmHsDataFramesLostSpi xx *100%
pmHsDataFrames Re ceivedSpi xx pmHsDataFramesLostSpi xx

If frame loss counter returns low values, and Iub presents no


limitation

IP Iub Throughput

RNC Iub throughput monitoring KPIs:


Average Iub throughput:
IUB_THR kbit / s

pmSumCapacity
pmSamplesCapacity

Average Iub throughput regulated within ROP:


IUB_THR_ REG kbit / s

pmSumCapacity Re gulation
pmSamplesCapacity Re gulation

Periods of Iub Throughput limitation:

IUB _ THR _REG_ DURATION sec pmTotalTim eCapacityRegulated

This KPIs observation alows to understand when low performance is due


internal RNC limitation, and not by transport network.

IP IUB EVALUATION
FLOWCHART

S-ar putea să vă placă și