Sunteți pe pagina 1din 28

Syslog Connector

Performance Tuning
Girish Mantry, Moehadi Liang
Technical Solutions Consultants
Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Agenda
In this session we will take a look at

Syslog connector variants


Connector components and operation
Stages in the event flow
Performance bottlenecks and tuning at each stage
Out of memory problems and tuning
Customer cases
General recommendations

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance


Tuning

Syslog connector variants, components, operation and


event flow

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Syslog Connector Variants
Network
Listeners

Syslog
Daemon
UDP
Raw TCP
Default port 514

Syslog
NG
Daemon

ArcSight CEF
Encrypted
Syslog (UDP)

UDP
Raw TCP
TLS
Default port 1999

UDP
Symmetric Key
Encryption
Default port 514
Only CEF format

Supported on all platforms


Configurable interfaces and ports
4

File
Readers

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog
Pipe

Syslog
File

Unix Pipe

Regular File

Supported only on unix platforms


Work in conjunction with the native syslog daemon

Syslog Connector Performance Tuning


Syslog Connector Components

Destination Flow

Device
Type 1
Device
Type 2

Subagent

C2

ESM
Transport

Main Flow

Queue
Raw Events

Device
Type N

C1

Subagent

C1

Parsed Events
Subagent

Processed Events

C2

Cache

Destination Flow
C1

C2

Logger
Transport

Cache

Note: Queuing only applies to network listeners and not for file readers
5

ESM

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Logger

Syslog Connector Performance Tuning


Event Flow
Event Reception

Receives network
packets on UDP/TCP
sockets
Extracts human
readable syslog raw
events from network
packets

Event Queuing

Event Parsing

Raw events are written Raw events are picked


to a queue of files on
up from the file queue
the file system in the
in a FIFO manner and
order in which they are
parsed using regular
received
expressions
Information from
device log formats
normalized into
Arcsight event format

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Event Processing

Normalized events are


categorized and
processed in many
ways useful for
correlation and asset
modeling
Events are batched,
filtered or aggregated
as required for
efficiency

Event Transport

Enriched Arcsight
events are sent to
ESM/Logger
destination
Events cached when
destinations are down
and resent when they
are back up

Syslog Connector Performance


Tuning

Performance Bottlenecks in the Event Flow and Tuning

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Event Reception
Choice of Transport Protocol

UDP performs better on reliable networks

Use Raw TCP on unreliable networks

Use TLS for encrypted transport with Syslog NG

Bottleneck (when dealing with Raw TCP or TLS)

Java applications do not know when a client closes the connection with a FIN

Connections remain idle in a CLOSE_WAIT state until closed explicitly by the application

Idle connections can grow over a period of time and can exceed the connector limit or OS limit

Happens faster with large number of devices or with devices that create new connections frequently

Tuning

Parameter

Default

Recommendation

tcppeerclosedchecktimeout

-1

Set it to 30000 msec or higher to tell the connector to check for connections closed by peer proactively
and close them on the connector side as well

tcpmaxsockets

1000

Increase it higher as required to accommodate simultaneous connections from a large number of devices

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Event Queuing
Raw events received over the network are written to a file queue consisting of a certain number of files of fixed size
Bottleneck

With high event volumes, file queue can build up faster leading to significant delays

When file queue becomes full, connector starts dropping events

Tuning

Enable syslog parser multithreading (may need to follow up with memory increase if required)

Increase the file queue size

Parameter

Default

Recommendation

syslog.parser.multithreading.enabled

false

Set it to true to enable multithreading

syslog.parser.threadcount

-1

Set it to a specific number on a single processor machine. You can do the same on a
multiprocessor machine or leave it for connector to decide based on the number of processors

syslog.parser.threadsperprocessor

Takes effect only when the threadcount is set to -1. Leave it at 1 or increase it as required. Total
number of threads = number of processors * threadsperprocessor

filequeuemaxfilecount

100

Increase this parameter to increase the number of files in the file queue

filequeuemaxfilesize

100000

Specified in bytes. Increase this parameter to increase the size of each file in the file queue

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Event Parsing
Inspection and Device Type Detection

Multiple subagents with one subagent per device type with a parser that has a regex to match something unique in the log

Subagent parsers are ordered such that specific regexes come ahead of generic ones to detect device types accurately

Connector inspects messages from senders applying regexes in the order to detect the device type and associates the subagent with the sender
when a match is found. A single sender could be associated with multiple device types and subagents

Associated subagent parsers are used to parse messages from a sender and inspection process is not reapplied unless a message from a new device
type is encountered from the same sender

Syslog senders and their associated subagent types can be seen in current/user/agent/syslog.properties

Bottleneck

Inspection process involving regex matching could be expensive because connector has more than 100 subagents

Tuning

10

If you are sure of device types in your environment, you can restrict the subagent list by following properties

Parameter

Default

Recommendation

usecustomsubagentlist

false

Set it to true to make the connector consider the customized subagent list

customsubagentlist

List of subagents (>100)

Set it to the restricted subagent list based on device types in your environment. Preserve
the original relative order of subagents not to affect the accuracy of subagent detection

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Event Parsing - continued
Regular expressions in parsers

Bottleneck
A badly written regular expression in the parser can be a big performance hit on the connector

Optimization
For supported device types, development went through optimizing the regular expressions in the respective parsers. If you are authoring your own
syslog flex connector parsers, consider the following guidelines.
Make your regexes generic only as much as needed. Specific regular expressions perform better than generic ones
Use generic greedy expressions like .* and .+ at the end and not in the beginning or middle of a regular expression. Replace them with non-greedy
equivalents like .*? and .+? with a clear character or token marking the boundary.
Use of greedy expressions with more specific characters or meta characters is okay, ex:- \s+ for a continuous string of whitespace characters or
\d+ for a continuous string of numerals or \w+ for a continuous string alpha numerals

Maximum number of devices

Bottleneck
Connector allows up to a max of 5000 devices and does not process events from newer devices once this limit is reached

Tuning

Parameter

Default

Recommendation

syslog.max.device.count

5000

Increase it as required to match the number of devices in your environment

11

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Event Processing
Agent Batching

Batch size controls how many events go together from component to component in the event flow and eventually to the destination

Doubling or tripling default size of 100 could help improve the performance internally as well as over networks with latency

Do not increase beyond that because it could have a negative impact by increasing memory requirements to hold the batches

Categorization

Categorization files for different device types are loaded into memory and some of those can be big

Connector base memory usage can be high when dealing with a large number of device types

Java heap space may need to be bumped up

External Map File Processing

External map file query is executed for every batch of events

Make sure the query is simple and returns fast, if you are using this feature

Connector Filtering

12

Make sure that the filter condition is optimized and not extremely complex

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Event Processing - continued
Field Based Aggregation

Groups events with same values in specified fields into buckets and produces aggregated events on time interval expiry or reaching event threshold

Restrict the field set to minimum required and choose an optimal event threshold value to keep the number of event field comparisons low

Choose an optimal time interval not to block the event flow for too long

Avoid using preserve common fields setting in a high event volume environment

Name Resolution

13

Name resolutions are done in background threads and the event flow is not normally blocked for the answers to come back

If the Wait For Name Resolution feature is enabled, then the event flow is blocked for a certain timeout period for the answers to come back

Do not enable Wait For Name Resolution feature in an environment requiring frequent resolutions

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Event Transport
Event caching can occur for a number of reasons - network latency and problems in the destination are the common reasons
Bottleneck

Excessive caching can cause delays in events reaching their destination

When cache becomes full, connector starts dropping events

Tuning

14

Enable transport multithreading (except when the root cause is a problem in the destination)

For the logger smart message transport, turn on the https persistent connection feature

Increase the cache size to hold events for longer in the cache and prevent loss of events

Parameter

Default

Recommendation

http.transport.threadcount

Applies only to the ESM transport. Increase it by small increments as required.

transport.loggersecure.threads

transport.loggersecure.connection.persistent

false

Applies only to the logger secure transport. Change it to true for reusing the existing
HTTPS connections and not tear them down for every batch of events

Cache Size

1GB

Increase it as required up to a limit of 50GB. This is a destination setting which can be


configured using ESM console, connector appliance GUI or local connector setup wizard.

Applies only to the logger secure transport. Increase it by small increments as required.

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance


Tuning

Out of Memory Problems and Tuning

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Java Process Memory and Management
Memory allocated to a java process consists of Heap Space and Native Memory
Heap space is allocated as instructed by java run time parameters

-Xms (Initial heap size), -Xmx (Maximum heap size), 256 MB by default on connectors

Native memory size = Process Memory size Size of Heap Space


Garbage Collection reclaims the memory of unused objects

Minor collections (GC), reclaims memory in YOUNG generation and moves survivors into OLD

Major collections (Full GC), reclaims memory in all of the Heap space, takes much longer

JVM stops the application threads during GC or Full GC

Frequent Full GCs affects application performance severely


A clear indicator for the need to increase the maximum heap size

Process Memory
YOUNG Generation
Newly created objects

OLD Generation
Old objects surviving minor GCs

PERMANENT Generation
Classes, methods, etc
Code Generation

Out of memory errors can happen in any of these memory areas

Socket Buffers

Memory limitations in 32 bit connector build

Thread Stacks

Total addressable space is 4GB, Kernel space ranges from 1GB to 2GB depending on OS

User space available for process is 2GB to 3GB depending on OS

Limits exist on max heap space: 1GB (connector appliances), 1.5 GB (Windows), 2 GB (Unix)

Use 64 bit connector build for higher memory


16

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Heap
Space

Direct Memory Space


JNI Code
Garbage Collection
JNI Allocated memory

Native
Memory

Syslog Connector Performance Tuning


Dealing with Java out-of-memory errors
Errors
java.lang.OutOfMemoryError:
Java heap space
java.lang.OutOfMemoryError:
Requested array size exceeds
VM limit
java.lang.OutOfMemoryError:
PermGen space
java.lang.OutOfMemoryError:
Unable to create a new native
thread
Out of Memory Error
(allocation.cpp:211),
pid=16950, tid=1855142800
17

Root Cause and Recommendation


Garbage collection is unable to free up more space and memory could not be allocated for new objects
Increase the maximum heap size using -Xmx option in increments as required up to the limit
If this still does not help, there could be a potential memory leak or a bug open a support incident supplying the
logs and heap dumps
Permanent generation area has become full due to loading many classes statically or creating dynamic classes or
creating too many interned strings
Default max size of PermGen space is 64 MB. Increase it in small increments using -XX:MaxPermSize option
JVM is low on native memory and unable to create a new VM thread. Make more native memory available by
Reducing the heap space using Xms and Xmx options
Reducing the stack space of using Xss option
Displayed in the fatal error logs when the JVM crashes due to a malloc failure. The system could be out of physical
RAM or swap space or the process size limit was hit on a 32 bit system. Take one or more of the following actions
Reduce memory load on the system or increase physical memory or swap space
Decrease the number of application threads, reduce the java heap space and stack space

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Adjusting memory options
On a software connector, add or edit settings in a file under current/user/agent folder
agent.wrapper.conf when running as a service
setmem.sh (Unix) or setmem.bat (Windows) when running as a standalone application. This file may have to be created if it does not already exist.
set ARCSIGHT_MEMORY_OPTIONS="-Xms256m -Xmx256m (Example only. Add or remove options as required inside the double quotes)
export ARCSIGHT_MEMORY_OPTIONS (only on Unix)

On a connector appliance
Only heap space can be changed using a container command Configure Memory Settings
Other settings can be changed using SSH or diagnostic tools file editor using the same mechanisms as for a software connector
Memory Type
Heap Space

Perm Gen Space

Stack space
18

Running as service

Running as a standalone application

wrapper.java.initmemory=256 (initial heap size)


wrapper.java.maxmemory=256 (maximum heap size)

-Xms256m Xmx1024m
It is recommended to increase only the max heap size

Add additional java parameters with adjusted indexes


wrapper.java.additional.7=-XX:PermSize=64m
wrapper.java.additional.8=-XX:MaxPermSize=128m

-XX:PermSize=64m -XX:MaxPermSize=128m
It is recommended to increase only the max perm size

Add an additional java parameter with adjusted index


wrapper.java.additional.9=-Xss=64k

-Xss=64k Default stack size is OS dependent. Adjust and


observe. Too low a value can cause StackOverflowError

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance


Tuning

Customer Cases

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Troubleshooting
Changing the transport protocol to UDP or Raw TCP did not help
Could not reproduce the problem in house
Customer captured tcpdump packets and analyzed them using Wireshark

Large number of TCP Window Full messages

SEQ/ACK analysis showed that at times there is more than 10KB data in flight indicating that the
receiver is too slow to process the incoming flood of packets

TCP receive buffer and window sizes got reduced over time which contributed to the slow reception

Further enquiries revealed that the Syslog NG connector is receiving TLS data from 2 other sources

With this new discovery of customer environment, problem could also be reproduced in house

Observed a high memory usage and Increased the heap space to1024 MB, but it did not help

Root Cause
Destination Syslog NG connector did not close TCP connections when sources closed connections
Growing TCP connections forces receive buffer size to be reduced causing slower reception

Solution
Set the tcppeerclosedchecktimeout parameter to 30000 msec (half a minute)
This parameter tells the connector to proactively check and close any TCP sockets

20

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Customer Case 1- Problem


CEF Syslog TLS destination was caching at only at
200eps, while ESM and Logger destinations did not
cache for the same event rate

ESM
Source
Connector
Syslog NG
Source 1
Syslog NG
Source 2

Logger
CEF
Syslog
TLS
TLS

Syslog NG
Connecto
r

Syslog Connector Performance Tuning


Observations
Incoming event rate was much higher than the processing rate and connector was queuing heavily
During peak hours, queuing has exceeded the size limit and dropped a huge number of events
Caching observed during peaks hours and some events were dropped when cache size limit is exceeded

Customer Case 2- Problem


Huge difference of event counts found between
Fortigate Firewall and Logger via Syslog connector

High memory usage and frequent Full GCs were observed affecting the performance of the connector

Fortigate
Firewall
queuing

Syslog
Connector
Queue Rate(SLC) vs Events/Sec(SLC)

Queue Drop Count

Memory usage (Total vs Used)


agent.out.wrapper.log:INFO
agent.out.wrapper.log:INFO
agent.out.wrapper.log:INFO
agent.out.wrapper.log:INFO
agent.out.wrapper.log:INFO
agent.out.wrapper.log:INFO
agent.out.wrapper.log:INFO
agent.out.wrapper.log:INFO

Events/Sec(SLC) vs Throughput(SLC)
21

Cache Size and Current Drop count

| jvm 1
| jvm 1
| jvm 1
| jvm 1
| jvm 1
| jvm 1
| jvm 1
| jvm 1

| 2012/12/05 11:35:29 | [Full GC


| 2012/12/05 11:37:08 | [Full GC
| 2012/12/05 11:38:52 | [Full GC
| 2012/12/05 11:40:30 | [Full GC
| 2012/12/05 11:42:06 | [Full GC
| 2012/12/05 11:43:47 | [Full GC
| 2012/12/05 11:45:32 | [Full GC
| 2012/12/05 11:47:10 | [Full GC

Frequent Full GCs

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

caching

Logger

Syslog Connector Performance Tuning


Customer Case 2 Solution
Machines hosting the connectors were very powerful (64 bit Linux, 48 core CPU, 128 GB RAM, 600 GB hard disk)
Actions Taken
Increased the java heap size to 2048 MB to reduce the frequency of full GCs
Enabled syslog parser multi-threading to keep up with the queuing rate
Increased the file queue size from 100 to 2000 files of 10MB equivalent to 20 GB in total size to prevent dropping of events from file queue
Increased the cache size from 1GB to 10GB to prevent dropping of events from cache during peak hours
The above measured helped the performance of the connector significantly

22

Where it did not help solve the problem completely, we asked the customer to split the event volume among multiple syslog connectors

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Observations
Time per Batch = roundtrip time taken for a batch of events to travel from logger to ESM and the
acknowledgment for the batch to come back from ESM logger
The US logger took an average of 40 msec/batch and the UK logger took an average over 500 msec/batch
This large difference in the round trip time is indicative of network latency due to geographical distance and is
the root cause of caching in the UK logger

Customer Case 3- Problem

Customer had loggers in UK and USA forwarding


events to an ESM manager in US. Only the UK
loggers were experiencing caching and event loss

Logger in UK
Onboard
Connector
caching

Logger in USA

USA Logger: Time per Batch ~ 40 msec

UK Logger: Time per Batch > 500 msec

Solution
Enabled multithreading on the ESM transport with a thread count of 2, this showed an improvement in throughput
Increased the thread count to 7 (number of processors in the CPU) and caching went away completely
23

Onboard
Connector

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

ESM
In USA

Syslog Connector Performance


Tuning

Recommendations

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Syslog Connector Performance Tuning


Some Recommendations
Evaluate the number of devices, device types and event volume early in your deployment cycle
Split the load among multiple connectors when the incoming event rate exceeds the achievable maximum
that varies based on the underlying platform and environment
When splitting the load, consider grouping the devices of same type to one connector and another type to a
different connector
Evaluate the total capacity of your machine and other processes running to determine the number of
connectors to install on a single machine
Cumulative heap size allocated to connectors and other java processes should be well below the total
memory available on the system
Use 64 bit syslog connector builds to overcome the out of memory errors

25

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Please give me your feedback


Session TB3248 Speakers Girish Mantry, Moehadi Liang

Please fill out a survey.


Hand it to the door monitor on your way out.
Thank you for providing your feedback, which
helps us enhance content for future events.

26

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Thank you

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

S-ar putea să vă placă și