Sunteți pe pagina 1din 37

Best Practices & Guidelines

for Smarts Performance &


Scalability
2015 London User Group Meeting
Barry Weinstein
Product Management
barry.weinstein@emc.com

Copyright 2015 EMC Corporation. All rights reserved.

EMC Notifications
Forward Looking Statements Notice
Copyright 2015 EMC Corporation. All rights reserved
This roadmap document contains forward-looking statements as defined under the Federal Securities Laws.
Actual results or deliverables could differ materially from those projected in the forward-looking statements
as a result of certain risk factors, including but not limited to: (i) adverse changes in general economic or
market conditions; (ii) delays or reductions in information technology spending; (iii) risks associated with
acquisitions and investments, including the challenges and costs of integration, restructuring and achieving
anticipated synergies; (iv) competitive factors, including but not limited to pricing pressures and new product
introductions; (v) the relative and varying rates of product price and component cost declines and the volume
and mixture of product and services revenues; (vi) component and product quality and availability; (vii) the
transition to new products, the uncertainty of customer acceptance of new product offerings and rapid
technological and market change; (viii) insufficient, excess or obsolete inventory; (ix) war or acts of
terrorism; (x) the ability to attract and retain highly qualified employees; (xi) fluctuating currency exchange
rates; and (xii) other one-time events and other important factors disclosed previously and from time to time
in EMCs filings with the U.S. Securities and Exchange Commission. EMC disclaims any obligation to update
any such forward-looking statements after the date of this roadmap document.

Copyright 2015 EMC Corporation. All rights reserved.

Agenda
Assumptions
Your Mileage May Vary
Memory
32bit vs 64bit architecture (a look back in time)
Operating System Memory
vSphere Machine Memory / Transparent Page Sharing

vSphere Guideline & Best Practices


Smarts Performance and Scalability
Copyright 2015 EMC Corporation. All rights reserved.

Our Assumptions
Support the largest Smarts topology possible

Avoid splitting topology whenever possible


Reduce Smarts footprint

Provide a responsive UI for operations and ensure timely notification processing


Optimize the time it takes for Smarts from start-up to monitoring (time to first
poll)

Fastest time possible

Polling should complete within the polling interval

Minimize late polling

Deploy on vSphere according to best practices


Provide guidance on server hardware

Copyright 2015 EMC Corporation. All rights reserved.

Etymology
The study of the history of words, their origins, and how their form and meaning
have changed over time.
In the United States, the Environmental Protection Agency requires a set of
standard emissions tests on all new vehicles which simulate city and highway
driving. Part of the test measures estimated city and highway gas mileage
estimates. Since no test can exactly simulate all driving habits and conditions,
actual gas mileage of each vehicle will vary. As a result, when these estimated
mileage claims from automobile manufacturers appear in advertisements, they
are almost always accompanied with the standard disclaimer "your mileage may
vary."
your mileage may vary

(idiomatic) It may work differently in your situation, or be different in your


experience. Those batteries last nine hours in my laptop, but your mileage may vary.
(idiomatic) To express a possible difference in taste, "this is just my opinion, your
opinion may be different". That red dress looks really good on you, but your mileage
may vary, of course

Copyright 2015 EMC Corporation. All rights reserved.

Follow VMWare Guidelines and Best Practices

Performance Best Practices for VMware vSphere 5.5


Performance Best Practices for VMware vSphere 5.1

Copyright 2015 EMC Corporation. All rights reserved.

Virtual Machine Memory


Gee, everything looks fine Why is my VM running poorly?

Copyright 2015 EMC Corporation. All rights reserved.

Virtual Memory
The basic idea behind virtual memory is that the combined size
of all the program code, data, etc may exceed the amount of
physical memory available for it. The OS keeps only those
parts of the program which are currently needed by the CPU in
RAM (main memory) and the rest on disk (pagefile)
A 16MB program can run in 4MB space by carefully choosing
which 4MB to keep in memory at each instance, with piece of
the program being paged between disk and memory as needed.

Copyright 2015 EMC Corporation. All rights reserved.

History: Challenges with 32 bit Smarts


Virtual Memory Challenged

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

4G
RAM

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

~2.5-3G
Available RAM

SMARTS MODEL
Larger the Topology, the More Memory Used

Smarts Program Code


Operating System

Copyright 2015 EMC Corporation. All rights reserved.

64 bit Smarts
Memory is no longer the bottleneck (Virtual and Physical)

4096G
RAM

~4000G
Available RAM

Virtual Machine Size in vSphere


How much Operating System
memory is made available to
this VM
SMARTS MODEL
Larger the Topology the More Memory

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Smarts Program Code

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Polling
Polling &
&
Discovery
Discovery
Threads
Threads

Operating System

Copyright 2015 EMC Corporation. All rights reserved.

10

My
DLL

My
DLL
My
DLL

My
DLL

VM RAM is
the amount
of RAM the
OS can
access

My
DLL

Transparent Page Sharing

RAM on the Physical ESX Server


Copyright 2015 EMC Corporation. All rights reserved.

11

Memory in vSphere is Confusing

TPS Memory

Copyright 2015 EMC Corporation. All rights reserved.

12

There are so many Server CPU choices. Which


one should I use?
Smarts has functions that are single threaded and others that
are multi-threaded

For example: Codebook creation is single threaded


Polling and discovery are multi-threaded
Faster clock speeds are critical for Smarts
Choose the CPU with the fastest clock speed when possible
This is more important for larger topologies

Use our sizing spreadsheet to calculate memory requirements


Memory requirements typically do not vary and are a good
indicator of practical memory requirements

Copyright 2015 EMC Corporation. All rights reserved.

13

General Hardware Recommendations


Acquire latest generation of affordable server technology
http://en.wikipedia.org/wiki/List_of_Intel_CPU_microarchitectures
http://en.wikipedia.org/wiki/List_of_Intel_microprocessors
Larger topologies benefit more from newer server hardware

If running ESX 4.x use Intel Nehalem(or AMD equivalent) or higher


Released on Nov 17th, 2008 (vSphere v4.x EOL )
Extended Page Tables(EPT) & Hyperthreading

If running ESX 5.x use Intel Sandy Bridge(or AMD equivalent) or higher
Released January 9th, 2011
More physical/logical cores, 11%+ increase in performance

If you are planning to use vMotion we highly recommend that your ESX servers
have a dedicated physical NIC configured for vMotion.
Note: vMotion requires shared (external) storage

Copyright 2015 EMC Corporation. All rights reserved.

14

Use SPECINT as a Guideline


http://www.spec.org/cpu2006/results/cpu2006.html
Industry standard benchmark
Not just processor speed

Your mileage still may vary


It should give you a relative indicator of how fast is the server

The larger the topology


The higher SPECINT CPU will perform better

Specint ratings are now > 100

Copyright 2015 EMC Corporation. All rights reserved.

15

Do not Run
Service Assurance Suite on
vSphere if you do not have a
Nehalem Processor(or AMD
equivalent) or Higher

Copyright 2015 EMC Corporation. All rights reserved.

16

Performance on ESX over Time


Proportion of Apps Performing Well

100%

ESX 2
Overhead

VM CPU

ESX 3

30% -

20% -

60%

30%

1 vCPU

VM Memory 3.6 GB

20%

<2% 10%

4 vCPU

8 vCPU

16 GB

64 GB

256 GB

100,000

IOPS
380 MBits

<10% -

vSphere/ESX 4.0

2 vCPU

<10,000
I/O

ESX 3.5

IOPS
800 MBits

9 GBits

>200,000
IOPS
40 GBits

ESX Version
Copyright 2015 EMC Corporation. All rights reserved.

17

Improvements to vMotion in vSphere 5.x

Copyright 2015 EMC Corporation. All rights reserved.

18

Smarts Typically
Does NOT Exploit
More Than 4 vCPUS
Use M&R Guidelines
M&R SQL Server Database can use
> 4vCPUs
Check P&S guidelines

Copyright 2015 EMC Corporation. All rights reserved.

19

Percent Improvement Internal P&S testing (Nehalem processor)


Comparisons done for 2nd discovery to show the effects of monitoring
Linux and Windows tested with both VM and bare metal configurations
Time measured from start of second discovery to system ready across multiple topologies
and averaged for each OS below
Topologies tested below ranged in size: Interfaces 4K - 78K, Ports 3K 61K, Managed
Ports 92 - 2.5K
Discovery time improved significantly
Running the default discovery and polling threads
ESX 4.0, RH 5.x, Windows 20003

AM

AM/PM

77

72

70
7.0.3

58

55

7.0.3

54

8.1.1

8.1.1
33

36

36

35

* 7.0.3 results normalized to 100% to show relative improvement in 8.1.1


Copyright 2015 EMC Corporation. All rights reserved.

20

VMWare Tools
VMware Tools is a set of utilities and drivers that improve the
performance and management of your virtual machines. The
tools are not installed by
default. During testing,
significant performance
degradation was observed
without VMware Tools
installed. Therefore, it is
recommended that VMware
Tools is installed on every VM.

Copyright 2015 EMC Corporation. All rights reserved.

21

VMware Network Driver


The VMware
Network Adapter
type is by default
set to E1000. It is
strongly
recommended that
you change your
network adapter
type to VMXNET3

Copyright 2015 EMC Corporation. All rights reserved.

22

Hyperthreading (enabled in BIOS)


VMware has recommended that Hyperthreading be
turned ON
Also called Symmetric Multithreading (SMT)

Copyright 2015 EMC Corporation. All rights reserved.

23

Virtual Machine Memory Concepts

Configured memory = memory size of virtual machine assigned at creation.

Touched memory = memory actually used by the virtual machine. vSphere only allocates guest
operating system memory on demand.

Swappable = virtual machine memory that can be reclaimed by the balloon driver or by vSphere
swapping. Ballooning occurs before vSphere swapping. If this memory is in use by the virtual
machine (i.e., touched and in use), the balloon driver will cause the guest operating system to
swap. Also, this value is the size of the per-virtual machine swap file that is created on the VMware
Virtual Machine File System (VMFS) file system (.vswp file).

If the balloon driver is unable to reclaim memory quickly enough, or is disabled or not installed,
vSphere forcibly reclaims memory from the virtual machine using the VMkernel swap file.

Copyright 2015 EMC Corporation. All rights reserved.

24

Dont limit CPU, Memory, I/O, Network

Copyright 2015 EMC Corporation. All rights reserved.

25

RPS on a SSD (Flash Drive)


Several customers have seen a 50% improvement in
elapsed time reading/writing the RPS
Remember there are 2 RPS files (primary, secondary)
See if this would be significant in your environment

[2015/05/12 08:42:56 +723ms] t@896096000 InCharge Framework


ICF_MSG-*-ICF_RESTORESTART-PersistenceManager: restore started
[2015/05/12 08:46:56 +148ms] t@896096000 InCharge Framework
ICF_MSG-*-ICF_RESTOREFINISH-PersistenceManager: restore finished

Copyright 2015 EMC Corporation. All rights reserved.

26

FASTVP or Equivalent is Bypassed


Feature on the storage array

FASTVP and similar features from array vendors use a LRU


(Least Recently Used) algorithm to keep frequently used
storage on a SSD
And least frequently used storage on slower spinning disks

The RPS file(s) are relatively small and infrequently used but
..
Are critical to Smarts and single thread discovery
Larger topologies mean larger RPS files

https://www.emc.com/collateral/white-papers/h12102-vnx-fa
st-vpwp.pdf
Copyright 2015 EMC Corporation. All rights reserved.

27

ESX Servers in Maintenance Mode


Multiple VMs get vMotiond Simultaneously

Maintenance Mode prepares an ESX to be taken


down/offline
vCenter Operator issues a command
All VMs get vMotiond off the ESX, up to 4 vMotions, in
parallel
Network might/w become oversubscribed
vMotions may fail or timeout

RECOMMENDATION:
vMotion 1 VM at a time
Then put the ESX in Maintenance Mode
Copyright 2015 EMC Corporation. All rights reserved.

28

How Can I tell if the ESX Server is


oversubscribed for CPU?
How many sockets (physical CPUs)?
How many cores per socket?
Total cores = (# of Sockets) * (cores/socket)
How many vCPUs are configured for all VMs on the
ESX server?
If total cores > sum(vCPUS on all VMs)
You are oversubscribed (this might not be bad)
Copyright 2015 EMC Corporation. All rights reserved.

29

How Many Cores are Available on Your ESX


Server?
Recommendation
Do not over commit CPU and Memory resources
Smarts domains typically do not benefit with more than 4 vCPUs
Large topologies, depending on network latency will benefit from 4
vCPUs
Medium to small topologies, depending on network latency will
likely only require 1 or 2 vCPUS
The Smarts Broker 1 vCPU or combined with another Smarts
server
Dedicate a Cluster (group of ESX servers) to management
All workloads are under your control

Copyright 2015 EMC Corporation. All rights reserved.

30

Misc Issues
Affinity / Anti-Affinity
Consider M&R Collector(s) to always run on the same ESX
as SAM
Consider SAM and large IP domains to run on the same ESX

ESM should be local to the vCenter servers


Consider disabling Vmware Tools polling

Consider WARM STANDBY to provide continuous


availability

Copyright 2015 EMC Corporation. All rights reserved.

31

Beware of vSphere Optimization Tools

Copyright 2015 EMC Corporation. All rights reserved.

32

So What is your Advice in Deploying Smarts on


vSphere?
Size the VMs like physical machines according to the P&S
guidelines
Dont limit(cap) memory, cpu, network, i/o
Add 50% more memory(RAM)
vSphere will not allocate it if it isn't needed

Larger topologies benefit from faster servers


Use historical data if converting from physical to virtual

Copyright 2015 EMC Corporation. All rights reserved.

33

What does High Latency Mean


Network latency
ping time

PLUS
Device latency
Device latency is how long the SNMP agent takes to respond
Large devices with many 100s+ ports/interfaces can take many
seconds to minutes to completely respond or timeout
SNMP agents may have been de-prioritized by the device in times
of high (cpu) utilization
Copyright 2015 EMC Corporation. All rights reserved.

34

So How Many devices (ports and interfaces)


can Smarts Support Today?
How much time does it take to do discovery?
Higher latencies will benefit from more discovery threads
If CPU utilization is <65% (just an estimate)
Increase discovery threads and measure
At some point adding threads does not improve discovery time or maxs
out the cpu
Every customer/configuration has a different tolerance for discovery time

Can we poll all devices within the polling interval?


Measure avg late polling does polling complete within the polling
interval (use W4N Smarts Health Solution Pack)
Typically polling is not a CPU intensive process
Increase the number of polling threads and measure
At some point adding threads does not improve polling
Copyright 2015 EMC Corporation. All rights reserved.

35

Customer Feedback (REAL WORLD)


A single IP Domain with 640K Ports and Interfaces
Server Specint of 30

Running on vSphere
Discovery time ~ 20 hrs, avg late polling > 10 secs
10 discovery threads / 10 polling threads

Increase threads to 50 / 50
No resource constraints
Planning to increase threads to 80 -100

Discovery time 5 hrs / avg late polling 3 seconds


Copyright 2015 EMC Corporation. All rights reserved.

36

THANK YOU

Copyright 2015 EMC Corporation. All rights reserved.

37

S-ar putea să vă placă și