Sunteți pe pagina 1din 20

HP Global Technical Partner − Cadence

HP−UX Kernel Tuning Guide


for
Technical Computing

Getting The Best Performance


On Your Hewlett−Packard HP 9000 Systems
Version 2.0
Introduction
This document describes the underlying basics of why and how a HP−UX kernel is tuned and configured. The
intent is to provide customers, developers, application designers, and HP's technical consultants the
information necessary to optimize the performance of existing hardware configurations and to make
intelligent decisions when running applications on HP's UNIX platforms.

Hardware Considerations
HP, and other hardware vendors, offer a broad selection of products with a wide range of CPU performance,
memory and disk options, varying greatly in price. Obviously, performance of a software application will be
affected by the hardware selected to run it on. The reason so many different products are available, is to allow
the customer to select the most cost effective solution for their particular software problem. A large, heavily
configured system may not be utilized to its full potential if you only need to solve small, simple problems
while a less capable system may be overloaded trying to solve large, complex problems that exceed its
capacity. Under these circumstances, neither system would be cost effective when utilized in this manner.
Selecting the most cost effective system requires understanding your compute requirements as well as the
hardware options.

There five key hardware areas that directly affect the performance you will obtain from your application:
CPU, Memory, Disk, Graphics, and Network. While all these hardware areas are important, it is equally
important to configure a balanced system. It is counter productive to buy the fastest CPU and then configure it
with insufficient memory. You might get better performance and throughput with a slower, less expensive,
CPU with the difference in price invested in more memory.

There are a large number of variables to consider when deciding on the hardware for your compute
infrastructure. The compute needs may vary from the very simple to the incredibly complex.

The best way to select the appropriate hardware configurations is to resolve your compute needs:

• How many users need to be served?


• What are the data server needs ?
• What are the compute server needs ?
• What are the application software needs?

Getting The Best Performance On Your Hewlett−Packard HP 9000 Systems Version 2.0 1
HP Global Technical Partner − Cadence
There should be couple of different system configurations to fully cover your environment. Maybe 1, 2 or 3
base system configurations will properly handle your desktop computing needs: one hardware configuration
for one type of user, a slightly different configuration for another and yet another configuration for the userr
who has major memory and swap requirements for her/his system. There may be a need for managing both
small and large batch tasks under a compute server or task queuing methodology. A data server will be needed
for storing the large amounts of data with a reliable backup system and revision control system. Add to this
collection a software server dedicated to manage large software applications and licensing programs. The best
way to select your appropriate hardware configuration(s) is to perform benchmark tests that duplicate your
intended use of the system. With relevant benchmark data in hand, you will have the information you need to
make intelligent tradeoff decisions on the cost/performance benefits of the available hardware options for your
site.

CPU

Many operations require a large number of integer and floating point calculations. A few applications will use
integer calculations, but others might rely heavily on floating point calculations. CPU performance is the
single most important performance factor for executing a large number of calculations in the shortest possible
time. Selecting the CPU is a tradeoff between cost, the size of the problems you will be solving, and your
perception of adequate performance. If an operation takes five seconds, is it worth it to you to spend an extra
$10,000 to do the operation in three seconds? However, if the operation takes five hours and the time can be
reduced to one or three hours, it may be worth the added expense. If the operation is done several times a day
it is almost certainly worth it. If it is only done once a month then it may be questionable. When evaluating
hardware performance, you must prioritize the tasks to be performed relative to their importance, frequency,
and impact on overall productivity.

Tasks that are most affected by CPU performance are those that involve more computation than disk access or
graphics display. Don't forget to consider investment protection. The CPU that seems adequate today may not
meet your needs in the near future. The rapid pace of hardware development makes existing systems obsolete
in a very short period of time. How easy will it be for you to upgrade your systems to increase MIP's capacity
or take advantage of the latest compiler or hardware technology?

One standard benchmark that you can use to gauge CPU performance is SPECint.

Memory

One of the most commonly asked questions is "How much memory do I need?". Unfortunately, the real
answers to this question are "Enough" and "It depends". The amount of memory you need is directly related to
the size of the applications you are working with. While 'X' amount of memory may allow you to run your
application, it may not be large
enough to allow for optimal performance. Memory management is a complex topic. Memory, its relationship
to swap space, and its effect on performance are discussed in more detail in the section "Understanding
Memory and Swap" later in this document. Again, cost must be weighed versus benefits; certainly you can
spend the money to configure a system with enough memory to allow your application to be run in memory,
but depending on the application, the cycle time savings may not be worth it.

Disk

Sometimes data can be quite large. Disk I/O is often a performance bottleneck. Other than the obvious effects
on data loading bandwidth, disk I/O can also be the limiting factor in overall performance if a system starts
paging.

Hardware Considerations 2
HP Global Technical Partner − Cadence
HP's philosophy is to design balanced systems in which no single component becomes a performance
bottleneck. HP has made significant enhancements to I/O performance in order to keep pace with the speed of
our CPUs. I/O performance depends on several parts of the system working together efficiently. The I/O
subsystems have been redesigned so that they now offer the industry's fastest and most functional I/O as
standard equipment.

To improve disk I/O performance:

Distribute the work load across multiple disks. Disk I/O performance can be improved by splitting the work
load. In many configurations, a single drive must handle operating system access, swap, and data file access
simultaneously. If these different tasks can be distributed across multiple disks then the job can be shared,
providing subsequent performance improvements. For example, a system might be configured with four
logical volumes, spread accross more than one physical volume. The HP−UX operating system could exist on
one volume, the application on a second volume, swap space interleaved across all local disk drives and data
files on a fourth volume.

Split swap space across two or more disk volumes. Device swap space can be distributed across disk volumes
and interleaved. This will improve performance if your system starts paging. This is discussed in more detail
in the section on Swap Space Configuration later in this document.

Enable Asynchronous I/O − By default, HP−UX uses synchronous disk I/O, when writing file system "meta
structures" (super block, directory blocks, inodes, etc.) to disk. This means that any file system activity of this
type must complete to the disk before the program is allowed to continue; the process does not regain control
until completion of the physical I/O. When HP−UX writes to disk asynchronously, I/O is scheduled at some
later time and the process regains control immediately, without waiting.

Synchronous writes of the meta structures ensure file system integrity in case of system crash, but this kind of
disk writing also impedes system performance. Run−time performance increases significantly (up to roughly
ten percent) on I/O intensive applications when all disk writes occur asynchronously; little effect is seen for
compute−bound processes. Benchmarks have shown that load times for large files can be improved by as
much as 20% using asynchronous I/O. However, if a system using asynchronous disk writes of meta
structures crashes, recovery might require system administrator intervention using fsck and, might also cause
data loss. You must determine whether the improved performance is worth the slight risk of data loss in the
event of a system crash. A UPS device, used in a power failure event will help reduce the risk of lost data.

Asynchronous writing of the file system meta structures is enabled by setting the value of the kernel
parameter fs_async to 1 and disabled by setting it to 0, the default. For instructions on how to configure kernel
parameters, see the section Kernel Configuration Parameters later in this document.

You may want to use a RAID (Redundant Array of Inexpensive Disks) configuration for reliability. Most
RAID configurations do not perform as well as non−RAID configurations, but the reliability gains may be
worth it.

Graphics and Color Mapping

Many tools use 2−D graphics, and are X11 based. Thus, a platform's X11 performance is key to maximizing
the graphics performance of these applications. This can be measured with the standard benchmark xmark93.

Disk 3
HP Global Technical Partner − Cadence

Network

Many installations are client/server networks, primarily because of the need for shared data and massive
amounts of on−line storage. Therefore, the network configuration can be, and usually is critical to the overall
performance and throughput. Most current networks are ethernet−based, which, when combined with a 700
class machine may create an unbalanced situation. For example, a single HP 735 can almost saturate a single
ethernet wire under the right conditions. See the section labeled Networking later in this document for tuning
and configuration guidelines for ethernet networks. You can, of course, upgrade to Fast Ethernet, FDDI,
ATM, or other faster network technology if you have the money.

Understanding Memory and Swap

There is a lot of confusion regarding cache memory, configuration of swap space, swap's relationship to
physical memory, kernel parameters affecting memory allocation, and performance implications. If there was
a simple formula, this would be easy. However, this is not the case. It is important to understand memory in
order to understand these settings and how to determine optimal settings for a given situation.

Memory Management

HP−UX memory management system is composed of 3 basic elements: Cache, memory and swap space.
Swap space can be composed of two types: device swap space and file system swap space. Device swap space
can be made up of primary swap space that is defined on the root file system disk drive and secondary swap
space which is defined on the remaining disk volumes. All of these memory elements can be optimized
through HP−UX kernel parameter tuning or application compile.

The data and instructions of any process (a program in execution) must be available to the CPU by residing in
physical memory at the time of execution. RAM, the actual physical memory (also called "main memory"), is
shared by all processes. To execute a process, the HP−UX kernel executes through a per−process virtual
address space that has been mapped into physical memory.

The term "memory management" refers to the rules that govern physical and virtual memory and allow for
efficient sharing of the system's resources by user and system processes.

Memory management allows the total size of user processes to exceed physical memory by using an approach
termed demand−paged virtual memory. Demand paged virtual memory enables you to execute a process by
bringing into main memory parts of the process only as needed, that is, on demand, and pushing out to disk,
parts of a process that have not been recently used.

The HP−UX operating system uses paging to manage virtual memory. Paging involves moving small units
(called pages) of a process between main memory and disk swap space.

One method for increasing the efficiency of memory allocation within memory management is the usage of
the mallopt command before each malloc call within the EDA application code. This command is unique to
HP−UX and controls the memory allocation algorithm and other optimization options within the malloc
library. Usage of this option can improve application execution time up to 10X depending on the data size. It
is important that the Maxfast and Numlblks options (i.e. the first two options to mallopt) be defined to reflect
the data size links being accessed.

Network 4
HP Global Technical Partner − Cadence

Physical Memory

Physical memory is composed of hardware known as RAM (also called SIMM's, DIMM's, etc...). For the
CPU to execute a process, the relevant parts of a process must exist in the system's RAM.

The more main memory in the system, the more data it can access and the more or larger a process(es) it can
execute without having to page. This is because the system can retain more processes in main memory, thus
requiring the kernel to page less frequently. Each time the system has to page there is a performance cost since
the speed of reading or writing from/to disk is much slower than accessing memory.

Not all physical memory is available to user processes. The kernel occupies some main memory (that is, it is
never paged).
The amount of main memory not reserved for the kernel is termed available memory. Available memory is
used by the system for executing processes.

Secondary Storage

Main memory stores computer data required for program execution. During process execution, data resides in
two faster implementations of memory found in the processor subsystem, registers and cache. Program files
are kept in secondary storage or secondary memory, typically disks accessible either via system buses or
network. Data is also stored when no longer needed in main memory, to make room for active processes.

Swap

A temporary form of secondary data storage is termed swap, dating from early UNIX implementations that
managed physical memory resources by moving, i.e. swapping, entire processes between main memory and
secondary storage. HP−UX uses paging, a more efficient memory resource management mechanism. It should
be noted that HP−UX does not "swap" any more, it pages and, as a "last resort" deactivates processes. The
process of deactivation replaces what was formerly known as swapping entire processes out.

While executing a program, data and instructions can be paged (copied) to and from secondary storage, or
disk, if the system load warrants such behavior.

Swap space is initially allocated when the system is configured. HP−UX supports two types of swap space:
device swap space and file system swap space. Device swap is allocated on the disk before a file system has
been created and can take the following forms:

• an entire disk
• a designated area on a disk
• a software disk−striped partition on a disk

If the entire disk hasn't been designated as swap, the remaining space on the disk can be used for a file system.
File−system swap space is allocated from a mounted file system and can be added dynamically to a running
system. If more swap space is required, it can be added dynamically to a running system, as either device
swap or file−system swap.

Note that file−system swap has significantly lower performance than device swap as it must use separate
read/write requests for each page block and has a smaller page swapping size than used in device swap. The
I/O for file system swap will contend with user I/O on that file system, which will cause performance to
degrade. File system swap space usage should be avoided.

Physical Memory 5
HP Global Technical Partner − Cadence

Either Sam or the swapon command can be used to enable disk space or a directory in a file system for swap.

NOTE: Once allocated, you cannot remove either type of swap without rebooting the system. HP−UX also
uses a early swap space reservation method to make sure it has space available but it only allocates the space
when it actually needs to write to it.

Virtual Address Space

Virtual memory uses a structure for mapping processes termed the virtual address space. The virtual address
space contains information and pointers to the memory that the process can reference.

One virtual address space (vas) exists per process and serves several purposes:

• It provides the overall description of each process.


• It contains pointers to another element in the memory management subsystem − per−process regions.
(pregions)
• It keeps track of pregions most recently involved in page faults.

Each HP−UX process executes within a 4 Gb virtual address space (this may change in the near future). The
virtual address space structure points to per−process regions, or pregions. Pregions are logical segments that
point to specific segments of a process, including code (text, or process instructions), data, u_area and kernel
stack, user stack, shared memory segments and shared library code and data segments.

The size of various memory segments is controlled by the values assigned to certain configurable kernel
parameters. It is beyond the scope of this paper to discuss all the process virtual memory segments. The
following, however, is a description of the segments most relevant to this discussion.

Text − The text segment holds a process's executable object code and may be shared by
multiple processes. The maximum size of the text segment is limited by the configurable
operating−system parameter maxtsiz.

Data − The data segment contains a process's initialized (data) and uninitialized (.bss) data
structures, along with the heap, private "shared" data, "user" stack, etc. A process can
dynamically grow it's data space. The total allotment for initialized data, uninitialized data
and dynamically allocated memory (heap) is governed by the configurable kernel parameter
maxdsiz.

Stack − Space used for local variables, subroutine return addresses, kernel routines, etc. The
u_area contains information about process characteristics. The kernel stack , which is in the
u_area, contains a process's run−time stack while executing in kernel mode. Both the u_area
and kernel stack are fixed in size. Space available for remaining stack use is determined by
the configurable parameter maxssiz.

Shared Memory − Address space which is sharable among multiple processes.

Configurable Parameters

HP−UX configurable kernel parameters limit the size of the text, data, and stack segments for each individual
process. These parameters have pre−defined defaults, but can be reconfigured in the kernel. Some may need
to be adjusted when swap space is increased. This is discussed in more detail in the section on configuring the
HP−UX kernel.

Swap 6
HP Global Technical Partner − Cadence

bufpages Sets number of buffer pages


create_fastlinks Store symbolic link data in the inode
fs_async Sets asynchronous write to disk
hpux_aes_override Controls directory creation on automounted disk drives
maxdsiz Limits the size of the data segment.
maxfiles Limits the soft file limit per process
maxfiles_lim Limits the hard file limit per processes
maxssiz Limits the size of the stack segment.
maxswapchunks Limits the maximum number of swap chunks
maxtsiz Limits the size of the text (code) segment.
maxuprc Limits the maximum number of user processes
netmemmax Sets the network dynamic memory limit
nfile Limits the maximum number of "opens" in the system
ninode Limits the maximum number of open inodes in memory
nproc Limits the maximum number of concurrent processes
npty Sets the maximum number of pseudo ttys
The four GB virtual address space is divided into four one−GB quadrants. Each quadrant has associated with
it:

• The first quadrant always contains the process's text segment (code), and sometimes some of the data
(EXEC_MAGIC).
• The second quadrant contains the data segment (static data, stack, and heap, etc.).
• The third quadrant contains shared library code, shared memory mapped files and sometimes shared
memory.
• The fourth quadrant contains shared memory segments, shared memory−mapped files, shared library
code, and I/O space.

Physical Memory Versus Performance

The amount of memory available to applications is determined by the amount of swap configured plus
physical memory. The size of physical memory determines how much paging will be done while applications
are running. Paging imposes a performance penalty because pages are being moved between physical memory
and secondary storage, or disk. The more time that is spent paging, the slower the performance. There is a
critical threshold for physical memory size below which the system spends almost all its CPU time paging.
This is known as thrashing and is evident by the fact that system performance virtually comes to a standstill
and even simple commands, like ls, take a long time to complete.

Optimally, all operations would be done in physical memory and paging would never occur. However,
memory costs money, so there is usually a tradeoff made between budgetary constraints and the minimum
acceptable performance level. Understanding how memory size affects performance can help you make sure
you are maximizing your expenditure on memory. One thing to keep in mind is that memory needs are always
changing and the base system configuration will need to be constantly addressed. HP's Glance/GlancePlus is a
good application that will help you address and resolve memory versus performance issues.

Where Is The Memory Going?

To help you understand the minimum memory configuration you should consider, it helps to understand how
memory is consumed. On a system, you will minimally have the following memory consuming resources:

Configurable Parameters 7
HP Global Technical Partner − Cadence
• HP−UX Operating System 10−12 MB
• Windowing System 21 MB (X11) 25 MB (VUE) 32 MB (CDS)

Any other processes or services running on the system will consume additional memory resources. As you can
see, if you add these up, before you even load the first part, you are already consuming approximately 50Mb
of memory. This isn't quite as straightforward as it seems, however. HP−UX uses a paging algorithm to move
data in and out of physical memory. The only data that isn't subject to paging is HP−UX itself. Out of the
25Mb of executable code in VUE, you will not be using all of it at any given time. Since code will be
overwriten if it isn't used, and there are many functions in VUE that you may seldom or never use, there is
some percentage of the executable code that will never be paged in. This same behavior applies to
applications. For example, an application that involves significant disk I/O or LAN activity, followed by
intensive CPU activity.

Determining Appropriate Physical Memory Size

There are a couple of ways to determine whether the amount of physical memory in your system is adequate.
The first is to run a series of timed benchmarks on systems with increasing levels of physical memory and
determine the impact of additional memory on those operations. Another way is to use one of HP's
performance tools to monitor the system operation. It will tell you how much paging is occurring, if any.

If you plot memory size versus time to perform an typical operation in an application, you will get a dog−leg
shaped curve for most operations. This means that performance increases on a fairly steep curve as memory
size is increased up to a point. Beyond that point, the curve flattens out and adding additional memory will not
significantly improve performance.

The ideal memory configuration is one that falls on the breakpoint. If your memory is less than the breakpoint,
you are not getting all the performance you could from your system. The performance breakpoint varies
depending on the operation being performed in combination with the data set used. The only accurate way to
determine the optimal memory size is to perform timed benchmarks using real data.

HP−UX Configuration
This section explains HP−UX configurable software settings and parameters that affect system capacity
and/or performance. Most of this section is common for HP−UX 9.X and HP−UX 10.X. Specific differences
are noted.

Swap Configuration

How much swap do I have?

SAM, Glance/GlancePlus, top, and swapinfo all show swap information. To see how much swap space is
configured on your system, and how much is in use, execute one of the following commands:

• top
• Glance/GlancePlus
• sam requires root passwd
• /etc/swapinfo −t HP−UX 9.X systems and requires root login
• /usr/sbin/swapinfo −t HP−UX 10.X systems and requires root login

Physical Memory Versus Performance 8


HP Global Technical Partner − Cadence
Any user can execute top and Glance. The program sam and command swapinfo both require root privilege.
This is because these commands must open the kernel memory file /kmem to read the swap usage information
. Since this is a critical operating system file, access is usually restricted to root only.

How Much Swap Do I need?

The amount of swap available determines the maximum address space, or virtual memory, available for
applications . The minimum recommendation is twice as much swap space as physical memory. If swap is too
small, and you try to load something that exceeds available swap you will get an out of memory error. If you
configure more swap than you will ever need, you are wasting valuable disk space. The correct swap size will
vary considerably depending on the application(s) run on a system.The optimal swap configuration may vary
between individual users and/or systems. However, optimizing swap on a user to user basis is not advised. A
common swap size for systems should be resolved for ease of supportability and maximum long−term design
flexibility.

The correct swap space configuration for your site can only be accurately determined by monitoring swap
usage while working with real data. This could be done either with the swapinfo command or using a tool like
HP's GlancePlus. GlancePlus allows you to monitor system resources on a per process basis and will track
high water marks over a period of time. You would configure a system with more swap than you expect to
need and then run GlancePlus while running an application in a real work environment. By monitoring the
high water mark, you can determine the maximum swap space used and adjust the swap configuration
accordingly. Obviously, if you experience out of memory errors, swap space is too small.

Swap space should not be less than the amount of physical memory in your system.

NOTE: For best performance, swap space should be distributed evenly across all disks at the same priority .
There are two types of swap space in HP−UX, device and file system. Device swap provides much better
performance because it utilizes the raw disk I/O. File system defined swap space should be avoided.

Configuring Swap Space

As mentioned previously, device swap is preferred over file system swap to achieve the best performance. The
ideal swap configuration is device swap interleaved on two or more disks. When device swap is interleaved on
2 or more disks, the system alternates between the disks as paging requests occur, providing better
performance than a single disk.

SAM is the easiest method for adding and configuring swap space. Swap configuration is under the Disks and
File System area of SAM. For more information on configuring swap, please see the on−line Help section
within SAM's Swap Configuration.

Kernel Configuration Parameters

Bufpages
Bufpages specifies how many 4096−byte memory pages are allocated for the file system buffer cache. These
buffers are used for all file system I/O operations, as well as all other block I/O operations in the system (exec,
mount, inode reading, and some device drivers.).

In HP−UX 10.X, we highly recommend this kernel parameter be set to 0. This will enable dynamic buffer
cache which has been changed in the 10.X OS.

Swap Configuration 9
HP Global Technical Partner − Cadence
In HP−UX 9.X, we do NOT recommend using dynamic buffer cache. A fixed buffer cache can be specified
by setting bufpages to a non−zero value, for example, 4096 and nbuf to 0. This will set 2048 buffer headers
and allocate 16 Kb of buffer pool space at system boot time. If you wish to reserve 10% of physical memory
for the file system buffer cache, the value can be calculated as:

bufpages = (.1 * ((physical memory in Mb) / (pagesize in 4096 bytes)) ).


Create_Fastlinks
Create_fastlinks tells the system to store HFS symbolic link data in the symbolic link's inode. This reduces
disk space usage and speeds things up. By default, this feature is disabled for backward compatibility. We
recommend all systems have create_fastlinks enabled by setting this kernel parameter to 1.

Dbc_Max_Pct
This parameter determines the percentage of main memory that the dynamically allocated buffer cache is
allowed to grow to. As the system will use as much memory as it can for buffer cache, when performing
intense block I/O, this becomes the size of the buffer cache on a system that is not feeling memory pressure
due to process invocations. The problem arises when memory stress due to process space requirements
requires the system to start paging, at which point, the system tries to reclaim buffer cache pages to allocate
them to running processes. But the system is also trying to allocate as much buffer cache as it can, causing a
vicious cycle of allocating and deallocating memory between buffer cache and process memory space,
creating a large amount of overhead.

The idea then is to keep this number resonably low, allowing you to have the cache space but also keep the
application space large enough to avoid high levels of conflict between them. The default value is 50%, but
we recommend 25% to start. We have seen systems that need buffer cache to have a max of as little as 5%,
with a min at 2%. This is something that requires careful attention, with appropriate modification.

If this form of thrashing in main memory becomes an increasing problem, the only good fix is to purchase
more physical memory.

Fs_Async
This kernel parameter controls the switch between synchronous or asynchronous writes of file system meta
structures to disk. Asynchronous writes to disk can improve file system I/O performance significantly.
However, synchronous writes to disk make it easier to restore file system integrity if a system crash occurs
while file system meta structures are being updated on the file system. Depending on the application, you will
need to decide which is more important. The decision should be based on what types of applications are going
to be run. You may value file system integrity more than I/O speed. If so, fs_async should be set to 0.

HPUX_AES_Override
This value is part of the OSF/AES compliance. It controls directory creation on automounted disk drives. We
recommend hpux_aes_override be set to 1. If this value is not set, you may see the following error message:

mkdir: cannot create /design/ram: Read−only file system.


This system parameter cannot be set using SAM. The kernel must be manually modified the old way. It is best
to modify the other parameters with SAM first and then change this parameter second, else SAM will override
your 'unsupported' value with default.

Maxdsiz
Maxdsiz defines the maximum size of the data segment of an executing process. The default value of 64 Mb is
too small for most applications. We recommend this value be set to the maximum value of 1.9Gb. If maxdsiz
is exceeded by a process, it will be terminated, usually with a SIGSEGV (segmentation violation) and you
will probably see the following message:

Kernel Configuration Parameters 10


HP Global Technical Partner − Cadence
Memory fault(coredump)
In this case, check out the values of maxdsiz, maxssiz and maxtsiz. For more information on these parameters,
please see the on−line Help section within SAM's Kernel Configuration. If you need to exceed the specified
maximum of 1.9Gb, there are a couple of ways (yet to be supported) to do so. Contact your Hewlwett Packard
technical consultant for the details. It is important to note that the maxdsiz parameter must be modified in
order for these procedures to work. Maxdsiz will need to be set to 2.75Gb or 3.6Gb depending on the method
chosen and/or size required.

Maxfiles
This sets the soft limit for the number of files a process is allowed to have open . We recommend this value be
set to 200.

Maxfiles_Lim
This sets the hard limit for number of files a process is allowed to have open . This parameter is limited by
ninode. The default for this kernel parameter is 2048.

Maxssiz
Maxssiz defines the maximum size of the stack of a process. The default value is 8Mb. We recommend this
value be set to a value of 79 Mb.

Maxswapchunks
This (in conjunction with some other parameters) sets the maximum amount of swap space configurable on
the system. Maxswapchunks should be set to support sufficient swap space to accommodate all swap
anticipated. Also remember, swap space, once configured, is made available for paging (at boot) by specifying
it in the file /etc/fstab. The maximum swap space limit is calculated in bytes is: (maxswapchunks * swchunk *
DEV_BSIZE). We recommend this parameter be set to 2048.

Maxtsiz
Maxtsiz defines the maximum size of the text segment of a process. We recommend 1024 MB.

Maxuprc
This restricts the number of concurrent processes that a user can run. A user is identified by the user ID
number and not by the number of login instances. Maxuprc is used to keep a single user from monopolizing
system resources. If maxuprc is too low, the system issues the following error message to the user when
attemting to invoke too many processes:

no more processes
We recommend maxuprc be set to 200.

Maxusers
This kernel parameter is used in various algoritms and formulae throughout the kernel. It is used to limit
system resource allocation and not the actual number of users on the system. It is also used to define the
system table size. The default values of nproc, ncallout, ninode and nfile are defined in terms of maxusers. We
are recommend fixed values for nproc, ninode and nfile. Set maxusers to 124.

Netmemmax
This specifies how much memory can be used for holding partial internet−protocal(IP) messages in memory.
They are typically held in memory for up to 30 seconds. The default of 0 allows up to 10% of total memory to
be used for IP level reassembly of packet fragments. Values for netmemmax are specified as follows:

Kernel Configuration Parameters 11


HP Global Technical Partner − Cadence

Value Description

−1 No limit, 100% of memory is available for IP packet reassembly.

0 netmemmax limit is 10% of real memory.

Specifies that X bytes of memory can be be used for IP packet reassembly.


>0 The minimum is 200 Kb and the value is rounded up to the next multiple of pages
(4096 bytes).
If system network performance is poor, it might be because the system is dropping fragments due to
insufficient memory for the fragmentation queue. Setting this parameter to −1 will improve network
performance, but, at the risk of leaving less memory available for processes. We recommend it be set to −1 for
systems acting as data servers only. For all other systems, we recommend a setting of 0.

Nfile
Nfile sizes the system file table. It contains entries in it for each instance of an open of a file. It therefore
restricts the total number of concurrent "opens" on your system. We suggest that you set this at 2800. This
parameter defaults to ((16 * (nproc + 16 + maxusers) / 10 ) + 32 + 2 * npty). If a process attempts to open one
more (than nfile) file, the following message will appear on the console:

file: table is full

When this happens, running processes may fail because they cannot open files and no new processes can be
started.

Ninode
Ninode sizes the incore inode table, also called the inode cache.For performance, the most recently accessed
inodes are kept in memory. Each open file has an inode in the table. An entry is made in the table for each
"login directory", each "current directory", each mount point directory, etc. It is recommended that ninode be
set to 15,000.

Nproc
Nproc sizes the process table. It restricts the total number of concurrent processes in the system.When some
one/process attepmts to start one more (than nproc) process, the system issues these messages:

at console window : proc: table is full


at user shell window: no more processes
Set nproc to 1024.

Npty
This parameter limits the number of master/slave pty data structures that can be opened. These are used by
network programs like rlogin, telnet, xterm, etc. We recommend this parameter be set to 512.

Configuring Kernel Parameters

The following are the suggested kernel parameter values.

Value

Configuring Kernel Parameters 12


HP Global Technical Partner − Cadence

# Parameter 0 # on HP−UX 10.X


# 4096 # on HP−UX 9.X
bufpages 1
25
create_fastlinks 1
dbc_max_pct 2063806464
fs_async 200
maxdsiz 2048
maxfiles (383*1024*1024)
maxfiles_lim 4096
maxssiz (1024*1024*1024)
maxswapchunks 200
maxtsiz 124
maxuprc 0 # on desktop systems
maxusers −1 # on data servers
netmemmax 2800
15000
nfile 1024
ninode 512
nproc
npty

Configuring Kernel Parameters in 9.X

In HP−UX 9.X we recommend manual kernel configuration. All work related to creating a new kernel in 9.X
takes place in the /etc directory. You will copy the old kernel configuration file, dfile, into an new name.
Modify the dfile. Run make to build the new kernel. Then copy the new kernel file into place after saving the
old kernel.

• cd /etc/
• cp dfile dfile.old
• vi dfile
• Modify the dfile to include the kernel parameters and values suggested above.
• config dfile
• make −f config.mk
• mv /hp−ux /hp−ux.old
• mv /etc//hp−ux /hp−ux
• cd / ; shutdown −h 0

Note: For more information on manual kernel configuration, please see the HP−UX System Administration
"How To" Book

Configuring Kernel Parameters in 10.X

In HP−UX 10.X we recommend first manually modifying the kernel parameter hpux_aes_overide and then
modifying the other kernel parameters in SAM by using a tuned parameter set. The hpux_aes_override kernel
parameter is the only recommended parameter that must be modified manually. The other parameters could
then be updated with SAM or modified manually along with hpux_aes_override. We recommend using SAM
to take advantage of its built−in kernel parameter rule checker.

Configuring Kernel Parameters in 9.X 13


HP Global Technical Partner − Cadence

To configure a kernel manually, you must be root.

All work related to creating a new kernel in 10.X takes place in the /stand/build directory. You will create a
new kernel configuration file, after moving the existing configuration file, system, into a new name. Run
mk_kernel to build the new kernel and copy the new kernel file into place after saving the old kernel (as
another name). Then reboot the system

• cd /stand/build
• /usr/lbin/sysadm/system_prep −s system
• vi system
• Either add or modify the entries to match:
• hpux_aes_override 1
• mk_kernel −s system
• mv /stand/system /stand/system.prev
• mv /stand/build/system /stand/system
• mv /stand/vmunix /stand/vmunix.prev
• mv /stand/build/vmunix_test /stand/vmunix
• cd / ; shutdown −h 0

Note: For more information on manual kernel configuration, please see the HP−UX 10.X System
Administration "How To" Book. .

To configure the remaining kernel parameters with SAM, follow these steps:

• Login to the system as root


• Place the list of kernel parameter values above in the file:
• /usr/sam/lib/kc/tuned/stuff.tune

(The first line should be "STUFF Applications" in the format shown in the general "Configuring
Kernel Parameters" section above.)

• Start SAM by typing the command: sam


• With the mouse, double−click on Kernel Configuration .
• On the next screen, double−click on Configurable Parameters.
• SAM will display a screen with a list of all configurable parameters and their current and pending
values. Click on the Actions selection on the menu bar and select Apply Tuned Parameter Set ... on
the pull−down menu. Select STUFF Applications from the list and click on the OK button.
• Click on the Actions selection on the menu bar and select Create A New Kernel. A confirmation
window will be displayed warning you that a reboot is required. Click on YES to proceed.
• SAM will build the new kernel and then display a form with two options:
♦ Move Kernel Into Place and Reboot the System Now
♦ Exit Without Moving the Kernel Into Place
♦ If you select the first option and then click on OK, the new kernel will be moved into place
and the system will be automatically rebooted.
♦ If you select the second option move the kernel from the /stand/build directory into the
/stand/vmunix

Networks

Configuring Kernel Parameters in 10.X 14


HP Global Technical Partner − Cadence
Network configuration can also have an impact on performance. Virtually all installations use some form of
local area network to facilitate sharing of data files and to simplify system management. Most installations use
NFS to mount remote file systems so they appear local to the user. This enables the user to access data from
any disk on the network as easily as from a local disk. This imposes a performance penalty, however, because
the I/O bandwidth for accessing data on an NFS mounted disk is less than that for a directly connected disk.
There are a few system configuration recommendations that can be made to maximize the convenience that
NFS and the local area network provide while minimizing the performance penalty.

• Patches. Always install the latest HP−UX NFS patch. HP periodically releases patches that correct
problems associated with NFS, many of them performance related. If you are using NFS, you should
make sure the latest patch is installed on both the client and server. See the PATCHES section for
more details. General HP−UX patch information can be found on http://us−support.external.hp.com.
• Local vs. Remote. You will need to determine what things are located remotely, and which should be
local. From a system administration viewpoint, the most convenient scenario is to have applications,
data, home directories, and basically anything anyone cares about on a central NFS file server which
is backed up regularly. That server is then accessed by multiple clients, which are typically
workstations with a minimal amount of local disk for OS and swap, and are not backed up. At the
other extreme, for maximum performance it is best to have no network access whatsoever and keep
everything on local disks. Between those two extremes there are a continuum of options, all of which
have associated tradeoffs.
• Subnetting. In general, it is a bad idea to have too many systems on a single wire. Implementation of a
switched ethernet configuration with a multi host server or a server backbone configuration can
preserve existing wiring while maximizing performance. If you are doing rewiring, seriously consider
using fiber for future upgradability.
• Local paging. When applications are located remotely, one trick you can use is to set the "sticky bit"
on the applications binaries, using the chmod +t and find commands. This forces the system to page
the text segment to the local disk, improving performance. Otherwise, it is paged across the network.
Of course, this would only apply when there is actual paging occurring. More recently, there is a
kernel parameter, remote_nfs_swap, when set to 1 will accomplish same.
• Demand loading. Previous versions of this document have setting the demand loading bit on binaries
using the chatr command. There's been some controversy over this; empirical data has shown that it
does make a difference, while some information has been found stating that there is no difference
between demand loadable binaries and shared ones. The current conclusion is that there is indeed a
difference and that it may be beneficial to lessen startup times by setting the demand loading bit as
described.
• File locking. Make sure the revisions of statd and lockd throughout the network are compatible; if
they are out of synch, it can cause mysterious file locking errors. This particularly affects user mail
files and Korn shell history files.
• NFS configuration. On NFS servers, a good first order approximation is to run two nfsd processes per
disk. The default is four total, which is probably not enough on a server. On 9.x systems, too many
nfsd processes can cause context switching bottlenecks, because all the nfsds are awakened any time a
request comes in. On 10.x systems, this is not the case and you can safely have extra nfsd processes.
Start with 30 or 40 nfsd's. On NFS clients run sixteen biod processes. In general, HP−UX 10.X has
much better NFS performance than previous versions.
• Design the lan configuration to minimize inter segment traffic. To accomplish this you will have to
ensure that heavily used network services (NFS, licensing, etc.) are available on the same local
segment as the clients being served. Avoid heavy cross segment automounting.
• Maximize the usage of the automounter. It allows you to centralize administration of the network and
also greater flexibility in configuring the network.. Avoid the use of specific machine names which
may change over time in your mount scheme; force mount points that make sense. /net ties you to a
particular server, which may change over time.

Networks 15
HP Global Technical Partner − Cadence

• You can watch the network performance with Glance, the netstat command, and the nfstat command.
There are other tools like NetMetrix or a LAN analyzer to watch lan performance. Additionally, you
can use the HP products PerfView Software/UX and HP MeasureWare/UX to collect data over time
and analyze it. You may want to tune the timeo and retrans variables. For HP systems, small numbers
4 for retrans and 7 for timeo are good. The default values for wsize and rsize, 8K, are almost always
appropriate. Do NOT use 1024 unless talking to an Apollo system running NFS 2.3 on SR10.3. 8K is
appropriate for 10.4 Apollos running NFS 4.1.
• Explore using dedicated servers for computing, file serving, and licensing. A good scenario has a
group of dedicated servers connected with a fast "server backbone", which is then connected to an
ethernet switch, which is itself connected to the desktop systems.

Flexlm Licensing
Some EDA applications use FlexLM, a commonly used UNIX licensing scheme. Some things you may want
to be aware of:

• Licensing can generate significant network traffic. Some EDA applications perform a "breath of life"
license check periodically. This varies from application to application; some intervals are as short as
40 seconds.
• In heavy usage mission critical situations, configure three machines to be your redundant license
cluster, and make licensing the only thing running on those machines. They can be small
workstations, for example, but don't bog them down with NFS or other services.
• You can mix license files from many vendors and use a single server or cluster to serve them. The
vendors must support Flex 2.2 or above, and you must use the LM_LICENSE_FILE.
• There is NO FlexLM performance benefit in node−locked licenses; the server is still contacted for
license checkin and checkout.
• You will want to follow the following order in the license file: node−lock multilicense lines,
node−lock single license lines, floating multilicense lines, floating single license lines.
• You must call the vendor hotline and get a new license file if you want to either change the node
associated with a node lock license or change servers.
• By default the device file /lan0 is overprotected for FlexLM usage.; it is set to rw−−−−−−−. This must
be changed for FlexLM to work. rw−r−−r−− is appropriate. This has been fixed at 10.x. The symptom
here is that the user root can execute applications successfully, but an ordinary user cannot.

X Terminal Configuration
Many EDA sites are moving to X terminal (or "X station" in HP talk) configurations. Here are some
guidelines regarding these configurations:

• Server memory. You will need 64Mb to start, and 24−48Mb for each X terminal to be served
depending on the application. The more memory, the better. Swap space configuration should fall
along the same lines as other systems, just all on the server.
• X terminal memory. 18Mb minimum. This allows efficient usage of fonts.
• Server kernel configuration. Set maxusers to 64. Set nptys to 512.
• Networking. Try and keep X terminal traffic away from critical NFS traffic on the network.
• Use NFS to load the server files; it's faster than TFTP.
• Font paths. You may have to hardwire the paths to the EDA vendor specific fonts in the setup screen.
Or set up a font server.

Flexlm Licensing 16
HP Global Technical Partner − Cadence

Patches
Since patch numbers change frequently, it is recommended that you always check for the latest information.
Here are some general recommendations:

• If you are using dynamic buffer cache on a 9.x system, load the latest kernel patch that mentions
dynamic buffer cache. These patches limit the growth of the buffer cache to half of physical memory,
and also modify cache management algorithms to be more efficient. These are not needed on 800
systems (in 9.X), or systems not using dynamic buffer cache.
• Always load the latest kernel megapatch, ARPA transport patch, NFS/automounter patches,
statd/lockd patches, and SCSI patch. Many performance and reliability improvements can be had.
• Load the latest C compiler and linker. The linker in particular is required for 9.01 systems.
• Load HP−VUE or CDE, and X/Motif patches at your discretion. Generally these are bug fixes.
• Almost always load the latest X server. Many display issues have been solved in the past by loading
the latest X server. There have been isolated instances in the past of a new X server causing problems
with EDA applications, though. When in doubt, call the hotline.

How to get patches. If you have WWW access go to http://us−support.external.hp.com, and follow the links
to the patch list. This is also a good way to browse the latest patch list. You can also get patches by e−mail. If
you know what the name of the patch you want is, send a message to support@@support.mayfield.hp.com,
with the text "send patchname". Don't forget to substitute the name of the patch you want for "patchname".
You can get a current list by sending the text "send patchlist". To get a complete guide on using the mail
server, send the text "send guide". If the customer has HP SupportLine access, then patches can be requested
from the HP SupportLine at (800)633−3600, and are also available for FTP access.

How to tell what patches are loaded. First scan the directory /etc/filesets (9.x) systems, or use the swlist
command (10.x). Patches are named PHxx_nnnn, where xx can be KL, NE, CO, or SS. nnnn refers to the
patch number, which is always unique no matter what PHxx category is specified. If a patch has been loaded
on a 9.x system, a file will exist in /etc/filesets, with the same name as the patch. If a patch has been loaded on
a 10.x system, the patch should be listed in the output of swlist.

How to load patches. Patches are shipped as shell archives, named after the patch. To unpack the shell
archive, enter sh filename where filename is the path to the patch shell archive. You will end up with two
files, a .text file and a .updt file. The .text file has detailed information about the patch. The .updt file is the
actual patch source. You can install the patch with /etc/update on 9.X, either in command line mode or
interactive mode. Use the following command line:
/etc/update −s/pathname−to−updt−file −S700 −r \*

You must specify either −S700 or −S800. The −r allows a kernel rebuild and reboot if you are installing a
kernel patch, so be prepared to reboot the system.

Using interactive mode, point to the patch file as if it were a tape device in the "Change Source or
Destination" menu, then have at it.

Make sure you are in single user mode when installing any patch.

To install a patch on a 10.X system, use the following command line:


swinstall −x autoreboot=true −x match_target=true −s /pathname−to−depot−file

Patches 17
HP Global Technical Partner − Cadence

You can install multiple patches at a time by creating a netdist area that contains the patches using /etc/updist,
or by specifying a list of patches in a file using the −f switch.

Patch management. Patch management can be a fulltime job for a large site. HP recommends that large sites
that don't want to tackle that particular task purchase the PSS support option. This service provides a
consultant who, among other things, provides patch management. It's well worth the money.

How to make a patch tape. On a 9.x system, you can use dd to make a patch tape as follows:
dd if=/pathname−to−updt−file of=/rmt/0m bs=2k

On a 10.x system, use the following command:


swpackage −s /pathname−to−depot −x target_type=tape −d /rmt/0m patchname

Performance Tips
Kernel Parameters
Most, if not all of the kernel parameter tuning has been covered in the preceding sections of this document.
Any additional/future parameters will appear here.

File Systems
When using UFS (HFS) file systems, configure them with a block size of 64K and a fragment size of 8K. HFS
file systems have historically preferred to perform I/O in 64K block sizes. I have improved performance by
using a VxFS (JFS) file system when it is being used as a "scratch" file system...a file system that you do not
care about when the application crashes, or when it completes successfully. When doing so, you need to
mount this file system with three specific options in order to gain performance. They are:

• nolog
• mincache=tmpcache
• convosync=delay

The on−line (advanced) JFS product is required to use these options. In my experience, the JFS block size is
of no consequence when using JFS. JFS likes to perform I/O in 64K chunks, regardless of the block size.
Supported block sizes are 1, 2, 4, and 8K. There is no fragment on a JFS file system.

When striping with LVM, one should make sure that the file system block size and the LVM stripe size are
identical. This will aid performance.

When mounting file systems, they should be positioned at mount points that are as close to the "root" of the
tree. This will help "shorten" directory search paths. It is very important that file systems that contain "tools"
that will be used by the application(s), be mounted as close to the top as possible.

As of the latest revision (2.0) of this document, there is a JFS "mega patch" for performance. The patch
number is PHKL_12901 for 700's and PHKL_12902 for the 800's.

Logical Volume Manager


The following are simply recommendations...you do not have to do them. Obviously, there are pros and cons
with everything. This is not the forum for this type of discussion, so, here they are. Use as many physical
disks as possible. Stripe them if you can. If you have followed the file system recommendation of using a 64K
block size, use a 64K stripe size as well. I would suggest a 64K stripe size for LVM anyway. Hopefully, you
will have identical disks (make, model, size, geometry, etc.). When you have control, place your logical

Performance Tips 18
HP Global Technical Partner − Cadence
volumes so that the "pieces" a logical volume are located in the same place accross the physical devices. For
example, having four physical devices, you "stripe" a logical volume so that 25% of appears on each of the
four disks, and, each piece appears at the "top" of the disk.

Startup Program
I have noticed very many customers and ISV's using the C shell as a startup. This might be OK on other
"variations" of UNIX, but does not fare as well on HP−UX (due to the implementation) as the K shell or
POSIX shell. When a process forks many children, the .cshrc file is "fired up" and executed for each fork. I
have seen some of these files that are extremely long AND they source files that source other files, and so on.
This is very time consuming and degrades performance. If possible, do not use the C shell.

The PATH Variable


This is one of the most abused areas that causes performance problems. PATH variables that are way too long
AND the positioning of the directory that contains the most frequently used tools (by the application), at the
end. This is of great concern.

NFS
Check your buffer cache size. Some say 128K for each 1000 IOP's a server expects to deliver.

Check your disk and file system configurations:

• LVM configuration/layout
• Multiple disk striping?
• HFS? ...check your block/fragment sizes
• JFS? ...check your mount options

Reads and writes...server and client block sizes should match. Pay attention to the suggestions for file systems
(above).

nfsd's ...start with 30 or 40. Some say that 2 per spindle is adequate

Make sure that ninode is at least 15000 (on 10.X). Some people have seen performance degradation on Multi
Processor systems when ninode is greater 4000. Check it on your system. The details of this problem are
much to detailed and complicated for this document.

NFS file systems should be exported with the async option in /etc/exports.

Some items that can be investigated...

nfsd invocations

• nfstat −a

UDP buffer size

• netstat −an | grep −e Proto −e 2049

How often the UDP buffer overflows

• netstat −s | grep overflow

Performance Tips 19
HP Global Technical Partner − Cadence

NFS timeouts...are they a result of packet loss? Do they correlate to errors reported by the links? Use
lanadmin() or netstat −i to check this.

IP fragment reassembly timeouts?

• netstat −p ip

UDP socket buffer overflows?

• ...see above

mounting through routers?

• check to see if routers are dropping packets

check for transport bad checksums

• netstat −s

is server dropping requests as duplicates?

• nfsstat

is client getting duplicate replies? (badxid)

• nfsstat on CLIENT

Some people have mentioned that they have had serious problems because of too many levels of hierarchy
within the netgroup file. It seems that this file is re−read a very many times, and the more hierarchy, thelonger
it takes to read.

(c) Copyright 1996 Hewlett−Packard Company.

December 1, 1997

Performance Tips 20

S-ar putea să vă placă și