Sunteți pe pagina 1din 73

D.H. Brown Associates, Inc. http://www.dhbrown.

com
A summary of this report is available to all of our subscribers free of charge. Sponsors of our collaborative
program in Systems Software (SS) receive the full report as part of our comprehensive services. Those
interested in the program should contact Bill Moran, Research Director Open Systems at
moran@dhbrown.com or 914-937-4302, ext. 230.
1999-2000 Operating
System Function Review
EXECUTIVE SUMMARY
5.00 6.00 7.00 8.00 9.00
IRIX 6.5
Solaris 7
HP-UX 11.0
Tru64 UNIX 5.0
AIX 4.3.3
Fair OK Good Very Good Excellent
OVERALL RESULTS
AIX 4.3.3 retains the lead for operating-system functions, maintaining its
significant advantage in Internet and web-application functions and achieving
strong ratings in several other areas. While Tru64 UNIX 5.0 places first in three
of the five studied areas Reliability, Availability, Serviceability (RAS); system
management; and directory/ security services it falls into last place for
scalability, largely due to Compaqs relatively limited SMP support. HP-UX 11.0
achieves a strong standing in Internet and directory/ security functions, but has
an average ranking in the remaining areas. Solaris 7 leads in scalability and does
well in RAS, but it has unimpressive capabilities in the remaining areas. IRIX 6.5
does well in scalability, but it trails in every other category.
SCALABILITY RESULTS
Solaris 7 captures the lead for overall scalability, supporting a very broad SMP
range, offering strong 64-bit capabilities, and holding competitive ratings in other
areas. IRIX 6.5 follows closely; it also offers a very broad SMP range, but lacks
performance evidence for its database clustering options. HP-UX 11.0 has the
best 64-bit capabilities, thanks to the extraordinary memory ranges supported on
HPs SCA hardware and very competitive scalability clustering options. AIX 4.3.3
provides leading performance clustering capabilities on IBMs Scalable Parallel
Processor (SP) hardware, but its SMP range remains average. Further, AIXs
maximum file size, 64 GB, falls significantly behind competitors, most of whom
FI GURE 1:
Overall Functional Ratings
as of January 1, 2000
1999-2000 Operating System Function Review
SS, March 2000
2 Copyright 2000 D.H. Brown Associates, Inc.
support at least 1 TB. Tru64 UNIX 5.0 has strong 64-bit capabilities and good
scalability clustering options, along with a variety of miscellaneous performance
optimizations that are useful for particular classes of applications. However, until
Compaqs long-awaited future eight-, 16-, and 32-node high-end GS-series
systems arrive in 1H00, Tru64 UNIXs SMP range significantly trails its
competitors at 14 processors.
RELIABILITY, AVAILABILITY,
SERVICEABILITY (RAS) RESULTS
Tru64 UNIX 5.0 shares the lead with IRIX 6.5 for RAS functions. Tru64 UNIX
offers unmatched storage reliability features and leading HA clustering functions,
thanks in part to its clustering file system, which is unique among all studied
products. IRIX offers particularly strong resiliency functions, along with
competitive HA clustering options. Solaris 7 follows, having the strongest
resiliency functions due to its unmatched Dynamic Reconfiguration and
Alternate Pathing capabilities. Solaris 7 offers only average HA clustering
capabilities, however. HP-UX 11.0 and AIX 4.3.3 have roughly equivalent RAS
capabilities. HP offers stronger resiliency functions and the best overall
serviceability functions, but IBM offers very strong HA clustering functions.
SYSTEM MANAGEMENT RESULTS
Tru64 UNIX 5.0 also ranks first for system management, thanks to very strong
operating-system management functions, which now match the strength of AIX,
the long-time leader in this area. Tru64 provides very strong heterogeneous
management capabilities, bundling the ability to host Windows NT network-
authentication functions. AIX 4.3.3 follows closely with the strongest hardware
management, supporting plug-and-play configuration of RS/ 6000 hardware and
peripherals and strong remote manageability based on its web-based system
manager. HP-UX 11.0 achieves average ratings across most functional areas, but
still trails in heterogeneous management and interoperability functions. Solaris 7
stands out for its leading resource-management capabilities, but has yet to catch
up with the leaders for operating-system management functions. IRIX 6.5 leads
in storage management and has strong resource-management tools, but has
relatively weak operating-system management and remote-manageability
functions.
INTERNET AND WEB APPLICATION SERVICES RESULTS
AIX 4.3.3 retains the lead for Internet and web application services, benefiting
from the strongest support for TCP/ IP protocols and extensions; the strongest
Internet file, mail, and web services; and the richest set of e-commerce options.
HP-UX 11.0 follows it also offers a strong TCP/ IP implementation, coupled
with very good e-commerce options and Internet file, mail, and web services.
Tru64 UNIX 5.0 has a very strong JVM implementation and unique support for
Microsofts DCOM distributed object protocol, but otherwise has average
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 3
capabilities. Solaris 7 places fourth overall, a surprising position for a pioneering
Internet company that has contributed so much technology to the industry. In a
rapidly growing arena where every player wants to be first, Suns choice to bundle
its own web server with Solaris 7 rather than iPlanet or Apache; a modest set of
e-commerce offerings; and the lack of many TCP/ IP extensions hamper Suns
functional leadership. IRIX 6.5 includes a good set of bundled Internet file, mail,
and web services, but it trails in most other areas.
DIRECTORY AND SECURITY SERVICES RESULTS
Tru64 UNIX 5.0 leads in directory and security services, bundling the strongest
set of directory services and sharing the top spot for secure networking
functions. AIX 4.3.3 follows closely, sharing the lead for Virtual Private Network
(VPN) functions while also providing very competitive directory services. HP-
UX shares the lead for secure networking and VPN functions, but provides only
average directory services. Solaris 7 has competitive directory services, but
average capabilities in remaining areas. IRIX 6.5 trails in all areas.
1999-2000 Operating System Function Review
SS, March 2000
4 Copyright 2000 D.H. Brown Associates, Inc.
TABLE OF CONTENTS
EXECUTIVE SUMMARY ................................................................................................................................ 1
OVERALL RESULTS................................................................................................................................... 1
SCALABILITY RESULTS............................................................................................................................. 1
RELIABILITY, AVAILABILITY, SERVICEABILITY (RAS) RESULTS........................................................... 2
SYSTEM MANAGEMENT RESULTS.......................................................................................................... 2
INTERNET AND WEB APPLICATION SERVICES RESULTS..................................................................... 2
DIRECTORY AND SECURITY SERVICES RESULTS ................................................................................ 3
METHODOLOGY ........................................................................................................................................... 6
NOTES ON THE 1999-2000 EDITION......................................................................................................... 7
MICROSOFT WINDOWS NT/2000 .......................................................................................................... 8
SUN SOLARIS 8...................................................................................................................................... 8
SCALABILITY................................................................................................................................................ 9
SUMMARY.................................................................................................................................................. 9
SCALABILITY CRITERIA ............................................................................................................................ 9
64-BIT SUPPORT.................................................................................................................................. 10
SMP/NUMA SCALABILITY..................................................................................................................... 10
SMP BENCHMARK EVIDENCE............................................................................................................. 11
MAXIMUM SMP CONFIGURATION SIZE.............................................................................................. 13
SMP LINEARITY.................................................................................................................................... 13
PERFORMANCE CLUSTERING............................................................................................................ 13
TECHNICAL COMPUTING CLUSTERS................................................................................................. 14
DATABASE CLUSTERING.................................................................................................................... 14
PACKAGED WEB SERVER FARMS ..................................................................................................... 15
MISCELLANEOUS PERFORMANCE OPTIMIZATIONS ........................................................................ 16
AIX 4.3.3 ................................................................................................................................................... 16
HP-UX 11.0 ............................................................................................................................................... 18
IRIX 6.5 ..................................................................................................................................................... 19
SOLARIS 7................................................................................................................................................ 21
TRU64 UNIX 5.0........................................................................................................................................ 22
RELIABILITY, AVAILABILITY AND SERVICEABILITY (RAS).................................................................... 24
SUMMARY................................................................................................................................................ 24
RAS CRITERIA ......................................................................................................................................... 24
RESILIENCY FUNCTIONS .................................................................................................................... 24
HIGH-AVAILABILITY CLUSTERING FUNCTIONS................................................................................. 25
STORAGE RELIABILITY AND SCALABILITY ........................................................................................ 27
SERVICEABILITY ENHANCEMENTS.................................................................................................... 27
AIX 4.3.3 ................................................................................................................................................... 28
HP-UX 11.0 ............................................................................................................................................... 29
IRIX 6.5 ..................................................................................................................................................... 29
SOLARIS 7................................................................................................................................................ 31
TRU64 UNIX 5.0........................................................................................................................................ 32
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 5
SYSTEM MANAGEMENT ............................................................................................................................ 35
SUMMARY................................................................................................................................................ 35
SYSTEM MANAGEMENT CRITERIA........................................................................................................ 35
OPERATING-SYSTEM MANAGEMENT................................................................................................ 36
EVENT MANAGEMENT......................................................................................................................... 37
HARDWARE STATE MANAGEMENT.................................................................................................... 38
STORAGE PERIPHERAL MANAGEMENT............................................................................................ 38
REMOTE MANAGEABILITY .................................................................................................................. 39
RESOURCE MANAGEMENT................................................................................................................. 40
HETEROGENEOUS MANAGEMENT AND INTEROPERABILITY.......................................................... 41
AIX 4.3.3 ................................................................................................................................................... 43
HP-UX 11.0 ............................................................................................................................................... 45
IRIX 6.5 ..................................................................................................................................................... 47
SOLARIS 7................................................................................................................................................ 49
TRU64 UNIX 5.0........................................................................................................................................ 51
INTERNET AND WEB APPLICATION SERVICES....................................................................................... 55
SUMMARY................................................................................................................................................ 55
INTERNET AND WEB APPLICATION CRITERIA ..................................................................................... 55
TCP/IP FEATURES ............................................................................................................................... 56
WEB APPLICATION SERVICES............................................................................................................ 58
E-COMMERCE TOOLS......................................................................................................................... 58
BUNDLED FILE, MAIL, AND WEB SERVERS ....................................................................................... 60
AIX 4.3.3 ................................................................................................................................................... 61
HP-UX 11.0 ............................................................................................................................................... 63
IRIX 6.5 ..................................................................................................................................................... 64
SOLARIS 7................................................................................................................................................ 65
TRU64 UNIX 5.0........................................................................................................................................ 66
DIRECTORY AND SECURITY SERVICES................................................................................................... 68
SUMMARY................................................................................................................................................ 68
DIRECTORY SERVICES CRITERIA......................................................................................................... 68
SECURITY INFRASTRUCTURE CRITERIA.............................................................................................. 69
VIRTUAL PRIVATE NETWORKING (VPN) CRITERIA.............................................................................. 70
AIX 4.3.3 ................................................................................................................................................... 71
HP-UX 11.0 ............................................................................................................................................... 71
IRIX 6.5 ..................................................................................................................................................... 72
SOLARIS 7................................................................................................................................................ 72
TRU64 UNIX 5.0........................................................................................................................................ 73
1999-2000 Operating System Function Review
SS, March 2000
6 Copyright 2000 D.H. Brown Associates, Inc.
METHODOLOGY
In this study, D.H. Brown Associates, Inc. (DHBA) evaluates five leading UNIX
operating systems IBM AIX 4.3.3, Hewlett-Packard HP-UX 11.0, SGI IRIX
6.5, Sun Solaris 7, and Compaq Tru64 UNIX 5.0 based on their functional
capabilities as of January 1, 2000. In this edition of the study, each operating
system receives a rating for its support of over 100 functional items across five
areas:
scalability,
RAS,
system management,
Internet and web application services, and
directory and security services.
This study primarily notes items for their existence or non-existence on a given
platform, although it judges some according to the quality and breadth of their
implementation. Vendors receive maximum credit only for functions they bundle
and integrate in their operating systems. They take a penalty if the function
requires a separately priced option and suffer a greater penalty if the function is
not available directly from the operating systems supplier (i.e., if it requires
involvement of a third-party supplier). They receive a maximum penalty if a
function is unavailable for the platform or if can be implemented only through
an awkward workaround.
Each individual rating sums to a score for each of the five functional categories,
based on weights indicated at the beginning of each chapter. The overall ranking
results from the average of all category rankings. Each of the major functional
areas gets an equal weight toward the total.
To determine its ratings for the studied functional items, DHBA evaluated each
operating system and its layered products using a variety of approaches,
including:
hands-on evaluation,
examination of system documentation and related publications, and
discussions with marketing and engineering staff from each operating-system
vendor.
DHBA must emphasize that this report represents a technology assessment,
which exposes findings that remain distinct from other types of research, such as
market-share statistics, customer-satisfaction surveys, or laboratory-based stress
testing. One cannot extrapolate the results of this assessment to make
conclusions in other domains. The industry has frequently shown that the best
technology does not always win in the market place.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 7
To arrive at a complete profile of an operating-system product, users should
consider a number of factors in addition to those addressed by this study,
including:
Application portfolio: An operating system is only as useful as the applications
available for it. The suitability of an application portfolio for a given user,
though, ultimately depends that users specific requirements.
Quality: As with any other complex technical product, an operating system
may ship with a number of defects that are independent of its relative
technical richness. Formal methods to measure quality vary; two alternatives
are stress testing and collecting empirical data based on customer-satisfaction
surveys.
Vendor support: At the high end of software complexity, operating systems
introduce a notoriously high support burden, especially when deployed on
servers. The ability of vendors to meet those support requirements may vary.
Vendor experience: Vendors offering multiple operating systems may have
different levels of experience within their respective product lines, depending
on when they entered the market and with what level of commitment.
Skills availability: This factor applies both to the skills available within a users
organization and in the market as a whole.
Hardware/ systemcapabilities: Since an operating system will only perform as well
as its underlying hardware, users must remain aware of factors such as
processor performance and the SMP ranges available on host platforms.
Cost: A complex and contentious area, this factor depends not only on the
prices of operating-system software and associated client license fees, but also
on any necessary add-on packages, the price and price/ performance of
underlying hardware, and a wide variety of hard-to-measure soft costs
related to ongoing management and training.
NOTES ON THE 1999-2000 EDITION
DHBA revised the latest version of its scorecard to reflect new areas of
technology differentiation among vendors and shifts in enterprise-level
computing priorities. While the scalability, RAS, and system-management
categories remain unchanged, the other top-level categories changed as follows:
The features evaluated in the PC client support category were incorporated
into the system-management category, as baseline file- and print-sharing
capability for PCs has become commoditized. Relevant differentiation now
relates to operating systems ability to integrate PC and UNIX management
functions in terms of heterogeneous network resources.
The features evaluated in the distributed enterprise services category were
split, with some functions moving to the Internet and web applications
category (formerly Internet/ intranet) and the remainder going into a new
category that evaluates directory and security services. These changes reflect
the industrys growing orientation around web-based infrastructures for
network and application architectures.
1999-2000 Operating System Function Review
SS, March 2000
8 Copyright 2000 D.H. Brown Associates, Inc.
MICROSOFT WINDOWS NT/2000
While the previous version of this report rated Microsofts Windows NT 4.0
product, this edition does not include Windows NT. The release of Windows
2000 which occurred after the research deadline for this report introduces
significant architectural changes, including a major kernel upgrade and a new
approach to network services (the Active Directory Service). These changes will
potentially require that DHBA modify its scorecard line items to take the
Windows 2000 development into account before reassessing the entire operating-
system area. This work is in progress, and DHBA expects to publish a new
report reflecting the changed landscape later this year.
SUN SOLARIS 8
This edition of the report evaluates Solaris 7, rather than the recently introduced
Solaris 8. Sun shipped Solaris 8 after the research deadline, and the company has
staggered the release of some Solaris 8 functions, so that the entire solution set is
not currently available. DHBA will not formally evaluate Solaris 8 until a number
of critical features ship. In addition, many enhancements Sun touts for Solaris 8
were in fact previously shipped for Solaris 7 in the form of patches and add-ons.
These enhancements were included in this report.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 9
SCALABILITY
5.00 6.00 7.00 8.00 9.00
Tru64 UNIX 5.0
AIX 4.3.3
HP-UX 11.0
IRIX 6.5
Solaris 7
Fair OK Good Very Good Excellent
SUMMARY
Solaris 7 captures the lead for overall scalability, supporting a very broad SMP
range, offering strong 64-bit capabilities, and holding competitive ratings in other
areas. IRIX 6.5 follows closely; it also offers a very broad SMP range, but lacks
performance evidence for its database clustering options. HP-UX 11.0 has the
best 64-bit capabilities, thanks to the extraordinary memory ranges supported on
HPs SCA hardware, and very competitive scalability clustering options. AIX
4.3.3 provides leading performance clustering capabilities on IBMs SP hardware,
but its SMP range remains average. Further, AIXs maximum file size, 64 GB,
falls significantly behind competitors, most of whom support at least 1 TB.
Tru64 UNIX 5.0 has strong 64-bit capabilities and good scalability clustering
options, along with a variety of miscellaneous performance optimizations that are
useful for particular classes of applications. However, until Compaqs long-
awaited future eight-, 16-, and 32-node high-end GS-series systems arrive in
1H00, Tru64 UNIXs SMP range significantly trails its competitors at 14
processors.
SCALABILITY CRITERIA
Three basic functional areas determine the scalability of a system in an enterprise
environment:
64-bit support: the ability to exploit processing, memory, and storage beyond
the 4 GB limitation imposed by 32-bit systems. Several levels of 64-bit
capabilities exist, including 64-bit processor support, large file systems, large
files, large physical memories, and large process address spaces (where large
means greater than 4 GB).
Shared-memory multiprocessing(SMP) support: the ability to take advantage of
multiple processors in a server. Criteria include kernel locking granularity,
kernel thread mechanisms, and evidence of scalability based on industry-
standard benchmarks.
FI GURE 2:
Scalability
Functional Ratings
1999-2000 Operating System Function Review
SS, March 2000
10 Copyright 2000 D.H. Brown Associates, Inc.
Performance clustering options: the ability to grow system capacity, including
performance and storage, by lashing together multiple servers using high-
speed interconnects. Typically, a systems ability to handle technical
applications and commercial applications (e.g., database or web) classifies its
performance clustering capabilities.
64-BIT SUPPORT
64-bit support typically pays off the most for applications that use large
databases. 64-bit systems can cache complete database indexes (or the database
contents themselves) in physical memory, offering a roughly 10x improvement in
access time over disk. Performance improvements in real-world situations with
real workloads prove substantially more modest; TPC-C results for various 64-bit
vendors come in closer to a factor of 10%-2x, for example.
In general, operating systems can support 64-bit capabilities at four incremental
levels:
64-bit processor support: can run on 64-bit processors such as Alpha, MIPS, PA-
RISC, PowerPC, and UltraSPARC. All current Intel X86 processors use 32-
bit instruction sets, although Pentium Pro and Pentium II Xeon support 36-
bit physical memory addressing (i.e., a maximum of 64 GB RAM).
Largestoragesupport: can support file systems and files greater than 4 GB.
Large file systems must use large RAID configurations, which may range as
high as 2-4 TB. Support for large files requires the availability of API
functions that allow applications to access 64-bit ranges.
Largephysical memory support: can take advantage of physical memory greater
than 4 GB. While this capability proves most useful when coupled with 64-bit
virtual memory (see below), applications can exploit the larger memory
configurations even in systems that otherwise support only 32 bits. In
particular, administrators can configure database systems to use the extra
physical memory for caching purposes, boosting performance.
Large virtual memory support: the ability for applications to run in a 64-bit
process address space. Only operating systems with this capability qualify as
fully 64-bit enabled.
SMP/NUMA SCALABILITY
The ability of an operating system to exploit SMP systems continues to represent
a critical differentiator in server environments. Relevant factors include:
The degree to which the kernel has been optimized to exploit multiple
processors, which influences the absolute range of processors that it can
effectively support. This ranges from two processors to more than 100 in
advanced NUMA architectures.
The availability of mechanisms to support SMP-optimized applications such
as threads.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 11
The availability of performance evidence based on industry-standard
benchmarks for high-end systems, based on tests such as TPC-C and TPC-
D, which stress I/ O as well as computation.
SMP BENCHMARK EVIDENCE
When discussing the quality of SMP implementations, quibbling over the details
of kernel architecture becomes relatively meaningless beyond a certain point.
Developers can plan only a finite degree of SMP scalability into the design the
final analysis hinges on industry-standard benchmark performance.
Traditional uniprocessor performance metrics such as SPECint95 and SPECfp95
cannot measure SMP system performance because they submit tasks to the
system serially rather than concurrently. Instead, SMP performance must be
assessed using benchmarks that stress running jobs in parallel. These tests fall
into two basic classes technical and commercial. Technical users can rely on
benchmarks such as the NAS Parallel Series to assess parallelized engineering-
related application performance on SMP systems, particularly those related to
computational fluid dynamics and finite element analysis. Users can project
performance from NAS test results to the extent that other technical applications
rely on similar algorithmic techniques.
Commercial workloads present a greater challenge to SMP implementations,
because they tend to exhibit a high degree of communication and
synchronization overhead among processors. Most commercial applications are
essentially database applications that stress I/ O, cache management, and
communication. As the number of processors increases, these operations place
increasing demands on scarce resources such as memory/ bus bandwidth and
I/ O bandwidth, rigorously exercising kernel-locking mechanisms. Commercially-
oriented benchmarks thus provide the most credible assessment of the quality of
an SMP implementation.
Many types of benchmarks claim to measure realistic commercial server
performance. Proprietary benchmarks tend to have a narrow focus, favoring
particular architectures, such as PC fileservers or terminal-centric hosts, or
particular products, such as SAP. To overcome any potential biases in the
measured workloads, vendors tend to rely on a number of industry-standard
benchmarks to demonstrate SMP scalability for commercial applications. Multi-
vendor committees define these benchmarks and require that results be
published under strict guidelines, including detailed auditing procedures. Some of
the most rigorous and widely accepted tests relevant to SMP systems include:
SPECint_rate95: a variation of the SPECint95 test commonly used to measure
raw processor performance. The SPEC (Standard Performance Evaluation
Corporation) committee manages several SPEC benchmarks. The
SPECint_rate95 benchmark measures the capacity of a computer to execute
multiple CPU-intensive processes concurrently. It derives from the same set
of applications used in the traditional SPECint95 test, but runs multiple
copies of this application set in parallel. However, because SPECint_rate95
1999-2000 Operating System Function Review
SS, March 2000
12 Copyright 2000 D.H. Brown Associates, Inc.
tests involve relatively little I/ O or interprocess communication, it is fairly
forgiving of inefficient SMP kernel locking and thus serves only as a baseline
measure of commercial SMP capabilities.
SPECWeb96: measures web server performance. The SPECWeb96
benchmark is designed to provide comparable measures of how well systems
can handle HTTP GET requests. The SPEC committee based the workload
of this test on analysis of server logs from websites ranging from a small
personal server up through some of the Internets most popular sites. Built
on the framework of the SPEC SFS/ LADDIS benchmark, SPECWeb96 can
coordinate the driving of HTTP protocol requests from single- or multiple-
client systems.
TPC-C: measures database transactions completed per minute, expressed in
tpmC ratings. The Transaction Processing Performance Council, an
organization devoted to benchmarking transaction-processing systems,
manages this test. To determine the number of transactions a system can
process in a given timeframe, TPC benchmarks measure the total
performance of the system, including the computer, operating system,
database-management system, and any other related components involved in
the transaction-processing operation.
TPC-D: designed for decision support and tests 17 complex queries. TPC-D
results are relative numbers based on the size of the database being queried
and yield a single-user Qppd Power metric and a multiple-user QthD
Throughput metric. TPC-D is being phased out in favor of TPC-H (below),
after vendors discovered that by pre-caching the specific queries made by the
benchmark, scores could leap dramatically.
TPC-H: restores the emphasis on ad hoc (not pre-cached) queries. Like TPC-
D, it is designed for decision support. TPC-H results are relative numbers
based on the size of the database being queried and yield a single-user QphH
query-per-hour metric.
While it is tempting to simply compare the absolute performance numbers
obtained with a particular operating system to rate SMP capabilities, this
approach proves invalid. The benchmark result achieved by an SMP server
depends on a variety of factors, including the processor performance (i.e., Intel
X86 compared to RISC), the cache sizes used (which can range from 256 KB to
8 MB), the hardware interconnect design and performance, the database or web
server, and the applications used. The operating system itself represents only one
component in this equation. However, one can draw relevant conclusions about
the capabilities of an SMP kernel from two key metrics:
Maximumconfiguration size: the largest number of processors on which the
operating system was tested with an industry-standard benchmark; and
Linearity: the ratio of performance gained when additional processors are
added to the system.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 13
MAXIMUM SMP CONFIGURATION SIZE
Regardless of the absolute performance value achieved, benchmark results
published from high-end configurations help to prove that the operating system
itself can effectively and competitively exploit that number of processors. At a
certain point, every SMP system will roll over (become slower as more
processors are added), because synchronization overhead starts to overtake
computing performance. Since superior SMP designs can push that threshold to
larger numbers of processors, vendors choose to run benchmarks on
configurations that produce the most impressive results with the fewest number
of processors.
SMP LINEARITY
Linearity on SMP systems is typically expressed as a percentage relative to the
ideal scalability, in which increasing the number of processors from n-1 to n
should produce n/ (n-1) times the performance. An ideal SMP system would truly
be linear, i.e., performance would increase by a factor of one for every single
CPU added (100% linearity). Loathe to be measured by such an unforgiving
standard, vendors hesitate to disclose the SMP benchmark data points necessary
to draw conclusions about linearity. Even when vendors provide multiple
measurements for the same machine, the tested environments almost always vary
by processor clock speed, cache size, database system, database version, or
operating-system version.
In rare instances, vendors have released enough benchmark data on a system to
allow a gauge of its linearity. For example, on Bulls Escala servers, which are
internally identical to IBMs current 32-bit RS/ 6000 servers, earlier versions of
AIX achieved 70% linearity on TPC-C benchmarks when going from four to
eight processors in identical configurations.
PERFORMANCE CLUSTERING
Clusters can sometimes increase a systems capacity, including performance and
storage. To scale performance on a cluster, applications work in concert with
clustering software to partition their workloads into subtasks, which the
clustering software then distributes across a group of clustered servers. Since
even the fastest cluster interconnects usually have lower bandwidth and greater
latency than the bus in an SMP (in some cases by several orders of magnitude),
synchronization among the subtasks becomes a critical bottleneck that systems
must minimize. Identifying opportunities for coarse-grained parallelism proves
key to effective scalability on clusters. A variety of parallel-programming tools
and techniques have emerged to assist in partitioning applications for clusters.
Their use requires considerable expertise, however, and some classes of
applications fundamentally cannot be adapted at all. If sufficiently partitioned,
applications can exploit clustered systems containing hundreds or even thousands
of nodes, delivering monumental gains in performance.
1999-2000 Operating System Function Review
SS, March 2000
14 Copyright 2000 D.H. Brown Associates, Inc.
TECHNICAL COMPUTING CLUSTERS
Today, parallel computing addresses some of the worlds deepest computational
problems across a variety of scientific and engineering domains, including
simulation of natural phenomena, finite element analysis, and mechanical design.
From a hardware standpoint, clustered computing has evolved into variants such
as Clusters of Workstations (COWs) and Massively Parallel Processors (MPPs),
all of which share the assumption that attached nodes are dedicated exclusively to
their participation in cluster activity. Software designs have converged around
two public-domain parallel processing packages, Message-Passing Interface (MPI)
and Parallel Virtual Machine (PVM), which handle dispatch, collection, and
management of processing tasks across cluster nodes.
Even as researchers have gained parallel programming experience, they have
continued looking for more affordable alternatives to expensive supercomputer
products (many of whose developers have now adopted parallel architectures
themselves). Recently, the dramatic improvements in price-performance of such
commodity technology as Intel X86 processors and Ethernet LAN adapters has
allowed developers to pursue advanced cluster architectures based entirely on
industry-standard technology, requiring little or no involvement from major
hardware vendors. The development of low-cost operating systems and related
PVM-type capability on Linux through software known as Beowulf has extended
the popularity and availability of technical computing clusters.
DATABASE CLUSTERING
Technical applications tend to be analysis-oriented and thus read more data than
they write. By contrast, most commercial applications involve Online Transaction
Processing (OLTP), which usually requires that database records be updated
frequently. Since any data set changes must be copied to every node in the cluster
to maintain consistency, OLTP-oriented tasks tend to scale better on SMP
systems, which suffer a much less severe penalty with regard to inter-processor
communication.
However, a few commercial applications rely on analysis as well and thus lend
themselves well to cluster deployment. For example, data warehousing involves
scanning large databases for patterns that can be used to help make business
decisions. (Decision support is typically cited as a key benefit of data-
warehousing applications.) Many classes of data warehousing applications can
partition their data sets so as to minimize inter-node synchronization, allowing
them to achieve good scalability on clusters. However, data partitioning and
distribution must be implemented at the core of a database engine to work
effectively, meaning that database systems require modifications to properly
support clustered operation.
Several commercial database systems including Oracle Parallel Server (OPS),
IBM DB2 Universal Database (UDB), and Informix XPS have been extended
to work on clusters of servers connected by high-speed interconnects.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 15
PACKAGED WEB SERVER FARMS
While technical clustering revolves around distribution of compute cycles across
nodes, and commercial database clustering revolves around distribution of both
disk I/ O and compute cycles across nodes, IP clustering focuses on the
distribution of network requests such as TCP/ IP or web service requests across
nodes. The largest websites on the Internet process millions of hits per day, a
volume of traffic that can exceed the capabilities of a single server.
IP clusters allow ISPs or corporate Intranet sites to map all the traffic destined
for a single node (say, home.netscape.com) to a farm of multiple web servers
across which the Internet traffic is balanced. This mapping can take place either
in hardware (at a router-like device sitting in front of the web server farm) or in
software (on a separate server that sits in front of the web server farm). Virtually
all operating systems support the hardware approach, epitomized by the
expensive but well-known Cisco LocalDirector solution. LocalDirector, like
other hardware products, takes incoming IP sessions and rewrites the IP headers
of a packet stream to redirect them to a particular server using a technique called
Network Address Translation defined in RFC 1631. This approach requires no
changes to DNS configurations and minimal configuration of web servers, other
than to insure that the web servers have mirrored data or are operating off a
common network-based file store. The balancing of connections can occur for a
broad range of TCP/ IP services such as email or FTP and not just web services.
On the downside, systems require a backup LocalDirector to avoid having a
single point of failure, and throughput can be limited by the rate at which
LocalDirector can rewrite packets. In addition, the hardware approach does not
offer the ability to dynamically balance the connections according to the load on
each server in the web server farm.
The earliest software approaches revolved around a technique called Round
Robin DNS. In this approach, when a DNS server was asked for IP addresses
of a site name (e.g., www.dhbrown.com) it would return a numeric IP address
that alternated among a set of predefined set of IP addresses, each referring to a
separate back-end server (e.g., 127.1.1.1, 127.1.1.2, 127.1.2.5). This approach had
two main flaws, however. First, DNS mappings often got cached in intermediate
routers and other DNS servers in such a way that the load was not evenly
distributed. Second, if a back-end server failed, the DNS table had to be modified
by hand to remove the failed systems IP address. Otherwise, connections would
continue to be routed to the dead system, even when a user pressed the reload
button on their browser.
Over the last several years, a variety of other software approaches have sprung up
that attempt to provide the load-balancing via software that intercepts incoming
requests for information and distributes those requests accordingly. As with
database clustering solutions, a full evaluation of these products goes beyond the
scope of this paper. However, at a minimum, all the studied operating systems
can support the hardware redirection approach to IP clustering, as well as the
primitive round-robin DNS approach.
1999-2000 Operating System Function Review
SS, March 2000
16 Copyright 2000 D.H. Brown Associates, Inc.
MISCELLANEOUS PERFORMANCE OPTIMIZATIONS
Several vendors have tried to address scalability in particular situations. Because
of their limited applicability, DHBA weighted these areas less heavily in its
rankings. Potential areas for optimization include:
Multiple2GB shared-memory segments: Large servers running several copies of 32-
bit enterprise applications such as SAP can run into bottlenecks over shared
memory, if the kernel can only provide 2 GB of shared memory to the whole
system. A proper kernel design or modification can enable multiple
applications to use their own private 2 GB shared-memory windows without
exhausting the limited shared-memory space addressable by a 32-bit kernel.
This tactical feature should help scalability in certain server consolidation
environments, where the application vendor has yet to port its application to
64 bits.
Dynamicpagesizing: Historically, operating systems used fixed-size I/ O pages.
However, some classes of applications may benefit from different page sizes.
For example, applications that involve use of many small files may operate
more efficiently with small page sizes, while I/ O-intensive applications
implementing large block transfers may run better with large page sizes. Some
operating systems allow administrators to set page size by process.
Kernel thread architecture: All studied environments now support kernel threads,
which are required to effectively scale threaded applications on SMP systems.
Some environments innovate over traditional one-to-one (1-1) thread
mechanisms in which each application thread has one corresponding kernel
thread with the addition of MxN thread mechanisms. MxN thread-
scheduling multiplexes user threads over a fixed (but configurable) number of
kernel threads. In some application classes, MxN thread-scheduling boosts
application efficiency, as it avoids calling kernel functions directly, thus
reducing the overhead of saving and restoring the kernel state when making
those calls. MxN also potentially allows the creation of many more user
threads, because it requires a smaller overhead per thread.
Kernel-based asynchronous I/ O: Asynchronous I/ O mechanisms prove useful in
programming SMP applications by allowing threads to continue processing
while waiting for time-consuming I/ O operations such as disk reads to
complete. Some operating systems support asynchronous I/ O deep in the
kernel, potentially making its use more efficient with heavy-duty programs
such as databases.
AIX 4.3.3
AIX offers good scalability overall, with excellent performance clustering
capabilities and solid 64-bit and SMP capabilities. IBMs latest generation of 64-
bit SMP servers supports up to 24 processors, twice what it supported last year.
IBM has published respectable TPC-C results demonstrating that AIX can
effectively exploit such high-end configurations.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 17
While linearity remains unknown at the 12-way or 24-way level, AIX has achieved
impressive results on past system generations. For example, as noted earlier,
using Bulls Escala servers, which are internally identical to IBMs current 32-bit
RS/ 6000 servers, earlier versions of AIX achieved 70% linearity on TPC-C
benchmarks when going from four to eight processors in identical
configurations. SPECweb96 shows 69% linearity between similar two-way and
four-way H70s. For technical batch jobs, IBM relies mainly on clustered SP
systems, but the older R50 has shown linearity of 82% on SPECfp_rate_base95
and SPECint_rate_base95. This is somewhat low, but still respectable.
IBM introduced its first 64-bit UNIX hardware two years ago and over the last
year it extended 64-bit benefits to the midrange of its server line with the H70.
Also over the last year, AIXs maximum physical memory support has grown to
64 GB with AIX 4.3.3 running on the S80. While AIX provides some degree of
64-bit capability across all four major criteria large file systems, files, physical
memory and address space two implementation weaknesses remain:
AIX 4.3s maximum file size, 64 GB, falls significantly behind competitors,
most of whom support at least 1 TB.
AIX 4.3s 64-bit addressing scheme rests on a hybrid 32-bit/ 64-bit kernel
addressing mechanism that penalizes 64-bit application performance, among
other tradeoffs (described more fully below).
Some controversy has arisen over AIXs single kernel, since the inner kernel itself
remains 32-bit, meaning that its pointers internally remain 32 bits wide. The 64-
bit pointers used by applications are handled internally by the kernel as 64-bit
cookies passed among internal routines. Manipulation of these pointers is
restricted to a small set of 64-bitaware kernel routines. A full description and
examination of the implications of this falls beyond the scope of this paper,
except for a brief examination of the tradeoffs. With AIX, 64-bit applications
that frequently call upon (32-bit) kernel routines may invoke a small performance
penalty for checking, reshaping, and creating internal kernel data structures. IBM
points out that 32-bit applications running on a 64-bit kernel face at least some
overhead of a similar nature. Given the relatively small number of 64-bit
applications and the great benefits that those applications receive from going to
64 bits, AIXs kernel architects feel comfortable with the occasional case where a
64-bit application takes a small performance penalty. In return, AIX offers the
unique ability to use older device drivers and a single kernel across its 32-bit and
64-bit systems. Furthermore, the cache effects of larger 64-bit code and data that
reduce performance may offset much of the potential gain of a true 64-bit kernel.
Overall, the issue has an impact on some 64-bit applications performance, but
the impact appears to be minor.
AIX is equipped with very good clustering options. IBMs HACMP clustering
package supports industry-leading high-availability (HA) functions, and on IBMs
SP systems, AIX has proven its ability to support world-class computational
problems. In terms of concurrent database support as rated by DHBAs HA
research, IBM ranks second only to Tru64 UNIX in breadth of capabilities. IBM
1999-2000 Operating System Function Review
SS, March 2000
18 Copyright 2000 D.H. Brown Associates, Inc.
has demonstrated the scalability of its systems with strong TPC-D benchmarks
on a 48-node SP. IBM also holds second place for clustering performance on the
TPC-C benchmark with a five-node cluster of S70 servers. Vendors such as
Oracle also support their OPS parallel database on AIX. The SP and AIX
support the broadest range of clustered databases, including IBM DB2 UDB
EEE, Informix XPS, Oracle Parallel Server, Red Brick xPP, and Sybase MPP.
Note that there is typically a three-month delay before the most recent version of
AIX is made available on the SP.
IBMs technical clustering stands out for its SP system, which consists of AIX
systems connected by a proprietary, high-performance switch; a version of MPI
optimized for the switch; and sophisticated cluster-management tools.
Like other vendors, IBM depends on third-party reverse proxy software (such as
iPlanet Proxy Server) for web-farm clustering. IBM supports 64-bit kernel
asynchronous I/ O and an MxN thread model, as well as the ability to support
multiple pools of shared memory on 32-bit systems using techniques analogous
to HPs Memory Windows feature. However, unlike all the other products
studied in this report, AIX does not support dynamic page sizing, in part due to
hardware limitations.
HP-UX 11.0
HP-UX offers solid scalability, offering strong SMP support up to 32-way
systems and matching other vendors for 64-bit capabilities. However, HP-UX
lacks MxN threads and strong concurrent database capabilities. While HPs V-
class servers have supported up to 32 processors since last year, HP has recently
moved its NUMA technology from Convex into HP-UX, allowing non-uniform
shared-memory servers that began shipping at the end of 1999 to reach 128
processors. HPs SMP servers will support up to 32 processors today, with
prototype NUMA systems of 128 processors expected to become generally
available in early 2000. At least some performance gain from additional
processors seems likely at the 32-way level, since HP has published TPC-C, TPC-
H, and TPC-D benchmarks on 32-processor systems. Linearity remains
somewhat unclear, but some eight-way N and 32-way V class results suggest
reasonable scalability, despite the fact that they are not strictly comparable, due
to different backplanes and other minor factors. For example, two 440 MHz
systems running TPC-H benchmarks with Informix as the database show 63%
linearity between eight processors and 32 processors. Similarly, an eight-way 440
MHz system running TPC-C benchmarks with Sybase suggests 47% linearity
when compared to a 32-way 440 MHz system running Oracle. (Assuming Oracle
is better than Sybase, 47% would be an upper bound.) A later 32-way Sybase
result with a slightly newer operating system and slightly newer Sybase yields 52%
linearity.
In terms of SPECweb96 linearity, one- to two-way linearity is 74%, with two- to
four-way being 94% and four- to eight-way being 85% (one- to eight-way thus is
78%). Also, a 16-way V-class showed 55% SPECweb96 linearity over a
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 19
comparably clocked four-way K-class, albeit with different backplanes. In terms
of trivially parallelized benchmarks useful for technical or batch-oriented
computing, SPECint_rate_base95 linearity from a 16-way to a 32-way 440MHz
V-Class is 84%, while a 200 MHz V-Class shows 93% linearity going from a one-
way to a 16-way configuration. No results for the HPs 128-way NUMA systems
from are yet available.
HP-UX 11.0 supports 64-bit capabilities in all four areas: files, file systems,
physical memory, and process address space. HPs 64-bit servers exceed many
other UNIX competitors in terms of maximum memory capacity, with support
for 128 GB in the SCA-node V-class servers that started shipping at the end of
1999. Previously, HPs 32 GB V-class limit matched that of others. HP-UX
supports large files and file systems up to 1 TB. HP has largely put compatibility
and software-availability transition issues for the 64-bit platform behind it, having
filled in holes such as its OpenGL implementation.
As measured by DHBAs HA research, HPs support for performance clustering
functions is limited by its lack of virtual raw disk access or low-overhead
messaging protocols, no distributed lock manager in the kernel, and no software
RAID5 support. Still, HP-UX systems support both XPS and OPS. Moreover, in
terms of benchmark evidence, HP has provided eight-node TPC-C and eight-
node TPC-H benchmarks, the former running Oracle and the latter with
Informix. Like other vendors, HP depends on third-party reverse proxy software
(such as iPlanet Proxy Server) for web-farm clustering.
HPs Memory Windows feature, introduced largely to support large SAP
installations on 32-bit systems, removes the restriction that all applications on a
server would have to share a single 1.75-2.75 GB pool of shared memory.
Memory Windows allow each application to have its own, semi-private pool of
1+ GB. With HP-UX 11.0, HP finally introduced kernel threads, using the 1-1
model, but has yet to catch up with competitors offering the more modern MxN
threads for peak scalability. HP does support dynamic page-sizing optimizations
and kernel asynchronous I/ O, features that prove useful for accelerating database
performance.
IRIX 6.5
IRIX has excellent scalability, supporting more processors and memory in SMP
systems than any other studied UNIX product. Commercial performance clustering
remains less fully addressed, however. IRIX offers particularly strong SMP capabilities
for technical requirements. Currently, SGI systems scale as high as 512 processors
with their Origin2000 NUMA hardware, although mainstream commercial
benchmarks do not go nearly as high. SGI has published TPC-C benchmark results
its for 28-way Origin2000 servers as well as a respectable 32-way TPC-D result. While
linearity cannot be determined based on the single TPC-C result, two reasonable data
points on the TPC-D benchmark indicate 77% linearity on database query power and
throughput between comparable eight-way and 32-way systems running the same
processors, operating system, and database.
1999-2000 Operating System Function Review
SS, March 2000
20 Copyright 2000 D.H. Brown Associates, Inc.
SPECweb96 linearity is fine; one- to two-way Origin results indicating 52-59%
linearity. Two- to four-way linearity appears to be between 59% and 70%,
depending on the choice of system pairs, with more results falling at the high end
of that range. Four- to eight-way SPECweb linearity appears to be 60%.
SGI has largely abandoned its vision of a Cellular IRIX that would solve
scalability problems such as kernel page table bottlenecks by running multiple
images of the operating system. Instead, SGI has turned its focus on tuning for
technical and batch applications. Linearity for trivially parallelizable technical and
batch applications appears strong, with 99% linearity from one- to four- to eight-
to 16- to 32- to 64-way Origin2000 systems all running at 250 MHz. From 64- to
128-way, linearity stays at 88% and from 128 to 256-way, linearity improves
slightly to 92%. Overall then, linearity from a one-way to a 256-way systems
appears remarkably strong over a very wide range, with each processor
performing at 88% of maximum performance.
SGI provides 64-bit hardware across its entire product line. The 64-bit version of
IRIX, first released in 1993, is highly mature, meaning compatibility issues are
largely an issue of the past. IRIX stands out for its large real memory support of
256 GB in a single Origin 2000 system, four times its nearest competitor, Suns
Enterprise 10000 server. IRIX is also notable for its storage scalability. SGI has
customers who use its XFS file system with over 100 TB, a stronger claim of real-
world testing than any of its competitors, most of whom guarantee 1 TB-level
testing and support.
SGI has sharpened its focus to emphasize technical performance clustering
scalability, rather than commercial database cluster scalability. While Oracle
supports its Parallel Server, and Informix supports XPS on IRIX, SGI has not
yet run any TPC-C or TPC-D benchmarks using either of these systems. SGIs
ORIGIN ARRAY clustering technology supports eight SMP nodes linked by
Fibre Channel (FC), aimed primarily at technical tasks. SGI offers a suite of
capabilities, including NQE and LSF for workload balancing/ distribution; the
TotalView cluster debugger for debugging a single MPI application across a
cluster; ArrayServices, allowing commands to be run across the cluster for cluster
management; and a cluster accounting package that helps track system usage.
SGIs 48 node, 128-way SMP cluster at LANL demonstrates its ability to deliver
and extend this technology in the most advanced technical-computing
environments. Current SGI environments support up to 512-way SMP, and
those systems can be partitioned into smaller clusters as desired.
In terms of miscellaneous optimizations, IRIX implements MxN thread
scheduling, per-process dynamic page scheduling, and kernel asynchronous I/ O.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 21
SOLARIS 7
Solaris provides excellent SMP scalability, with support for up to 64 processors.
TPC-C and TPC-D benchmarks at the 64-way level suggest at least some
performance gains from that number of processors. Linearity remains extremely
difficult to judge, more even than other vendors, because Sun ran all its tests with
different databases or hardware configurations. If one drops the requirement for
comparable databases and backplanes, a 64-way Oracle configuration shows 40%
linearity per processor over a four-way Sybase configuration, both running 400
MHz processors with 4 MB L2 caches. An even grosser comparison can be made
from that 64-way system to a 24-way system running Sybase on 336 MHz
processors, indicating 69% linearity, assuming that TPC-C results would scale
perfectly with comparable clock speeds. A strong correlation exists between
SPECint speeds and clock rates and between SPECint and TPC-C results, but
the accuracy of these linearity results is highly qualified given the number of
changing variables.
In terms of SPECweb96 linearity, one- to two-way improvements on comparable
Enterprise 250s run a surprisingly low 63%, or 72% on comparable Enterprise
450s. The Enterprise 450 shows 80% linearity from two to four processors. In
terms of technical compute scalability, Suns SPECfp_rate_base95 scales between
78 and 87% on various 32- to 64-way configurations.
Solaris 7 added support for large process address spaces to its previous support
for large files, file systems, and real memory. Suns support for large physical
memory sizes is strong at 64 GB, matching or exceeding all vendors except SGI.
Sun supports files and file systems up to 1 TB. Like HP-UX and IRIX, Solaris 7
comes in both 32-bit and 64-bit flavors, chosen at install time, with the same
binary compatibility for applications and the minor inconvenience of checking
that proper device drivers are available and installed on the appropriate 32-bit or
64-bit hardware.
Suns commercial clustering performance has been validated by four-node
benchmarks with Oracle 8i and four-node TPC-D benchmarks with Informix
Dynamic Server XP. As one of the first UNIX environments to optimize for
kernel threads, Solaris pioneered the MxN thread model and fully supports
kernel asynchronous I/ O. Sun does not appear to support a shared-memory
feature similar to HPs Memory Windows for systems running 32-bit
applications on large memory servers.
1999-2000 Operating System Function Review
SS, March 2000
22 Copyright 2000 D.H. Brown Associates, Inc.
TRU64 UNIX 5.0
While Tru64 UNIXs 64-bit and clustering technologies remain key areas of
strength, the operating system has yet to prove its scalability on high-end SMP
configurations, weakening its scalability range. Until Compaqs future eight-, 16-,
and 32-node high-end GS-series systems arrive in 1H00, Tru64 UNIXs SMP
support remains limited to up to 14 processors. Linearity of that SMP support
remains unclear at best, questionable at worst. Compaq has published 12-way
TPC-D results and recent eight-way TPC-C and TPC-H results (as well as a now-
obsolete 10-way TPC-C result). These results indicate that additional processors
do improve performance through at least the eight-way range and probably up
through the 12-way range. While it might appear odd that Compaq has released
newer benchmarks with fewer processors, this is likely due not to operating-
system factors but to the fact that newer Alpha 21264 processor saturates the
memory bus quicker than the 21164 for which the systems were originally
designed.
On the positive side, comparable six-way and eight-way 700 MHz Sybase
systems, with slightly different backplanes, do show 97% TPC-C linearity over
that narrow processor range. However, SPECweb linearity is much more modest:
48% improvement going from one- to two-way (DS20), 84% going from two-
way DS20 to four-way ES40, and only 11% faster performance going from a
four-way system to a 10-way system with faster clock speed, once the MHz gains
are scaled out. SPECfp_rate_base95 linearity is surprisingly poor, with 52%
improvement from a four-way to an eight-way. This is perhaps due to inadequate
memory bandwidth for the 21264 on the 8400; SPECint_rate_base95 linearity for
that same comparison is 95%.
Tru64 UNIXs lead in 64 bits grows less relevant by the year, as all other UNIX
vendors now effectively match its 64-bit addressing capabilities. Compaq boasts
the largest portfolio of 64-bit applications, but the payoff of this achievement
aside from large-memory databases remains unproven.
From the very beginning, Compaq (then Digital) designed its operating system to
be a fully 64-bit environment. Not surprisingly, Tru64 UNIX offers the strongest
64-bit functionality and the best compatibility story for customers going forward.
Compaqs entire line of Alpha hardware has been 64-bit capable for almost five
years and the system provides large files, file systems, process address space, and
physical memory of up to 28 GB (limited by hardware). Tru64 UNIXs AdvFS
file system has supported large files since its introduction, although its UFS file
system has not. Unlike other competitors, all Tru64 UNIX applications are
available on its 64-bit platform without future migration issues. While Tru64
UNIX does not yet conform to the UNIX98 standard, it does support the earlier
UNIX95 standard and offers a full set of 64-bit device drivers and applications
with a single operating-system binary.
Compaqs TruCluster Server software provides effective clustering support.
Although its HA clustering functions rank as average, Tru64 has driven more on
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 23
performance scalability clustering, ranking first in terms of concurrent database
support as rated by DHBAs HA research. Tru64 UNIX systems support both
Informix XPS and Oracle OPS parallel databases. An eight-node Oracle OPS and
Tru64 UNIX cluster was the first system to break the 100,000 tpmC TPC-C
barrier. More recently, Compaq released a new TPC-H benchmark running
Informix XPS across eight nodes running Tru64 UNIX.
Like other vendors, Compaq depends on third-party reverse proxy software
(such as iPlanet Proxy Server) for web-farm clustering. Tru64 UNIX does
provide a MxN thread model for SMP applications. Its dynamic page sizing is
particularly flexible, allowing page sizes to vary across processor and on a per-
process basis.
1999-2000 Operating System Function Review
SS, March 2000
24 Copyright 2000 D.H. Brown Associates, Inc.
RELIABILITY, AVAILABILITY
AND SERVICEABILITY (RAS)
5.00 6.00 7.00 8.00 9.00
AIX 4.3.3
HP-UX 11.0
Solaris 7
Tru64 UNIX 5.0
IRIX 6.5
Fair OK Good Very Good Excellent
SUMMARY
IRIX 6.5 shares the lead with Tru64 UNIX 5.0 for RAS functions. Tru64 UNIX
offers unmatched storage reliability features and leading HA clustering functions,
thanks in part to its clustering file system, which is unique among all studied
products. IRIX offers particularly strong resiliency functions, along with
competitive HA clustering options. Solaris 7 follows, having the strongest
resiliency functions due to its unmatched Dynamic Reconfiguration and
Alternate Pathing capabilities. Solaris 7 offers only average HA clustering
capabilities, however. HP-UX 11.0 and AIX 4.3.3 have roughly equivalent RAS
capabilities. HP offers stronger resiliency functions and the best overall
serviceability functions, but IBM offers very strong HA clustering functions.
RAS CRITERIA
Virtually all systems have downtime, but enterprise environments place a
premium on functions that help minimize it. Developers have created a number
of software tools and mechanisms to reduce both planned and unplanned
downtime, including:
Resiliency functions: allow an operating system to adapt to outages by certain
hardware components in single systems, including I/ O, CPUs, and memory.
HA clusteringfunctions: protect a complex of multiple systems against hardware
and software failures both in the operating system and applications by
allowing servers to failover operations to a backup server.
Storagereliability functions: such as journaling file systems maintain the integrity
of system and user data in the event of unplanned shutdowns.
RESILIENCY FUNCTIONS
In general, hardware has become more reliable over time. Server designs
increasingly build on highly integrated components, reducing complexity and
hence the number of points of failure. Hardware areas particularly vulnerable to
FI GURE 3:
RAS Functional Ratings
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 25
mechanical failure, such as storage, can be protected through techniques such as
RAID. Systems now build in redundancy for components such as fans to further
improve reliability. Despite these improvements, critical failures can still occur in
components such as memory and CPUs. Leading-edge developers have
responded by introducing features that allow an operating system to adapt to
certain hardware failures, in some cases drawing on techniques that have
traditionally been implemented in mainframe environments. Emerging operating-
system technology that enables such self-healing includes:
Dynamicprocessor resilience: can adapt to processor failures by isolating failed
CPU components. In the event of a soft error (a non-fatal error that allows
the system to continue processing), the system should gracefully discontinue
use of the failed unit. If a processor failure results in a system crash, the
system should reboot automatically after isolating the failed unit.
Dynamicmemory resilience: can dynamically cordon off memory that has suffered
single-bit errors so that software no longer risks using potentially unreliable
areas. Most systems typically can detect and correct single-bit failures with
error-correcting code (ECC) memory. With dynamic memory resilience,
however, the operating system registers repeated single-bit failures in
software so it can isolate affected areas before fatal double-bit errors occur.
Dynamic reconfiguration: can support online addition and removal of I/ O
adapters, CPUs, and memory modules for repairs or upgrades. Dynamic
removal of CPUs and memories requires the operating system to gracefully
dry up use of those resources. Dynamic reconfiguration of I/ O typically
requires support for Alternate Pathing (AP) in the operating system, so any
logical I/ O reference can be switched among different physical I/ O adapters.
Software-awareinternal partitions: can support the division of a large SMP system
into several smaller SMP systems, each running their own copy of the
operating system for increased reliability.
HIGH-AVAILABILITY CLUSTERING FUNCTIONS
The vast majority of risks to system reliability derive from failures in software,
including the operating system, middleware, and applications. Administrators can
use HA clustering techniques to maintain the availability of operating-system
services and applications by failing over to a backup system in the event of
system outage, either planned or unplanned. HA clustering allows one or more
servers to take over for a server that has crashed or stopped processing normally
due to an operating-system or application failure, allowing processing to
continue. By isolating faults on the failed node, the remaining nodes can continue
functioning, keeping the overall clustered system in operation, albeit at reduced
capacity.
In some cases, clustering can help with some management tasks by absorbing
planned downtime in addition to addressing system failure. For example, a cluster
could allow testing of new software or hardware in a working system while still
protecting the remaining nodes from any resulting failures. Clusters can also be
used to respond to failure of hardware components such as disks or adapters.
1999-2000 Operating System Function Review
SS, March 2000
26 Copyright 2000 D.H. Brown Associates, Inc.
Note that most clustering solutions only try to insure that service gets restored
within a reasonable time limit. They do not necessarily guarantee continuous
service. In fact, at the time of a failure, cluster clients will likely receive errors
while the cluster completes state transition changes. Unlike Fault-Tolerant (FT)
systems, which tend to use specifically-designed and usually costly proprietary
mechanisms to enable truly continuous availability, clusters emphasize the use of
standard building blocks (i.e., traditional servers used to construct meta-
systems with some level of a single-system image). As part of the design
tradeoff, a clusters failover process does not necessarily occur immediately or
transparently.
Full-function HA clustering solutions typically include a number of components,
including:
Failuredetection and recovery: Clustering software monitors the health of systems
and applications by running agents that continuously probe for certain
conditions. Vendors usually provide agents for monitoring hardware, the
operating system, and key applications such as databases and messaging
systems. They typically also provide an API that developers can use to
configure monitoring of their own applications.
Failover configuration: When agents detect a failure, they can trigger a variety of
actions, depending on the configurability of the clustering package. First, the
system must decide whether to attempt a local recovery or initiate a failover,
in which the workload is moved to a backup server. In failover situations,
support for more than two nodes becomes a significant added value, because
of the ability to perform cascading and multidirectional failover. Cascading
failover provides higher levels of reliability by allowing the workload to
continue migrating to yet another backup node if the primary backup node
fails. Multidirectional failover allows a failed nodes workload to be split and
failed over to multiple backup nodes.
Cluster administration: The basic definition of a cluster has long invited
contentious debate in both marketing and academic circles. The one concept
agreed on by all relates to the fundamental requirement for a single-system
image the ability to view and operate the cluster as if it were a single virtual
server. From a clients perspective, a cluster implementation should be
transparent and require no special modification to client software or
hardware. From an operator standpoint, administration should involve a
single point of interaction, and management tools should hide the
implementation details of multiple servers as much as possible.
Disaster recovery: Many clustering packages depend on the ability for systems to
share disks, since backup nodes need to access the same data used by primary
nodes. However, most shared-storage configurations constrain the distance
between nodes to the maximum length of I/ O channels such as SCSI or FC,
which at best extend to campus ranges of a thousand yards or so. Disaster-
recovery configurations allow nodes to be separated by geographically
significant distances, measured in miles or even continents. These greater
distances protect systems from outages that affect entire sites, such as floods
or terrorist attacks.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 27
Cluster filesystem: The cluster can share a single file system across multiple
nodes, both for data and for the operating-system code itself on the root file
system. This feature dramatically simplifies serviceability and manageability of
HA clusters.
STORAGE RELIABILITY AND SCALABILITY
Since data management represents a central function in most server
environments, operating systems must implement specific features to maintain
the integrity of storage. A journaling file system (JFS) provides two particularly
important storage reliability benefits by increasing the robustness of the file
system and reducing the time required to boot a system configured with large
amounts of storage after unplanned shutdowns. Journaling employs transaction-
based logging techniques similar to those of database systems. Before updating
any file system control information (i.e., metadata), the operating system enters
information concerning the update into a disk-based log. Only after the system
has confirmed that it has written the user data safely to disk does it attempt to
update the actual metadata. If the system loses power or otherwise fails during
the metadata update, the JFS can reconstruct the all-important metadata from
information in the log. In this way, file systems always move from one consistent
state to another, never attempting unsafe writes.
SERVICEABILITY ENHANCEMENTS
Enterprise systems administrators require a broad portfolio of tools to help them
service the operating system. They use these tools to harden the system against
failures (usually by performing postmortems on past failures) and to tune it for
optimal performance. Potential serviceability options include:
Checkpoint/ restart: This capability allows the operating system to take
snapshots of a running application, including memory contents and register
values. When a server fails, the snapshot can be used after the server comes
back up to restore an application to its exact state at the time of failure.
Resource management: While standard UNIX provides disk quotas and per-
process resource limitations, some systems provide more advanced
management capabilities, such as allocating CPU and memory percentages by
user or user group.
Year 2000 validation: While most Year 2000 risks derive from shortsighted
designs in application code, users must also test operating-system functions
as a minimum level of protection. A number of vendors certify that their
operating systems work in Year 2000 conditions, in some cases referring to
assessments by independent organizations.
Enhanced core dump analysis: This capability provides tools for analyzing
application failures. When UNIX applications crash, they leave behind a core
dump file containing the state of the application at the time of failure.
Operating systems can provide enhanced abilities to analyze these files with
more system-specific detail than standard debuggers provide.
1999-2000 Operating System Function Review
SS, March 2000
28 Copyright 2000 D.H. Brown Associates, Inc.
Efficient kernel dump: This capability provides tools for analyzing total system
failures. Extreme software failures can result in operating-system crashes. As
with application crashes, developers can examine dump files containing a
snapshot of the entire system memory at the time of failure. On high-end
servers configured with very large amounts of memory, especially 64-bit
systems, such files can grow significantly. Operating systems can make
analysis of such files more efficient by reducing the amount of data through
compression or elimination of irrelevant information.
AIX 4.3.3
AIX has processor resilience and partial memory resilience. If AIX discovers a
sick processor or memory block at boot time, the system turns off the defective
part and does not use it. This includes a situation when a processor encounters
too many recoverable errors, although recoverable errors for ECC memory are
not yet trapped in a similar manner. In any case, if the system is halted for a sick
processor or memory block, then the processor and block are turned off and not
used when the system reboots. However, AIX does not yet support any dynamic
reconfiguration, other than the ability to turn off processors using the
cpudisable command.
AIX achieves the highest overall rating for HA cluster features according to
DHBAs HA scorecard. While achieving competitive ratings in every area, AIX
breaks out with overwhelming advantages for Disaster Recover/ Remote Data
Replication. AIX does not yet offer a full root CFS, although it provides
networked file systems with its SP cluster hardware. In terms of storage
reliability, AIX includes a journal file system, albeit one that protects the integrity
of the file system as a whole, and not individual files.
AIX matches many of the serviceability and performance improvements
provided by competitors, including efficient core and kernel dump facilities, and
some checkpoint/ restart capability. AIX does not offer dump analysis tools.
AIXs kernel dumps are both selective in the data saved and can be compressed
on the fly when created. For example, for a device driver that runs in kernel
mode, the driver can explicitly specify what it wants to dump, possibly querying
the device upon a dump for specific status information. IBM claims that such
device driver data typically runs about 0.25 MB per driver. IBM claims its dump
sizes are in general limited to less than 10% of real memory and less than 5% on
larger systems (>4GB RAM), with 64 GB systems producing dumps typically
2GB in size. In terms of checkpoint/ restart capability, IBMs LoadLeveller allows
checkpoint/ restart of a set of processes with parent-child relationships intact and
process IDs in place. This capability includes any kind of processes run under
LoadLeveller, even Perl and shell scripts, without requiring explicit API hooks.
However, LoadLeveller is an additional charge, and does not include more
sophisticated multiple-process-family checkpoint/ restart or socket checkpoint/
restart.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 29
HP-UX 11.0
While HP-UX offers little in the way of dynamic reconfiguration, it is at the
forefront for resiliency features. When certain classes of faults occur at run-time,
HP-UX will trap and check for processor failure. If a processor has failed, HP-
UX will notify an administrator who can take the processor down while the
system stays up. HP-UX can also check memory at run-time to detect if single-bit
hard errors or repeating soft errors have occurred. If so, HP deallocates the
respective 4K page of memory to prevent a second and fatal bit error (note that
ECC memory detects, but does not correct, a second bit error.) HP-UX also logs
these errors for later analysis or reboot so the bad memory is permanently kept
offline throughout succeeding boot cycles. HP-UX 11 does not yet support
dynamic addition and removal of CPU, memory, or I/ O devices.
HP-UX offers solid HA cluster features, taking third place according to DHBAs
HA scorecard. In particular, HPs clustering options offer leading cluster failover
configuration and detection/ backup/ recovery functions. In terms of storage
reliability, HP-UX includes a journal file system, albeit one that protects the
integrity of the file system as a whole, not individual files.
Miscellaneous reliability features include the ability to log all console error
messages to a file (syslogd), and core and kernel dump infrastructures
optimized for efficient saving of relevant information to disk. User core dumps
can be analyzed with WDB, HPs Windowed Debugger, which can attach to a
running/ hung process or to attach to and examine application core files. For
kernel core dumps, HP has an internal tool, /usr/contrib/bin/q4
typically used only by HP field support or a few specially trained customers. HP-
UX lacks a checkpoint/ restart tool, although HP provided one on past SPP-UX
based systems from its Convex division.
IRIX 6.5
While IRIX lags its competitors in traditional failover HA capabilities critical to
the commercial server market, SGIs focus on reliability for technical servers
shows. IRIX offers strong processor and memory resiliency, a unique integrated
checkpoint-restart capability, highly flexible dynamic page sizing, and a static
partitioning capability that is second only to Suns Dynamic Domains.
IRIX offers strong resilience to processor and memory failures. IRIX can catch a
wide range of faults in user mode. When IRIX encounters an unrecoverable CPU
error, it runs a software routine to gather hardware diagnostics. If the failure did
not cause a crash, an administrator can stop scheduling processes onto the CPU
via the mpadmin command. Some processor errors are automatically recovered
from without crashing or administrator intervention, including cache errors in
the instruction cache, cache errors on clean data lines, and unrecoverable cache
errors for user data, in which case the running process is killed. IRIX can also use
those scheduler features to restrict a processor whose cache has too many single-
bit (correctable) errors. If failure did cause a crash, the CPU is disabled after
1999-2000 Operating System Function Review
SS, March 2000
30 Copyright 2000 D.H. Brown Associates, Inc.
rebooting. The power-on test also detects the failed CPU and automatically
disables it. If ECC-correctable errors grow beyond a certain threshold, SGI says
IRIX can deallocate those pieces of memory so they are not used by the
operating system for further read/ write operations. IRIX also offers alternate
pathing of network and disk I/ O, providing automatic failover to a second
already-configured Ethernet or SCSI adapter. Despite strong resilience to failure,
SGI does not yet provide any support for dynamic reconfiguration of CPUs,
memory, or I/ O devices.
Finally, IRIX stands out for its hardware processor-partitioning capability, a
reliability enhancement that allows several operating-system images to run on a
single large server, insuring that in the case of operating-system or hardware
failure in one partition, other partitions will remain up. While the partitioning is
inferior to Suns Dynamic Domains capability which allows for the partition
sizes to be shrunk or expanded at run-time rather than at reboot this capability
is available on SGIs whole Origin 2000 product line. While the ability to run
different versions of the operating system and to tune partitions can essentially
be accomplished using a cluster, those clusters cannot aggregate their compute
resources to form a single-system image upon reboot as the SGI processor
partitioning scheme allows.
SGI has made significant strides forwards in its HA failover offerings, largely
catching up to other major UNIX vendors. In 1999, SGI extended its FailSafe
product with version 2.0 product from a dual-node-only failover solution to
provide eight-node multidirectional and cascading failover. Failsafe now includes
disaster recovery and remote data replication capabilities over two-kilometer
Ethernet or FC connections. Management of the cluster can occur within a
single GUI environment called IRIS Console, and preconfigured failover scripts
are available for a wide range of server scenarios, including web, email, NFS, and
Samba file serving, and Oracle or Informix database serving. Still, SGI currently
lacks software RAID 5 support or a cluster filesystem to enable single-system
image clusters. A cluster filesystem is currently under development, however.
IRIX supports efficient kernel dumps, allowing both minimal dumps and
compressed dumps. These dumps can then be analyzed (in compressed form) by
a software tool (icrash) and by the hardware diagnostic processor (FRU), which
will perform an automatic analysis of the kernel dump and will generate a list of
probable causes, ordered by percentage likelihood that a specific item is causing
the failure. The crash information can also automatically be sent back to SGI
over the network or dial-out modem for further analysis or response. IRIX does
not support compression for user-level program dumps, since unlike full system
dumps, they generally are not multi-gigabytes in length.
SGI also provides availmon as a standard part of IRIX embedded in the system
boot and shutdown processes. The availmon utility differentiates between
controlled shutdowns, system panics, system hangs, power cycles, and power
failures. Uptime is tracked by a lightweight daemon, and diagnostic information is
collected from icrash, syslog, hinv, versions, and gfxinfo. All availability and
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 31
diagnostic data for cooperating systems are maintained in an SGI database (a
check-box is presented upon installation asking if users want to send failure
information to SGI over the Internet). This database provides SGI with overall
reliability data and a specific problem history for individual machines. While it
remains unclear whether IRIX can log console error messages to a file, IRIX
systems do provide unattended reboot capability in case of system failure in an
attempt to restore services.
SGI has integrated checkpoint-restart into the kernel and it is now bundled
beginning with IRIX 6.5, a further improvement over its previous unbundled
Hibernator II offering. The kernel integration allows restart of kernel-level issues
like open sockets, MPI jobs, open files, open file descriptors, etc. Checkpoint-
restart is available on IRIX and acts as the technical server counterpoint to HA
failover options popular for commercial servers. Checkpoint-restart capabilities
make the most sense for long compute or batch jobs, while HA failover is best
suited for transactional environments with many small jobs. IRIX is also
unusually advanced in its ability to support dynamic page sizes. Not only can page
size be specified on a per-process basis at run-time, but IRIX allows processes to
change their page size while executing or even to have multiple page sizes within
the same process.
SOLARIS 7
While Solaris 7 offers only minimal resiliency, with the ability to disable failed
CPUs only on reboot, it is at the forefront of Dynamic Reconfiguration
functions. Solaris supports I/ O reconfiguration and Alternate Pathing, on most
of its server product line, providing a unique differentiation relative to all studied
products. These functions enable online repair and reconfiguration of CPUs,
memory, and I/ O as follows:
DynamicReconfiguration: enables HA by allowing a system administrator to dry
up defective server components such as CPUs, memory, and I/ O without
application interruption by off-loading processes. Then, Suns hot-plug
hardware capability allows the defective component to be replaced without
creating any electrical problems. This reduces both planned downtime (e.g.,
for upgrades) and unplanned downtime (e.g., for component failures). Note
that while Solaris 7 includes operating-system support for these functions,
only Suns Enterprise 10000 servers currently have the necessary hardware
support for all classes of reconfiguration.
Alternate Pathing: allows an I/ O path to be redirected transparently to
applications, allowing a server to adapt to I/ O device failure.
Solaris 7 has a modest standing on DHBAs HA scorecard and is notable
primarily for its software RAID capabilities. Sun does not yet support a CFS,
although one is expected in 2000. On Suns Enterprise 10000 servers, Solaris
supports a unique approach to resource management through Dynamic System
Domains (DSD). DSDs permit subdividing a server into subsystems, each with
their own copy of the operating system and the ability to operate completely
1999-2000 Operating System Function Review
SS, March 2000
32 Copyright 2000 D.H. Brown Associates, Inc.
isolated from the other parts of the system. That is, if one domain crashes, the
remaining domains continue to function.
The key distinction from cluster-in-a-box alternatives, in which multiple
computers are simply configured in a single enclosure, is the ability to shift
partition boundaries dynamically. For example, administrators can deploy a
number of smaller domains during business hours that are assigned to individual
departments, and can consolidate the domains into a single large SMP during
after-hours batch processing. Up to 16 domains may be created. Finer-grained
resource management within a domain requires Sun Resource Manager, which is
covered in the system-management section later in this paper.
Solaris supports at least two miscellaneous serviceability features: efficient core
dumping via compression and a dump analysis tool. Solaris lacks other features
such as efficient kernel dumps and checkpoint/ restart capabilities. Solaris
provides storage reliability through a journaling file system, although it only
protects the integrity of the file system structure and not the integrity of the files
themselves.
TRU64 UNIX 5.0
Tru64 UNIX provides moderate dynamic resilience and reconfiguration
functions. As with AIX, operators can take processors offline, with Tru64 UNIX
5.0 adding the capability to take the boot processor offline and the ability for the
processor to be taken offline automatically in case of internal cache errors. When
a processor failure does lead to a crash, Tru64 UNIX will reboot unattended with
the failed CPU disabled. In terms of memory resilience, at boot time Tru64
UNIX can detect single-bit memory errors and deallocate pages. At run-time,
however, it will simply log errors without deallocating failed components. High-
end Tru64-based servers can be partitioned into multiple operating-system
images.
Tru64 UNIX 5.0 also supports some alternate pathing functions for both storage
and network peripherals, over SCSI, Ethernet, and TruConnect busses. With the
proper redundant configurations, Tru64 UNIX will automatically switch from
using a failed component to a redundant component in the following areas:
SCSI multipath: for SCSI disks in a multipath configuration, if an in-use path
fails, then the system will reroute requests to another path. The failed path is
put into a state where it is tested for viability. The time between each test of a
failed path increases over time until some maximum delay is reached.
NETRAIN: allows a set of network adapters to represent a logical network
device. If the active adapter in the set fails, then one of the other adapters is
designated as the active adapter.
Memory Channel: provides data redundancy.
DistributedRawDisk (DRD): provides multiple paths to SCSI disks, tapes, and
media changers that are on a SCSI bus that is shared between multiple
systems in a Tru64 UNIX cluster. If the path to a SCSI device fails on one
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 33
system, the DRD will route the requests through another system in the
cluster. For tapes and media changers, the application must be involved with
the rerouting because of the characteristics of these device types. However,
for SCSI disks, the rerouting of the I/ O requests is transparent to the
applications.
Logical StorageManager (LSM): provides redundancy of data across multiple
disks.
Tru64 UNIX ranks competitively for HA cluster features according to DHBAs
HA scorecard, achieving particularly strong ratings in cluster administration
facilities. Most notably, after years of development effort, Tru64 UNIX 5.0 now
optionally includes a clustered file system as part of the TruCluster Server 5.0
software.
While sold separately from the base operating system, Tru64 UNIX 5.0s CFS
marks a significant technological advance that remains unmatched by its
competitors. A cluster file system (CFS) provides a single file system image across
a cluster, partially delivering a single-system image. A CFS enables simplified
administration, because program and configuration files need not be duplicated
and maintained on all nodes. Furthermore, a CFS enhances the single-system
image presented to cluster users with a single file system. The lack of support for
a CFS means that separate configuration file copies must be maintained through
manual or automatic propagation of changes to all nodes.
The CFS also simplifies operating-system installation, since only a single install is
needed. Nodes share a common file system, allowing access to a nodes disk and
its configuration information even if the node is down. The CFS also reduces the
burden of writing start and stop scripts, a task previously required in cluster
environments where applications had to be spawned on multiple nodes.
One implication of a CFS is that processes must have unique process-ID
numbers across the cluster, since many applications create temporary files that
depend on the process ID to create a unique filename. Tru64 UNIX 5.0 enables
a cluster-wide namespace for processes by incorporating a node ID into the
process ID, yielding a guaranteed unique process ID. Tru64 UNIX also provides
cluster aliasing, allowing all the nodes in a cluster to be accessed from a single IP
name and address. Thus, the cluster appears as a single-system image to systems
outside the cluster.
In terms of storage reliability, only Tru64 UNIX offers a data-journaling file
system, which continues to represent a powerful differentiator unmatched by any
of the other studied systems. Not only is the file-system structure protected from
corruption, but files themselves are protected from being left in an unknown
state after a crash.
Tru64 UNIX offers a competitive set of serviceability enhancements. The
Compaq Crash Analysis Tool automates on-site crash dump analysis, although a
license for that software requires a warranty/ service contract. The tool can create
1999-2000 Operating System Function Review
SS, March 2000
34 Copyright 2000 D.H. Brown Associates, Inc.
a signature of the crash and compare that signature to a database of other crash
signatures and the faults that caused them. Thus a match can yield a quick
diagnosis. The latest database can be downloaded from a service center. Compaq
also includes system hardware-analysis tools for logging and correlating hardware
failures or expected failures using Tru64 UNIX 5.0s new event-management
tools (covered in the system-management section later in this paper). Compaq
also bundles a Revision and Control Management Tool (RCM) with the Compaq
Crash Analysis Tool. RCM acts as a hardware and software revision checker,
reading part numbers and comparing them to a pre-approved configuration
topology. RCM takes a snapshot of the status of hardware and all components
bundled with the operating system and can compare them online to a database of
previous states for that system stored in a Compaq support center.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 35
SYSTEM MANAGEMENT
5.00 6.00 7.00 8.00 9.00
IRIX 6.5
Solaris 7
HP-UX 11.0
AIX 4.3.3
Tru64 UNIX 5.0
Fair OK Good Very Good Excellent
SUMMARY
Tru64 UNIX 5.0 ranks first for system management, thanks to very strong
operating-system-management functions, which now match the strength of AIX,
the long-time leader in this area. Tru64 also provides very strong heterogeneous
management capabilities, bundling the ability to host Windows NT network-
authentication functions. AIX 4.3.3 follows closely with the strongest hardware
management, supporting plug-and-play configuration of RS/ 6000 hardware and
peripherals and strong remote manageability based on its web-based system
manager. HP-UX 11.0 achieves average ratings across most functional areas, but
still trails in heterogeneous management and interoperability functions. Solaris 7
stands out for its leading resource-management capabilities, but has yet to catch
up with leaders for operating-system-management functions. IRIX 6.5 leads in
storage management and has strong resource-management tools, but relatively
weak operating-system-management and remote-manageability functions.
SYSTEM MANAGEMENT CRITERIA
As UNIX systems become increasingly complex and users deploy them in ever-
more-critical enterprise roles, UNIX system management has become a key area
of differentiation. UNIX system-management functions now fall into the
following classes:
Operating-systemstatemanagement: GUI tools and infrastructure that facilitate
software, patch, and driver installation, operating-system configuration, and
event management.
Hardwarestatemanagement: Plug-and-play management of peripherals.
Storageperipheral management: Provides such capabilities as disk, volume, and file
system management; online file-system backup capabilities; and tape
management.
Resourcemanagement: Provides the ability to limit usage of system resources by
user or application.
FI GURE 4:
System Management
Functional Ratings
1999-2000 Operating System Function Review
SS, March 2000
36 Copyright 2000 D.H. Brown Associates, Inc.
Remotemanageability: Enables remote operating-system access, template-based
installation across multiple servers, and web-based administration.
Heterogeneous management and interoperability: Enables heterogeneous manage-
ment across UNIX, Windows NT, NetWare, and Linux platforms.
OPERATING-SYSTEM MANAGEMENT
UNIX has historically had notoriously poor system management. Most UNIX
systems require administrators to hand-edit a large and dispersed set of cryptic
configuration files stored in the /etc directory a crude and error-prone
process. Hand-editing has been simplified in many cases through the
development of GUI management tools, allowing newer administrators to
employ a recognize and point approach that is easier to learn than the old
remember and type approaches.
Management of dispersed files has been gone through three phases:
1. De facto locations for configuration information, in /etc for system
information and dot files in users home directories.
2. Centralized registries that store all system information in a single file,
manipulated and searched with database-like queries.
3. Scanning and parsing tools that capture the state of existing /etc files and
abstract away management tasks through a query-and-modify interface.
These approaches offer varying tradeoffs for the administrator, as described in
the following table:
State Management
Approach
Pros Cons
/etc files store state
Familiar to experienced
administrators
Can modify by hand for
installation on remote machine
Corruption easier to fix by
hand due to ASCII format
Requires learning curve
for administrators
Files have differing
definitions of white space
and comments
Registry stores state
Can store past state for version
control of configuration
changes
Removes configuration file
parsing burden from
application developers
Corrupt registries can
require complete reinstall
Tools query and
modify /etc state
Maintains all the benefits of
/etc files above
Also removes configuration
file parsing burden from
application developers
Also simplifies development of
further management tools
Lacks version control
TABLE 1:
State Management Tradeoffs
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 37
As networks have expanded across organizations, lowly administrative chores
such as restarting a printer queue or handling a backup procedure often fall to
less-experienced (and less expensive) system administrators. Some operating
systems enable administrative role delegation for functions that normally require
broad administrative privileges, so full trusted access to the entire network does
not need to be granted to every low-level administrator. The ability to safely
delegate such limited authority allows more experienced administrators to avoid
spending their time being interrupted by trivial tasks.
Another challenge relates to operating-system state management. Administrators
have to contend with a continuous cycle of updates to the operating system and
applications, as well as a stream of patches that address particular issues. If an
installation of software or a patch causes problems to the system, the ability to
roll back (back out of) those changes to the system can be a significant time-
saving feature for system administrators.
EVENT MANAGEMENT
One of the most tedious problems with administering UNIX systems for
availability and security is that logs of various system services get stored all over
the system. In addition, each of these logs tends to have its own peculiar format
for describing a particular event. In 1999, several UNIX vendors have introduced
the ability to track, view, and notify parties about system events with a unified
interface, marking a major improvement in UNIX capability.
Event management is tremendously useful for administrators because it provides
them with a single console for tracking the following types of events on the
system:
System logs tend to grow rapidly, in some cases swelling rapidly enough to eat up
all available disk space. Furthermore, the logs, which are scattered throughout the
system, grow at different paces. By providing a single log to capture events as
they occur, old logs can be cleaned up and removed frequently. This allows the
event managers log to serve as the only log that must be stored for debugging
purposes; its size can be more easily tracked and backed up as needed.
Also, event management simplifies tracking of system events, allowing issues to
be addressed more proactively. Previously, such problems might have been
ignored until they generated a crisis. Event management brings a quantum
improvement in the ability to monitor a UNIX system.
Disk full
Disk fails
CPU error
System panic
Configuration change
Subsystem started/ stopped
Application started/ stopped
Application error
Repeated failed login
1999-2000 Operating System Function Review
SS, March 2000
38 Copyright 2000 D.H. Brown Associates, Inc.
While event management has been somewhat possible with such heterogeneous
system-management tools as HP OpenView, Tivoli TME10, and CA Unicenter,
integrating this capability within the operating system itself allows a greater range
of system-specific information to be gathered and tracked. This capability further
enables event management for a much broader range of administrators, because
they would otherwise need to purchase, install, and configure a complex
framework. Complex frameworks still retain their value, however, for managing
networks of hundreds of systems or of heterogeneous systems.
HARDWARE STATE MANAGEMENT
In addition to the usual tasks of maintaining user accounts and application
software, routine administration of server environments typically involves adding
and replacing processor cards; memory banks; disks; adapters for I/ O and
networking; terminals; printers; and other hardware. Managing the state of
connected hardware as administrators repair and upgrade systems typically
involves several phases:
Physically installing and connecting the hardware;
Reflecting the state of installed hardware at a low-level (i.e., firmware);
Updating the operating system with appropriate device drivers; and
Making hardware resources available to applications.
In the past, performing system upgrades typically involved in-depth
understanding of hardware architectures and at least a cursory knowledge of the
operating systems innards. This meant that only expensive field support
technicians or in-house experts were up to the job. As more systems begin to use
industry-standard parts for disk, peripherals, and memory, however, the emphasis
has shifted. Now, developers seek to simplify overall system maintenance so
relatively untrained personnel can perform upgrades and simple repairs. Server
administrators now covet the plug-and-play simplicity long offered by systems
such as the Apple Macintosh.
STORAGE PERIPHERAL MANAGEMENT
Most UNIX systems allow disks to be partitioned into smaller logical volumes.
However, file systems and individual files remain limited to a size no larger than
individual disks, which becomes a problem for data-intensive applications such as
databases, CAD/ CAM/ CAE, and image processing. Storage-management tools
such as Logical Volume Managers (LVM) overcome this limitation by allowing
the creation of a virtual disk or volume made up of one or more physical
disks. Combining several disks to form a logical volume can increase capacity,
reliability, and/ or performance. Unlike the more primitive filesystem and
physical-partition approaches, however, logical volumes often allow
administrators to manipulate them online, without requiring a reboot.
LVMs manage disks in terms of logical volumes, not physical ones. The LVM
runs as a layer beneath the basic file system, translating requests for logical disk
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 39
volumes into physical device commands. Acting as an interpreter, the LVM can
represent several small disks as one large virtual disk (disk spanning), or one large
disk as several smaller disk partitions (disk partitioning). Thus, large files can span
multiple disk units. Other software RAID capabilities such as parity checking and
mirroring can also be incorporated automatically as part of the abstraction
provided by the LVM.
More importantly, perhaps, sophisticated LVMs add the ability to move volumes
to different physical locations and to extend volumes if not enough space is
initially allocated. The system can accomplish both while the volumes remain
online and in use. Without the added layer of abstraction provided by volume
management, many of these operations require the system to be shut down and
rebooted, increasing the need for planned downtime to reconfigure the system.
Volumes can also be shrunk, but face a significant limitation in some cases.
Operating systems that do not support shrinking a file system will still require a
backup/ reconfigure/ restore operation for the data on that file system.
REMOTE MANAGEABILITY
As enterprises depend more and more on networks, the IT infrastructure
becomes more distributed, dramatically increasing the number of servers that
they deploy. Large enterprises routinely disperse servers geographically, in some
cases across different continents and time zones. Thus, it becomes increasingly
important to have the capability to effectively manage operating systems
remotely. If an enterprise depends on a thousand servers, it is simply not feasible
to maintain a thousand system administrators locally.
A number of techniques have emerged to help manage operating systems
remotely, including:
Remote operating-systemaccess: Since the operating system controls all server
functions, administrators must be able to communicate with it remotely.
Ideally, a remote administrator should be able to use the system as if he or
she were physically next to the hardware. Remote interaction might occur
over character-oriented sessions (as if the administrator were using a local
ASCII terminal) or via a distributed GUI (with graphics and keyboard/ mouse
events being passed back and forth based on the native look-and-feel of the
environment being managed).
Web-based systemmanagement: Some systems can be managed remotely across
networks from any web browser using mechanisms such as Java. Javas user
interface widgets closely match those of mainstream Windows widgets,
offering management tools that are relatively intuitive to inexperienced users.
Template-based installation: The template approach employs a cookie-cutter
method, in which a template server is created and tested, then replicated
across multiple servers using some distribution mechanism. This technique
incurs a cost based on the fact that the server that hosts the template is not
used. This approach has the advantage of allowing administrators to press a
standard server, which might be idle or used for low-priority tasks, into
1999-2000 Operating System Function Review
SS, March 2000
40 Copyright 2000 D.H. Brown Associates, Inc.
service if a critical server crashes. By changing the configuration of the
replacement server, administrators can make it into a replacement for the
critical server. This ability provides tremendous flexibility for managing
systems. As a further step, administrators may automate the update of
common parts so they can ensure all servers are in fact identical. Making the
servers identical, allows administrators to guarantee that the backup server
will act as the critical server once they change the configuration. In addition,
if a critical server starts behaving in a problematic fashion, administrators can
use an identical server to replicate the problem, rather than having to take the
critical server out of service.
RESOURCE MANAGEMENT
When every user depends solely on their own system to perform all their work,
or when each application runs on its own server, resource management is largely
a matter of making sure each person and application has the right-sized systems.
In UNIX environments, the disk space has traditionally been shared,
strengthening the need for disk quotas, but other resources have been partitioned
in a more ad hoc manner.
The ability to properly allocate resources has grown in importance in response to
three factors:
HA scenarios are being implemented more widely to avoid node failures that
can result in insufficient capacity to run a businesss critical and non-critical
applications;
UNIX systems are growing increasingly capable of scaling to extremely large
workloads; and
Businesses are attempting to use these larger systems to cut management and
system costs through server consolidation.
In HA scenarios, resource management can ensure that critical applications
maintain the share of resources they need to meet business-critical performance
requirements. Resource management also enables administrators to more easily
control and reduce the impact of occasional runaway processes that would
otherwise take over the whole system, either accidentally or as a deliberate denial-
of-service attack. Such sudden usage spikes can be controlled online even before
the precise origin of the problem is located and fixed on a more permanent basis,
a useful troubleshooting capability.
Using new high-performance UNIX machines such as IBMs SP, HPs V-2500,
or Suns Enterprise 10000, businesses have adopted server-consolidation
strategies moving multiple applications onto a single large SMP system. Such
strategies employ two basic mechanisms: partitioning a large SMP server into
several smaller SMP servers (each with their own operating-system image and
configuration) and adding resource constraint code within the scheduler for
finer-grained control of specific resources within the SMP server.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 41
Partitioning schemes can be either static or dynamic. Static partitioning implies
that the size of partitions can only be changed upon a reboot, as the system
stores partition information in a file accessed by the firmware at boot time.
Dynamic partitioning implies that the number of processors in each partition can
be changed at run-time without a reboot. This requires more sophisticated
implementations of the operating system (and sometimes of the applications), so
software can properly dry up and release resources, as well as recognizing and
making use of newly added resources.
Finer-grained resource management within a server typically attempts to limit or
ensure adequate availability of very specific resources within a system through:
CPU management,
Memory management (both physical and virtual),
Network bandwidth management,
Disk I/ O bandwidth management, and
Management of other resources (such as logins, connections, file descriptors,
or printer usage).
The granularity of such resource management varies; most tools provide the
ability to manage resource on a per-user, per-group, or per-application basis.
Some provide additional degrees of control such as minimum/ maximum usage
or control by time of day, date, or time of month.
Effectively managing resource usage requires the ability to map business priorities
onto computing priorities, so resource-management tools come with a variety of
hierarchical controls for allocating resources. Just as UNIXs group mechanism
provides common security for a set of users or applications, resource-
management tools often allocate resources to a set of users and/ or applications
known as a class. Resource management allows server consolidation to guarantee
each department or application a certain portion of system resources. Some
resource-management tools offer additional hierarchical mechanisms called tiers,
which allow specific classes to receive priority for spare resources.
HETEROGENEOUS MANAGEMENT AND INTEROPERABILITY
Heterogeneous computing networks with a mix of UNIX and Windows
platforms are relatively common, and cross-platform interoperability and
management remain important criteria. Two additional platforms may require
interoperability and management in many environments, with Linux deployment
increasing and NetWare decreasing.
At the network protocol level, UNIX-Windows interoperability has become
relatively straightforward, thanks in part to the dominance of the Internet, which
is based on the TCP/ IP protocol used by most UNIX systems, rather than the
older Windows-based NetBEUI protocol. As a result, UNIX-Windows
interoperability issues have largely shifted to the service level (i.e., the ability to
share file, print, and application resources across both platforms).
1999-2000 Operating System Function Review
SS, March 2000
42 Copyright 2000 D.H. Brown Associates, Inc.
Historically, accessing UNIX files and printers from PCs required each client to
have extensions that worked on UNIX terms, such as Suns PC-NFS software
a cumbersome arrangement that incurred significant software costs and
administration burdens. Now, a number of products exist that enable a UNIX
server to act as a file and print server for Windows clients using Microsofts
native Server Message Block (SMB) protocol. When configured with these
systems, Windows clients can access UNIX files and printers transparently using
their native protocols. UNIX servers simply appear in the Windows Network
Neighborhood as virtual Windows NT servers.
SMB a networking protocol defined by Microsoft, Intel, and IBM allows
machines running DOS, Windows, and OS/ 2 to share files and printers across a
network. Like UNIXs NFS, SMB is a high-level protocol supporting remote file
operations. It resides above the transport protocols (NetBEUI, TCP/ IP,
IPX/ SPX) that manage the transfer.
SMB was originally designed to run over Microsofts NetBIOS protocol, but is
now supported over TCP/ IP. SMB effectively acts as a Remote Procedure Call
(RPC) specialized for file systems. A redirector packages SMB requests into a
Network Control Block (NBC) structure that can be sent over the network to a
remote device. The network provider listens for SMB messages destined for it
and removes the data portion of the SMB request, so it a local device can process
the request. Several mechanisms have emerged that allow UNIX systems to share
files and printers with Windows-based systems using SMB, including Totalnet
Advanced Server (TAS), Samba, and Advanced Server for UNIX (AS/ U).
TAS a commercially developed product from Syntax Inc. allows the seamless
introduction of UNIX servers into a variety of PC-oriented networks using their
native protocols. TAS can provide file, print, and application services to clients in
Windows NT, IBM OS/ 2, Apple Macintosh, and Novell NetWare networks.
Samba is an open-source implementation of the SMB protocol for UNIX.
Samba includes the following components:
An SMB server for providing Windows NT and LAN Manager-style file and
print services to SMB clients running on Windows 95, Windows NT, OS/ 2
Warp Server, and others.
A NetBIOS name server, which supports the browsing capabilities required
to make UNIX servers appear in the Windows Network Neighborhood.
An ftp-like SMB client that allows UNIX users to access PC disks and
printers.
A tar extension to the SMB client to back up PCs from UNIX systems.
Command-line tools that support some of the Windows NT administrative
functions and can be used on Samba or Windows NT.
Samba 2.0 can use a Windows NT Primary Domain Controller (PDC) for user
authentication in exactly the same way a Windows NT system does, allowing a
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 43
Linux system to be a client member of a Domain. Samba 2.0 does not yet
provide the ability to host PDC services for other Windows NT clients, however.
Advanced Server for UNIX (AS/ U), a commercially developed product,
implements full Windows NT network services on UNIX platforms. Unlike
Samba and TAS, AS/ U uses code licensed from Microsoft rather than reverse-
engineered techniques. In fact, AS/ U derives from the same networking code
used by Windows NT itself. AT&T secured a license from Microsoft to port the
same code to UNIX kernels and now resells it to third-party OEMs.
AS/ Us implementation gives it a major advantage over TAS and Samba, allowing
UNIX systems to host Primary Domain Controllers (PDCs), which are used to
maintain Windows NTs Directory Service (NTDS) and network-authentication
protocols. PDC support allows UNIX systems to take over a number of
administrative functions in Windows NTcentric environments. Among these
functions is the ability to authenticate network logins by Windows clients using
Windows NTs native security protocols. With AS/ U, Windows NT
administrative infrastructures can potentially be rehosted entirely on UNIX
servers. (Without PDC support, users must continue to maintain Windows NT
servers for managing user information.)
Many desktop systems continue to rely on Novell NetWare for file- and print-
sharing and thus depend on Novells IPX/ SPX protocols to access remote
resources. To support such systems without reconfiguration, UNIX systems can
emulate NetWare services using a number of add-on packages. For example,
Novell supplies versions NetWare for UNIX on a number of platforms. Syntax
TAS also provides NetWare compatibility so that UNIX systems appear as virtual
NetWare servers.
Due to the growing move of software developers to the Linux platform, Linux
interoperability has grown in importance for conventional UNIX vendors.
Vendors have pursued various approaches: Tools like lxrun can provide the
ability to run Linux binaries on other UNIX systems with the same chip
architecture by converting system calls on the fly. Support for GNU tools such
as the gcc compiler also enhances interoperability. Compatibility for desktop
users benefits from porting Linuxs GUI interfaces to conventional UNIX
platforms.
AIX 4.3.3
AIX long ago set the standard for user-friendly UNIX system management with
its GUI-based Systems Management Interface Tool (SMIT), which allows the
operating system to be configured interactively and comprehensively. In AIX 4.3,
IBM addressed remote manageability with its web-based System Manager, a Java-
based system-management GUI containing many of the functions of SMIT. The
web-based System Manager delivered the first UNIX administration interface
that resembled Windows closely enough to enable users familiar with Windows
to work easily in the new environment. The widgets (e.g., controls and menu
1999-2000 Operating System Function Review
SS, March 2000
44 Copyright 2000 D.H. Brown Associates, Inc.
structures) employed in the interface closely match those of Windows 95 and
Windows NT 4.0: folder tabs for navigating hierarchies of configuration
information, tool tips that explain items underneath the cursor, and controls such
as check-boxes and data-entry boxes that mimic Windows. More importantly, the
Java implementation allows remote management of AIX systems across networks
from any Java-enabled browser.
The web-based System Manager provides users with platform independence in
the management of their AIX systems. It can be used to manage AIX systems in
several different ways:
on a single AIX system to manage that local system;
on an AIX system to remotely manage another AIX system;
on properly configured PCs or other clients to remotely manage an AIX
system through a Java 1.1 and AWT 1.1compliant web browser.
Web browsers that support Java 1.1, including AWT 1.1, are needed to run the
web-based system manager from a browser. While the web-based System
Manager itself does not provide administrative role delegation, that capability is
available via SMIT. An administrator can go through the SMIT menus and check
off any features they would like to delegate to another user. AIXs Network
Installation Manager (NIM) addresses the requirement for creating system-
software images from pre-assembled templates.
From the start, AIX also dealt remarkably well with the issue of detecting and
installing hardware changes automatically and transparently. The AIX Object
Data Manager (ODM), while not a standard UNIX function, acts as a registry
(akin to that found in Windows NT today) that manages all AIX configuration
information. When an RS/ 6000 server boots up, a complex detection and
configuration mechanism registers new hardware and makes sure necessary
device drivers are reflected in the ODM tables. As long as the proper device
drivers are available to the operating system, AIX device management comes as
close to plug-and-play as any UNIX system.
AIX 4.3.3 adds to AIXs traditionally strong storage management (LVM) with
online JFS backup and support for RAID 0+1 (concurrent mirroring and
striping).
AIX 4.3.3 also introduced the AIX Workload Manager (WLM) for resource
management. WLM manages CPU and memory usage, but not disk bandwidth or
network bandwidth. WLM has a sophisticated hierarchical management
infrastructure that supports both classes and a set of tiers that allow spare
resources to cascade to different sets of classes. However, outside of WLMs
fine-grained resource management, AIX does not provide any capability to
partition SMP servers into multiple independent operating environments.
AS/ U runs on AIX, but must be obtained from a third party, Groupe Bull. AIX
Fast Connect for Windows is an add-on feature of AIX that supports high-
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 45
performance SMB file-serving but not PDC authentication. Developed entirely
by IBM, it can be ordered for a flat fee (currently $1,500, with no per-client
license fee) for any copy of AIX 4.3.2 or above. A time-restricted version of AIX
Fast Connect is shipped on the Bonus Pack at no charge. AIX Fast Connect has
support for all HACMP modes, including mutual takeover, as of September 1999.
The default mode of AIX Fast Connect handles all forms of password
authentication supported by AIX itself, including local, NIS, DCE, and LDAP.
This default mode (plain text password) is the same as used by Samba and greatly
simplifies password administration for AIX users. AIX does not support
synchronization of passwords across UNIX and Windows NT systems.
The AIX 4.3.3 Bonus Pack also includes an evaluation version of Novell
Network Services 4.1 for AIX, Version 2.2.1. The evaluation package contains:
A two-user license for NetWare File and Print Services and
A two-user license for Novell Directory Services (NDS) that supports all
functions except directory replication, which is automatically added with
additional user licenses.
IBM research has been developing an application to allow AIX systems to run
PowerPC Linux binaries similar to lxrun, but for now, AIXs Linux
compatibility mainly consists of shipping the GNU C compiler with the
operating system.
HP-UX 11.0
HP-UXs System Administration Management (SAM) tool supports most routine
system-management tasks interactively, with both full-screen terminal and Motif
(X-windows) interfaces. SAM provides medium-granularity administrative
delegation, allowing a system administrator to map a user or group to a particular
set of sub-areas within the SAM interface, delegating the ability to carry out those
tasks without knowing the root password.
Remote management typically requires traditional shell tools or X-windows. A
native web-based system-management tool has not yet arrived on HP-UX,
although HP points out that customers can buy GraphOns Java implementation
of X-windows and thus run SAM through a web interface. HP also offers two
hardware solutions for remote management:
The Secure Web Console that plugs into the RS-232 port of a server and
connects to the LAN. This tool provides a low-security browser console
interface for managing HP-UX systems and is bundled with N-, L- and A-
class servers.
A Consolidated Hardware Console, which is a Windows NT server that
connects up to 224 servers via their RS-232 ports and provides a
management console to remote browsers using Secure Sockets Layer (SSL).
1999-2000 Operating System Function Review
SS, March 2000
46 Copyright 2000 D.H. Brown Associates, Inc.
HP-UX has not adopted a registry approach like Windows NT or AIX, but HP-
UX 11 introduced a DMI (Desktop Management Interface) repository that
contains information used for kernel configuration. Eventually, HP expects this
DMI repository to fetch and store all sorts of operating-system and software-
configuration information:
Hardware components: physical configuration, host file systems, tuning
parameters, volume groups, routing definitions, NIS configurations, system
contact information and
Software components: software locations, software-bundling information,
product contents, control files, and fileset dependencies.
HP also offers an Event Monitoring Service (EMS), delivered at no charge as
part of the operating system (coincidentally, HP also typically includes EMS as
part of its MC/ ServiceGuard HA offerings). Like Tru64 UNIXs event
management, HPs EMS provides a unified framework and user interface for
system-wide logging and notification. While the full suite of capability is optional,
HP bundles the hooks for EMS into HP-UX, and they are used by
implementations of certain HP-UX features, such as dynamic processor
resilience, which is an EMS monitor. Third parties such as Oracle have built EMS
into some database products, so that database errors are linked into the EMS
console via the same communications mechanisms used by other EMS
components. Device drivers require additional code to communicate device
failures to EMS.
HP now provides patch and application rollback through its Software Distributor
product, since patches are distributed as Software Distributor bundles. While
Software Distributor does not explicitly provide apply-and-commit options for
keeping and removing old system state, it preserves such state information,
allowing it to be removed manually by organizations that have a consistent apply-
and-commit approach to software installation.
HP lacks rollback capabilities for patches or new software, but HP-UX 11.0 does
include Ignite/ UX, a nice software distribution mechanism that allows operators
to create a golden system image of a complete HP-UX environment
including kernel settings, volumes, file systems, and applications that can then
be rolled out to many distributed systems.
HP-UX supports dynamically loadable device drivers and some auto-
configuration of devices detected at bootup. HP-UX scans the I/ O card space at
each boot and if new hardware is discovered, the system tells the Operator
Console what it found, permitting an automated driver installation to be manually
invoked from SAM. The system does not presume the new hardware should be
added just because it is present. This makes addition a conscious, if simple,
decision by the system administrator and prevents hardware changes without the
superusers consent.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 47
HPs storage-management options are fairly strong: HPs MirrorDisk/ UX
provides logical volume management on the root disk as well as data disks,
mirrors data on up to three volumes to prevent failure, supports dual I/ O paths
between disks and systems with automatic failover to second path, allows backup
of a mirrored logical volume from a second system (configured with read-only
access to shared disk) for availability, and provides an atomic split function that
ensures time stamps for multiple mirrored volumes will be identical when taken
offline. HPs optional Online JFS allows online disk defragmentation, online file-
system expansion (contraction must be done offline), and online backup when
used in conjunction with NetBackup or OmniBack. NetBackup from Veritas and
OmniBack from HP perform block-level incremental backups, with NetBackup
offering a broader feature set.
HP offers an unbundled option for resource management in HP-UX, the
Process Resource Manager (PRM), which can allocate CPU, real (but not virtual)
memory and disk I/ O bandwidth usage on a per-application, per-process, or per-
user basis. Usage limits within PRM, typically controlled by percentages, can be
varied according to time and date. While PRM supports allocation among groups
of users akin to classes, it does not support management hierarchies like the
tiers provided in AIX WLM. Unlike AIX WLM, however, PRM configuration
can be modified without requiring a reboot. HP-UX does not provide other
forms of resource management such as static or dynamic partitions with multiple
operating-system images on a large SMP server.
In terms of platform interoperability and management, in 1999 HP shipped 32-
bit and 64-bit versions of AS/ U for HP-UX 11, providing solid Windows file,
print, and authentication services. HP has dropped NetWare for UNIX with HP-
UX 11. While HP lacks the cross-platform management and synchronization
tools available under Tru64 UNIX, HP does provide HA scripts for AS/ U
failover; Samba scripts are working in HPs labs and represent a likely target for
release as a future product.
IRIX 6.5
Resonating SGIs strong graphics heritage and appeal with non-technical users,
IRIX became a pioneer in wrapping up UNIXs operating-system management
into a truly attractive, highly interactive package.
In IRIX 6.5 SGI added RoboInst, a new tool that provides four types of
automated network installations: initial installations of IRIX 6.5, updates to IRIX
6.5, unbundled software products, and patches. RoboInst can also automate tasks
including re-partitioning disks, creating a file system, installing software, and
executing shell scripts. A related GUI tool, Software Manager, can be used to
install software on single systems whenever a CD with software is loaded in the
CD-ROM drive. Software Manager also allows software or patches to be
uninstalled a new capability in IRIX 6.5 that even rolls back the actions of
scripts that change configuration files.
1999-2000 Operating System Function Review
SS, March 2000
48 Copyright 2000 D.H. Brown Associates, Inc.
IRIX uses traditional /etc files for storing configuration information and
traditional UNIX logs for event monitoring or management, lacking more
sophisticated advances. Some IRIX device drivers are dynamically loadable, while
others are not. While SGI has begun to address some distributed system-
management issues, it still lacks cutting-edge features such as a Java
implementation for management from nonX-Window system environments
(i.e., Windows platforms). Furthermore, SGI has not spent enough time to date
on the issue of simplifying hardware management on the whole, it resorts to
traditional UNIX methods that depend on user expertise.
SGI offers IRISConsole, an enhanced X Window System-based based system-
management tool. SGI also bundles its EnlightenDSM tool with every server,
which provides management of users and groups across the network via
directory services such as NIS+. In addition, IRIX supports CA Unicenter and
HP OpenView system-management frameworks, bundling the Unicenter
framework and an OpenView MIB. With Trusted IRIX, SGI can delegate root-
level tasks to non-root users using privilege assignment lists. Trusted IRIX 6.5 is
available and does not lag the mainstream IRIX release schedule significantly,
which is a traditional problem with such high-security releases. Regular IRIX
supports a least-privilege mechanism through the implementation of POSIX
P1003.1eD15 capabilities.
The superuser privileges have been broken out into a set of distinct capabilities
which can be granted and relinquished through a set of inheritance rules. IRIX
supports three capability styles: Traditional Superuser, Augmented Superuser
(traditional root and non-root processes can access root capability), and No
Superuser (root uid is irrelevant, only capability settings matter). Selected
capabilities include:
addition and removal of swap space,
ability to use the uadmin(2) call to reboot or shutdown a system,
ability to manipulate the scheduler (relevant especially for real-time
programs),
ability to modify disk quotas, and
ability to override file mode read and search access restrictions when
accessing an object (e.g., for system backup).
IRIX was first in offering strong resource management with ShareII, which
provided the ability to restrict or ensure availability of CPU time, system
memory, and other resources such as connect time, logins, and printer/ plotter
usage. In 1999 SGI stopped selling the ShareII product, however, and has
focused development efforts on IRIX job limits that map more closely to the
historic management approaches available on Crays UNICOS. Job limit
functionality ships in 1Q00, however, and is not included in this studys findings.
Instead IRIX depends on a newer version of its pset or processor set
capability, which traditionally has been able to restrict applications to a specific
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 49
group of processors within the system. In IRIX 6.5, SGI extended this capability
to recognize and specify the exact physical locations of CPUs within a processor
group (now called a cpuset) to take advantage of the lower latencies of
communication with physically adjacent CPUs, memory, and I/ O devices in
SGIs ccNUMA servers. SGI enhanced cpusets to ensure that memory allocated
for jobs comes first from the same node board as the CPU, and that memory can
be reserved for exclusive use of a cpuset. Overall, however, the cpusets provide
substantially less control and less ease-of-management than the previous ShareII
product.
IRIXs Priority I/ O feature can still guarantee disk bandwidth from the raw disk
level through the operating-system and file-system level to meet application-level
requests. In other words, IRIX can guarantee throughput to its XFS file system
and can reserve and schedule I/ O bandwidth for a specific I/ O channel, a facility
particularly useful in media-serving applications, where dropping video frames is
unacceptable. Static, partition-based resource management is available on all
Origin 2000 servers, and SGI continues to support traditional cluster-oriented
scheduling and resource-management tools such as the Load Sharing Facility
(LSF) and Network Queueing Environment (NQE).
For meeting heterogeneous platform interoperability and management needs,
SGI offers Windows file and print services as an unbundled option in its Samba
implementation for IRIX. Because it uses Samba, IRIX lacks the Primary
Domain Controller (PDC) capability necessary to authenticate Windows NT
logins. Although HA failover scripts are available for SGIs Samba
implementation, the synchronization and cross-platform account-management
tools like those found in Tru64 UNIX are not available on IRIX.
SOLARIS 7
Historically serving highly technical users on its workstations, Sun has only
recently begun to address ease-of-use criteria in system management as part of its
move to focus more on commercial user requirements. Solaris long had a
somewhat primitive GUI-based system-management tool, admintool, which
lacked the breadth of SMIT in AIX or SAM in HP-UX, covering only the basics
of adding and deleting user profiles, printers, host names, serial ports, and
software.
More recently though, Sun has begun to focus on more advanced system-
management tools and has delivered several improvements. With Solaris 7, Sun
introduced a new GUI-based system-management tool called Solaris
Management Console, a point-and-click administration tool for Solaris that
provides a centralized integration point for Solaris system administration and
management tools. The console is configurable and extensible, allowing
integration of system-management applications based on a variety of
development methods, including the X-Window system, scripts, Java, and
HTML.
1999-2000 Operating System Function Review
SS, March 2000
50 Copyright 2000 D.H. Brown Associates, Inc.
Another tool, WebStart, provides a Java-based GUI tool for installing system
software and software add-ons, providing both ease of use and remote
manageability. Sun also provides a central dialog for access to its unbundled
system-management products called AdminSuite. AdminSuite provides a unifying
framework for grouping Suns unbundled system-management tools, but it falls
outside of the scope of this evaluation, which primarily assesses single-system
management tools rather than enterprise system-management frameworks.
Finally, Sun offers SyMon, a GUI-driven tool optimized for management of
distributed Solaris servers.
Sun has handled its hardware management cleverly with its OpenBoot firmware
mechanism. When a Sun server is switched on, the OpenBoot Programmable
Read-Only Memory (PROM) system activates, allowing the system to be
controlled even before Solaris boots up for the purpose of diagnostics and
other operations. Enterprise users routinely configure the boot PROM to be
accessible remotely through a serial port, so that pre-boot problems can be
resolved even if the network is inoperable. OpenBoot also tracks the presence of
installed hardware and forwards setup information to the Solaris operating
system, which can then configure itself appropriately. Reconfiguration is not
quite as transparent as in AIX, since administrators have to manually force
reconfiguration with a special command, but the integration between OpenBoot
hardware and Solaris software clearly provides added value.
The Solaris JumpStart mechanism, which resembles HPs Ignite/ UX, allows
operators to create a template of a Solaris environment, including all necessary patch
updates, which can then be rolled out to many distributed systems. However, patch
and software rollback capability and administrative role delegation are not yet
standard on Solaris. Solaris does not include an event-management or event-
monitoring framework, although enterprise frameworks are available.
Suns optional Solaris Resource Manager v1.5 based on technology used in the
ShareII product provides CPU and virtual (not physical) memory management.
While disk I/ O bandwidth is not controlled, version 1.5 integrates bandwidth
management that previously was available as a separate product. Network
bandwidth can be allocated by application protocol (e.g. http, NFS, etc.), for
traffic both inbound and outbound. Very fine-grained controls are available,
allowing bandwidth to be managed even on a per-URL basis by intercepting and
examining network packets. It also provides a rich set of miscellaneous resource-
management controls, allowing flexible control over the maximum number of
logins, connections, or processes. Resources can be managed by user, user group,
process, or process groups, with additional scripting required for changing
resource limits over different time periods. SRM provides class-based
management, without support for cascading excess resources to different tiers.
Management parameters can be changed on the fly without a reboot. On
Enterprise 10000 hardware, Solaris also allows resource management to be
performed through Dynamic Domains, which allow multiple operating-system
images (and applications) to be isolated from each other. These partitions can
grow and shrink without a reboot.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 51
In terms of platform interoperability and management, Suns PC NetLink
product, formerly known as Project Cascade and based on AS/ U technology,
provides both file and print sharing, as well as PDC authentication services. PC
NetLink is bundled with all one- to eight-processor Sun servers, along with an
unlimited client license. NetWare interoperability is provided by an older
product, SunLink PC, which is essentially a repackaged version of Syntaxs TAS
package. Solaris interoperability with Linux is also particularly strong a tool
called lxrun in Solaris 7 provides the ability to run Linux binaries (of the same
chip architecture) under Solaris. Support for lxrun comes from a third party,
however. Some GNU tools for Solaris are also available on Suns website.
TRU64 UNIX 5.0
While Tru64 UNIX V4 introduced Compaqs GUI management tools, Tru64
UNIX 5.0 provides a better underlying architecture and Java-based interfaces for
remote management. The improved architecture benefits the systems more
flexible and attractive range of management tools.
While operating systems such as AIX and Windows NT store all information in a
binary registry, Tru64 UNIX adopts a slightly different framework that better
maintains a traditional UNIX approach. The /etc files remain, but the system
accesses them via a single framework (internally known as MCL) that contains
component definitions for each file and how it can be read and modified. The
MCL creates a common data model that cleanly maps to and can be exported to
SNMP MIBs. This model can also be exported in future releases to LDAP
directories and the Common Information Model (CIM) of the cross-platform
Web Based Enterprise Management (WBEM) initiative.
Another piece of the framework provides a structured set of APIs that is
independent of a particular user interface. These APIs can be used for building
management tools in past, present, and future user-interface paradigms. This
piece of the framework is known as the Sysman User Interface Toolkit, or SUIT,
and all Compaqs UNIX system-management tools are built on top of it.
Together, MCL and SUIT form a strong architectural alternative to AIXs Object
Data Manager (ODM) repository for storing configuration information. Both
provide a central mechanism for accessing configuration data, both provide
technology for cloning systems, and both provide a framework for multiple types
of management-user interfaces. The MCL/ SUIT approach better maintains the
existence of the /etc files familiar in traditional UNIX administration. The
ODM stored-registry approach allows for more sophisticated rollback of system-
configuration changes and automatic loading of detected device drivers.
The major tool administrators employ is SysMan, which contains a set of
command-line flags that allow it to launch alternatively as a X11/ Motif GUI
interface (the default), a full-screen curses ASCII interface, a command-line
interface, a Java interface, or as an interface within Compaqs Insight Manager
1999-2000 Operating System Function Review
SS, March 2000
52 Copyright 2000 D.H. Brown Associates, Inc.
tool (available on PCs). The Java interface is known as SysMan Station and can
run either in a browser or as a standalone Java application.
While Tru64 UNIXs Motif-based GUI has a hierarchical, menu-oriented
structure that resembles IBMs SMIT and HPs SAM, the Java-based SysMan
Station provides a more graphically sophisticated, icon-based approach. In
general, the management function available in each user interface appears
identical. However, SysMan Station provides some additional real-time graphical
monitoring of the system, depicting the components in a system hierarchically
and highlighting those that encounter errors or fail. Graphical monitoring even
extends to monitoring multiple nodes with shared busses in cluster scenarios.
Thus users can perform cluster management with the same set of tools as single-
system management, a situation not now possible with many competitors cluster
offerings.
Tru64 UNIX 5.0 provides web and SNMP integration in the Compaq Enterprise
Management framework. Included in the 5.0 release are Compaq Insight Manager
Agents for Tru64 UNIX, which provide local- and remote-management
capabilities through a dedicated, corporate-wide HTTP port. Working in
conjunction with hardware and firmware, the agents export system information;
monitor various system components such as CPU, memory, and I/ O devices;
track storage, networking, and environmental components such as fans and
power supplies; and also provide information on CPU and file-system utilization.
The agents include a sophisticated SNMP-to-HTML rendering engine that can
provide management data for dynamic display using smart Java scripts. The
agents broadcast their services to other agents on the network, allowing users to
discover and monitor systems in the enterprise from any system with the
Compaq Insight Manager agents.
Tru64 UNIX includes a feature called division of privileges within the SysMan
tool interface, allowing delegation of a full range of typical root actions to
particular users or user groups without requiring them to have the superuser
password. The granularity of this delegation is moderate, delegating not at the
individual task level, but delegating authority in 14 different areas of
responsibility, including:
Tru64 UNIXs Event Manager provides a centralized mechanism for gathering,
storing, distributing, and acting upon events occurring in the system. The system
can post events directly to a common data store or can gather events via built-in
Network Management
Network Configuration
Mail Management
Mail Configuration
Printer Management
Printer Configuration
Event Management
Event Configuration
Host Management
Process Management
File Management
Power Management
Keyboard Configuration
Security
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 53
daemons that monitor traditional UNIX logs. Events can trigger notifications to
applications (via an API) or administrators (via email). In addition, events can be
listed and manipulated within GUI interfaces or text-oriented file formats and
commands. While experienced administrators can continue employing older
fragmented logs for advanced troubleshooting, the new tools reduce the learning
curve required for newer or part-time system administrators to keep up with
system operation.
Tru64 UNIXs hardware state management has improved in Version 5. Tru64
UNIX can now manage the state of Compaq devices transparently and load
device drivers automatically on startup, but non-Compaq devices still require
manual procedures. Tru64 UNIXs Remote Installation Services (RIS) provides
configuration and distribution of a master operating-system image and layered
software. While Tru64 UNIX cannot rollback application installation, its
dupatch tool in version 5 can rollback patches. When a patch is installed,
dupatch will compress any files that are changed or removed and save them so
a rollback can be performed later. In terms of operating-system rollback, Tru64
can do this partially through a new operating-system upgrade process (if you are
missing some component). Once the operating system is installed, however, one
can only restore the system to a backup of the old version.
In terms of resource partitioning, Tru64 UNIX provides support for static
partitioning on current hardware. Future eight-, 16-, and 32-node high-end GS-
series systems should support dynamic partitioning. Tru64 uniquely bundles its
own class scheduler, which provides a modest level of resource management,
with its operating system. The class scheduler allows CPU and real-memory
allocation across users. These allocations occur via minimum and maximum
percentages managed by user, user group, or process ID. While users or
processes can be grouped into classes, the scheduler does not support tiered
hierarchies of classes. Tru64 UNIX does not support disk I/ O bandwidth or
network-bandwidth management except in the narrow case of ATM
infrastructures.
Unlike all other products, Tru64 UNIX bundles Advanced Server for UNIX
(AS/ U), and version 5.0 further improves cross-platform management between
its UNIX and Windows NT environments, including single password capability
across UNIX and Windows NT environments. In other words, an application
that performs authentication on Tru64 UNIX Version 5.0 can transparently
support the use of Windows NT usernames and passwords. Applications that
write to Tru64 UNIXs Security Integration Architecture (SIA) APIs can easily
switch from employing traditional NIS or / etc/ password databases to code that
authenticates Tru64 UNIX users against a Windows NT Primary Domain
Controller (running on a Windows NT server or on a UNIX server running
AS/ U.) Such applications currently available include, but are not limited to: ftp,
telnet, rlogin, dtlogin, rsh, and login. The single sign-on feature is included in
Tru64 UNIX partly because of Compaqs close relationship with Microsoft, since
the software underlying this feature has to understand the proprietary encryption
methods used by Windows NTs authentication process.
1999-2000 Operating System Function Review
SS, March 2000
54 Copyright 2000 D.H. Brown Associates, Inc.
The approach taken by Tru64 UNIX has a drawback in that the current
implementation requires that applications use the Tru64 UNIX SIA APIs to
access the single sign-on capability. This requirement is not an issue for
applications provided with Tru64 UNIX, as in general they use SIA to make
authorization calls. However, for applications coded to other pluggable
authentication APIs (GSS/ API, PAM or SSPI) this requirement is an issue. For
such environments, Tru64 UNIX provides an alternative approach
synchronizing passwords across Windows NT and UNIX systems.
Although Tru64 UNIX versions earlier than Version 5.0 could synchronize
UNIX and Windows NT passwords, an end user on a Windows NT system had
to launch a separately installed tool to replicate his or her password on the UNIX
system. Starting with Tru64 UNIX Version 5.0, a password change done on a
Windows NT system with Windows NTs default tools is automatically replicated
on a Tru64 UNIX system. Synching passwords changed on a Tru64 UNIX
system is not yet totally automated; a specific UNIX command must be launched
to replicate the UNIX passwords on a Windows NT system. However, Compaq
states that an administrator could easily automate the launch of this command by
adding it to a password-handling script.
Note that both the password sharing and password synchronization approaches
depend on Windows NTs Primary Domain Controller safely securing password
information. Experienced UNIX administrators may distrust this dependence,
given the long history of holes found in security mechanisms that are not publicly
disclosed and are relatively new. This risk is unavoidable, however, for
organizations that create heterogeneous UNIX and Windows NT networks.
Approaches such as allowing Windows NT applications to authenticate using
UNIX passwords perhaps using Microsofts SSPI pluggable authentication
interface would require Microsoft acquiescence, which appears an unlikely
prospect.
Finally, Tru64 UNIX adds the unique capability to manage existing UNIX and
Windows NT user accounts (not just the passwords themselves) from the
standard set of Tru64 management tools. This allows adding and removing users,
managing printers, and modifying file-sharing privileges across platforms from a
single management console. NetWare interoperability remains available through
Pathworks software. Tru64 UNIX does not yet support the ability to run Alpha
Linux binaries.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 55
INTERNET AND WEB APPLICATION SERVICES
5.00 6.00 7.00 8.00 9.00
IRIX 6.5
Solaris 7
Tru64 UNIX 5.0
HP-UX 11.0
AIX 4.3.3
Fair OK Good Very Good Excellent
SUMMARY
AIX 4.3.3 retains the lead for Internet and web application services, benefiting
from the strongest support for TCP/ IP protocols and extensions; the strongest
Internet file, mail, and web services; and the richest set of e-commerce options.
HP-UX 11.0 follows it, also offers a strong TCP/ IP implementation, coupling
that with very good e-commerce options and Internet file, mail, and web
services. Tru64 UNIX 5.0 has a very strong JVM implementation and unique
support for Microsofts DCOM distributed object protocol, but otherwise has
average capabilities. Solaris 7 places fourth overall, a surprising position for a
pioneering Internet company that has contributed so much technology to the
industry. In a rapidly growing arena where every player wants to be first, Suns
choice to bundle its own web server with Solaris 7 rather than iPlanet or Apache,
a modest set of e-commerce offerings, and the lack of many TCP/ IP extensions
hamper Suns functional leadership. IRIX 6.5 includes a good set of bundled
Internet file, mail, and web services, but trails on most other areas.
INTERNET AND WEB APPLICATION CRITERIA
As the Internet has flourished, Internet-related services have become a prime
area for product differentiation among operating-system developers. DHBAs
assessment of Internet capabilities examines nearly 50 functions in the following
five areas:
Major TCP/ IP extensions: support for IPSec, RSVP, DiffServ, IntServ, IP
multiplexing, and IP multicast.
Minor TCP/ IP features and performanceenhancements: over a dozen miscellaneous
TCP/ IP stack features described further below.
Web application services: tools that can be used for development of web
applications, such as Java Virtual Machine technology, Object Request
Brokers, Microsoft DCOM support, and transaction-processing tools.
E-commerce tools and layered packages: availability of native middleware and
applications focused specifically at leveraging Internet technology for online
selling.
FI GURE 5:
I nternet and Web
Application Services
Functional Ratings
1999-2000 Operating System Function Review
SS, March 2000
56 Copyright 2000 D.H. Brown Associates, Inc.
Bundledweb, mail, andfileservices: a staple requirement for both Internet and
intranet servers.
In addition, this study evaluates some ancillary functions related to Internet
infrastructures, including directory services, network security, and virtual private
networking (VPN), which appear in the section on directory and security
services.
TCP/IP FEATURES
By their nature, operating systems form part of the backbone of computing
infrastructures. Thus, strong support for IP protocols, TCP/ IP extensions, and
related tools represent key parts of a good Internet offering. While all studied
products support such standard dial-up Internet access protocols as Point-to-
Point Protocol (PPP) and Serial Line Internet Protocol (SLIP), a number of
extensions can add significant value. Major TCP/ IP extensions are highlighted
for the additional capability they bring to the platform, enabling new or different
types of Internet applications that cannot effectively be implemented without
their presence. Such extensions include support for IPSec, RSVP, DiffServ,
IntServ, IP multiplexing, and IP multicast.
As enterprise infrastructures and the Internet become more enmeshed, secure
Internet services such as IPSec grow in importance. So that no one can intercept
information in transit, IPSec secures traffic that passes over the public Internet
by transparently encrypting IP packets on both transmission endpoints without
requiring support in intervening routers or any special application coding. Among
other benefits, IPSec represents an important component for enabling VPNs
(see the Directory and Security Services section, below).
Advanced IPv4 features include Resource Reservation Protocol (RSVP),
DiffeServ, IntServ, and IP multicasting. RSVP can assign varying priority levels to
IPv4 packets, allowing networks to promise varying quality-of-service (QoS)
guarantees, assuming that intervening routers support RSVP. IP multicasting also
requires router support to reduce upstream bandwidth requirements so as to
deliver one-to-many IPv4 broadcasting capability for audio, video, software, or
data streams. IP multiplexing (not to be confused with IP multicasting) allows a
single system to be seen as multiple numeric IP addresses, even on the same
network interface card.
DiffServ is an emerging IETF standard that attempts to improve QoS capabilities
by increasing the type-of-service bits in a standard TCP/ IP packet header from
three to six, and by defining the routing behaviors associated with those bit
patterns. DiffServ routers adopt the appropriate behavior indicated by the packet
and do not retain information about traffic flows. DiffServ is expected to be used
predominantly in IP backbone environments.
IntServ attempts to define how applications services describe their bandwidth
and latency requirements, how this information can be made available to routers
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 57
(typically via RSVP), and how the appropriate quality of service can be tested and
validated. Unlike DiffServ, IntServ routers must classify packets based on several
IP packet header fields and maintain state information for each flow. With RSVP,
an application requests to reserve resources along a route from the source IP to
the destination IP. Routers along the path then approve or deny the request and,
if approved, reserve the appropriate resources.
Minor extensions either reflect performance enhancements (rather than
improvements in functionality) or features whose presence or absence affects
user capabilities only in small ways. Table 2 lists these enhancements.

1
IPv4 version available
2
Early access version available
3
Available via download
4
Unbundled

AIX
4.3.3
HP-UX
11.0
IRIX 6.5
Solaris
7
Tru64
UNIX 5.0
Major TCP/IP Extensions
IPSec Yes Yes No No
1
Yes
RSVP Yes Yes Yes Yes Yes
DiffServ Yes Yes No No No
IntServ Yes No No No No
IP Multiplexing(IP Aliasing) Yes Yes Yes Yes Yes
IP Multicast Yes Yes Yes Yes Yes
Minor TCP/IP Extensions and Tools
TCP SelectiveAcknowledgment
(SACK)
Yes No Yes Yes Yes
Ipv6 Yes No
2
No No
3
Yes
ATM IP Switching Yes Yes Yes No Yes
Supernet CIDR Support Yes Yes No Yes Yes
SOCKS 5 Support Yes No No No No
Multilink PPP (server-side) Yes Client Yes Yes No
IP (modem) Dial-upTool Yes Yes Yes No Yes
Performance Optimizations
Gigabit Ethernet Drivers Yes Yes Yes Yes Yes
Ethernet bonding(multi-NIC), multi-
threadedsupport
Yes Yes Yes Yes
4
Yes
TCP LargeWindows Yes Yes Yes Yes Yes
ZeroCopyTCP/ HardwareChecksum Yes Yes Yes Yes Yes
PathMTU Discovery Yes Yes Yes Yes Yes
PathMTU Discoveryover UDP Yes Yes No No Yes
OpenShortestPathFirst Yes Yes No Yes Yes
TCP/ IP Gratuitous ARP Yes Yes Yes No Yes
TABLE 2:
Overview of TCP/ I P
Functions and Extensions
1999-2000 Operating System Function Review
SS, March 2000
58 Copyright 2000 D.H. Brown Associates, Inc.
WEB APPLICATION SERVICES
As web browsers become the primary entry point for a growing number of day-
to-day computing activities, developers have increasingly begun to explore
possibilities for segmenting application designs along web boundaries, i.e.,
shifting application logic from clients to web servers and implementing user
interfaces with HTML-based presentation layers. On the surface, the approach
delivers several benefits, including client independence since web access is
supported by a wide variety of platforms and geographic independence the
ability to access both applications and data from any location. Further,
organizations that have long struggled to maintain huge networks of PCs sense
that they can potentially use a web-based application approach to ease their
management burden by centralizing applications and simplifying clients as much
as possible, which enables greater efficiency through increased economies of
scale, managed either by in-house IT operations or a new breed of Application
Service Providers (ASPs).
Simultaneously, the growing role of e-commerce and a profusion of other
emerging services available to the public on the Internet has resulted in the need
for vastly more complex applications that are deployed on servers and can be
accessed reliably by huge numbers of globally dispersed web clients. These trends
demand increasing support for the development of web applications, i.e.,
applications that have specifically been designed for deployment on web-based
infrastructures. While such tools have historically been offered as layered
products that can be hosted on diverse operating systems, aggressive operating-
system developers such as Microsoft have begun to fold rich application services
directly into the base of operating system, increasing the pressure on UNIX
suppliers to follow suit.
While all operating systems provide some support for running Java applications,
each vendor has the opportunity to optimize Java by tying Java primitives more
closely to native system services and improving the choice and implementations
of algorithms used in Java execution. While Java benchmarks remain heavily
dependent on the power of underlying hardware, this evaluation examines which
versions of Java are supported and describes significant optimizations made in
the threading and memory management of each operating systems Java
implementation. Other emerging web application services include support for
object request brokers (ORBs), transaction processing, and Microsofts
Distributed Component (DCOM) architecture.
E-COMMERCE TOOLS
The Internet represents the first medium to automate both marketing-on-demand
and sales-on-demand on the same platform without requiring human intervention for
each customer. As such, Internet technology offers a compelling and unique promise
of enabling e-commerce. However, the tools and infrastructure to fully enable that
vision remain relatively new. While front-end activities such as credit-card encryption
with SSL have become well-defined and widely accepted, back-end and business-logic
infrastructures remain in their infancy.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 59
Nonetheless, bundled e-commerce capabilities are rife with opportunities for
differentiation. Although the variable cost per customer associated with e-
commerce is low, the startup expertise and integration required to develop a full
e-commerce solution suitable for businesses remain relatively high. DHBAs
assessment in this sub-category examines the degree to which vendors and their
platforms enable e-commerce solutions for customers.
Bundled or unbundled software tools that work with web servers have become
central to e-commerce middleware products. Vendors offer these packages of
tools to address a variety of needs, such as providing user interface templates,
performing site management, and (most importantly) improving security for safe
passage of critical information. In the absence of these applications, companies
could not realistically manage transactions efficiently, performance and reliability
would be severely compromised, and Internet buying could become a textbook
example of how unsafe the Internet can be.
Some vendors innovate their own application-level solutions; others provide a
combination of layered software and ISV products; and still others depend
almost entirely on ISVs. While this report mentions companies partnerships
with third-party ISVs, DHBA currently gives vendors credit in the ratings only
for capabilities they provide or bundle themselves.
In a drive to keep sites fresh and manageable, many companies have begun
exploring user-interface tools that extend beyond the traditional page-at-a-time
authoring approach to tools that create a consistent, website-wide look-and-feel
based on a user interface template. Applications that create a template of product
information and replicate it for all the remaining web pages shrink formatting
time considerably and ease the burden of uploading new product information or
updating old information.
Beyond mass-exporting information in the same format, some e-commerce tools
have a complementary mass-import capability. The ability to load catalogs (i.e.,
product information) in large quantities, as opposed to loading a page of data at a
time, can significantly ease site development for businesses developing their first
e-commerce site.
No e-commerce site can function properly without appropriate site-
administration tools. These GUI tools provide a wide spectrum of information,
such as control and view access to product information, shopper groups, security
exposure, error detection and correction, statistical data, shipping information,
installation and configuration, addition/ removal of supplementary software
products, etc. With such information, administrators can control the application
and content either locally or remotely and can take necessary action to address a
security compromise or fix errors.
Security concerns remain at the heart of all e-commerce applications effective
e-commerce solutions must have a robust security model. Secure Sockets Layer
(SSL), a widely accepted security standard, encrypts all information between
1999-2000 Operating System Function Review
SS, March 2000
60 Copyright 2000 D.H. Brown Associates, Inc.
client and server. All leading browsers and web servers support SSL. However,
some applications take the next step forward to support the Secure Electronic
Transaction (SET) standard, specifically designed to protect credit-card
transactions. SSL encrypts only users credit-card and personal-identification
numbers, whereas SET also verifies the identities of the customer and the
merchant via a digital certificate. Applications built on SET or SSL use
encryption technology to handle credit-card authorization, verification, and
receipt, allowing credit-card transactions to be cleared automatically. Though
typically available through third parties, some system vendors provide their own
options to handle such tasks.
E-commerce applications usually need to access data in a variety of different
formats. Often this data resides in databases, requiring e-commerce applications
to support ODBC connectivity. Some also support JDBC, which allows Java
applications to access databases. Billing systems form another part of the e-
commerce chain, providing hard copy of transactions to both buyers and sellers,
reducing costs, and improving customer service by detailing relevant information.
A number of operating system-level enhancements can improve the performance
of e-commerce systems, including:
Dynamic page caching: Allows the system to dynamically cache frequently
accessed web pages in the memory of a server or gateway to speed
performance and provide up-to-date information to shoppers;
Bandwidth allocation: Bandwidth allocation allows I/ O bandwidth to be
reserved or prioritized according to the type of Internet protocol (HTTP for
web traffic, FTP, or mail) or according to the location of a page or set of
pages within a given website;
Transaction prioritization: A more granular version of bandwidth allocation, this
feature can give priority to transactions (not just browsing) and even to high-
volume customers within the transaction stream; and
Encryption accelerator support: Since encryption can reduce the number of hits a
site can achieve by a factor of 10, offloading this task to hardware can
dramatically improve the performance and scalability of an e-commerce site.
BUNDLED FILE, MAIL, AND WEB SERVERS
As the core service in Internet environments, bundled web servers assume
central importance. Mail services, while arguably equally critical, currently have
minimal differentiation across vendor implementations. Most operating systems
bundle POP3 and IMAP4 mail servers, as well as fairly recent versions of
sendmail. While all systems offer some form of file services such as the Network
File Service (NFS) or Server Message Block (SMB), support varies for extensions
such as CacheFS, AutoFS, WebNFS, and CIFS.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 61
AIX 4.3.3
IBM bundles and integrates Java with AIX, which runs it on startup and includes
its own just-in-time (JIT) compiler. In 1999, AIX added support for Java 2
version 1.2.2 with a JIT compiler developed by IBMs Tokyo Research
Laboratory, which includes a number of performance enhancements, including:
Mixed-modeinterpreter (MMI) for faster application start-up. MMI selectively
compiles the most frequently executed methods.
Efficient exploitation of AIX native POSIX threads for improved scalability.
Improved heap management: the new JIT includes improved object allocation
heuristics for small objects, which is significant because Java applications tend
to use many small objects. It also introduced thread local heaps (TLH) for
quick allocation, allowing memory to be allocated from the TLH without
acquiring the heap lock. Reduced heap-lock contention also improves SMP
scalability. AIX has also refined Javas heap growth algorithm so growth is
based on factors such as free space in heap after garbage collection; percent
of time spent in garbage collection; expansion amount adapted based on size
of failing request; and avoidance of compaction when expansion is necessary
to fulfill a request.
Advanced garbagecollection heuristics to reduce pause times (e.g., compaction
avoidance and fragmentation reduction).
These optimizations, together with a few others in the mark-and-sweep garbage
collection, have resulted in reductions of 50% to 75% in pause times when
garbage collecting, according to IBM.
In terms of other middleware, IBM bundles the standard edition of its
WebSphere product, which includes its own object request broker in the Bonus
Pack shipping with every AIX release. While IBM does not bundle any
transaction processing, its CICS and MQSeries products stand out as highly
mature leading implementations. AIX does not include any Microsoft DCOM
capability.
True to its slogans and marketing, IBM has taken the most active role among the
studied operating-system vendors in providing e-commerce software options for
its AIX customers. IBM offers a comprehensive set of proprietary e-commerce
solutions with its Net.Commerce product and Payment Suite, both of which are
optimized for AIX. IBM also offers several GUI-based tools for site creation,
administration, and management. The company also provides a Java-based web-
page design tool for creating templates. Net.Commerce v3, introduced in 1999,
now includes the ability to import catalogs of data in a systematic way for e-
commerce sites, avoiding tedious data entry.
Net.Commerce supports both SET and SSL standards to address security issues
and credit-card clearance. Database support is provided through DB2, which is
bundled with Net.Commerce, and ODBC connectivity. IBM also features several
performance optimizations, including support for hardware-based encryption
1999-2000 Operating System Function Review
SS, March 2000
62 Copyright 2000 D.H. Brown Associates, Inc.
acceleration through a PCI card and the ability to run multiple Net.Commerce
sites on a single machine. IBMs Payment Suite provides all the necessary
functionality to complete the e-commerce package with Consumer Wallet on the
client side, Payment Server for billing and credit-card clearance, and Payment
Registry and Payment Gateway for certificate authentication and SET
communication, respectively.
With the introduction of IntServ and DiffServ in the underlying TCP/ IP stacks
of AIX 4.3.3, filters can be set up to manage bandwidth according to source and
destination address ranges in the operating system that then apply to
Net.Commerce. While DiffServ can mark packets for expedited forwarding and
assured forwarding, transaction prioritization has not been fully integrated into
Net.Commerce.
IBM currently bundles Apache with a kernel-based HTTP server, having
dropped FastTrack and the Lotus Domino Go web server from its Bonus Pack.
Mail server support is solid with POP3 and IMAP4 support, including support of
the most recent 8.9.3 version of sendmail, which includes filters for detecting and
avoiding transmission of spam. AIX has solid NFS support, with version 3 fully
supported in AIX 4.3. Client-side NFS features, such as CacheFS and AutoFS,
are also supported.
As shown in Table 2, AIX 4.3.3 offers the strongest available support for IP
protocols, filling in past weaknesses by implementing TCP Selective
Acknowledgment and by raising the bar with a production implementation of the
new IETF IntServ and DiffServ extensions. IntServ and DiffServ controls
remain primitive in this first implementation, lacking easy-to-understand GUI
interfaces for configuration.
A few of IBMs IP protocols, extensions, and tools deserve further highlighting.
AIX includes nice GUI interfaces for both TCP/ IP dial-up and configuration
within its web-based system-management tool. AIX 4.3.3 also offers strong
support for a variety of IPv6 services, including 128-bit addressing, IP-layer
security, dynamic auto-configuration, redundant routing and multi-homing, and
tunneling support (for encapsulation in IPv4 packets across non-IPv6 networks).
AIXs IPSec implementation supports 40-bit, 56-bit, and Triple-DES encryption
options; it also offers filter rules to control network traffic by characteristics such
as source and destination address, specific protocol or port, or subnet mask.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 63
HP-UX 11.0
HP includes the Java JDK 1.2.2 with the Hotspot compiler in its latest quarterly
DART release. The Hotspot compiler maps Javas thread primitives to HP-UXs
native 1:1 threads. HP offers both classic and Hotspot Java virtual machines, with
Hotspot included as a native operating-system component. The Hotspot garbage
collector includes both the traditional mark-and-sweep algorithmic approach, as
well as a generational garbage-collection algorithm. HP-UX manages memory in
three aging stages short-, medium-, and long-term allocation, making the
garbage collection process more efficient for short-lived objects.
Unlike most other vendors, HP bundles an object request broker from Iona with
HP-UX. Like most others, Microsoft DCOM support is not available, and
transaction processing remains primarily the domain of third-party providers.
HPs e-commerce platform largely depends on ISVs for application-level
capability. It depends particularly on iPlanet, which delivers GUI tools for site
management, SSL support, LiveWire database service, and optimized caching,
among the other features provided by its FastTrack Server. HP provide QoS
functions for prioritizing transactions and customers along with preventing
server overloads. HP also supports SET via its Verifone products, although that
capability does not extend to credit-card clearance. The iPlanet Application
Server is available on HP-UX. However, other e-commerce features such as
catalog import capability, user interface templates, and billing features are left to
third parties.
HP bundles the iPlanet FastTrack web server and provides Apache and Zeus
(unsupported) from its website. HP has quietly dropped bundling of Oracles
Web Application Server v3.0. Mail-server support includes not only POP3 and
IMAP4, but also a slightly dated sendmail v8.8. File services include support for
NFS, CacheFS, AutoFS, and (optionally) SMB support. HP-UX lacks support for
WebNFS, for which HP sees little demand. HP plans to highlight CIFS file
serving for Windows 2000 clients early in 2000.
HP-UX offers strong TCP/ IP capability, missing only IntServ, TCP/ IP Selective
Acknowledgment, and SOCKS 5 support. HP-UX stands out for its support of
DiffServ, which is not yet widely supported. HP also claims to provide a
particularly strong Gigabit Ethernet driver implementation, hitting 940-960
Mbits/ sec measured bandwidth.
1999-2000 Operating System Function Review
SS, March 2000
64 Copyright 2000 D.H. Brown Associates, Inc.
IRIX 6.5
IRIX 6.5 now offers Java 1.2 support, stressing its suitability for soft-real-time
behavior. IRIX 6.5s JVM maps Java threads to SGIs native threads. IRIX can
also assign specific threads to specific processors, useful for quasi-real-time Java
behavior. With Java 1.2, SGI has incorporated additional improvements in Javas
memory-management system via an incremental garbage collector. The
incremental garbage collector now runs in a completely separate thread. It stops
all other threads, marks all other threads, and sweeps through each one collecting
garbage, breaking the pseudo-real-time characteristics of the conventional
garbage-collection algorithm. SGI has also improved global register allocations,
reducing run-time memory requirements and memory-management overhead.
SGIs Java implementation also supports the n32 MIPS ABI (the same binary
interface as SGIs current C and C++ compilers support), making it elegant for
Java coders to call C and C++. Progress remains unclear on SGIs Java front-end
to its native compilers, allowing statically compiled Java code to attain close to
native binary performance. Other middleware services such as object request
brokers and transaction processing depend on third-party products. IRIX does
not support Microsoft DCOM.
Unlike IBM or Microsoft, SGI does not offer a proprietary e-commerce solution
set. Still, IRIX keeps up with Compaq and HP in overall e-commerce support by
bundling products from the Sun-AOL-Netscape alliance. IRIX 6.5 bundles the
iPlanet FastTrack Server. It also bundles Site Manager for IRIX, which provides
site-administration tools in a limited context, as well as optimized caching and
database support through LiveWire. Site Manager for IRIX analyzes websites for
usage and server errors. GUI tools in Site Manager provide a visual directory of
the files on the web site and a 3D view of the site data, including dynamic
animations of site traffic. SGIs Internet Gateway provides similarly easily
configured network connections.
SGIs WebFORCE Intranet Junction for IRIX provides a website tool providing
templates, graphics, and third-party options for additional functions. SGIs
WebFORCE Director product provides load balancing among servers to prevent
server overloads, although IRIX does not yet offer a full-featured bandwidth-
allocation tool. IRIX supports hardware encryption acceleration via PCI cards
that accelerate SSL and IPSec from a third party, Rainbow Technologies. Like
most other UNIX vendors, SGI depends on third parties to provide catalog
importing, credit-card clearance, and billing capabilities.
SGI has focused on making HA capabilities easily accessible for web server
customers, offering a Crate-to-Network in 20 minutes web server that requires
little UNIX expertise to install. Users need to answer fewer than 10 questions
from an interactive program, and the configuration tool will properly set up a
dual-failover HA web server.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 65
SGI bundles Apache, Zeus, and The iPlanet FastTrack Server on all its servers,
but does not provide kernel-based HTTP serving. Mail services include POP3
and IMAP4, as well as sendmail. SGI provides a full range of file services,
including NFS v3, CacheFS, AutoFS and SMB, lacking only Suns WebNFS.
Despite support for a broad range of TCP/ IP performance extensions, IRIX has
yet to support IPSec, IntServ, DiffServ, or IPv6. IRIX has added support for
TCP Selective Acknowledgment since DHBAs last report.
SOLARIS 7
Solaris 7 integrates a JVM with a JIT, along with Java 1.2. The JVM maps Java
thread synchronization operations to operating-system primitives, reducing
overhead for thread synchronization. Finer-grained locking primitives also
improve thread responsiveness. Sun has tested the scalability of the JVM on SMP
systems with up to eight CPUs. Sun has also improved the garbage collection in
its JVM, introducing a handle-less system that has direct access to memory.
Other middleware services such as object request brokers and transaction
processing depend on third-party products. Solaris does not support Microsoft
DCOM.
Sun has adopted an ISV-centric and infrastructure-centric strategy for its e-
commerce offerings, rather than a solution strategy. As such, developing an e-
commerce site on Sun systems requires a fair amount of development work. Still,
Solaris e-commerce offerings are marginally broader than other UNIX vendors
with similar strategies.
Suns Java Wallet technology provides developers with a complete framework for
e-commerce from the user interface through public-key infrastructures to back-
end Java frameworks. However, real applications leveraging the JavaWallet
infrastructure have yet to make a significant impact.
Sun provides SunScreen SPF, a hardware-based encryption accelerator. Solaris
includes dynamic page caching for further enhancement of e-commerce site
performance. Database connectivity and server-side business-logic code can be
load-balanced via the NetDynamics Application Server Sun recently acquired.
While Sun depends on ISVs for SET security, its native web server supports SSL.
Sun Internet Administrator manages website issues, while management of e-
commerce transactions is handled by ISVs. Third-party software likewise is
required for catalog mass-data importing, credit-card clearance, and billing.
Solaris 7 bundles its own HotJava browser and web server. While Suns tools are
adequate for the job, companies building intranets on a variety of different UNIX
systems may prefer to pick a single server and browser combination, in which case
iPlanet makes a better choice. Bundled mail server support is solid with POP3,
IMAP4, and the latest version of sendmail, v8.9, all available in Solaris 7. Solaris 7
supports all studied file-server functions, including NFS V3, CacheFS, AutoFS, SMB,
and WebNFS (a mechanism to run NFS over the Internet).
1999-2000 Operating System Function Review
SS, March 2000
66 Copyright 2000 D.H. Brown Associates, Inc.
Solaris supports a solid set of major TCP/ IP extensions, including RSVP, IP
multicasting, and IP multiplexing. IPSec and IPv6 will ship with Solaris 8. RSVP
support once available only in Solstice Bandwidth Manager has now been
moved into the base operating system. ATM IP switching, once slated to arrive
in February 1999, does not yet appear to be supported. Solaris support for
IntServ, DiffServ, SOCK5, server-side multilink PPP, Path MTU discovery over
UDO, and TCP/ IP Gratuitous ARP still appear to be missing.
TRU64 UNIX 5.0
Tru64 UNIX integrates a JVM with a JIT, along with improvements to threads
(mapping to M:N native threads) and memory management. JDK 1.2 can be
downloaded from the web. Tru64 UNIX is also the only studied UNIX system
to include support for Microsoft DCOM. Object request brokers and
transaction-processing support are unbundled and provided by third parties,
although Compaq points out it has partner relationships with BEA Systems,
Iona, and Inprise to provide CORBA, Orbix, and Inprises VisiBroker products.
Tru64 UNIX features limited e-commerce functions in Compaqs Open Source
Internet Solutions (OSIS) software package. The included e-commerce tools are
primarily iPlanet features such as GUI tools for site management, which fall short
compared to IBMs Net.Commerce or Microsofts Site Server. The iPlanet-based
features include SSL support (but no SET), dynamic page caching, and ODBC
support through iPlanet LiveWire. Tru64 UNIX provides limited additional site
administration through the Apache Web Server and iPlanet FastTrack Server.
However, Compaq does not provide user templates, credit-card clearance
software, or billing systems on its own, instead relying on ISVs. Tru64 UNIX
also lacks tools for transaction prioritization. Bandwidth allocation is available
over ATM networks, useful for backbone networks within companies, but this
capability does not yet extend to the Ethernet networks connecting to most
desktops. Importing mass data for online catalogs and customizable user
interface templates are provided via third-party Oracle and Intershop products.
Compaq provides a try-and-buy evaluation CD of leading third-party products as
part of its Innovators Program; the CD is bundled in each operating-system
offering. Compaq also supplies the iPlanet SuiteSpot and Application Server for
Tru64 UNIX. The iPlanet SuiteSpot includes the higher-end Enterprise Pro web
server, as well as a mail server, calendar, workgroup, and directory services. The
Application Server is designed to be a host for custom business-logic applications
in a multi-tier client/ server environment, offering services such as HA failover,
dynamic load balancing, application partitioning, connection caching and pooling,
and results caching and pooling.
Compaq bundles iPlanet FastTrack 3.01 on every Tru64 UNIX server and
workstation product. In addition, Compaqs OSIS package targeted largely at
ISPs but also bundled with every Tru64 UNIX server includes a variety of
commercial and public-domain Internet tools, as well as the Apache and Zeus
web servers. Compaq has adapted every service included in OSIS to work with its
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 67
TruCluster HA clustering software, so that multiple instances run correctly.
Apache is used most widely, while Zeus offers better scalability on SMP systems
due to its threaded architecture. Tru64 UNIX does not appear to have the
kernel-based HTTP serving optimizations provided by AIX. Mail protocols
supported in the Internet AlphaServer Software bundle include POP3, IMAP4,
and sendmail v8.9.3. While Tru64 UNIX supports NFS V3, SMB file-serving, and
WebNFS, it has not implemented optimizations such as AutoFS and CacheFS.
As shown in Table 2, Tru64 UNIX offers a broad range of TCP/ IP extensions,
with particularly aggressive support for IPv6, IPv4, IPSec, RSVP, and IP
multicast, although it lacks support for newer DiffServ and IntServ extensions.
As with AIX, connecting a Tru64 UNIX system to the Internet over a modem
line requires a series of commands that Tru64 UNIX packages in an intuitive
GUI tool interface.
1999-2000 Operating System Function Review
SS, March 2000
68 Copyright 2000 D.H. Brown Associates, Inc.
DIRECTORY AND SECURITY SERVICES
5.00 6.00 7.00 8.00 9.00
IRIX 6.5
Solaris 7
HP-UX 11.0
AIX 4.3.3
Tru64 UNIX 5.0
Fair OK Good Very Good Excellent
SUMMARY
Tru64 UNIX 5.0 leads in directory and security services, bundling the strongest
set of directory services and sharing the top spot for secure networking
functions. AIX 4.3.3 follows closely, sharing the lead for Virtual Private Network
(VPN) functions while also providing very competitive directory services. HP-
UX shares the lead for secure networking and VPN functions, but provides only
average directory services. Solaris 7 has competitive directory services, but
average capabilities in remaining areas. IRIX 6.5 trails in all areas.
DIRECTORY SERVICES CRITERIA
In large networks, it becomes increasingly difficult for users and administrators to
track user IDs, passwords, server host IDs, and printers throughout the
enterprise. System management itself becomes a database problem. Thus,
operating systems supporting enterprise networks must provide a special-purpose
distributed database called a directory service that provides users and
administrators with an up-to-date and global reference to all network resources.
For example, directory services can authorize users anywhere on the network,
allowing them to log in from any client system, regardless of their geographic
location or the server through which they are connecting.
To avoid becoming a bottleneck and to meet enterprise scalability requirements,
the directory service must directly address traditional database issues, including
reliability and performance. Replication becomes a particularly important
capability. The system must transparently copy and synchronize the directory
service database onto multiple disks or servers while retaining its appearance to
all users as a single entity. Replication can improve directory-service performance
by funneling requests to alternate servers if a particular directory service server
becomes overloaded. Replication can also enhance reliability if a server or disk
becomes disabled, enabling the system to pass queries can to a replica of the
affected database.
FI GURE 6:
Directory and Security
Services Functional Ratings
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 69
Some of the key directory services in use today include:
Lightweight Directory Access Protocol (LDAP): an open standard for directory
services based on a subset of X.500, a vast and comprehensive information-
exchange standard. As LDAP gains implementations on a wide variety of
operating systems, it promises to become both a de jure and de facto
standard. Leaders in LDAP implementation have begun to integrate LDAP
with system operations, such as authenticating user logins against the LDAP
database.
Network Information Service(NIS): has long been used by UNIX systems as a
network store for usernames and passwords. NIS+, a more advanced and
secure version, extends NIS to store a broader range of system configuration
information.
NetWareDirectory Service(NDS): provides a hierarchical structure for tracking
users on large networks. While NDS is perhaps the most established scalable
directory service available, its impact in the market has been limited in the
past by its dependency on the NetWare platform.
Windows NT Directory Service (NTDS): manages user authentication on
Windows NT 4.0 and 3.5x networks using Primary Domain Controllers
(PDCs) and Backup Domain Controllers (BDCs).
Remote Authentication Dial-In User Service (RADIUS): is a special-purpose
directory service for securely managing dial-in remote access. RADIUS
typically increases security for network access by integrating smart-card
authentication with the user login process.
While operating systems may support these services at different levels, this study
specifically assesses the ability to host the server component of a given directory
service.
SECURITY INFRASTRUCTURE CRITERIA
Historically, vendors focused primarily on single-system security. With the rise of
enterprise networks and ubiquitous Internet connectivity, network security has
grown considerably in importance. A comprehensive discussion of network
security represents a vast topic largely beyond the scope of this report, but
operating systems can provide some functions that facilitate deployment of
secure networks. They include:
Kerberos, a sophisticated mechanism for managing distributed user
authentication.
TCP/ IP wrappers, which allow administrators to place restrictions on
incoming and outgoing TCP/ IP services and also allow network activity to be
logged.
Trusted operating-system networking, which provides tools that have been
modified for secure network operation. Common UNIX tools such as
telnet and ftp traditionally passed password information in plain text
over the network, where it was vulnerable to interception. Some operating
1999-2000 Operating System Function Review
SS, March 2000
70 Copyright 2000 D.H. Brown Associates, Inc.
systems have bolstered network security by providing secure versions of
telnet, ftp, and similar tools which plug such holes. Network directory
services such as NIS can also benefit from secure network implementations
(e.g., NIS+). Secure directory services minimize the ability for remote users
to get encrypted lists of passwords that can be broken with widely-known
dictionary attacks that compare encrypted passwords with a self-generated
encrypted list of frequently-used passwords or dictionary entries.
Pluggablesecurity types, which allow applications to take advantage of up-to-date
improvements in security mechanisms by allowing administrators to simply
plug a new authentication module into the system without having to upgrade
the applications themselves. Developers benefit from standardized access to
the modules if the operating system supports the General Security Service
API (GSSAPI).
Support for the Generic Security Service API (GSS-API), which provides a
common method of access to several authentication technologies. GSS-API
allows programmers to structure applications so they can be linked to work
with any GSS-API mechanism.
VIRTUAL PRIVATE NETWORKING (VPN) CRITERIA
VPNs allow a remote user to access an internal corporate network using standard
TCP/ IP services rather than a dial-up modem. Historically, corporations have
used either dedicated and expensive leased lines or dial-up remote access servers.
These traditional approaches potentially incur such problems as limitation of
simultaneous users by the available bank of modems, security problems from
trivial passwords and insecure gateway software, and costly long-distance calls.
VPNs overcome these barriers by allowing users to dial their local service
provider and connect over the Internet to access company systems as if they
were just another node on the local area network. Doing this securely requires
overcoming two barriers: authenticating users before they are allowed to access
corporate internal resources, and securing all traffic that passes over the public
Internet so no one can intercept it.
VPN infrastructures vary somewhat in capabilities; stronger VPN solutions
include the following improvements beyond base VPN functionality:
ICSA certification: ICSA.net performs extensive testing on security-related
products.
IPSec-based implementation: earlier VPNs employed a variety of mechanism
which were proprietary or less interoperable than IPSec.
Filteringtunnels based on IP addresses: since IP addresses of major business
partners should be stable, this function restricts VPN conversations to IP
addresses or address ranges on an approved list. The system should provide
the ability to define tunnels when needed, allocating them on demand.
Certificate-based digital signatures: provide for Internet Key Exchange (IKE)
authentication.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 71
Logging: should log IPSec and IKE messages for security auditing.
Improved management tools (web-based, GUI, or textual commands): help simplify
installation of installing VPNs and the associated keys.
AIX 4.3.3
AIX 4.3.3 includes IBMs SecureWay LDAP v3.11, which has been tested to
support millions of entries and thousands of clients. The directory service
includes SSL support for server and client, which can be turned off if necessary,
and uses X.509v3 public-key certificates. It also supports replication and remote
management through a web-based GUI. AIX allows the service to be integrated
with login operations, so that AIX users, groups and roles can be stored,
replicated, and retrieved across a network of systems. Like most UNIX systems,
AIX also supports NIS. AIX 4.3.3 introduces support for NIS+ Version 2.5.
IBMs AIX Bonus Pack includes an evaluation version of Novell Network
Services 4.1 for AIX, Version 2.2, which includes a two-user-per-server license
for NDS, but does not support replication. Windows NT Directory Services can
be deployed via IBMs Advanced Server for UNIX option.
For secure networking, AIX supports Kerberos v5 as part of its layered DCE
options. TCP/ IP wrappers are supported in AIX Firewall, which is included in
the Bonus Pack. Some secure tools are supported, including NFS, rsh, rlogin, and
rcp. While AIX does not support the complete GSSAPI standard according to
RFC 2078, it does expose some pluggable security APIs as part of IBMs DCE
options for AIX.
AIX bundles a complete set of VPN functions that are ICSA certified and based
on the IPSec standard. AIX can log IPSec and IKE messages for auditing
purposes. Filter tunnels can be defined when needed based on specific IP
addresses using both command-line tools and a GUI, either locally or remotely.
AIX also supports certificate-based digital signatures for IKE authentication.
HP-UX 11.0
HP-UX 11.0 bundles the iPlanet Directory Server, and HP supports an NIS to
LDAP gateway that enables customers to consolidate user management into the
LDAP directory. HP-UX includes both NIS and NIS+ directory services, with
enhancements such as fallback from NIS to DNS, secure NIS maps (i.e., root-
only access), secure updating of NIS maps, and NIS IP address authentication.
Windows NT Directory Services are available as an option through the HP-UX
version of Advanced Server for UNIX.
For secure networking, HP-UX supports Kerberos v5 and a capability
functionality equivalent to TCP/ IP wrappers called secure inetd.sec.
HP-UX supports trusted operating-system functions, including secure telnet,
ftp, rcp, rsh, rlogin, and NIS. HP-UX also supports pluggable security
types and the GSS API for modular network authentication. For VPN support,
1999-2000 Operating System Function Review
SS, March 2000
72 Copyright 2000 D.H. Brown Associates, Inc.
HP-UX builds in ICSA-certified functions at the base operating-system level. A
full set of layered tools for IPSec/ IKE logging, filter tunneling, default and
configurable application-level, rule-based policies, and a fully automated
certificate-retrieval process is available in HPs unbundled Praesidium option.
IRIX 6.5
SGI offers the optional iPlanet Directory Server for IRIX 6.5, which supports
LDAP v2 or v3, and SSL connections. On the client side, IRIX 6.5 includes an
LDAP library for nsd, so that UNIX directory functions such as
getXbyY()can fetch their information from the LDAP server. While the
library supports both LDAP v2 and v3, SGI is awaiting an export waiver to ship
secure connections to the LDAP server, upon which it will be supplied as a
layered product. Server-side NIS+ support is available as part of the unbundled
EnlightenDSM option from SGI.
For secure networking, IRIX 6.5 now includes all the features that formerly
required SGIs unbundled Commercial Security Pack, including not only
Kerberos v5, but also ACL support, least-privilege capabilities, and trusted login,
ftp, ftpd, rlogin, rsh, and rcp, along with other Kerberos commands. Secure NIS
functions are not included, however. While IRIX does not include any direct
software support for VPN functions, SGI resells the third-party Gauntlet firewall
product, which provides a variety of VPN capabilities.
SOLARIS 7
Solaris 7 includes an LDAP v3 server in some of its feature set packages. In the
Solaris for ISP feature set, Suns LDAP server has been tested with up to a
million entries, but its functions have not been integrated with basic operating-
system operations such as logins. Solaris supports both NIS and NIS+, for which
Sun has improved security from 192 to 640 bits. Solaris provides Windows NT
Directory Services as part of its Solaris Easy Access Server feature set, and a
version of NDS for Solaris is available from Novell. Solaris also provides a
RADIUS server in certain packages.
Solaris 7 includes Kerberos security in some of its packages and has some
support for pluggable security types. For example, the Solaris RPC mechanism
has been modified based on GSSAPI, so that functions such as NFS are no
longer bound to a single security mechanism. However, trusted tools require the
older Trusted Solaris 2.5.1. Solaris 7 also does not bundle any VPN functions,
which require the Sun.Net or SKIP add-ons.
1999-2000 Operating System Function Review
SS, March 2000
Copyright 2000 D.H. Brown Associates, Inc. 73
TRU64 UNIX 5.0
Tru64 UNIX bundles the iPlanet SuiteSpot web server package, which also
includes the LDAP v3 server. The iPlanet LDAP v3 server is also included in the
bundled OSIS package. Tru64 UNIX also includes an LDAP authentication
security module as part of its OSIS package, allowing the directory server to be
transparently used by the operating system as the source for all user
authentication and identification information for services such as logins and mail
access. Tru64 UNIX supports NDS V3.2 as part of the unbundled NetWare for
UNIX option and is the only studied product to bundle Windows NT Directory
Service support through the included Advanced Server for UNIX package. Tru64
UNIX is also the only product to match Solaris for RADIUS server support,
with the Basic Merit AAA Radius Server 3.5.14.2 that comes bundled with OSIS.
Compaq sells an optional VPN product for Tru64 UNIX named Raptor Tunnel-
EC. While Raptor Tunnel-EC has not been ICSA certified, it does support basic
VPN functionality. Raptor Tunnel-EC does not yet support auditing of IPSec
and IKE errors and setup/ teardown. Still, Raptor Tunnel-EC provides a nice
GUI interface, can set up tunnels between locations on demand, and can filter
tunnels based on IP address, allowing only pre-validated IP sources or
destinations for secure connection.

S-ar putea să vă placă și