Documente Academic
Documente Profesional
Documente Cultură
A large body of literature has appeared over the past three or four decades on how developers
can measure various aspects of software development and use, from the productivity of the
programmers coding it to the satisfaction of the ultimate end users applying it to their business
problems. Some metrics are broader than others. In any scientific measurement effort, you must
balance the sensitivity and the selectivity of the measures employed. Here we are primarily
concerned with the quality of the software end product as seen from the end user's point of view.
Although much of the software metrics technology used in the past was applied downstream, the
overall trend in the field is to push measurement methods and models back upstream to the
design phase and even to measurement of the architecture itself. The issue in measuring software
performance and quality is clearly its complexity as compared even to the computer hardware on
which it runs. Managing complexity and finding significant surrogate indicators of program
complexity must go beyond merely estimating the number of lines of code the program is
expected to require.
Historically software quality metrics have been the measurement of exactly their opposite²that
is, the frequency of software defects or bugs. The inference was, of course, that quality in
software was the absence of bugs. So, for example, measures of error density per thousand lines
of code discovered per year or per release were used. Lower values of these measures implied
higher build or release quality. For example, a density of two bugs per 1,000 lines of code (LOC)
discovered per year was considered pretty well, but this is a very long way from today's Six
Sigma goals. We will start this article by reviewing some of the leading historical quality models
and metrics to establish the state of the art in software metrics today and to develop a baseline on
which we can build a true set of upstream quality metrics for robust software architecture.
Perhaps at this point we should attempt to settle on a definition of
as well.
Most of the leading writers on this topic do not define their subject term, assuming that the
reader will construct an intuitive working definition on the metaphor of computer architecture or
even its earlier archetype, building architecture. And, of course, almost everyone does! There is
no universally accepted definition of software architecture, but one that seems very promising
has been proposed by Shaw and Garlan:
Abstractly, software architecture involves the description of elements from which systems are
built, interactions among those elements, patterns that guide their composition, and constraints
on those patterns. In general, a particular system is defined in terms of a collection of
components, and interactions among that components.1
This definition follows a straightforward inductive path from that of building architecture,
through system architecture, through computer architecture, to software architecture. As you will
see, the key word in this definition²for software, at least²is
. Having chosen a
definition for software architecture, we are free to talk about measuring the quality of that
architecture and ultimately its implementations in the form of running computer programs. But
first, we will review some classical software quality metrics to see what we must surrender to
establish a new metric order for software.
þ
c
Two leading firms that have placed a great deal of importance on software quality are IBM and
Hewlett-Packard. IBM measures user satisfaction in eight dimensions for quality as well as
overall user satisfaction: capability or functionality, usability, performance, reliability, install
ability, maintainability, documentation, and availability (see Table 3.1). Some of these factors
conflict with each other, and some support each other. For example, usability and performance
may conflict, as may reliability and capability or performance and capability. IBM has user
evaluations down to a science. We recently participated in an IBM Middleware product study of
only the usability dimension. It was five pages of questions plus a two-hour interview with a
specialist consultant. Similarly, Hewlett-Packard uses five Juran quality parameters:
functionality, usability, reliability, performance, and serviceability. Other computer and software
vendor firms may use more or fewer quality parameters and may even weight them differently
for different kinds of software or for the same software in different vertical markets. Some firms
focus on process quality rather than product quality. Although it is true that a flawed process is
unlikely to produce a quality product, our focus here is entirely on software product quality, from
architectural conception to end use.
The implementation of TQM has many varieties, but the four essential characteristics of the
TQM approach are as follows:
In 1993 the IEEE published a standard for software quality metrics methodology that has since
defined and led development in the field. Here we begin by summarizing this standard. It was
intended as a more systematic approach for establishing quality requirements and identifying,
implementing, analyzing, and validating software quality metrics for software system
development. It spans the development cycle in five steps, as shown in Table 3.2.
A typical "catalog" of metrics in current use will be discussed later. At this point we merely want
to present a gestalt for the IEEE recommended methodology. In the first step it is important to
establish direct metrics with values as numerical targets to be met in the final product. The
factors to be measured may vary from product to product, but it is critical to rank the factors by
priority and assign a direct metric value as a quantitative requirement for that factor. There is no
mystery at this point, because Voice of the Customer (VOC) and Quality Function Deployment
(QFD) are the means available not only to determine the metrics and their target values, but also
to prioritize them.
The second step is to identify the software quality metrics by decomposing each factor into sub
factors and those further into the metrics. For example, a direct final metric for the factor
reliability could be faults per 1,000 lines of code (KLOC) with a target value²say, one fault per
1,000 lines of code (LOC). (This level of quality is just 4.59 Sigma; Six Sigma quality would be
3.4 faults per 1,000 KLOC or
.) For each validated metric at the metric
level, a value should be assigned that will be achieved during development. Table 3.3 gives the
IEEE's suggested paradigm for a description of the metrics set.6
!
Name Name of the metric
Metric Mathematical function to compute the metric
Cost Cost of using the metric
Benefit Benefit of using the metric
Impact Can the metric be used to alter or stop the project?
Target value Numerical value to be achieved to meet the requirement
Factors Factors related to the metric
Tools Tools to gather data, calculate the metric, and analyze the results
Application How the metric is to be used
Data items Input values needed to compute the metric
Computation Steps involved in the computation
Interpretation How to interpret the results of the computation
Considerations Metric assumptions and appropriateness
Training Training required to apply the metric
Example An example of applying the metric
History Projects that have used this metric and its validation history
References List of projects used, project details, and so on
To implement the metrics in the metric set chosen for the project under design, the data to be
collected must be determined, and assumptions about the flow of data must be clarified. Any
tools to be employed are defined, and any organizations to be involved are described, as are any
necessary training. It is also wise at this point to test the metrics on some known software to
refine their use, sensitivity, accuracy, and the cost of employing them.
Analyzing the metrics can help you identify any components of the developing system that
appear to have unacceptable quality or that present development bottlenecks. Any components
whose measured values deviate from their target values are noncompliant.
Validation of the metrics is a continuous process spanning multiple projects. If the metrics
employed are to be useful, they must accurately indicate whether quality requirements have been
achieved or are likely to be achieved during development. Furthermore, a metric must be
revalidated every time it is used. Confidence in a metric will improve over time as further usage
experience is gained.