Documente Academic
Documente Profesional
Documente Cultură
This article is the third in a series. The first two articles describe the history and
definition of two important but generally under-utilized software metrics: the cost of
(poor) quality (CoQ) framework (a measure of efficiency), and defect containment (a
measure of effectiveness). Industry benchmark data for these metrics and for alternative
appraisal methods were reviewed. In this article I describe an approach to modeling and
managing efficiency and effectiveness that integrates these two metrics to provide both
prospective (leading) and retrospective (lagging) indicators of software project
outcomes. This model facilitates “simulation” of alternative strategies and provides
quantitative indications of the impact of alternatives under consideration.
As George Box said, “All models are wrong--some are useful” (Box 1979). This model
uses parameters taken from a variety of public sources, but makes no claim that these
parameter values are valid or correct in any particular situation. It is hoped that the
reader will take away a thought process and perhaps make use of this or similar models
using parameter values appropriate and realistic in the intended context. It is widely
recognized that all benchmark values are subject to wide (and typically unstated)
variation. Many parameter values will change significantly as a function of project size,
application domain, and other factors.
The complete model includes five tables; the first four include user-supplied
parameters and calculate certain fields based on those parameters. The fifth table
summarizes the results of the other four. This article will look at the summary first,
and then will provide an overview of the details upon which the summary is based.
I have defined a number of different scenarios (four of which we will examine in this
article), all based on an assumed size of 1000 function points, to illustrate use of the
model. Many other scenarios might be constructed. The first two scenarios assume
defects are “inserted” at U.S. average rates (Jones 2009, p.69) of a total of 5.0
defects per function point, including bad fixes and documentation errors. Scenarios 3
and 4 reflect results reported by high maturity groups in which defects inserted are
reduced to 2.7 per function point. These reductions are generally consistent with
best-in-class results reported in Jones (2008).
The table below summarizes key parameters associated with each of these
scenarios – again, these reflect a reasonable set of assumptions but there are an
almost infinite set of alternative assumptions that might be reasonable in a given
situation. This model is available to anyone interested on request – you are most
welcome to try this out with parameter values appropriate to your situation.
In this illustration a scenario 2 mix of appraisal activities reduces total NVA effort
(including both pre- and post-release effort) by 40 percent compared to the scenario
1 test-only approach typically used by average groups (68.6 person months in
scenario 2 vs. 113.5 in scenario 1). More mature organizations, as a result of lower
defect insertion and improved inspection effectiveness, can reduce NVA by an
additional 38 percent (to 42.7 in scenario 3 vs. 68.6 in scenario 2).
Model Parameters
Four sets of parameter values, each in a separate Excel table, are required to
generate the summary conclusions described previously. One set of tables is used
for each scenario and is contained in a single Excel sheet (tab) dedicated to each
scenario. These tables and the parameters in each are as follows:
1. Defect insertion and removal forecast. This table contains a row for each
distinct appraisal step, for example, requirements inspection, design
inspection, code inspection, static analysis, and unit-function-integration-
system-acceptance tests. Any desired set may be identified. Defects inserted
are forecast on a “per size” basis, for example, using Jones benchmark value
per function point. Percent of defects forecast to be removed by each
appraisal step are also forecast, again using Jones benchmark values or
other values locally determined. Given these user-supplied parameters, the
model calculates defects found and remaining at each appraisal step and the
final TCE percent.
2. Inspection effort forecast. This table contains a row for three types of
inspections--requirements, design, and code. Other rows may be added. For
each row the user specifies the percentage of the work product to be
inspected--0 percent if the inspection is not performed as in the test-only
scenario, up to 100 percent. The user also specifies the expected average
number of defects to be found per inspection and the number of rework hours
forecast per defect. The model calculates the number of inspections required
to remove the number of defects forecast by Table 1, inspection person
months, rework person months, and total person months required.
The pre-test impact factor is used to quantify the overall impact of these consequences;
in effect, this value indicates the percent reduction expected for a given test step due to
pre-test appraisals. The value may in some instances be 100 percent (1.0) if incoming
quality is believed to be sufficiently good to simply not do certain types of tests (for
example, unit tests).
Sanity Check
Does all of this really make sense? How does this “simulation” compare to what we
know about the real world? One way to evaluate the results of these models is to
examine their conclusions in relation to total effort likely to be devoted to an actual
project. Sanity checking any model is always a good idea. Experimenting with these
models has shown that some published parameter values lead to totally implausible
conclusions--for example, pre-release NVA effort can exceed total project effort when
some of these values are used. Obviously such a conclusion cannot be valid; at least
one parameter value must be incorrect when the results do not make sense.
In general, the sanity check seems consistent with published results (for example,
Jones 2009; 2010; Humphrey 2008) if one assumes the scenarios considered are
roughly representative of CMMI levels 1 to 5, respectively.
The Excel model described here is available from the author at no cost – to obtain a
copy, including additional detail about the model and the parameters used, email
ggack@process-fusion.net
BIOGRAPHY
Gary Gack is the founder and president of Process-Fusion.net, a provider of
assessments, strategy advice, training, and coaching relating to integration and
deployment of software and IT best practices. Gack holds an MBA from the Wharton
School, is a Lean Six Sigma Black Belt, and is an ASQ Certified Software Quality
Engineer. He has more than 40 years of diverse experience, including more than 20
years focused on process improvement. He is the author of many articles and a book
entitled Managing the Black Hole: The Executive’s Guide to Software Project Risk.
LinkedIn profile: http://www.linkedin.com/in/garygack.