Sunteți pe pagina 1din 12

DREXEL UNIVERSITY

Software Size and Effort


Estimation
An exploration of algorithmic and non-
algorithmic techniques
Brian Driscoll
11/8/2010
Software Size and Effort Estimation
Brian Driscoll

Introduction
There is little debate over the observation that accurate planning and estimation at an early
stage of the Software Development Life Cycle (SDLC) is critical to the success of any software project [1],
[2], [5], [6], [19]. However, there is also little debate that accurate prediction of the size of a software
product at an early project stage is difficult to achieve [3], [4]. Since software product size estimation is a
primary driver in the allocation of resources to a project, there is clearly significant value in conducting
the software size estimation process in such a way that it yields the most accurate results possible. A
project that is over-estimated will result in under-utilization of resources, while a project that is under-
estimated will result in over-utilization of resources, which may cause higher defect rates, or else will
result in the utilization of premium resources (consultants) in order to complete the project on or close
to schedule. To date there have been many different methods proposed to conduct software size
estimation, as well as controlled and observational studies performed to assess the effectiveness of the
same. In this paper I shall provide a constrained review of the literature on software size estimation,
including two relatively traditional methods and two novel methods for predicting software product size
at early stages in the SDLC. Specifically, I shall describe the method, scope, advantages, and
disadvantages of using Function Point Analysis, Estimation by Analogy, and Use Case Analysis for
software size and effort estimation.

Function Point Analysis


Function Point Analysis (FPA) is an approach to software size estimation that has evolved to
become one of the most widely used methods for software size estimation since its introduction to the
software development industry in 1979 [7]. Additionally, there have been countless approaches
developed to date that are derived from Albrecht’s original work in developing the function point
analysis method, including MKII Function Points, Feature Points, 3D Function Points, Full Function
Points, IFPUG, and COSMIC-FFP [5], [8]. The principles underlying the FPA approach are based upon the
functionality that the developed system is to provide to the end user. More specifically, the approach
entails counting the number of inputs and outputs to be made available in the application, and
weighting each by the value it has to the customer or end user. The weighted sum of these input and
output counts is referred to as “function points” [17]. Albrecht’s claims regarding Function Point Analysis
are that the number of function points possessed by a proposed system correlates highly to the final

2 Introduction | Brian Driscoll


Software Size and Effort Estimation
Brian Driscoll

number of Source Lines of Code (SLOC) of the developed system and to the work effort required to
produce the developed system [17].

One advantage of using FPA to estimate system size and the work effort required to produce a
system is that the analysis can be performed relatively early in the SDLC, usually at any point after
requirements gathering is performed [3]. Function Point Analysis can be performed so early in the
Software Life Cycle because only inputs to and outputs from the system need to be known in order to
perform the analysis [16], [17]. According to Fehlmann, the early stage at which FPA can be performed is
attractive to business managers because it reduces the temporal and financial costs sunk into
determining whether or not a particular project is feasible. Additionally, Fehlmann points out that case
studies performed to compare FPA to other functional analysis methods consistently showed that FPA
provides similar estimates to other functional analysis methods that are performed either
simultaneously or later in the SDLC [16]. Thus, it seems advantageous to use Function Point Analysis to
estimate system size because it can be performed at a relatively early stage of the SDLC with results
comparable to those provided by other techniques, thus reducing risk exposure without degrading the
quality of the estimate.

Another advantage of using Function Point Analysis to estimate system size is that it readily
outperforms other estimation methods that attempt to directly estimate a system’s Source Lines of
Code [7]. Both informal [7] and formal [3] validation studies performed to compare the accuracy of
Function Point Analysis to SLOC-based estimation methods showed that FPA more accurately predicts
system size and development effort than SLOC-based estimation methods such as SLIM, COCOMO, and
expert estimation (though it should be noted that expert estimation of SLOC is exceedingly rare).
Additionally, as pointed out in [17] and verified in [3], Function Point Analysis can also be used to
estimate a project’s SLOC. This proves extremely beneficial for project effort estimation, as in [3] it is
shown that Estimated SLOC has a much higher correlation to Actual Man-Months than does any type of
functional size measure, including Function Point Analysis.

A third and final advantage to using Function Point Analysis to estimate system size is that it is
one of the most widely used software size estimation techniques currently in use by practitioners [7],
[18]. Given that the FPA approach is so widely used, it follows that there is a large community of practice
around the approach, as well as a large research community dedicated to studying the approach. While
not trivial to do so, this can be seen readily in a review of the literature on functional sizing methods.

3 Function Point Analysis | Brian Driscoll


Software Size and Effort Estimation
Brian Driscoll

That there is a research community dedicated to FPA is important because it indicates that the approach
is being continuously examined, revised, and validated even as the types of software being developed
have changed.

While there are clearly advantages to using Function Point Analysis in software estimation, there
are also disadvantages to employing the approach. One of the more notable disadvantages of the
Function Point Analysis approach is that it is a poor estimator of actual financial cost of developing a
software product. A study performed by Heemstra and Kusters found that organizations that used FPA
to estimate project budget had more budget overruns than those that did not use FPA for large projects
(> 200 man-months) [7]. Heemstra and Kusters felt that the negative relationship between use of FPA
and cost overruns could be explained by the fact that successfully implementing FPA is difficult to do,
and often requires more time than can be dedicated to the task. Nonetheless, it seems to follow that
within normal business constraints FPA is not an effective cost estimator, and therefore project
managers must either choose a complementary cost estimation technique or choose a different
estimation technique altogether that provides acceptable estimation for cost as well as other factors.

Another disadvantage of using Function Point Analysis is that it seems also to be a poor
estimator of the actual development effort required for a software product. Results from [3] indicate
that Albrecht’s formula for deriving estimated project effort from function points does not correlate well
to the actual effort expended in developing the software product. Specifically, Kemerer’s experiments in
[3] yielded an R-squared value of only .553 when using function points to predict effort in man-months,
whereas other predictors (most notably KSLOC) yielded much higher R-squared values for their
respective linear regression formulas. It is also indicated in [3] that the correlation between function
points and project effort suffers for projects that are dissimilar to those used to formulate the FPA
methods, e.g.: the projects completed by the DP Services Group at IBM. Thus, Kemerer concludes that it
is likely best not to use FPA to predict project effort directly. It follows from Kemerer’s analysis that
project managers should use alternative approaches to estimate project effort, especially if the project
to be estimated is not a business data processing application such as those produced by IBM’s DP
Services Group at the time that the FPA approach was developed.

4 Estimation by Analogy | Brian Driscoll


Software Size and Effort Estimation
Brian Driscoll

Estimation by Analogy
Estimation by Analogy is an approach to software size estimation that uses data recorded from
past projects to estimate new projects. To be more specific, the method entails characterizing a project
to be estimated, then finding a historical project with the same, or very nearly the same,
characterization to use as the basis for estimation [4], [9]. The actual values from the historical project
are used as the initial estimated values for the new project, however these values may be altered
depending upon specific conditions that are unique to the new project [9], [10], [11], [12]. The general
methods used in estimation by analogy follow from the design and implementation of case-based
reasoning systems, and fall into the realm of machine learning. In machine learning, software tools used
for estimation improve their target estimates as the number of historical project data increases.

One of the distinct advantages of using estimation by analogy is that the process by which
estimates are created closely resembles the processes conducted by human estimators. On the one
hand, this is advantageous because the estimation process, including the inputs and outputs to the
process, are more easily understood to the user than algorithmic models (such as FPA) [9]. On the other
hand, the similarity between estimation by analogy and human estimation is advantageous because the
results of the estimation process seem to be more trustworthy than results produced by other methods
[11]. Again, it seems this apparently increased trust is due to the relative opacity of algorithmic
estimation techniques when compared to estimation by analogy. Thus, it seems it would be
advantageous to use estimation by analogy because it is more easily understood by the user and thus its
resulting estimates are more trusted by the user as well.

Another advantage held by analogy-based estimation is that it can be applied to problem


domains that are difficult to model and in which an algorithmic method might not be suitable [9], [11]. It
tends to be the case that algorithmic methods fall short when there is a significant amount of “noise”, or
irrelevant data, in the models used to generate the algorithm. Unfortunately, it is often difficult to know
exactly what data is and is not to be considered “noisy data” when generating models for estimation,
and therefore it is often difficult to eliminate this data [9], [4]. It is also difficult to generate algorithmic
models when the interactions among factors that drive project effort and/or cost are not readily known
[9]. Fortunately, the methods and tools used in analogy-based estimation are able to account for “noisy
data”, and it is not necessary to know how factors contributing to project cost and effort interact with
one another in order to generate an estimate using these tools and techniques. Thus, it is advantageous

5 Estimation by Analogy | Brian Driscoll


Software Size and Effort Estimation
Brian Driscoll

to use estimation by analogy when the problem domain in which the target project lies is difficult to
model, when there is incomplete information regarding the significance of relationships among the
factors that contribute to project cost and effort, or when it is difficult to determine what project data is
relevant and what project data is not relevant.

Of course, just as there are advantages with estimation by analogy, there are disadvantages as
well. One of the most obvious, and perhaps most crippling, disadvantages of the approach is that there
is relatively little consistent research proving that estimation by analogy is in fact better than algorithmic
approaches. For instance, the work done by Shepperd et al in [4] suggests that estimation by analogy
(using the ANGEL estimation tool) is superior to linear regression and stepwise multiple regression, two
algorithmic estimation techniques. However, Walkerden and Jeffrey concluded in [9] that neither the
results obtained ANGEL, nor those obtained by ACE, another analogy-based estimation tool, were
significantly different from those obtained using linear regression or those obtained from an unaided
human subject. Unfortunately, there does not seem to be a definitive explanation for why experimental
results using estimation by analogy differ, and thus it is difficult to propose estimation by analogy as a
superior alternative to algorithmic methods for this reason.

Another disadvantage of using estimation by analogy is that it is dependent upon the existence
of a database of suitable projects from which to select a project that is analogous to the target project
[9]. There are several reasons why there may not be sufficient data available with which to generate an
estimate for a target project based on a comparable source project. First, if the target project or its
parameters are sufficiently novel or unique, then there may not be relevant data from which to derive
an estimate. Given the rapid pace of change in the software development business currently, it is not
difficult to imagine a situation whereby existing source data is obsolete (and therefore useless) when it
comes time to estimate a new project [9]. Second, it may be the case that there is not a source project
that is similar enough (or close enough, in terms of n-dimensional Euclidean distance) to the target
project to be considered a good analogue [4], [9]. This is potentially the case in emergent software
development areas, such as Software-as-a-Service projects in recent years, where the nearest neighbors
were either simple web applications or enterprise-grade client-server applications. Thus, it may be that
estimation by analogy is unsuitable for projects for which there is not sufficient historical data from
which to generate an estimate.

6 Estimation by Analogy | Brian Driscoll


Software Size and Effort Estimation
Brian Driscoll

Use Case Points


The employment of use cases as inputs to functional size estimation was first proposed by
Karner in 1993 [15]. Therefore, it is a relatively new approach to functional sizing of a software system.
However, this particular approach has received much attention in recent years due to the rise of Unified
Modeling Language and Object-Oriented Programming techniques. The approach estimates project
effort in person-hours using sufficiently detailed use case descriptions. More specifically, the approach
assigns Use Case Points based on use case attributes such as Actors and Use Cases, then weights
(adjusts) those Use Case Point values based upon technical and environmental factors. The Adjusted Use
Case Point values are then used to estimate effort with the equation:

Estimated Effort = (Use Case Points) x (Person Hours/Use Case Point)

While the Use Case Points method bears some similarity to Function Point Analysis, and may
have been influenced by the same, there are some important differences between the two approaches.
The first and most notable difference is that the Use Case Points method strictly depends on the use of
Use Cases to describe the delivered functionality of the completed system, and as such the approach
cannot be employed for projects that do not utilize use cases to describe system functionality. On the
other hand, while the process of counting Function Points is standardized, the process of counting Use
Case Points has not yet been standardized [14]. Therefore, two separate evaluations of a project using
Function Point Analysis should yield the same number of Function Points, whereas two evaluations of a
project using Use Case Points may not.

The primary advantage of using the Use Case Points method to estimate software system size
and required effort is that it is tailored to more modern software design and development techniques,
namely the use of Unified Modeling Language and Use Cases in specifying software systems. Current
system design techniques and software development practices place an emphasis on creating a high-
level system specification up-front and deferring detailed specification until it is absolutely necessary to
do so [8]. This poses a problem for users of Function Point Analysis, as FPA requires a slightly more
detailed specification (down to the I/O level) in order to provide an estimate with relative confidence.
Use Case Points methods can be used as soon as a high-level design has been created, allowing for an
estimate to be obtained very early in the Software Development Life Cycle. Clearly, this limits exposure
to projects that are infeasible by filtering them out before a significant financial investment has been
made.
7 Use Case Points | Brian Driscoll
Software Size and Effort Estimation
Brian Driscoll

Another advantage to using the Use Case Points method to estimate software system size and
required effort is that it provides results that are at least as accurate as those provided by human
estimators with less actual effort expended in producing those results. In independent studies, both
Kusumoto et al and Anda found that the results provided by Use Case Points estimation tools were as
accurate if not more accurate than human counterparts [8], [13]. Observational results provided in [8]
showed that estimates provided by the U-EST tool were 80-120% of the estimates provided by experts.
Meanwhile, the more formal results in [13] showed the mean Magnitude of Relative Error (MMRE) to be
.37 for human estimators, whereas the MMRE for the Use Case Points method implemented by the
author was either .21 or .29 depending upon the assignment of environmental factors. What’s more,
both studies found that results were obtained much more quickly with the use of software tools to aid
Use Case Points Analysis than were obtained from human estimators evaluating the same projects. Once
again, in current software development practices it is optimal to spend the least amount of time
possible on effort that does not contribute directly to the software product, so clearly it is advantageous
to have an estimation method and related tools that provide results comparable to those obtained from
human estimators.

An unfortunate disadvantage of using the Use Case Points method for estimating software size
and required effort is that there is no standard by which to evaluate and apply the Use Case Points
method. While the general process of the Use Case Points method is singular and universal, adopters of
this method may only use their own best judgment to determine how to measure Unadjusted Use Case
Points [14]. A side effect of this fact is that two different implementations of the Use Case Points
method may produce two completely different estimates for the same project. Of course, it is the case
that any one implementation of the method will be deterministic, and is therefore internally valid.
However, the method itself cannot be considered consistent because there is no deterministic method
of counting Use Case Points that applies to all implementations of the method.

Clearly another disadvantage to using Use Case Points is its dependency upon the Use Case as a
software design construct. One of the advantages of FPA is that the inputs to the approach can be
generalized such that function points can be calculated for 30-year-old systems developed in Cobol, as
well as for contemporary systems developed in contemporary languages. The same cannot be said for
the Use Case Points method. Although use cases are widely used today [14], there is no guarantee that
they will be nearly as widely used in the future. Thus, an organization that has made a significant

8 Use Case Points | Brian Driscoll


Software Size and Effort Estimation
Brian Driscoll

investment in adopting the Use Case Points method of estimating system size and required effort will
surely be disappointed should the construct of use case points fall out of fashion in favor of some other
design construct.

Conclusion
It is clear that all three of the approaches explored in this paper, Function Point Analysis,
Estimation by Analogy, and Use Case Points, hold promise for providing relatively accurate estimates of
software size and required development effort at a relatively early stage in the Software Development
Life Cycle. Just the same, it is clear that all three approaches have certain limitations that prevent any
one of them from being the single best approach to estimating project size or effort. Yet, despite any
limitations that these three approaches have, none should be removed from an organization’s
consideration when attempting to choose an appropriate estimation method. Rather, an organization’s
software development and business processes must factor into its decision to choose one method or
another.

Any of the methods described thus far would be an appropriate choice for organizations that
require an estimation of project size and effort as early as is reasonable in the Software Development
Life Cycle. It is known that Function Point Analysis can be performed as soon as requirements gathering
has been completed, and can be performed again after requirements have been clarified. It is also
known that the Use Case Points method can be applied as soon as use cases have been written during
the requirements gathering process, and can be performed again after use cases have been clarified.
Finally, it is known that Estimation by Analogy can be performed as soon as any pertinent information
about a proposed project is known, and can be performed again each time more information about a
proposed project is discovered.

Concerning consistency as a factor in an organization’s decision to choose a particular


estimation method, it seems as though Function Point Analysis is likely the desired approach. Since the
results of Estimation by Analogy techniques are heavily dependent upon a database of relevant
historical project data from which to select an analogous project, the results provided by different
analogy-based estimators (or even the same estimator at different times) is not likely to be consistent.
And, since a standard for Use Case Points measurement has not yet been developed, results from
different UCP systems are likely to differ – perhaps significantly – from one another. On the other hand,

9 Conclusion | Brian Driscoll


Software Size and Effort Estimation
Brian Driscoll

obtaining Function Counts in the FPA process has been standardized by an international governing body,
so the function counts provided by one entity for one project should be the same as those provided by
another entity pertaining to the same project. Therefore, if consistency of results is a deciding factor for
an organization to choose a particular method, then Function Point Analysis is the most desirable of the
three approaches discussed.

Then again, it may be the case that there is another factor – or number of factors – that are keys
to an organization’s decision to use one estimation approach or another. In this paper I have provided
only brief descriptions of each of three approaches to software size and effort estimation that can be
used early in the Software Development Life Cycle. There are other approaches not discussed here, as
well as other advantages and disadvantages of the included approaches that have not been explored in
detail. Further research and experimentation is required to determine empirically which of the three
approaches discussed here provides the most accurate results given similar project conditions.

References
[1] Dolado, Jose Javier. “A Validation of the Component-based Method for Software Size Estimation.”
IEEE Transactions on Software Engineering, vol. 26, no. 10, pp. 1006-1021, 2000.

[2] Offen, Raymond J, and Ross Jeffrey. “Establishing Software Measurement Programs.” IEEE Software,
March/April issue, pp. 45-53, 1997.

[3] Kemerer, Chris F. “An Empirical Evaluation of Software Cost Estimation Models.” Communications of
the ACM, vol. 30, no. 5, pp. 416-429, 1987.

[4] Shepperd, Martin, Chris Schofield, and Barbara Kitchenham. “Effort Estimation Using Analogy.”
Proceedings of the ICSE-18, pp. 170-178, 1996.

[5] Hastings, T.E., and A.S.M. Sajeev. “A Vector-Based Approach to Software Size Measurement and
Effort Estimation.” IEEE Transactions on Software Engineering, vol. 27, no. 4, pp. 337-350, 2001.

[6] Zivkovic, Ales, Marjan Hericko, and Tomaz Kralj. “Empirical Assessment of Methods for Software Size
Estimation.” Informatica, vol. 27, no. 4, pp. 425-432, 2003.

10 References | Brian Driscoll


Software Size and Effort Estimation
Brian Driscoll

[7] Heemstra, F.J., and R.J. Kusters. “Function point analysis: evaluation of a software cost estimation
model.” Eur J of Inf Systs, vol. 1, no. 4, pp. 229-237, 1991.

[8] Kusumoto, Shinji, et al. “Estimating Effort by Use Case Points: Method, Tool, and Case Study.”
Proceedings of the 10th Annual International Symposium on Software Metrics, 2004.

[9] Walkerden, Fiona, and Ross Jeffrey. “An Empirical Study of Analogy-based Software Effort
Estimation.” Empirical Software Engineering, no. 4, pp. 135-158, 1999.

[10] Idri, Ali, Alain Abran, and Taghi M. Khosgoftaar. “Fuzzy Analogy: A New Approach For Software Cost
Estimation.”International Workshop on Software Measurement, pp. 93-101, 2001.

[11] Li, Jingzhou, et al. “A flexible method for software effort estimation by analogy.” Empirical Software
Engineering, no. 12, pp. 65-106, 2007.

[12] Jeffrey, Ross, Melanie Ruhe, and Isabella Wieczorek. “Using Public Domain Metrics to Estimate
Software Development Effort.” Seventh International Software Metrics Symposium, pp. 16-27, 2001.

[13] Anda, Bente. “Comparing Effort Estimates Based On Use Case Points with Expert Estimates.”
(Unpublished), retrieved from http://de.scientificcommons.org/42390807 on 10/31/10.

[14] Anda, Bente, Hege Dreiem, et al. “Estimating Software Development Effort Based on Use Cases –
Experience from Industry.” Lecture Notes in Computer Science, Iss. 2185, pp. 487-502, 2001.

[15] Mohagheghi, Parastoo, Bente Anda, and Reidar Conradi. “Effort Estimation of Use Cases for
Incremental Large-Scale Software Development.” ICSE ’05, 2005.

[16] Fehlmann, Thomas. “When use COSMIC FFP? When use IFPUG FPA? A Six-Sigma View.”
IWSM/MetriKon, 2006.

[17] Albrecht, Allan J., and John E. Gaffney. “Software Function, Source Lines of Code, and Development
Effort Prediction: A Software Science Validation.” IEEE Transactions on Software Engineering, vol. SE-9,
no. 6, pp. 639-648, 1983.

[18] Pow-Sang, Jose Antonio and Ricardo Imbert. “Including the Composition Relationship among
Classes to Improve Function Points Analysis.” Proceedings VI Jornadas Peruanas de Computación-JPC'07,
2007.

11 References | Brian Driscoll


Software Size and Effort Estimation
Brian Driscoll

[19] Fetcke, Thomas, Alain Abran, and Tho-Hau Nguyen. “Mapping the OO-Jacobson Approach into
Function Point Analysis.” Proceedings of TOOLS-23’97, pp. 1-11, 1998.

12 References | Brian Driscoll

S-ar putea să vă placă și