DRAFT [2016-2017][KR][en] at 2023-06-02 13:24:01 +0300
Logo-do [errata] Profile

Software Engineering Practices

Unit 10 Software Cost Estimation

Lecture


Keywords

effort, size of the product, COCOMO (COnstructive COst MOdel), organic project, embedded project, semidetached project, function point analysis (FPA), input types (EI), output types ( EO), inquiry types ( EQ), logical internal files ( ILF), interfaces ( EIF), unadjusted function points (UFP), degree of influence (DI), technical complexity factor (TCF), adjusted function point measure (FP), application composition model, early design model, post-architecture model, source lines of code (SLOC), scale factor, software understanding increment (SU), degree of assessment and assimilation (AA)

Software development takes not only time, but also money. Getting reliable cost and schedule estimates for software development projects is still hard achieved. Software development cost is notoriously difficult to estimate reliably at an early stage. Since progress is difficult to see schedule slippages often go undetected for quite a while, and schedule overruns are the rule, rather than the exception.

Estimating the cost of a software development project is a field, in which all too often relies on mere guesstimates. There are exceptions to this procedure, fortunately. Now exist a number of algorithmic models that allow us to estimate total cost and development time of a software development project, based on estimates for a limited number of relevant cost drivers.

In most cost estimation models, a simple relation between cost and effort is assumed. The effort may be measured in man-months, for instance, and each man-month is taken to incur a fixed amount. The total estimated cost is then obtained by simply multiplying the estimated number of man-months by this constant factor. In this chapter, we freely use the terms cost and effort as if they are synonymous.

The notion of total cost is usually taken to indicate the cost of the initial software development effort, i.e. the cost of the requirements engineering, design, implementation and testing phases. Thus, maintenance costs are not taken into account. Unless explicitly stated otherwise, this notion of cost will also be used by us. In the same vein, development time will be taken to mean: the time between the start of the requirements engineering phase and the point in time when the software is delivered to the customer. Lastly, the notion of cost as it is used here, does not include possible hardware costs either. It concerns only personnel costs involved in software development.

Research in the area of cost estimation is far from transparency. Different models use different measures, so that mutual comparisons are very difficult.

General algorithmic models show a relation between effort needed (E, measure could be the number of man-months needed) and the size of the product (Kilo Lines Of Code, KLOC = Lines Of Code / 1000):

Table 10.1 - Base formulae for the relation between size and effort

Origin

b

c

Halstead

0,7

1,5

Boehm

2,4

1,05

Walston-Felix

5,2

0,91

Several questions come to mind immediately: What is a line of code? Do we count machine code, or the source code in some high-level language? Do we count comment lines, or blank lines that increase readability? Different models use different definitions of these notions.

The actual numbers used in those models result from an analysis of real project data. If these data reflect different project types or development environments, so will the models. Thus differences between formulae are in the characteristics between the sets of projects on which the various models are based.

These models reflect factors that bear on development cost and effort and allow software developers to identify strategies for improving software productivity, the most important of which are:

10.1 COCOMO

COCOMO (COnstructive COst MOdel) is a procedural software cost estimation model developed by Barry W. Boehm and the one of the algorithmic cost estimation models best documented. In its simplest form, often called Basic COCOMO or COCOMO 81, the formula that relates effort to software size, is (10.1), where b and c are constants that depend on the kind of project that is being executed.

COCOMO distinguishes three classes of project:

Table 10.2 - Parameters of Basic COCOMO

Origin

b

c

Organic

2,4

1,05

Embedded

3,6

1,2

Semidetached

3,0

1,12

The COCOMO formulae are based on a combination of expert judgment, an analysis of available project data, other models, etc. The basic model does not very accurate results for the projects on which the model has been based.

You can use the implementation of basic COCOMO model in program USC COCOMO 81.

10.2 Function Point Analysis

Function point analysis (FPA) is a method of estimating costs in which the problems associated with determining the expected amount of code are circumvented. FPA is based on counting the number of different data structures that are used. In the FPA method, it is assumed that the number of different data structures is a good size indicator. FPA is particularly suitable for projects aimed at realizing business applications for, in these applications, the structure of the data plays a very dominant
role. The method is less suited to projects in which the structure of the data plays a less prominent role, and the emphasis is on algorithms (such as compilers and most real-time software).

The following five entities play a central role in the FPA-model:

The number of unadjusted function points (UFP) is a weighted sum of these five entities:

Table 10.3 - Counting rules for UFP

Type

Complexity level

Simple

Average

Complex

EI

3

4

6

EO

4

5

7

EQ

3

4

6

ILF

7

10

15

EIF

5

7

10

Table 10.4 - Counting rules for UFP

Number of file types

Number of data elements

1-4

5-15

> 15

0-1

Simple

Simple

Average

2-3

Simple

Average

Complex

> 3

Average

Complex

Complex

As in other cost estimation models, the unadjusted function point measure is adjusted by taking into account a number of application characteristics that influence development effort. Figure 10.1 contains the 14 characteristics used in the FPA model. The degree of influence of each of these characteristics is valued on a six-point scale, ranging from zero (no influence, not present) to five (strong influence). The total degree of influence (DI) is the sum of the scores for all characteristics.

Figure 10.1 Application characteristics in FPA

Then this number is converted to a technical complexity factor (TCF) using the formula:

In applying the FPA cost estimation method, it still remains necessary to calibrate the various entities to used development environment. 

10.3 COCOMO II

COCOMO II is a revision of the basic COCOMO model, tuned to the life cycle practices of the 1990s and 2000s. It reflects our cumulative experience with and knowledge of cost estimation.

COCOMO II provides three increasingly detailed cost estimation models. These models can be used for different types of projects, as well as during different stages of a single project:

The Post-Architecture model can be considered an update of the original COCOMO model; the Early Design model is an FPA-like model; and the Application Composition model is based on counting system components of a large granularity, such as screens and reports.
Total effort is estimated in the Application Composition model as follows:

  1. Estimate the number of screens, reports, and 3GL components in the application.

  2. Determine the complexity level of each screen and report (simple, medium or difficult). 3GL components are assumed to be always difficult. The complexity of a screen depends on the number of views and tables it contains. The
    complexity of a report depends on the number of sections and tables it contains (see table 10.6) is used to determine these complexity levels.

  3. Use the numbers given in table 10.5 to determine the relative effort (in Object Points) to implement the object.

  4. The sum of the Object Points for the individual objects yields the number of Object Points for the whole system.

  5. Estimate the reuse percentage, resulting in the number of New Object Points (NOP) as follows:

  1. Determine a productivity rate

 

 

  1. Estimate the number of man-months needed for the project:

 

Table 10.5 - Counting Object Points

 

Object type

Complexity level

Simple

Medium

Difficult

Screen

1

2

3

Report

2

5

8

3GL componen

   

10

Table  - Complexity levels for screens

Number of views

Number and source of data tables

total < 4
( < 2 on server
< 3 on client)

total < 8
( 2 - 3 on server
3 -5 on client)

total > 8
( > 3 on server
> 5 on client)

< 3

Simple

simple

medium

3-7

Simple

medium

difficult

> 8

Medium

difficult

difficult

The Early Design model uses unadjusted function points (UFPs) as its basic size measure. These unadjusted function points are counted in the same way they are counted in FPA. Next, the unadjusted function points are converted to Source Lines Of Code (SLOC), using a ratio SLOC/UFP which depends on the programming language used. These ratio are given in Function Point Languages Table.

The Early Design model does not use the FPA scheme to account for application characteristics. Instead, it uses a set of seven cost drivers, which are a combination of the full set of cost drivers of the Post-Architecture model. These cost drivers are rated on a seven-point scale, ranging from extra low to extra high. The values assigned are similar to those in figure 10.7.

Table 10.6 - Cost drivers and associated effort multipliers in COCOMO II

Cost drivers

Rating

Very low

Low

Nominal

High

Very high

Extra high

Product factors


Reliability required

0.75
 

0.88

1.00

1.15

1.39

 

Database size

 

0.93

1.00

1.09

1.19

 

Product complexity

0.75

0.88

1.00

1.15

1.30

1.66

Required reusability

 

0.91

1.00

1.14

1.29

1.49

Documentation needs

0.89

0.95

1.00

1.06

1.13

 

Platform factors

Execution time constraints

   

1.00

1.11

1.31

1.67

Main storage constraints

   

1.00

1.06

1.21

1.57

Platform volatility

 

0.87

1.00

1.15

1.30

 

Personnel factors

Analyst capability

1.50

1.22

1.00

0.83

0.67

 

Programmer capability

1.37

1.16

1.00

0.87

0.74

 

Application experience

1.22

1.10

1.00

0.89

0.81

 

Platform experience

1.24

1.10

1.00

0.92

0.84

 

Language and tool experience

1.25

1.12

1.00

0.88

0.81

 

Personnel continuity

1.24

1.10

1.00

0.92

0.84

 

Project factors

Use of software tools

1.24

1.12

1.00

0.86

0.72

 

Multi-site development

1.25

1.10

1.00

0.92

0.84

0.78

Required development schedule

1.29

1.10

1.00

1.00

1.00

 

After the unadjusted function points have been converted to KLOC, the cumulative effect of the cost drivers is accounted for by the formula

Finally, the Post-Architecture model is the most detailed model. Its effort equation is very similar to that of the basic COCOMO model:

It differs from the original COCOMO model in its set of cost drivers, the use of lines of code as its base measure, and the range of values of the exponent b. The differences between the COCOMO and COCOMO II set of cost drivers (see table 10.7) reflect major changes in the field.

In COCOMO II, the user may use both KSLOC and UFP as a base measure. It is also possible to use UFP for part of the system. The UFP counts are converted to KSLOC counts as in the Early Design model, after which the effort equation applies.

COCOMO II model uses five scale factors [TEX]W_{i} [/TEX], each of which is rated on a six-point scale from very low (5) to extra high (0). The exponent b for the effort equation is then determined by the formula:

here the scale factors:

Basic COCOMO model has only the first two of these factors. Basic COCOMO model allows us to handle reuse in the following way. The three main development phases, design, coding and integration, are estimated to take 40%, 30% and 30% of the average effort, respectively. Reuse can be catered for by separately considering the fractions of the system that require redesign ( DM ) , recoding ( CM ) and re-integration ( IM ) . An adjustment factor AAF is then given by the formula:

 

An adjusted value AKLOC, given by

 

is next used in the COCOMO formulae, instead of the unadjusted value KLOC.

In this way a lower cost estimate is obtained if part of the system is reused. By treating reuse this way, it is assumed that developing reusable components does not require any extra effort. This hypothesis does not to be realistic.

COCOMO II uses a more elaborate scheme to handle reuse effects. This scheme reflects two additional factors that impact the cost of reuse: the quality of the code being reused and the amount of effort needed to test the applicability of the component to be reused.

Extra effort needed to reuse is denoted by the software understanding increment (SU). If the software to be reused is strongly modular, strongly matches the application in which it is to be reused, and the code is well-organized and properly documented, then SU is estimated to be 10%. If the software has low relations, is poorly documented the SU may be as high as 50%.

The degree of assessment and assimilation (AA) denotes the effort needed to determine whether a component is appropriate for the present application. It ranges from 0% (no extra effort required) to 8% (extensive test, evaluation and documentation required).

Both these percentages are added to the adjustment factor AAF, yielding the equivalent kilo number of new lines of code, EKLOC:

 

You can use the implementation of COCOMO II Post-architecture model in program USC COCOMO II.

 

 

References...Hide
  1. Barry Boehm. «Software engineering economics». Englewood Cliffs, NJ:Prentice-Hall, 1981.
  2. Barry Boehm, et al. «Software cost estimation with COCOMO II». Englewood Cliffs, NJ:Prentice-Hall, 2000.

Part of material was taken from:

  1. Software Engineering: Principles and Practice. Hans van Vliet. 2007.


© 2006—2023 Sumy State University