Software Engineering PracticesUnit 7 Software Maintenance Lecture

Keywords

corrective maintenance, adaptive maintenance, preventive maintenance, maturity, version-centered analysis, history-centered analysis, S-program, P-program, E-program, reverse engineering (reengineering), refactoring

Software maintenance encompasses all modifications to a software product after delivery. According IEEE, it is possible to distinguish three types of maintenance activity:

Corrective maintenance deals with the repair of faults found.
Adaptive maintenance deals with adapting software to changes in the environment, such as new hardware or the next release of an operating system, and accommodating new or changed user requirements.
Preventive maintenance concerns activities aimed at increasing the system’s maintainability, such as updating documentation, adding comments, and improving the modular structure of the system.

The correction of errors accounts for about 21% of the total maintenance effort only (fig. 7.1). Vliet divided adaptive category into adaptive (changes without change of system`s functionality) and perfective (functional enhancements to the system) categories. 2/3 of the adaptive maintenance effort concerns changes to accommodate changing user needs, while the remaining 1/3 largely concerns adapting software to changes in the external environment.

Figure 7.1 — Distribution of maintenance activities

Also the total cost of system maintenance is estimated to comprise at least 50% of total life cycle costs. Figure 7.2 gives an estimate of the number of people working in software development compared to software maintenance according to (Jones, 2006).

Figure 7.2 — Distribution of maintainers to developers in USA

In many organizations, the definition of software maintenance does not follow the IEEE definition. Some organizations for instance define change efforts larger than, say, three months, as development rather than maintenance.

The maintenance categories mentioned above refer to the software only. Keeping software alive incurs other costs too, though. For example, new users must be trained, and the helpdesk needs to be staffed. Nowadays, these supporting costs account for around 25% of the cost of keeping a system deployed.

Another way to look at the distribution of maintenance cost and prevailing types of maintenance tasks is along the time dimension. It is possible to identify the following maintenance life cycle stages:

Introductory - the stage of a new system, during which most of the effort is spent on user support. Users have to be trained, and they will often contact the helpdesk for clarification.
Growth stage follows next, in which more and more users start to explore the system’s possibilities. Emphasis during this stage is on correcting faults.
Maturity - a period when users already know what the system can and cannot do, and ask for enhancements. In this period system can successfully evolve.
Finally, a period of decline sets in. Technology replacement, such as another platform or user interface kit, constitutes a major category of maintenance tasks during this period.

Successful maintenance requires knowledge of the application. After initial delivery, this knowledge usually is available. Either knowledge of the application is explicitly transferred to the maintenance organization via documentation, training. But over time, this knowledge vaporizes, and at some point in time, it has become scant. This point in time more or less coincides with the transition from the mature stage to the declining stage. In the latter stage of maturity , changes become tactical. For example, necessary changes are realized through patches and wrappers.

The main activities of a single maintenance task are:

Isolation - the first activity is concerned with determining the part of the system (modules, classes) that needs to be changed.
Modification - this concerns the actual changes. One or more components are adapted to accommodate the change.
Testing - after the changes have been made, the system has to be tested anew (regression testing).

As a rule, isolation takes about 40% of effort, while the other two activities each take about 30%. This distribution is not the same for all types of maintenance. For corrective maintenance, isolation often takes an even larger share, while for adaptive maintenance tasks, the actual modification takes longer. During corrective maintenance, the fault that caused the failure has to be found, and this may take a lot of effort. Once it is found, the actual modification often is fairly small. For adaptive maintenance tasks, the reverse holds.

A particularly relevant issue for software maintenance is that of reverse engineering, the process of reconstructing a lost blueprint. Before changes can be realized, the maintainer has to gain an understanding of the system. Since the majority of operational code is unstructured and undocumented, this is a major problem. The fundamental problem is that maintenance will remain a big issue. Because of the changes made to software, its structure degrades. Specific attention to preventive maintenance activities aimed at improving system structure are needed from time to time to fight system entropy.

Lehman and Belady studied the evolution of software systems and formulated their well-known laws of software evolution. Empirical studies have given general support for these laws.

Apparently, there is quite a bit of regularity in the evolution of software. We can use this insight and try to predict the future evolution of a specific system by looking at the actual evolution of that system till now. We then base our next action on information from the past. We may for example decide which components to reengineer by looking at components that changed a lot in the recent past. The assumption then is that components that changed a lot in the recent past, are likely to change in the near future too.

Gîrba and Ducasse (2006) distinguish two types of analysis of evolutionary data: version-centered analysis and history-centered analysis. In version-centered analysis, differences between successive versions of a system are studied. The results are typically depicted in a figure with time (i.e. successive versions) along one axis and the relevant aspects of the system on another. For example, we may consider the relative size of the different components of a system over time, as illustrated in figure 7.3.

Figure 7.3 — Size versus version

Each rectangle in figure 7.3 denotes a component. The width and height of a rectangle each stand for an attribute of that component. The width may for instance denote the number of classes of a component, while the height denotes its number of interfaces. Figure 7.3 tells us that component A is stable and small, while component D is stable and big. Component C shows a steady growth from one version to the next, and component B exhibits some ripple effects in versions 2 and 3, and is stable since then.

In a history-centered analysis, a particular viewpoint is chosen, and the evolution of a system is depicted with respect to that viewpoint. For example, figure 7.4 shows how often different components are changed together. Each node denotes a component, and the thickness of the edges denotes how often two connected components are changed together (so-called co-changes). A thicker edge between components indicates more frequent co-changes. The latter information may for instance be derived from the versioning database.

Figure 7.4 — Components that change together

From figure 7.4 we learn that components /util/figs and /util/tools are changed together frequently. The same holds for components /util/tools and /work- flow/paint. The names of the components suggest that components /util/figs and /util/tools are structurally related, while /util/tools and /workflow/paint are structurally unrelated. From this additional information, we might infer that the interaction between components /util/tools and /workflow/paint deserves our attention. Alternatively, we may label the components with the (external) features they participate in, and the view then shows whether changes frequently affect different features.

A version-centered analysis depicts the version information as-is. It is up to the user to detect any pattern. In figure 7.3, it is the user who has to detect growing or shrinking components; the picture just presents the facts. In a history-centered analysis, some hypothesis guides the representation, and the patterns are then encoded in the representation, as in figure 7.4.

Observing that most software is subject to change in the course of its existence, the Lehman and Belady formulated eight laws of software evolution.

Lehman qualified the application of such laws by distinguishing between three categories of software:

An S-program is written according to an exact specification of what that program can do.
A P-program is written to implement certain procedures that completely determine what the program can do.
An E-program is written to perform some real-world activity. It should behave is strongly linked to the environment in which it runs, and such a program needs to adapt to varying requirements and circumstances in that environment.

The laws apply only to the last category of systems. These laws are:

Continuing Change - an E-type system must be continually adapted or it becomes progressively less satisfactory.
Increasing Complexity - as an E-type system evolves, its complexity increases unless work is done to maintain or reduce it.
Self Regulation - E-type system evolution processes are self-regulating with the distribution of product and process measures close to normal.
Conservation of Organisational Stability - the average effective global activity rate in an evolving E-type system is invariant over the product's lifetime.
Conservation of Familiarity - as an E-type system evolves, all associated with it, developers, sales personnel and users, for example, must maintain mastery of its content and behavior to achieve satisfactory evolution. Excessive growth diminishes that mastery. Hence the average incremental growth remains invariant as the system evolves.
Continuing Growth - the functional content of an E-type system must be continually increased to maintain user satisfaction over its lifetime.
Declining Quality - the quality of an E-type system will appear to be declining unless it is rigorously maintained and adapted to operational environment changes.
Feedback System - E-type evolution processes constitute multi-level, multi-loop, multi-agent feedback systems and must be treated as such to achieve significant improvement over any reasonable base.

The term reverse engineering (reengineering) as applied to software means different things to different people. In the researching the various uses and defining a taxonomy by Chikofsky and Cross reverse engineering defines as "of analyzing a subject system to create representations of the system at a higher level of abstraction". It can also be seen as "going backwards through the development cycle". In this model, the output of the implementation phase (in source code form) is reverse-engineered back to the analysis phase, in an inversion of the traditional waterfall model. In practice, two main types of reverse engineering emerge. In the first case, source code is already available for the software, but higher-level aspects of the program, perhaps poorly documented or documented but no longer valid, are discovered. In the second case, there is no source code available for the software, and any efforts towards discovering one possible source code for the software are regarded as reverse engineering. This second usage of the term is the one most people are familiar with. Reverse engineering of software can make use of the clean room design technique to avoid copyright infringement.

Also during adapting the system the restructuring is done. It is popular practice of Extreme Programming (XP). Refactoring concerns of improving the internal structure of an existing program's source code, while preserving its external behavior. The functionality of the system does not change. The transformation of spaghetti-code to structured code is a form of refactoring. The redesign of a system (possibly after a design recovery step) is another example of refactoring. In agile methods, refactoring the code to improve its design is an explicit process step. Refactoring is a white-box method, in that it involves inspection of and changes to the code. It is also possible to modernize a system without touching the code.

Changing software impairs its structure. By a conscious application of software quality assurance procedures during maintenance, developers may limit the negative effects. If the software quality factors that affect maintenance effort and cost are known, those factors can be measured and preventive actions may be taken accordingly.

Quality control issues get quite some attention during software development. Software quality assurance however should broaden its scope to maintenance as well. The implementation of changes during maintenance requires the same level of quality assurance as development work. The components of software quality assurance procedures, as discussed in Unit 6, apply equally well to software maintenance.

Software quality assurance can be backed up by measurements that quantify quality aspects. Relationships between such measures can then be sought.

A particularly relevant issue during maintenance is to decide when to reengineer. At a certain point in time, evolving an old system becomes next to impossible and a major reengineering effort is required or the system enters the servicing stage. There are no hard figures on which to decide this, but certain system characteristics indicate system degradation:

Frequent system failures;
Overly-complex program structure and logic flow;
Code written for previous generation hardware;
Running in emulation mode;
Very large modules or subroutines;
Excessive resource requirements;
Hard-coded parameters that are subject to change;
Difficulty in keeping maintenance personnel;
Seriously deficient documentation;
Missing or incomplete design specifications.

The greater the number of such characteristics present, the greater the potential for redesign.

Improvements in software maintenance requires insight into factors that determine maintenance cost and effort. Software metrics provide such insight. To measure is to know. By carefully collecting and interpreting maintenance data, we may discover the major cost drivers of software maintenance and initiate actions to improve both quality and productivity.

References...Hide

Jones, C. The Economics of Software Maintenance in the Twenty First Century, Version 3, February 14, 2006. Technical report, http://www.spr.com.
Gırba, T. and Ducasse, S. Modeling history to analyze software evolution. Journal of Software Maintenance and Evolution: Research and Practice, 18, 2006: 207-236.
Herraiz, Israel; Rodriguez, Daniel; Robles, Gregorio; Gonzalez-Barahona, Jesus M. "The evolution of the laws of software evolution". ACM Computing Surveys. 46 (2), 2013: 1-28.
Chikofsky, E. J.; Cross, J. H.. "Reverse Engineering and Design Recovery: A Taxonomy". IEEE Software 7 (1), 1990: 13–17.

Part of material was taken from:

Software Engineering: Principles and Practice. Hans van Vliet. 2007.

Software Engineering Practices

Unit 7 Software Maintenance

Lecture

Keywords

7.1 Maintenance Issues

7.2 Software Evolution

7.3 Reverse Engineering and Refactoring

7.4 Quality Issues