Quality Planning for Software Development

Quality Planning for Software Development Tom Walton Alcatel Networks tom.waltoniliiialcatel.com Abstract A historical project is used as a reference model for the verification planning process. The planning equations and an example are presented. The example objective set is to increase the level of defect removal from thefinal softwwe from the historical level of 35% to about 88% of the dejects introduced into the code and to reduce the overall number of delivered defects/kloc by 85%. The example illustrates the methods used to estimate the phase-by-phase effort required to achieve the specified goal. The use of the planning model and the resulting dato in day-to-day activities is also discussed. 1. Introduction Any plan has two critical elements: the objective and the resources to achieve that objective. Presented here is a method of quantitatively determining the resources required to achieve a specified level of defect removal during a software development project. 1.1 The Problem Once a software development work product (such as a design document or a code module) is completed, the number of defects il contains is fixed. Subsequent effort is all spent to find and remove these defects. Intuitively, spending more time and effort finding defects will result in a more reliable product. The problem for planners is determining how much time to spend and on which processes to spend it. The method presented here provides planners with a tool for allocating the necessary resources to achieve their defect removal goals. 1.2 The Basic: Equation The method presented is based on an observation made at Alcatel and. described in a paper [l]. Based on the results of 1400 code inspection reports, the rate of defect discovery (defectslperson hour of effort) observed to be neither a function of the size of the piece of code being inspected nor of the rate at which it was inspected. Figure 1-1 summarizes the original data. I Code Inspection Results Summary \ I I I I I I i & \ - + I I I I Figure 1-1 Averaged Code Inspection Data This observation is expressed by the equation: DefectdKLOC x KLOC/Hr = Constant. (Equ. 1-1) The number of defects found per hour of effort is not a true constant. We have found that it is roughly proportional to the defect density of the code, but it is a very good approximation for any given software development project. - ~ 350 - Integration Test Defect Trend 0 500 1000 1500 2000 2500 3000 Cumulative Total Effort (hr) Figure 1-2 Constant Rate of Defect Discovery The observation was based on code inspection data but it soon became clear that the equation applies to all systematic verification processes and applies to 0-7695-1372-7/01 $10.00,El 2001 IEEE 1 -- n.4

requirements and design documentation as well as program code. Figure 1-2 provides an illustration of the principle taken from integration testing. The equation provides us with a design curve that is applicable to every verification process used in software development. Using this equation, managers may specify the desired level of defect removal and accurately calculate the amount of effort required to achieve it. 2. Verification Effectiveness Verification Effectiveness (VE) for a process is the ratio of the defects found by the process to the defects presented to the process. It provides a method of measuring the effectiveness of a verification process that is independent of any preceding or following processes. An example is shown in Table 2-1. Process 0.650 Unit Test 0.400 210 0.300 Total 853 Table 2-1 Verification Effectiveness Calculation There are limits to the effectiveness values that are achievable. While a good summary of the range of effectiveness values that you can expect to achieve was given by Capers Jones [2], it is best to be guided by your own experience. Since the rate of defect discovery is approximately proportional to the defect density, the amount of effort required to achieve a given value of effectiveness is almost constant, all else remaining equal. Because of this property, the effectiveness was selected as the adjustable value for the planning process. All other planning parameters are either historical values, calculated from historical values, or are project specific estimates. 3. Assumptions There are two types of defects in a work product; defects in the parent work product, termed communicable defects, which are inherited, and errors made while producing the work product. In the case of program code, this means that there are code defects that are the result of errors in the design documentation, whatever their root cause, plus code defects caused by mistakes made while coding. It is assumed that the density of errors caused by mistakes is constant while the number of defects due to communicable defects in the parent is proportional to the number of defects remaining in the parent when it was used. The proportionality is called the Breeding Factor denoted by F. Therefore, the number of defects expected in a body of code can be found using Equation 2-2. = Size * Dc + Nd*F, (Equ. 2-2) Where: Size = Lines of Code, D, = Defect density for coding errors Nd = number of design defects at coding F, = the average number of code defects per design defect at coding A similar equation can be written for design documentation based on the defects remaining in the requirements. These factors will not remain exactly constant from project to project, but unless there is a significant change in the development processes or the process is sensitive to size changes, they will support estimates that are accurate enough for project planning. As each process occurs, actual project data can be used to improve the estimates. The second assumption is that the four levels of code verification occur in the order that they are listed. This is necessary in order to determine the number of defects present in the code when each process is applied. The third assumption is that when defects are found, they are removed without introducing a significant number of additional defects. 3.1 Which Count Code inspection will find coding standard violations while system test will not. In order to maintain consistency, the defects counted and used in the planning process must all be the same type. Only the kind of defects that system test is capable of finding may be counted when calculating the effectiveness of other phases. These are termed Functional and have the capacity to cause abnormal system behavior. Functional defects found by tracing or root cause analysis are not included in the planning numbers. 4. The Planning Process The objective of the example project is to reduce the number of delivered defects, adjusted for project size, by 85%. This is an arbitrary target. 105

4.1 Defect E;stimates There are a number of defect estimating methods provided in the literature [3,4]. For this paper, Equation 2-2 is assumed to apply using the parameters shown in Table 4-1. The estimate of the number of defects present in the software is based on the historical defect density, the product size and the level of defect removal. The effect of communicable defects is taken into account. The data for the base case is shown in Tables 4-1,4-2 and 4-3. The base case effort data is from the reference project history. 1 WorkProduct I Size I D I F 1 Table 4-1 Project Estimating Parameters Some of the historical SW Design document defects are the result of undetected requirement errors and some of the program code defects are the result of undetected design errors. Thus finding a greater fraction of the requirement defects, will result in relatively fewer design and code defects. Starting Product 590 0.4356 1330 0.3436 Code UnitTest 3712 3458 0.0685 0.0628 Code Int. Test 3241 0.1449 Code Svs. Test 2772 0.1264 Table 4-2 Base Case Defect Data Left 333 873 3458 3241 2772 2421 1 Verification Process I Effort Per Defect (hr) I 1.50 3.10 4.80 7.28 10.64 17.50 Table 4,-3 Base Case Effort Data 4.2 Planned ]Discovery Profile To illustrate the planning process, the base case project is re-planned. Table 4-4 shows the effect of selecting better values of VE on the defects initially inserted, the number removed and the number remaining of selecting better values of VE. The data is calculated using Equation 2-2. Code I Int. Test I 661 10.30 I 463 Code I Sys. Test I 463 I 0.20 I 370 Table 4-4 Planned Defect Discovery Profile The planner specifies the effectiveness values required to achieve the desired level of defect removal. The plan may be modified by adjusting the specified effectiveness values. The planned effectiveness values are somewhat arbitrary but the choices should reflect the relative cost, including repairs, of finding a defect using each process and the proven process capability. 4.3 Effort Calculation Table 4-4 shows what will happen if more defects are removed by each process. Since at each stage, the number of defects is reduced compared to the base case, the effort required to find each defect will be greater than in the base case. The discovery effort from the reference project is adjusted using the Weibull function shown in Equation 4-1. D = N (1 - e-atb) (Eqn. 4-1) Where D = found at time T N = Total T = Cumulative Test Effort For the reference project, b = 1.0333. Generally b is very close to one. The value of a is not important. A value of 0.002 was used. Two adjustments factors are required to adjust the historical effort data; one to reflect the lower initial defect density (Ad) and one to reflect the improved defect removal (Ap). The adjustment for reduced initial density is simply the ratio of the historical and planned density. Thus for code verification processes, Ad = 1.235. Figure 4-1 illustrates how the effort adjustment factor (Ap) is determined for Unit Testing. 106

z 3000 2500 2000 v t 1500-0) 0 1000, Effort Adjustment For Unit Test,,/ e-- - L- - - --- L,* -Unit Test (Reference) 1 1 Case F;d 1 TDes c.l_-- Reference 177.17 2.218 :I Planned I 687 I 376.05 I 1.827 I Table 4-6 Design Defect Discovery Effort 0 I 0 200 400 600 800 1000 1200 Dimensionless Effort Figure 4-1 Ap = Reference Slope I Plan Slope The resulting calculated effort per defect is the historical effort multiplied by the two adjustment factors. For unit testing the result is Ap = 2.768. Thus the estimated average effort to find a defect by unit testing becomes 24.88 hours, assuming that the planned number of defects is removed by code inspection. The effort adjustments require that the standard Weibull function be solved for T, the verification effort, at the beginning and end of each process. The calculations for all phases are shown below. 4.3.1 Requirements and Design Since each work product has a different number of defects when it is completed, each work product must be treated separately. The calculations for Requirements and Design are identical since there is only one verification phase applied to each. The Weibull function is solved for time twice in each case, one for the reference case and one for the planned case. Ap is the ratio of the two resulting average defect discovery efforts, SR and Sp. Where applicable the defect density adjustment is also applied. Planned Table 4-5 Requirements Defect Discovery Effort For requirements defects, Ad =I. Multiplying these values by the base case effort yields a requirement defect discovery effort of 1.918 hours per functional defect. For design defects, Ad = 1.162, the ratio of the two design defect densities. Multiplying these values by the base case effort yields a design defect discovery effort of 4.383 hours per functional defect. 4.3.2 Code For code inspection the calculation for the effort adjustments is the same as for the design inspections. For subsequent processes, unit test, etc., the number of defects remaining depends on the number of defects found by the previous processes. Thus the segments of the Weibull curve used to calculate the adjustment factor do no start at the same point but start where the previous process left off. This is illustrated in Figure 4-1 for Unit Testing. Repeating the calculations for all of the phases yields the results set out in Table 4-7. Table 4-8 shows the calculated total verification effort for each verification process. Process I AP I Ad I A Unit Test Integration Test I 4.084 I 1.234 I 5.040 System Test I 4.790 I 1.234 I 5.911 1 1 Process Req. Inspection Des. Inspection Code Inspection - Integration Test System Test Total Table 4-7 Effort Adjustment Factor 1 Planned I Estimated Historical EffortIDef 413 1 SO 792 687 3.10 3005 1804 4.80 14513 1 198 10.64 10618 93 17.50 9620 52002 Table 4-8 Calculated Total Effort This plan will deliver an estimated 370 defects or 0.925 defects/kloc. This compares to the reference 107

project, which delivered an estimated 2421 defects or 6.053 defects/kloc. The planning process has thus shown how to redluce the defect content of the new release by 85%, compared to the reference release, with no change in the development process. To achieve this improvement, the plan requires the expenditure of 130.0 hours/kloc for verification compared to 39.3 hours/kloc spent for verification during the reference: project. This is an increase of 23 1% in verification effort. 5. Using the Planning Data 5.1 Where Tlhe Size is Known If the size of the item being verified is known, then the size and effort data can be used to determine a verification speed. In the example project, there are 400,000 lines of code to be inspected using 14,513 hours of effort. Thus the average rate of inspection must be 27.5 lines per hour (400,000 lines / 14513 hours). This rate can be used to plan each individual inspection. For example, the inspection of a new code module containing 300 lines of code would require about 1 1 person hours of inspection effort. The same calculation can be applied to planning the inspection of requirements and design documents and to unit testing. If these rates are observed in planning and conducting the verification steps, then on average, the desired number of defects will be discovered. If the work products can be identified in advance and their size estimated, then the effort required for their verification can be allocated on an individual basis. For example, if the size of one subsystem in our example project is expected to be 75,000 new and changed lines, then the effort to be allocated to code inspection for that subsystem can be cdculated (2727 hrs) and allowed for in the project schedule. It isn t necessary to allocate the verification effort evenly. If a part of the system being developed is more critical than the rea, the effort spend on inspection and test for the critical portions may be increased and the remainder of the estimated effort allocated to the rest of the system. The verification reports for each process can be tracked to ensure that each design team is following the plan. 5.2 Where Size is Not Known In the case of integration testing and system testing, the amount of code being exercised by each test is generally not known. Thus it would be difficult to determine the rate of testing. In the case of integration testing and system testing, a specific number of tests are typically available. Executing them quickly or slowly will not change the number of test failures. Instead, it is necessary to estimate how much effort it would take to execute this finite set of test cases and adjust the plan accordingly. It should be possible to estimate how much effort would be required to execute these tests if it is assumed that no defects were found. Given the rate of defect discovery, this effort will yield a number of defects, which will have to be repaired. The re-testing effort associated with the repair effort will yield a few more defects and so on. 5.3 Training and Awareness It is not very useful to go through a planning process if the results are not communicated to the people that must carry out the resulting plan. It is therefore imperative that the software development and test team staff receive training on the planning process and the resulting plan. During the project, the results of the verification efforts relative to the plan must be made visible to the project staff. Requiring verification progress reports at all progress meetings can help achieve this. The Project Manager must understand the results and take action when the results are not as expected. Publishing progress charts can also improve awareness of the process and what it is achieving. 5.4 Rework Estimates If repair effort data is available for defects found at each stage, the rework effort can be estimated from the estimated defect counts and included in the project plan and project schedule. Doing this will help to ensure timely repair of the discovered defects. 5.5 Maintaining the Plan The quality plan must be revisited periodically and updated when things change. For example, if the estimated size of each software component is replaced with its actual size as it is completed, a continuously improved size estimate will result. This revised estimate can be fed into the quality

plan and improved effort and defect estimates will be produced. If the actual defect density of a type of work product is different than the assumed value, it will become apparent from the rate of defect detection. For example, the estimated average defect density of the program code of the example project was 7.504 defectslkloc and the adjusted estimated discovery effort for Code Inspection was 8.06 hrldefect. Suppose that after a significant fraction of the code has been inspected, the measured Code Inspection effort is found to be 9.20 hrldefect. It is likely that the actual defect density of the code is lower than initially assumed by the ratio of these two values. Applying the ratio results in an average adjusted defect density of 6.574 defectslkloc. Generally, for code defects, it s a good idea to look at unit testing data as well and adjust the data only if the results from both processes are in agreement. For those items where size is known, the plan can be adjusted to reflect the actual rate of verification. Thus if the rate of designs document inspection turns out to be 0.650 pageslhour rather than 0.778 pages per hour of effort as planned, it is likely that the number of defects that will be removed will be greater than planned. If the planned effectiveness of the design inspection process is adjusted upward to reflect the lower inspection rate, then the effects of the increased defect removal on the project can be estimated. REFERENCES 111 I21 131 141 Steven Q. Quigley, Measuring Code Inspection Effectiveness, Proceedings of the Seventh International Conference on Applications of Software Measurement in October of 1996, pp. 517. Capers Jones, Software Challenges: Software Defect-removal Efficiency, Computer, Vol. 29, N0.2: APRIL 1996, pp. 94-95. Kai-Yuan Cai, Software Defect and Operational Profile Modeling, Kluwer Academic Publishers, 1998 John D. Musa, Anthony Iannino and Kazuhira Okumoto, Software Reliability, Measurement, Prediction, Application, McGraw-Hill, New York, 1987 5.6 Resisting Schedule Compression Every software development group is under pressure to deliver the product as quickly as possible which can lead to the cutting of comers and the skipping of verification steps. The quality planning model can be used to demonstrate what will happen if this is done. Since the planning results are based on historical data, development managers can easily justify the estimates. Without process changes or improved development tools, management cannot expect the developers to greatly exceed their historical performance. 5.7 Post Mortems The data collected during a project in support of the Quality Plan is valuable for project post mortems. The performance of the team against the plan can be evaluated and recommendations made for adjustments in the next release. The impact of any process changes made for the project should also.be evident in the collected data. 109