Sunteți pe pagina 1din 4

International Journal of Computational Intelligence and Information Security, March 2012 Vol. 3, No.

Application Of Genetic Algorithm In Software Testing: A Survey


AbhishekSinghal,2Swati Chandna, 3Abhay Bansal Amity School of Engineering & Technology. Amity University, Noida 1 asinghal1@amity.edu, 2swatichandna111@gmail.com, 3abansal1@amity.edu
1, 2, 3 1

Abstract
Test data generation is basically the process of identifying a set of data which satisfy the criteria set for testing. Lot of research have been done by many researchers and they developed many test data generators like random test data generators, symbolic test data generators and dynamic test data generators. This paper analyses and then compares the techniques applied in generating test cases based on genetic algorithm Keywords: Test case generation, Genetic Algorithms, fitness functions

1. INTRODUCTION
Software testing is a major and most important part of software engineering so that errors can be identified easily. The process of validation and verification occurs in software testing. It is the most time consuming and costly activity to satisfy customers. Most of the time in software development is normally spent in software testing. So designing the minimal number of test cases and to determine the maximum number of [1] as much as possible is the main goal of software testing. In this paper, various works done by the researchers have been compared, for identification and implementation of the test case generators and techniques to optimize them.

2. GENETIC ALGORITHM
Genetic algorithm is a rapidly growing area in Artificial Intelligence. Darwins theory inspired genetic algorithm. It is started with the set of solutions which is basically represented by chromosome known as population. New population is generated by using the results from previous solution. New solutions are selected according to their fitness and this procedure is repeated until the specified condition is satisfied.

3. BASIC GENETIC ALGORITHM OUTLINE


1. 2. 3. 4. 5. [START] Random population of n chromosomes is generated [FITNESS] Fitness value of each chromosome is evaluated [NEW POPULATION] Create new population by applying genetic operators like Selection, Crossover, and Mutation etc. [REPLACE] New population generation is replaced. [TEST] If the specified condition is satisfied stop and return the solution [1].

Figure 1: The GA Algorithm

38

International Journal of Computational Intelligence and Information Security, March 2012 Vol. 3, No. 3

4. GENERATING TEST DATA USING GENETIC ALGORITHM.


Applying genetic algorithm in generating test cases is main milestone. The code should generate random numbers.GA then generates chromosomes for the initial generation which is the binary representation of every random number taking this a current generation the fitness of every chromosome is evaluated using the fitness function. Algorithm to generate test cases using genetic algorithm. 1. 2. 3. 4. 5. 6. 7. Let K be the number of elements Generate random numbers according to K Convert the random number generated in binary form. Evaluate fitness for every member of the population According to the evaluation mutation and crossover are applied Convert the bit string of every fit individual into integer The test data generated is finally passed and time taken for the entire implementation is found.

5. THE RELATED WORK


One of the major concerns is the generation of test cases automatically. Thus software testing has seen many researches to solve problems in random test generation, symbolic test generation, dynamic test generation and most recently generating test cases using genetic algorithms. Xanthakis et. al.[1]applied genetic algorithms to generate test cases. In this approach a path is chosen by the user and the branch predicates which are relevant are extracted from the program. The algorithm is then used to find the test data which satisfies the branch predicates. Roper et. al.[2] developed the system to generate test cases using genetic algorithms which automatically satisfies the coverage of branches involved in the program. It basically takes a C program and instruments it with probes to provide feedback on the coverage of branches achieved. The initial population is created randomly and then only iterative search is done which involves running the data and measuring its coverage. Jones et. al.[3] developed a system for automatic generation of test cases which used control flow graph that represents three iterations of each loop, the graphs are acyclic in nature. The algorithm when implemented it records the number of branches it reaches and fitness of each branch is calculated. The function which calculates fitness takes branch value and the branch condition. Sthamer [4] used genetic algorithm for white box testing which involve mutation testing, branch and boundary testing. The programs which were taken were written in ada like triangle classification, binary search, and various small sorting programs. The fitness function used the predicates of the program. Pargas et. al.[5]developed an algorithm for automatic generation of test cases called as Generate data. The algorithm used control flow graph to search for the test data that satisfy the given criteria. The algorithm so developed was implemented as a tool called TGen which evaluates the program or software for the branch and statement coverage. Bueno and Jino[6][7]developed a system which involves generating test data by identifying the paths involved in the program. The genetic algorithm was used in this which identified the infeasible paths. They developed a fitness function using control and data flow information called as path similarity metric. Michael et. al. [8] discussed the application of genetic algorithm for generation of test cases automatically. They developed a system called as GADGET which generates test cases that satisfy the decision-condition coverage. This system was developed for the programs which were written in C and C++ languages. Praveen Ranjan Srivastava et. al. [9] applied the concept of genetic algorithm to the problem of test generation. They consider the population to be a set of test data where each individual is an element of the test data due to which feasible test data can be generated. They developed the system which randomly generates the initial population, then evaluates the fitness function and then apply the process of mutation and crossover. This process will be continued until it finds the optimal test data.

39

International Journal of Computational Intelligence and Information Security, March 2012 Vol. 3, No. 3

Roya Alavi et.al. [10] developed a system for generating test cases using genetic algorithm based on initial test cases. They proposed a method called KMGA (K-means Genetic Algorithm) which is a combination of k-means and genetic algorithm. In k-means algorithm, test cases are clustering according to the internal criteria. The purpose of k-means algorithm is to minimized sum of distances. of every point from their centers.After clustering; the gained centres are entered to the genetic algorithm as test instances.

6. COMPARISON RESULTS
A comparison is presented among the genetic algorithm based test data generation techniques through many dimensions. As shown in Table 1 for the first dimension coverage criterion. Xanthakis and Bueno techniques are employed to generate test data for selected paths of the program; each technique takes one path at a time in a given sequence.Whereas, in the work of Xanthakis the genetic algorithm is used to find input data that satisfies all branch predicates of a chosen path. Buenos technique can be applied to the generation of the test data for the sub-paths from the entry node to some goal node different from the exit node. Roper, Jones, Pargas and Michael techniques attempt to achieve a desired level of branch coverage all branches in the software were exercised but the loops are controlling to zero, one, two and three loops. Pargas technique uses Control Dependence Graph thus the paths are acyclic and Michaels technique uses condition-decision coverage. For the second dimension, fitness function is taken where Ropers fitness function is the coverage of the program which is achieved, i.e. the number of branches to the total number of branches. Jones considers two fitness functions: the Hamming Distance function and a simple reciprocal of the difference between two predicate values. The former may be applied in general, while latter applies only to predicates in which numerical values are compared. Pargas fitness function is the number of predicates in which numerical values are compared. Pargas fitness function is the number of predicates that it has in common with the predicates on a control-dependence predicate path of the target. Finally Buenos technique uses the fitness function, where NC path similarity, EP absolute value of the path predicate(branch) function, and MEP is the maximum predicate function value among the candidate. Many approaches are used to select the survival of individuals such as high fitness, high average, high fitness in the selected subpopulation, hybrid between random and high fitness.
TABLE 1: COMPARISON ACCORDING TO COVERAGE CRITERION AND FITNESS FUNCTION

Authors Xanthakis Roper Jones Pargas Michael Prvaeen Ranjan Srivastava

Fitness Function The branch distance values. Percentage of coverage achieved Hamming distance or reciprocal Common predicates Predicate function Based upon the priority

Criteria Path Branch Hamming distance or reciprocal Statement and branch Branch (Condition-decision Feasible path

RoyaAlavi

Cj is Calculated(Number of Clusters)

K-Means Algorithm (Euclidean Distance)

40

International Journal of Computational Intelligence and Information Security, March 2012 Vol. 3, No. 3

7. CONCLUSION AND FUTURE WORK


Genetic algorithm has made generation of test cases possible due to which we can find optimum solution. Various techniques developed so far which we have described in section V which discusses about the techniques developed so far but these techniques has many problems which can be explored further and a new technique can be developed which will generate test data having optimum solution.

REFERENCES
[1] Xanthakis S ,Ellis C,SkourlasC,Le Gall A,Kastiskas,Karapoulious K. . pp 625-636 (1992) Application of genetic algorithm to Software Testing. In 5th international conference on software Engineering and its Applications [2] Roper M,MacleanI,BrooksA,MillerJ,Wood M. , (1995) Genetic Algorithm and the Automatic Generation of Test Data. Technical Report RR/95/195.Department of Computer Scienece,University of Strathclyde [3] Jones B F, Sthamer H H,Eyres D E. (1998);41(2):98-107.2.Automatic Strucural Testing using Genetic Algorithm. Software Engineering Research Journal [4] Sthamer H. H., the automatic generation of test data using genetic Algorithm, ,(1995) Ph. D Thesis, University of Glamorgan,Pontyprid, Wales, Great Britain [5] Pargas R P, Harrold M J, Peck R R ,(1999)Test Data Generation Using Genetic Algorithm. Journal of Software Testing, Verifications and Reliabilit [6] Bueno P M S ,Jino M. ,(2002)Identification of Potentially Infeasible Program Paths by monitoring the search for Test Data. Proceedings of the Fifteenth IEEE International Conference on Automated Software Engineering [7] Bueno P M S ,Jino M. (2011) 25th Brazilian Symposium Automatic Test Data Generation for Program Paths Using Genetic Algorithms. International Journal of Software Engineering and Knowledge EngineeringSoftware Engineering (SBES), [8] Michael C C, McGraw G E, Schatz M A ,(2001)Generating Software Test by Evolution. IEEE Transactions on Software Engineering [9] Praveen RanjanSrivastava,PriyankaGupta,YogitaArrawatia, SumanYadav. . March (2009) Volume 34 Number 2 ACM SIGSOFT Software Engineering Notes [10] RoyaAlavi and ShahriarLofti. IPCSIT vol. 14 (2011) IACSIT Press, Singapore The New Approach for Software Testing Using a Genetic Algorithm Based on Clustering Initial Test Instances 2011 International Conference on Computer and Software Modelling

41

S-ar putea să vă placă și