• ## Empirical Study

Far more new real programs than toy programs have been introduced...

### Mutant Reduction Techniques

One of the major sources of computational cost in Mutation Testing is the inherent running cost in executing the large number of mutants against the test set. As a result, reducing the number of generated mutants without significant loss of test effectiveness has become a popular research problem. For a given set of mutants, M, and a set of test data T, MS_T(M) denotes the mutation score of the test set T applied to mutants M. The mutant reduction problem can be defined as the problem of finding a subset mutants M' from M, where MS_T(M') =~ MS_T(M). This section will introduce four techniques used to reduce the number of mutants, Mutant Sampling, Mutant Clustering, Selective Mutation and Higher order Mutation.

Mutant Sampling: Mutant Sampling is a simple approach that randomly chooses a small subset of mutants from the entire set. This idea was first proposed by Acree [Acree80] and Budd [Budd80]. In this approach, all possible mutants are generated first as in traditional Mutation Testing. x% of these mutants are then selected randomly for mutation analysis, and the remaining mutants are discarded.

Selective Mutation: A reduction in the number of mutants can also be achieved by reducing the number of mutation operators applied. This is the basic idea underpinning Selective Mutation, which seeks to find a small set of mutation operators that generate a subset of all possible mutants without significant loss of test effectiveness. This idea was first suggested as 'constrained mutation' by Mathur [Mathur91]. Offutt et al. [OffuttRZ93] subsequently extended this idea calling it Selective Mutation.

Mutant Clustering:The idea of the Mutant clustering was first proposed in Hussain's masters thesis [Hussain08]. Instead of selecting mutants randomly, Mutant Clustering chooses a subset of mutants using clustering algorithms. The process of Mutation Clustering starts from generating all first order mutants. A clustering algorithm is then applied to classify the first order mutants into different clusters based on the killable test cases. Each mutant in the same cluster is guaranteed to be killed by a similar set of test cases. Only a small number of mutants are selected from each cluster to be used in \MT, the remaining mutants are discarded.

Higher Order Mutation: Higher Order Mutation is a comparatively new form of Mutation Testing introduced by Jia and Harman [JiaH08b]. The underlying motivation was to seek to find those rare but valuable higher order mutants that denote subtle faults. In traditional Mutation Testing, mutants can be classified into first order mutants (FOM) and higher order mutant(HOM). FOMs are created by applying mutation operator only once. HOM are generated by applying mutation operators more than once. In their work, Jia and Harman introduced the concept of subsuming HOMs. A subsuming HOM is harder to kill than the FOMs from which it is constructed. As a result, it may be preferable to replace FOMs with the single HOM to reduce the number of the mutants. In particular, they also introduced the concept of a strongly subsuming HOM (SSHOM) which is only killed by a subset of the intersection of test cases that kill each FOM from which it is constructed.

### Execution Cost Reduction Techniques

In addition to reducing the number of generated mutants, the computational cost can also be reduced by optimizing the mutant execution process. This section will introduce the three set of techniques used to optimize the execution process that have been considered in the literature.

Strong, Weak, and Firm Mutation: Based on the way in which we decide whether to analysis if a mutant is killed during the execution process, Mutation Testing techniques can be classified into three types, Strong Mutation, Weak Mutation and Firm Mutation.

Strong Mutation is often referred as traditional Mutation Testing. That is, it is the formulation originally proposed by DeMillo et al. [DeMilloLS78]In Strong Mutation, for a given program p, a mutant m of program p is said to be killed only if mutant m gives a different output from the original program p.

To optimize the execution of the Strong Mutation, Howden [Howden82] proposed Weak Mutation. In Weak Mutation, a program p is assumed to be constructed from a set of components C = {c1,...,cn}. Suppose mutant m is made by changing component cm, mutant m is said to be killed if any execution of component cm is different from mutant m. As a result, in Weak Mutation, instead of checking mutants after the execution of the entire program, the mutants only need to be checked immediately after the execution point of the mutant or mutated component.

Firm Mutation was first proposed by Woodward and Halewood [WoodwardH88]. The idea of Firm Mutation is to overcome the disadvantages of both weak and strong mutations by providing a continuum of intermediate possibilities. That is the 'compare state' of Firm Mutation lies between the intermediate states after execution (Weak Mutation) and the final output (Strong Mutation).

Runtime Optimization Techniques: The interpreter-based Technique is one of the optimization techniques used in first generation of mutation testing tools [OffuttK87]. In interpreter-based techniques, the result of a mutant is interpreted from its source code directly. The main cost of this technique is determined by the cost of interpretation. Interpreter-based tools provide additional flexibility and are sufficiently efficient for mutating small programs. However, due to the nature of interpretation, it becomes slower as the scale of programs under test increases.

To reduce the cost of interpretation, the compiler-based technique was proposed [Delamaro93, DelamaroM96], because execution of compiled binary code is much faster than interpretation. In compiler-based techniques, the mutant source is compiled into an executable program first, then each compiled mutant will be executed by a number of test cases. As each mutant generated are The overall cost of this technique is the sum of the compilation cost and the running cost. There is also a speed limitation due to the high compilation cost for large programs [ChoiM93].

DeMillo et al. proposed the compiler-integrated technique [DeMilloKM91] to optimise the performance of the traditional compiler-based techniques. Because there is only a minor syntactic difference between each mutant and the original program, compiling each mutant separately in the compiler-based technique will result in redundant compilation cost. In the compiler-integrated technique, a instrumented compiler is designed to generate and compile mutants.

The Mutant Schema Generation approach is also designed to reduce the overhead cost of the traditional interpreter-based techniques [Untch92, UntchOH93, Untch95]. Instead of compiling each mutant separately, the mutant schema technique generates a metaprogram. Just like a "super mutant" this metaprogram can be used to represent all possible mutants. Therefore, to run each mutant against the test set, only this metaprogram need be compiled. The cost of this technique is composed of a one-time compilation cost and the overall runtime cost. As this metaprogram is a compiled program, its running speed is faster than the interpreter-based technique.

The most recent work on reduction of the compilation cost is the Bytecode Translation technique. This technique is another way to first proposed by Ma et al. [OffuttMK04, MaOK05]. In Bytecode Translation, mutants are generated from the compiled object code of the original program, instead of the source code. As a result, the generated `bytecode mutants' can be executed directly without compilation. As well as saving compilation cost, Bytecode Translation can also handle off-the-shelf programs which do not have available source code.

Advanced Platforms Support for Mutation Testing: Mutation Testing has also been applied to many novel computer architectures to distribute the overall computational cost among many processors. In 1988, Mathur and Krauser [MathurK88] were the first to perform Mutation Testing on a vector processor system. Krauser et al. [KrauserMR88,KrauserMR91] proposed an approach for concurrent execution mutants under SIMD machines. Fleyshgakker and Weiss [FleyshgakkerW94,WeissF93] proposed an algorithm that significantly improved techniques for parallel \MT. Choi and Mathur [ChoiM93] and Offutt et al. [OffuttPFK92] have distributed the execution cost of \MT~through MIMD machines. Zapf [Zapf93] extended this idea in a network environment, where each mutant is executed independently.