Comparison with Evolutionary Algorithms

Table 1: Comparison of ASAP, FlipGA, RFGA and SAW on Test Suite 1

			ASAP		FlipGA		RFGA		SAW
Inst.	n	m	SR	AFES	SR	AFES	SR	AES	SR	AES
1	30	129	1.00	27	1.00	120	1.00	253	1.00	754
2	30	129	1.00	2024	1.00	1961	1.00	14370	1.00	88776
3	30	129	1.00	498	1.00	784	1.00	6494	1.00	12516
4	40	172	1.00	59	1.00	189	1.00	549	1.00	3668
5	40	172	1.00	54	1.00	165	1.00	316	1.00	1609
6	40	172	1.00	1214	1.00	1618	1.00	24684	0.78	154590
7	50	215	1.00	85	1.00	219	1.00	480	1.00	2837
8	50	215	1.00	103	1.00	435	1.00	8991	1.00	8728
9	50	215	1.00	7486	1.00	11673	0.92	85005	0.54	170664
10	100	430	1.00	62939	1.00	132371	0.54	127885	0.16	178520
11	100	430	1.00	281	1.00	1603	1.00	18324	1.00	43767
12	100	430	1.00	298	1.00	1596	1.00	15816	1.00	37605

We will consider three evolutionary algorithms for SAT, here called FlipGA [9], RFGA [7] and SAW [1]. FlipGA is a heuristic based genetic algorithm combining a simple GA with the Flip Heuristic. RFGA uses an adaptive refining function to discriminate between chromosomes that satisfy the same number of clauses and a heuristic mutation operator. The SAW algorithm is a (1, $\lambda^*$ ) ( $\lambda^*$ is the best $\lambda$ found in a suitable number of test experiments) evolutionary strategy using the SAW-ing (stepwise adaptation of weights) mechanism for adapting the fitness function according to the behavior of the algorithm in the previous steps. We test ASAP on the same instances (test suites 1, 2) used in [1,7,9], which are 3-SAT instances generated using the generator developed by Allen van Gelder. These instances are available at http://www.in.tu-clausthal.de/~gottlieb/benchmarks/3sat. All instances lay in the phase transition, where the number of clauses is approximately 4.3 times the number of the variables.

- Test suite 1 contains four groups of three instances each. The groups have a number of variables of 30,40,50 and 100.

The performance of genetic algorithms is generally evaluated by means of two measures: the Success Rate (SR), that is, the percentage of runs in which the algorithm found a solution for that instance or group of instances; and the Average number of evaluations to Solution (AES) index, which counts the average number of fitness evaluations performed to find the solution. Note that the AES takes into account only successful runs.

Since our algorithm uses also local search, we use an approximated estimation of the AES called Average Flip cost in terms of fitness Evaluation to Solution(AFES). The AFES index is based on the number of flips performed during the execution of the local search (both accepted and not accepted flips are counted) and is an estimation of the cost of the local search step in terms of fitness evaluations. If the local search performs n_flips flips (including accepted and not accepted flips), one can estimate a cost of K * n_flips/n_vars fitness evaluations (cf. [9]), where n_vars is the number of variables in the instance and K is the clause length. This applies only to K-SAT instances which are randomly generated.

The results of the experiments are given in Tables 1, 2, where n and m denote the number of variables and of clauses, respectively. All the algorithms are run 50 times on each problem instance, and the average of the results is reported. Moreover, the termination conditions for all algorithms is satisfied either if a solution is found or if a maximum of 300000 chromosomes have been generated.

The results show that ASAP has a very good performance, with SR equal to 1 in all instances, and smaller AFES than FlipGA in all but one instance (instance 2) where it has AFES slightly bigger than FlipGA.