Genetic Programming and Evolvable Machines

Software review: Pony GE2
Vu TM
Learning feature spaces for regression with genetic programming
La Cava W and Moore JH
Genetic programming has found recent success as a tool for learning sets of features for regression and classification. Multidimensional genetic programming is a useful variant of genetic programming for this task because it represents candidate solutions as sets of programs. These sets of programs expose additional information that can be exploited for building block identification. In this work, we discuss this architecture and others in terms of their propensity for allowing heuristic search to utilize information during the evolutionary process. We investigate methods for biasing the components of programs that are promoted in order to guide search towards useful and complementary feature spaces. We study two main approaches: 1) the introduction of new objectives and 2) the use of specialized semantic variation operators. We find that a semantic crossover operator based on stagewise regression leads to significant improvements on a set of regression problems. The inclusion of semantic crossover produces state-of-the-art results in a large benchmark study of open-source regression problems in comparison to several state-of-the-art machine learning approaches and other genetic programming frameworks. Finally, we look at the collinearity and complexity of the data representations produced by different methods, in order to assess whether relevant, concise, and independent factors of variation can be produced in application.
Automated discovery of test statistics using genetic programming
Moore JH, Olson RS, Chen Y and Sipper M
The process of developing new test statistics is laborious, requiring the manual development and evaluation of mathematical functions that satisfy several theoretical properties. Automating this process, hitherto not done, would greatly accelerate the discovery of much-needed, new test statistics. This automation is a challenging problem because it requires the discovery method to know something about the desirable properties of a good test statistic in addition to having an engine that can develop and explore candidate mathematical solutions with an intuitive representation. In this paper we describe a genetic programming-based system for the automated discovery of new test statistics. Specifically, our system was able to discover test statistics as powerful as the t-test for comparing sample means from two distributions with equal variances.
Guest editorial: special issue on selected papers from the European conference on genetic programming
Silva S and Foster JA
Evolutionary algorithms and synthetic biology for directed evolution: commentary on "on the mapping of genotype to phenotype in evolutionary algorithms" by Peter A. Whigham, Grant Dick, and James Maclaurin
Kell DB
I rehearse two issues around the commentary of Whigham and colleagues. (1) There really are many more reasons than those given as to why natural evolution cannot reasonably find or select the 'optimal' individual. (2) A series of experimental molecular biology programmes, known generically as directed evolution, can use operators and selection schemes that natural evolution cannot. When developed further using the methods of synthetic biology, there are no operators or schemes for in silico evolution that cannot be applied precisely to directed evolution. The issues raised apply only to natural evolution but not to directed evolution.
Visualising the global structure of search landscapes: genetic improvement as a case study
Veerapen N and Ochoa G
The search landscape is a common metaphor to describe the structure of computational search spaces. Different landscape metrics can be computed and used to predict search difficulty. Yet, the metaphor falls short in visualisation terms because it is hard to represent complex landscapes, both in terms of size and dimensionality. This paper combines local optima networks, as a compact representation of the global structure of a search space, and dimensionality reduction, using the t-distributed stochastic neighbour embedding algorithm, in order to both bring the metaphor to life and convey new insight into the search process. As a case study, two benchmark programs, under a genetic improvement bug-fixing scenario, are analysed and visualised using the proposed method. Local optima networks for both iterated local search and a hybrid genetic algorithm, across different neighbourhoods, are compared, highlighting the differences in how the landscape is explored.
Visualisation with treemaps and sunbursts in many-objective optimisation
Walker DJ
Visualisation is an important aspect of evolutionary computation, enabling practitioners to explore the operation of their algorithms in an intuitive way and providing a better means for displaying their results to problem owners. The presentation of the complex data arising in many-objective evolutionary algorithms remains a challenge, and this work examines the use of and for visualising such data. We present a novel algorithm for arranging a treemap so that it explicitly displays the dominance relations that characterise many-objective populations, as well as considering approaches for creating trees with which to represent multi- and many-objective solutions. We show that treemaps and sunbursts can be used to display important aspects of evolutionary computation, such as the diversity and convergence of a search population, and demonstrate the approaches on a range of test problems and a real-world problem from the literature.
Editorial introduction
Spector L
Highlights of genetic programming 2020 events
Nicolau M
Editorial introduction
Spector L
Editorial Introduction
Trujillo L, Hu T, Lourenço N and Zhang M