Genetic algorithm for feature engineering

Genetic Algorithms are inspired by the concepts of evolution through natural selection. They are often used in high dimensional spaces where grid / random search would be prohibitive.

Genetic Algorithms encode the space to explore with genes and proceed by generations. For each generation:

  • individuals forming the current population are evaluated (fitness)
  • the best individuals are chosen to mix their genes together (crossover)
  • independent random changes are performed (mutation)

This plugins deals with feature creation and selection, powered by genetic algorithms. Starting from a dataset with features and a target, it will automatically select among features both from the dataset and their combinations (product, sum and differences). In this setting, an individual is represented by a boolean array with a value for every feature (originals and combinations) indicating whether it is selected or not as an input for the model to train.

Plugin information

Version0.0.1
AuthorDataiku
Released2018-08-01
Last updated2018-08-01
LicenseLGPL v3.0
Source codeGithub
Reporting issuesGithub