API Reference

The finchGE API consists of:

`finchge.core.Population(initialiser, population_size)`

Container for Individuals in an evolutionary run.

Population is representation-agnostic and does not perform genotype, tree, or phenotype initialisation logic.

Create a Population.

Exactly one of individuals or initialiser must be provided.

Parameters:

Name	Type	Description	Default
`initialiser`	`Optional[GEInitialiser]`	Initializer used to generate individuals.	required
`population_size`	`Optional[int]`	Number of individuals to generate (required if initialiser is used).	required

`from_individuals(individuals, population_size)` `classmethod`

Construct a Population directly from a list of Individuals.

This is used when offspring are produced by crossover/mutation strategies that already return fully-formed Individuals.

`finchge.core.Individual(*, genotype=None, phenotype=None, used_genome=None, used_codon_count=0, invalid=False, tree=None)`

Represents a single individual in an evolutionary algorithm.

An Individual encapsulates the genetic representation (genotype), its expressed form (phenotype), and metadata produced during evaluation and selection. Fitness values are not assigned at construction time and must be computed by a FitnessEvaluator.

Algorithm-specific information (e.g., Pareto rank, crowding distance, or reference-point association) is stored in the meta dictionary and is expected to be populated and cleared by the evolutionary algorithm.

Attributes:

Name	Type	Description
`genotype`	`Optional[list[int]]`	Genetic representation of the individual. A list of integers or None if not yet initialized or derived from a tree-based initialiser.
`phenotype`	`str`	Phenotypic representation derived from the genotype via grammar-based mapping. Empty until mapping occurs.
`used_genome`	`Optional[list[int]]`	Portion of the genotype consumed during genotype-to-phenotype mapping. Set during evaluation.
`used_codon_count`	`Optional[int]`	Number of codons consumed during mapping.
`invalid`	`bool`	Indicates whether the individual is invalid (e.g., mapping failure). Invalid individuals should not participate in fitness-based selection.
`tree`	`Optional[str]`	Serialized derivation tree (e.g., JSON) produced during mapping or tree-based initialisation.
`fitness`	`list[float]`	Fitness values assigned during evaluation. Empty until evaluated. Supports both single-objective and multi-objective optimization.
`meta`	`Dict[str, Any]`	Algorithm-specific metadata managed by the evolutionary algorithm, such as Pareto rank, crowding distance, or dominance information.

Initialize an Individual.

Parameters:

Name	Type	Description	Default
`genotype`	`Optional[list[int]]`	Genetic representation as a list of integers, or None if the individual is initialized from a tree or will be derived later.	`None`
`phenotype`	`str`	Phenotypic representation of the individual. Defaults to an empty string and is populated during mapping.	`None`
`used_genome`	`Optional[list[int]]`	Portion of the genotype actually consumed during mapping. Populated during evaluation.	`None`
`used_codon_count`	`Optional[int]`	Number of codons consumed during genotype-to-phenotype mapping.	`0`
`invalid`	`bool`	Whether the individual is invalid (e.g., mapping failure).	`False`
`tree`	`Optional[str]`	Optional serialized derivation tree (e.g., JSON) associated with this individual.	`None`

`clone()`

Create a deep copy of this individual.

Returns:

Type	Description
`Individual`	A new Individual with copied genotype, phenotype, fitness, and metadata.

`from_genotype(genotype)` `classmethod`

Construct an Individual from a genotype.

Parameters:

Name	Type	Description	Default
`genotype`	`list[int]`	Genetic representation as a list of integers.	required

Returns: A new Individual initialized with the given genotype.

`from_tree(tree)` `classmethod`

Construct an Individual from a derivation tree.

Parameters:

Name	Type	Description	Default
`tree`	`str`	Root of the derivation tree used to initialize the individual.	required

Returns:

Type	Description
`Individual`	A new Individual initialized with the given tree representation.

`get_meta(key, expected_type=None)`

Retrieve algorithm-specific metadata stored on the individual.

Parameters:

Name	Type	Description	Default
`key`	`str`	Metadata key to retrieve.	required
`expected_type`	`Type[T] \| None`	Expected type of the metadata value if being strict.	`None`

Raises:

Type	Description
`ValueError`	If the key is missing or the value does not match the expected type.

Returns:

Type	Description
`Any`	The metadata value associated with the given key.

`mutated(mutation_strategy)`

Apply a mutation strategy to this individual.

Parameters:

Name	Type	Description	Default
`mutation_strategy`	`GEMutationStrategy`	Mutation operator used to produce a mutated individual.	required

Returns:

Type	Description
`Individual`	A new Individual representing the mutated offspring.

`sort_key(maximize)`

Return a scalar key suitable for sorting individuals by fitness. Invalid or unevaluable individuals are always pushed to the end for minimization use +inf and for maximization: use -inf

`finchge.grammar.Grammar(grammar_str, parser=None)`

Represents a grammar and provides methods for mapping genotypes to phenotypes using grammatical evolution.

This class utilizes GrammarParser to parse the grammar (typically, Backus-Naur Form (BNF))

Parameters:

Name	Type	Description	Default
`grammar_str`	`str`	The grammar definition as a BNF-formatted string.	required
`parser`	`Optional[GrammarParser]`	GrammarParser will be used if not provided	`None`

`analyze()`

Analyze grammar structure for initialisation and diagnostics. Computes recursion, min/max derivation depth, arity, and termination feasibility. Results are stored on Rule objects.

References

[1] Conor Ryan, J. J. Collins, and Michael O'Neill. 1998. Grammatical Evolution: Evolving Programs for an Arbitrary Language. In Proceedings of the First European Workshop on Genetic Programming (EuroGP '98). Springer-Verlag, Berlin, Heidelberg, 83-96.

[2] O’Neill, M. and Ryan, C. (2001,) Grammatical Evolution. IEEE Transactions on Evolutionary Computation, 5, 349-358. https://doi.org/10.1109/4235.942529

`compute_min_ramp(*, population_size, max_init_depth)`

Compute minimum ramp depth.

Ramping should start at the first depth where the grammar can generate enough distinct derivations to support population diversity across the requested ramp range.

`compute_permutations_by_depth(max_depth)`

Estimate the number of distinct derivation trees available at each exact depth.

The internal count is cumulative up to a depth budget. The stored permutation count subtracts the previous depth budget so each entry represents trees whose maximum depth is exactly that depth.

`describe(expanded=True)`

Returns summary information about the grammar, including rule counts and structure. Set expanded=False to display original contracted versions like [a-z] for range if the grammar uses that syntax Default setting is to display expanded version.

Parameters:

Name	Type	Description	Default
`expanded`	`bool`	flag whether to show expanded version True by default.	`True`

Returns:

Name	Type	Description
`str`	`str`	A formatted string containing grammar statistics and structure.

`from_file(filename, parser=None)` `classmethod`

Create a Grammar instance from a file containing BNF rules. This method reads the BNF grammar from a text file and initializes the grammar with the same parameters as the direct constructor.

Parameters:

Name	Type	Description	Default
`filename`	`str`	Path to the file containing BNF grammar rules	required
`parser`	`Optional[GrammarParser]`	GrammarParser will be used if not provided	`None`

`finchge.grammar.parser`

`GrammarParser`

Bases: ABC

Base class for all grammar parsers.

Any custom parser must implement the parse() method which returns a standardized tuple containing grammar components.

`parse()` `abstractmethod`

Parse input data and return grammar components.

Returns:

Name	Type	Description
`tuple`	`tuple[dict[str, Rule], dict[str, Rule], str, list[str], list[str]]`	containing the rules, terminals and non-terminals

Response Contains

rules_original (dict[str, Rule]): Mapping of non-terminal symbols to Rule objects (original written form without range expansion).
rules (dict[str, Rule]): Mapping of non-terminal symbols to Rule objects in Expanded (range notation) form.
start_rule (str): The first non-terminal rule, considered the start rule.
terminals (set[str]): Set of terminal symbols.
non_terminals (set[str]): Set of non-terminal symbols.

`BNFGrammarParser(grammar_str)`

Bases: GrammarParser

Parses a grammar string written in Backus-Naur Form (BNF) into structured rules.

The parser identifies terminal and non-terminal symbols, splits productions into alternatives, and validates the structure of the grammar.

Attributes:

Name	Type	Description
`grammar_str`	`str`	The raw BNF grammar string.

Args: grammar_str (str): The grammar definition in BNF format.

`parse()`

Parses the BNF grammar string into rules, terminals, and non-terminals.

Returns:

Name	Type	Description
`Tuple`	`tuple[dict[str, Rule], dict[str, Rule], str, list[str], list[str]]`	Tuple containing rules (original), rules_expanded, start_rule, terminals and non-terminals.

Response contains

rules_original (dict[str, Rule]): Mapping of non-terminal symbols to Rule objects. (Original may be contracted form)
rules (dict[str, Rule]): Mapping of non-terminal symbols to Rule objects in Expanded (range notation) form.
start_rule (str): The first non-terminal rule, considered the start rule.
terminals (set[str]): List of terminal symbols.
non_terminals (set[str]): List of non-terminal symbols.

Raises:

Type	Description
`ValueError`	If the grammar syntax is invalid or contains undefined symbols.

`remove_comments(lines)`

Remove comments and empty lines.

`consolidate_multiline_rules(lines)`

If same rule sreads across multiple lines, then just make them a single line Simple rule is unti we see a token before ::= , its the same rule

Parameters:

Name	Type	Description	Default
`lines`	`list[str]`		required

Returns:

`split_choices(rhs_symbols)`

Split RHS into production choices For now it simply finds choices based on presence of choice symbol (Can be improved later, if needed.)

`finchge.grammar.rule`

`Rule(symbol, choices)`

Represents a single production rule in a BNF grammar.

Each rule contains a non-terminal symbol (e.g., "") and a list of possible expansions (choices) for that symbol.

Parameters:

Name	Type	Description	Default
`symbol`	`str`	The non-terminal symbol on the left-hand side (LHS) of the rule.	required
`choices`	`list[list[str]]`	A list of possible right-hand side (RHS) expansions. Each choice is a list of symbols (either terminals or non-terminals).	required

`finchge.grammar.GenotypeMapper(*, grammar, max_tree_depth=None, max_recursion_depth=None, max_wraps=6, repair_strategy=None, random_state=None)`

Bases: RandomStateMixin

Bidirectional genotype-phenotype codec for Grammatical Evolution.

GenotypeMapper implements the core mapping logic of Grammatical Evolution (GE), providing:

Decoding (map): genotype >>> derivation tree >>> phenotype
Encoding (reverse_map): derivation tree >>> genotype (that reproduces the same tree)

Follows classic GE semantics

Stack-based depth-first expansion
Modulo-based production selection
Leftmost-first expansion (children pushed reversed onto a LIFO stack)
Codon wrapping with a configurable limit
Explicit recursion depth control

This class is grammar-aware but grammar-agnostic: all syntactic structure comes from the provided Grammar instance.

Attributes:

Name	Type	Description
`grammar`	`Grammar`	Grammar defining the genotype-to-phenotype mapping rules.
`rules`	`dict[str, Rule]`	Grammar production rules indexed by non-terminal symbols.
`non_terminals`	`list[str]`	Set of grammar non-terminal symbols.
`start_rule`	`str`	Start symbol used for derivation.
`max_recursion_depth`	`int`	Maximum allowed recursion depth during decoding.
`max_wraps`	`int`	Maximum number of genotype wraps allowed during decoding.
`repair_strategy`	`Optional[RepairStrategy]`	Optional phenotype repair strategy applied after decoding.

Initialize a GenotypeMapper with grammar and mapping constraints.

Parameters:

Name	Type	Description	Default
`grammar`	`Grammar`	Grammar defining production rules and non-terminals used for mapping.	required
`max_tree_depth`	`Optional[int]`	Max depth limit for resulting trees.	`None`
`max_recursion_depth`	`int`	Maximum depth allowed during derivation tree expansion. Mapping is marked invalid if exceeded. Defaults to 20.	`None`
`max_wraps`	`int`	Maximum number of times the genotype may be wrapped when codons are exhausted. Mapping is marked invalid if exceeded. Defaults to 6.	`6`
`repair_strategy`	`Optional[RepairStrategy]`	Optional strategy to repair or post-process the phenotype after successful decoding. Defaults to None.	`None`
`random_state`	`int`	random state	`None`

`map(genotype)`

Decode a genotype into a derivation tree and phenotype using standard Grammatical Evolution (GE) mapping. This implementation performs stack-based depth-first expansion using leftmost derivation semantics.

Follows leftmost derivation semantics

Children are attached to the tree in left-to-right order
Children are pushed onto the stack in reverse order ensuring leftmost child expands first

Codons are consumed sequentially and mapped to grammar productions using modulo selection.

Wrapping is supported when the genotype is exhausted, up to max_wraps times.

Mapping is aborted and marked invalid if

wrapping limit exceeds
recursion depth exceeds max_recursion_depth
the final tree contains unresolved non-terminals

Parameters:

Name	Type	Description	Default
`genotype`	`list[int]`	List of integer codons representing the genotype.	required

Returns:

Name Type Description

MappingResult

Object containing:

Text Only

    - phenotype (str): Generated phenotype string.
    - used_genome (list[int]): Codons actually consumed.
    - used_codon_count (int): Number of codons consumed.
    - invalid (bool): Whether mapping failed.
    - tree (TreeNode): Root of the derivation tree.
    - tree_json (str): Serialized tree representation.

`reverse_map(tree, *, codon_size=127, pad_to_length=None, pad_mode='random')`

Encode a derivation tree into a genotype that reproduces the same tree.

This method traverses the tree in the same order that map() consumes codons: leftmost depth-first expansion of non-terminals.

Parameters:

Name	Type	Description	Default
`tree`	`Union[TreeNode, str]`	TreeNode root or JSON serialized tree.	required
`codon_size`	`int`	Maximum codon value (inclusive).	`127`
`pad_to_length`	`Optional[int]`	Optional genome length padding.	`None`
`pad_mode`	`str`	"zeros" or "random".	`'random'`

Returns:

Type	Description
`list[int]`	list[int]: Genotype reproducing this tree.

`finchge.grammar.mapper.MappingResult(phenotype, used_genome, used_codon_count, invalid, tree, tree_str)` `dataclass`

Result of genotype to phenotype mapping.

`finchge.grammar.derivation_tree.TreeNode(symbol)`

Represents a derivation tree (root node) and its children. The Derivation Tree is used during genotype-to-phenotype mapping.

As TreeNode is used to represent both node and full derivation tree, the properties such as depth, max_depth should be used with caution. The propertydepth is the depth of current node, while max_depth should be used to get the depth of the tree.

Tree depth is derived dynamically from parent links. The property depth gives the depth of the current node, while max_depth gives the maximum depth of the full tree.

Each node holds a symbol (terminal or non-terminal) and may have children. The tree supports JSON and string (CSV-like) serialization. Args: symbol (str): The symbol associated with this node.

`add_child(child)`

Adds a child to the current node and updates depth, parent, and root.

Parameters:

Name	Type	Description	Default
`child`	`TreeNode`	Node to add as a child.	required

Raises:

Type	Description
`TypeError`	If `child` is not a TreeNode instance.

`clone()`

Deep copy of this node and its entire subtree. Parent pointers are reconstructed correctly.

`from_dict(data)` `classmethod`

Constructs a TreeNode from a dictionary.

Parameters:

Name	Type	Description	Default
`data`	`dict`	Dictionary with keys 'symbol' and 'children'.	required

Returns:

Name	Type	Description
`TreeNode`	`TreeNode`	Reconstructed tree.

Raises:

Type	Description
`KeyError`	If required fields are missing.

`from_json(json_str)` `classmethod`

Constructs a TreeNode from a JSON string.

Parameters:

Name	Type	Description	Default
`json_str`	`str`	JSON representation of the tree.	required

Returns:

Name	Type	Description
`TreeNode`	`TreeNode`	Reconstructed tree.

Raises:

Type	Description
`JSONDecodeError`	If JSON string is invalid.

`from_string(s)` `staticmethod`

Deserialize a tree from its canonical string representation.

Parameters:

Name	Type	Description	Default
`s`	`str`	Serialized tree string.	required

Returns:

Name	Type	Description
`TreeNode`	`TreeNode`	Root node of reconstructed tree.

`replace_subtree_with(new_subtree)`

Replace this subtree with new_subtree.

Parameters:

Name	Type	Description	Default
`new_subtree`	`TreeNode`	Root of the subtree that will replace `self`.	required

Returns:

Type	Description
`TreeNode`	The root of the resulting tree.

`size()`

convert the tree into a phenotype

`swap_subtree_with(other)`

Swap this subtree with another subtree.

Returns the new roots of both trees.

`to_dict()`

Converts the node and its subtree into a dictionary.

Returns:

Name	Type	Description
`dict`	`dict[str, Any]`	Dictionary representation of the tree.

`to_json(indent=2)`

Serializes the tree into a JSON-formatted string.

Parameters:

Name	Type	Description	Default
`indent`	`int`	Indentation for the JSON string.	`2`

Returns:

Name	Type	Description
`str`	`str`	JSON string representation of the tree.

`to_phenotype()`

convert the tree into a phenotype

`to_string()`

Reversible tree serialization using length-prefixed symbols.

Format

Length followd by the symbol and the children in braces. For example.

Text Only

<len>:<symbol>{child1,child2,...}

Leaf : 1:x, internal node: 3:add{1:x,1:y}

Returns:

Name	Type	Description
`str`	`str`	Canonical serialized tree string.

`finchge.algorithm`

`BaseAlgorithm(random_state=None)`

Bases: RandomStateMixin, ABC

Abstract base class for Search Algorithm

`BaseAlgorithmSO(random_state=None)`

Bases: BaseAlgorithm

Base class for single-objective algorithms. Provides default sort_population and get_best_individual using self.max_best, which subclasses must set in init.

`BaseAlgorithmMO(random_state=None)`

Bases: BaseAlgorithm

Base class for multi-objective algorithms. Enforces get_pareto_front. All multi-objective algorithms inherit BaseAlgorithmMO and must implement get_pareto_front

`GeneticAlgorithm(selection, crossover, mutation, replacement, elite_size, fitness_evaluator, random_state=None)`

Bases: BaseAlgorithmSO

Genetic Algorithm

Parameters:

Name	Type	Description	Default
`selection`	`GESelectionStrategy`	Selection strategy instance or function.	required
`crossover`	`GECrossoverStrategy`	Crossover strategy instance or function.	required
`mutation`	`GEMutationStrategy`	Mutation strategy instance or function.	required
`replacement`	`GEReplacementStrategy`	Replacement strategy instance or function.	required
`elite_size`	`int`	Number of elite individuals to carry over.	required
`random_state`	`Optional[int]`	random state	`None`

`evolve_one_generation(population)`

Perform one generation of evolution on the given population.

Parameters:

Name	Type	Description	Default
`population`	`Population`	The population to evolve.	required

Returns:

Name	Type	Description
`Population`	`Population`	The evolved population.

`SteadyStateGA(selection, crossover, mutation, fitness_evaluator, random_state=None)`

Bases: BaseAlgorithmSO

Steady-State Genetic Algorithm for single-objective GE.

Implements the Genitor model (Whitley, 1989): each iteration selects two parents, produces two offspring via crossover and mutation, evaluates them, then replaces the two worst individuals in the current population. This repeats until population_size offspring have been processed.

Unlike generational GAs, offspring enter the live population immediately. Newly inserted individuals are eligible for both parent selection and replacement in subsequent iterations of the same generation. No explicit elitism is needed: the best individuals survive naturally because only the worst are displaced each iteration.

Parameters:

Name	Type	Description	Default
`selection`	`GESelectionStrategy`	Parent selection strategy.	required
`crossover`	`GECrossoverStrategy`	Crossover operator producing two offspring per call.	required
`mutation`	`GEMutationStrategy`	Mutation operator applied to each offspring.	required
`fitness_evaluator`	`FitnessEvaluator`	Evaluator for phenotype mapping and fitness scoring.	required
`random_state`	`Optional[int]`	Seed for reproducibility.	`None`

`IslandGA(num_islands, migration_interval, migration_size, selection, crossover, mutation, replacement, elite_size, fitness_evaluator, random_state=None)`

Bases: BaseAlgorithmSO

Island Model Genetic Algorithm for single-objective GE.

The population is partitioned into num_islands independent sub-populations (islands), each evolving in isolation using standard GA operators. Every migration_interval generations, the best migration_size individuals from each island migrate to the next island in a ring topology, replacing that island's worst individuals.

Island isolation promotes diversity by allowing different islands to explore different regions of the search space. Migration periodically shares good solutions to prevent any island from stagnating in a local optimum.

Requires population_size to be divisible by num_islands.

Parameters:

Name	Type	Description	Default
`num_islands`	`int`	Number of independent sub-populations. Must be >= 2.	required
`migration_interval`	`int`	Generations between migration events.	required
`migration_size`	`int`	Number of individuals exchanged per migration event. Must be < population_size // num_islands.	required
`selection`	`GESelectionStrategy`	Selection strategy for within-island evolution.	required
`crossover`	`GECrossoverStrategy`	Crossover strategy for within-island evolution.	required
`mutation`	`GEMutationStrategy`	Mutation strategy for within-island evolution.	required
`replacement`	`GEReplacementStrategy`	Replacement strategy for within-island evolution.	required
`elite_size`	`int`	Number of elites preserved per island per generation.	required
`fitness_evaluator`	`FitnessEvaluator`	Evaluator for phenotype mapping and fitness.	required
`random_state`	`Optional[int]`	Seed for reproducibility.	`None`

`NSGA2(selection, crossover, mutation, replacement, elite_size, fitness_evaluator, random_state=None)`

Bases: BaseAlgorithmMO

NSGA-II algorithm.

Parameters:

Name	Type	Description	Default
`selection`	`GESelectionStrategy`	Selection strategy.	required
`crossover`	`GECrossoverStrategy`	Crossover strategy.	required
`mutation`	`GEMutationStrategy`	Mutation strategy.	required
`replacement`	`GEReplacementStrategy`	Replacement strategy.	required
`elite_size`	`int`	Number of elite individuals to carry over.	required
`fitness_evaluator`	`FitnessEvaluator`	Evaluator used to score individuals.	required
`random_state`	`Optional[int]`	Seed for reproducible randomness.	`None`

`evolve_one_generation(population)`

Perform one generation of evolution on the given population.

Parameters:

Name	Type	Description	Default
`population`	`Population`	The population to evolve.	required

Returns:

Name	Type	Description
`Population`	`Population`	The evolved population.

`get_pareto_front(population)`

Return the Pareto front (rank 0 individuals).

`sort_population(population)`

Sort the population using NSGA-II rank and crowding distance.

Parameters:

Name	Type	Description	Default
`population`	`Population`	The population to sort.	required

`NSGA3(selection, crossover, mutation, fitness_evaluator, num_divisions=12, random_state=None)`

Bases: BaseAlgorithmMO

NSGA-III algorithm.

Parameters:

Name	Type	Description	Default
`selection`	`GESelectionStrategy`	Selection strategy.	required
`crossover`	`GECrossoverStrategy`	Crossover strategy.	required
`mutation`	`GEMutationStrategy`	Mutation strategy.	required
`fitness_evaluator`	`FitnessEvaluator`	Evaluator used to score individuals.	required
`num_divisions`	`int`	Number of divisions for reference points.	`12`
`random_state`	`Optional[int]`	Seed for reproducible randomness.	`None`

`evolve_one_generation(population)`

Perform one generation of evolution on the given population.

Parameters:

Name	Type	Description	Default
`population`	`Population`	The population to evolve.	required

Returns:

Name	Type	Description
`Population`	`Population`	The evolved population.

`get_pareto_front(population)`

Return the Pareto front.

Parameters:

Name	Type	Description	Default
`population`	`Population`	Population to extract the Pareto front from.	required

Returns:

Type	Description
`list[Individual]`	Individuals in the first non-dominated front.

`finchge.initialisation`

`RandomGenomeInitialiser(genome_length, codon_size, random_state=None)`

Bases: GEInitialiser

Random genome initialiser for Grammatical Evolution.

Generates Individuals with a fixed-length integer genome where each codon is sampled uniformly from [0, codon_size].

This initialiser requires genome_length and codon_size to be provided explicitly via configuration or constructor.

`from_config(config)` `classmethod`

Create a RandomGenomeInitialiser using FinchConfig object

Parameters:

Name	Type	Description	Default
`config`	`FinchConfig`	FinchConfig instance	required

Returns:

Type	Description
`RandomGenomeInitialiser`	RandomGenomeInitialiser

Raises:

Type	Description
`ConfigError`	If required configuration keys are missing or invalid.

`RVDInitialiser(genome_length, codon_size, population_size, random_state=None, mapper=None)`

Bases: GEInitialiser

Random Valid Distinct (RVD) initialiser.

Generates individuals by sampling random genomes while rejecting invalid mappings and duplicate phenotypes. - RVD initialisation [Nicolau, 2017].

Note

RVD initializer instance is supposed to be used once per run or the stored RVD state may cause issues. To reset the stored state call reset() , if same RVDInitializer instance has to be reused.

`from_config(config)` `classmethod`

Create a RVDInitializer from the [ge] config section.

Parameters:

Name	Type	Description	Default
`config`	`FinchConfig`	FinchConfig instance	required

Returns:

Type	Description
`RVDInitialiser`	RandomGenomeInitialiser

Raises:

Type	Description
`ConfigError`	If required configuration keys are missing or invalid.

`reset()`

If same RVD instance is used for multiple runs. must reset

`FullTreeInitialiser(init_min_depth, init_max_depth, strict_full=True, random_state=None)`

Bases: GETreeInitialiser

Full tree initialiser as defined by Koza (1992).

Produces trees where all internal nodes expand until max depth. Terminals are only allowed at the maximum depth.

`GrowTreeInitialiser(init_min_depth, init_max_depth, random_state=None)`

Bases: GETreeInitialiser

Grow tree initialiser as defined by Koza (1992).

Allows productions to terminate early below max depth, resulting in irregular tree shapes.

`PIGrowInitialiser(init_max_depth, population_size, random_state=None)`

Bases: GETreeInitialiser

Position-Independent Grow (PI-Grow) tree initialiser.

Generates derivation trees using grow-style initialisation with position-independent expansion.

`from_config(config)` `classmethod`

Create a PI-Grow initialiser from GE configuration.

Parameters:

Name	Type	Description	Default
`config`	`FinchConfig`	FinchConfig instance	required

Returns:

Type	Description
`PIGrowInitialiser`	PIGrowInitializer

Raises:

Type	Description
`ConfigError`	If required configuration keys are missing or invalid.

`RHHInitialiser(init_max_depth, population_size, strict_full=True, random_state=None)`

Bases: GETreeInitialiser

Ramped Half-and-Half (RHH) Initialisation for Grammatical Evolution.

This implementation of RHH Initialisation is the Grammatical Evolution is based on "Sensible Initialization" approach as an analogue of Koza’s Ramped Half-and-Half (RHH) method originally developed for tree-based Genetic Programming. It generates an initial population of derivation trees using a combination of Full and Grow construction strategies across a range of depth limits, promoting structural diversity while respecting grammar constraints.

In this approach, grammar production rules assume the role of GP functions in controlling tree growth. Recursive production rules are preferentially selected during Full initialisation to ensure tree expansion until the specified depth limit is reached. During Grow initialisation, both recursive and terminating productions may be selected, allowing trees of irregular shape to be generated.

Production feasibility is guided by grammar analysis, including recursion detection and minimum derivation depth calculation, ensuring that generated trees can terminate within the allowed depth budget.

References

Koza, J. R. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press.

Ryan, C., Collins, J. J., & O’Neill, M. (1998). Grammatical Evolution: Evolving Programs for an Arbitrary Language. In Proceedings of the First European Workshop on Genetic Programming.

Ryan, C., & Azad, R. M. A. (2003). Sensible Initialisation in Grammatical Evolution. In Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary Language. Springer.

Fenton, M., McDermott, J., Fagan, D., Forstenlechner, S., Hemberg, E., & O’Neill, M. (2017). PonyGE2: Grammatical Evolution in Python. arXiv:1703.08535.

`from_config(config)` `classmethod`

Create a RampedHalfAndHalfInitializer from configuration.

Parameters:

Name	Type	Description	Default
`config`	`FinchConfig`	FinchConfig instance	required

Returns:

Type	Description
`RHHInitialiser`	RampedHalfAndHalfInitializer

Raises:

Type	Description
`ConfigError`	If required configuration keys are missing or invalid.

`PTC2Initialiser(target_size, random_state=None, max_depth=None)`

Bases: GETreeInitialiser

`RampedPTC2Initialiser(init_min_size, init_max_size, population_size, max_depth=None, random_state=None)`

Bases: GETreeInitialiser

Ramped size-based population initialiser using PTC2 family algorithms.

This initialiser generates individuals using Probabilistic Tree Creation 2 across a ramped distribution of target tree sizes.

Behaviour

• If max_depth is None: Uses PTC2 (size-controlled only).

• If max_depth is provided: Uses PTC2D (size + depth constrained).

`finchge.operators.selection`

`GESelectionStrategy(max_best, random_state=None)`

Bases: RandomStateMixin, ABC

`TournamentSelection(max_best, tournament_size=3, random_state=None)`

Bases: GESelectionStrategy

Selection strategy that chooses individuals using tournament selection.

In tournament selection, a fixed number of individuals (tournament_size) are randomly chosen from the population, and the best among them (based on fitness) is selected. This process repeats until the desired number of individuals is selected. Selection pressure increases with larger tournament sizes.

Parameters:

Name	Type	Description	Default
`max_best`	`bool`	Whether higher fitness values are better. True for maximization, False for minimization.	required
`tournament_size`	`int`	The number of individuals competing in each tournament. Must be >= 2. Defaults to 3.	`3`
`random_state`	`Optional[int]`	Random seed.	`None`

`select(population_size, individuals)`

Selects a subset of individuals using tournament selection.

Repeatedly selects a group of individuals (a "tournament") at random from the population, and chooses the best among them based on fitness. This process continues until the desired number of individuals is selected. Whether the best individual is determined by maximizing or minimizing fitness depends on the max_best setting.

Parameters:

Name	Type	Description	Default
`population_size`	`int`	The number of individuals to select for the next generation.	required
`individuals`	`list[Individual]`	A list of individuals to select from. Each must have a `fitness` attribute, which is expected to be a list containing a single float value (e.g., [fitness]).	required

Returns:

Type	Description
`list[Individual]`	A list of selected individuals based on tournament outcomes.

`RouletteWheelSelection(max_best, random_state=None)`

Bases: GESelectionStrategy

Fitness-proportionate selection (roulette wheel selection).

Roulette wheel selection is one of the earliest and most widely used selection mechanisms in evolutionary computation. The method assigns selection probabilities to individuals proportional to their fitness, enabling stochastic sampling while maintaining selection pressure toward fitter individuals.

Each individual occupies a proportion of a conceptual roulette wheel, where the area assigned to each individual is proportional to its fitness. Selection is performed by sampling from this distribution, ensuring that individuals with higher fitness are more likely—but not guaranteed—to be selected.

Fitness-proportionate selection originates from early genetic algorithm research and is commonly attributed to the foundational work on genetic algorithms by John Holland.

Parameters:

Name	Type	Description	Default
`max_best`	`bool`	Whether higher fitness values are better. True for maximization, False for minimization.	required
`random_state`	`Optional[int]`	Random seed.	`None`

`select(population_size, individuals)`

Selects a subset of individuals using roulette wheel selection.

This method performs fitness-proportionate selection, where individuals are assigned selection probabilities based on their fitness values. The fitness values are shifted to ensure they are non-negative, and if necessary, inverted for minimization problems. If the total weight of the shifted fitness values is zero or negative, the method falls back to uniform random selection.

Parameters:

Name	Type	Description	Default
`population_size`	`int`	The number of individuals to select for the next generation.	required
`individuals`	`list[Individual]`	A list of individuals to select from. Each individual must have a `fitness` attribute, which is expected to be a list containing a single float value (e.g., [fitness]).	required

Returns:

Type	Description
`list[Individual]`	A list of selected individuals based on roulette wheel selection probabilities.

`RankSelection(max_best, selection_pressure=1.5, random_state=None)`

Bases: GESelectionStrategy

Selection strategy that selects individuals based on their rank in the population.

In rank selection, individuals are first sorted by fitness, and then assigned selection probabilities based on their rank rather than their raw fitness value. The selection pressure can be adjusted using the selection_pressure parameter, which influences how strongly the best individuals are favored. A higher selection pressure makes the selection more biased toward higher-ranked individuals.

Parameters:

Name	Type	Description	Default
`max_best`	`bool`	Whether higher fitness values are better. True for maximization, False for minimization.	required
`selection_pressure`	`float`	The selection pressure that controls the bias toward higher-ranked individuals. Must be between 1.0 and 2.0. Defaults to 1.5.	`1.5`
`random_state`	`Optional[int]`	Random seed.	`None`

Raises:

Type	Description
`ValueError`	If selection_pressure is not between 1.0 and 2.0.

`select(population_size, individuals)`

Selects a subset of individuals using linear rank-based selection.

This method applies rank selection by sorting individuals based on fitness and assigning selection probabilities based on their rank rather than their raw fitness values. The selection pressure parameter controls how much more likely the best-ranked individuals are to be selected compared to lower-ranked ones.

Parameters:

Name	Type	Description	Default
`population_size`	`int`	The number of individuals to select.	required
`individuals`	`list[Individual]`	A list of individuals to select from. Each individual must have a `fitness` attribute, which is a list containing a single float value representing its fitness.	required

Returns:

Type	Description
`list[Individual]`	A list of selected individuals from the population based on rank-weighted probabilities.

`TruncationSelection(max_best, truncation_threshold=0.5, random_state=None)`

Bases: GESelectionStrategy

Truncation selection strategy.

Truncation selection is a deterministic selection mechanism in which only the top-performing portion of a population is allowed to reproduce. The selected elite subset is determined by sorting individuals according to fitness and selecting a predefined fraction of the best individuals.

The population is ranked by fitness and only the top k fraction of individuals is retained as eligible parents. Selection from this elite pool may then occur randomly or deterministically.

Parameters:

Name	Type	Description	Default
`max_best`	`bool`	Whether higher fitness values are better. True for maximization, False for minimization.	required
`truncation_threshold`	`float`	Threshold value.	`0.5`
`random_state`	`Optional[int]`	Random seed.	`None`

`select(population_size, individuals)`

Selects a subset of individuals using truncation selection.

First, the population is sorted by fitness (descending for maximization, ascending for minimization). Then, the top fraction specified by truncation_threshold is selected as the elite pool. Finally, individuals are randomly chosen from this elite pool to reach the desired population size.

Parameters:

Name	Type	Description	Default
`population_size`	`int`	The number of individuals to select for the next generation.	required
`individuals`	`list[Individual]`	A list of individuals to select from. Each individual must have a `fitness` attribute that is comparable.	required

Returns:

Type	Description
`list[Individual]`	A list of selected individuals randomly chosen from the elite portion of
`list[Individual]`	the population.

`LexicaseSelection(case_key='errors', case_max_best=False, random_state=None)`

Bases: GESelectionStrategy

Implements Lexicase Selection for parent selection in Genetic Programming.

This strategy filters a population by evaluating individuals against one case at a time in a randomized order. This is particularly effective for problems where different individuals might excel at different parts of the fitness landscape, as it preserves diversity better than standard tournament selection.

Attributes:

Name	Type	Description
`case_key`	`str`	The key used to retrieve the per-case fitness vector from an individual's metadata.
`required_case_keys`	`tuple`	A tuple containing the primary case key needed for evaluation.

Initializes the selection strategy.

Parameters:

Name	Type	Description	Default
`case_key`	`str`	Metadata key where the casewise error/fitness list is stored.	`'errors'`
`case_max_best`	`bool`	Set to True if higher values in the case vector are better (e.g., accuracy). Defaults to False (e.g., error/loss).	`False`
`random_state`	`Optional[int]`	Seed for the random number generator.	`None`

`select(population_size, individuals)`

Selects parents using Lexicase selection.

`EpsilonLexicaseSelection(case_key='errors', epsilon=0.0, case_max_best=False, random_state=None)`

Bases: LexicaseSelection

A variant of Lexicase Selection that uses a margin of error (epsilon).

Instead of requiring individuals to exactly match the 'best' performance on a given case, this strategy allows individuals to survive if they are within a specified distance (epsilon) of the best performer. This helps prevent premature convergence caused by tiny differences in floating-point fitness values.

Initializes the Epsilon Lexicase strategy.

Parameters:

Name	Type	Description	Default
`case_key`	`str`	Metadata key for the casewise fitness vector.	`'errors'`
`epsilon`	`float`	The buffer allowed when comparing fitness values.	`0.0`
`case_max_best`	`bool`	True if higher values are better.	`False`
`random_state`	`Optional[int]`	Seed for the random number generator.	`None`

`NSGA2TournamentSelection(max_best=False, tournament_size=2, exploration_prob=0.1, random_state=None)`

Bases: GESelectionStrategy

Tournament selection using NSGA-II crowded-comparison operator.

This selection method implements the parent selection strategy introduced in the NSGA-II multi-objective evolutionary algorithm. Selection is based on Pareto rank and crowding distance to balance convergence toward the Pareto front with population diversity.

Individuals are compared using a crowded comparison operator:

Prefer individuals with lower Pareto rank.
If ranks are equal, prefer individuals with higher crowding distance.

Tournament selection is applied using this comparison operator.

`crowded_comparison_operator(tournament)`

Selects the best individual using NSGA-II's crowded comparison operator.

The operator selects individuals based on lower Pareto rank (better) If same rank, higher crowding distance (better)

Parameters:

Name	Type	Description	Default
`tournament`	`list[Individual]`	The individuals competing in the tournament.	required

Returns:

Type	Description
`Individual`	The selected winner from the tournament.

`select(population_size, individuals)`

Selects individuals using NSGA-II tournament selection.

For each tournament: 1. With probability exploration_prob, a random individual is chosen 2. Otherwise, the crowded comparison operator selects the best individual based on Pareto rank and crowding distance

Parameters:

Name	Type	Description	Default
`population_size`	`int`	Number of individuals to select.	required
`individuals`	`list[Individual]`	List of individuals to select from. Each individual must have 'rank' and 'crowding_distance' metadata.	required

Returns:

Type	Description
`list[Individual]`	List of selected individuals.

`NSGA3TournamentSelection(tournament_size=2, exploration_prob=0.1, random_state=None)`

Bases: GESelectionStrategy

Tournament selection using NSGA-III rank-based selection.

This selection strategy implements the rank-based selection component of NSGA-III, a many-objective evolutionary algorithm designed for problems involving four or more objective functions.

Selection is performed using Pareto rank only. When multiple individuals share the same rank, selection among them is performed randomly to preserve diversity. NSGA-III relies on reference point-based niching rather than crowding distance.

`rank_only_operator(tournament)`

Selects an individual based on Pareto rank only. First finds the individuals with the best (lowest) rank, then randomly selects one from among them to maintain diversity.

Parameters:

Name	Type	Description	Default
`tournament`	`list[Individual]`	The individuals competing in the tournament.	required

Returns:

Type	Description
`Individual`	A randomly selected individual from those with the best rank.

`select(population_size, individuals)`

Selects individuals using NSGA-III tournament selection.

For each tournament a random individual is chosen with probability exploration_prob, otherwise, the rank-only operator selects from the best-ranked individuals

Parameters:

Name	Type	Description	Default
`population_size`	`int`	Number of individuals to select.	required
`individuals`	`list[Individual]`	List of individuals to select from. Each individual must have 'rank' metadata.	required

Returns:

Type	Description
`list[Individual]`	List of selected individuals.

`finchge.operators.crossover`

`OnePointCrossover(codon_size, crossover_proba, within_used=True, mode='fixed', random_state=None)`

Bases: GECrossoverStrategy

One-point crossover operator.

A single crossover point is selected in each parent genome, and the genome segments after this point are exchanged to form offspring. Parent genomes are split at a randomly selected point. Offspring are constructed by concatenating prefix segments from one parent with suffix segments from the other. Supports both fixed and variable one-point crossover variants with mode parameter.

variable: Select one crossover point independently in each parent. This can change offspring genome lengths.
fixed: Select the same crossover point in both parents. This preserves offspring genome lengths.

Initialize the one-point crossover strategy.

This operator supports both fixed and variable one-point crossover:

"fixed": Uses the same crossover point in both parents. Offspring genome lengths are preserved. Equivalent to PonyGE-style fixed one-point crossover.
"variable": Uses independent crossover points in each parent. Offspring genome lengths may differ from parents. Equivalent to PonyGE-style variable one-point crossover.

Parameters:

Name	Type	Description	Default
`codon_size`	`int`	Maximum codon value (inclusive).	required
`crossover_proba`	`float`	Probability that crossover is applied; otherwise, offspring are copies of the parents.	required
`within_used`	`bool`	If True, crossover points are restricted to the used portion of each parent's genome (i.e., within `used_codon_count`). If False, the full genome length is considered.	`True`
`mode`	`Literal['variable', 'fixed']`	Crossover mode. Either "fixed" or "variable".	`'fixed'`
`random_state`	`Optional[int]`	Optional random seed for reproducibility.	`None`

`cross(parent1, parent2)`

Perform one-point crossover

Parameters:

Name	Type	Description	Default
`parent1`	`Individual`	First parent.	required
`parent2`	`Individual`	Second parent.	required

Returns:

Type	Description
`Tuple[Individual, Individual]`	Two offspring individuals.

`SubtreeCrossover(crossover_proba, non_terminals, random_state=None)`

Bases: GECrossoverStrategy

Subtree crossover exchanges randomly selected subtrees between two parent program trees while preserving syntactic correctness with respect to a grammar.

Two compatible non-terminal nodes are selected from parent trees, and the subtrees rooted at these nodes are swapped.

Initialize the subtree crossove.

Parameters:

Name	Type	Description	Default
`crossover_proba`	`float`	Probability of crossover occurring.	required
`non_terminals`	`list[str]`	List of grammar non-terminal symbols that are valid crossover points.	required

`cross(p_0, p_1)`

Perform subtree crossover between two parent individuals.

If crossover does not occur, cloned trees are returned as new individuals.

Parameters:

Name	Type	Description	Default
`p_0`	`Individual`	First parent individual.	required
`p_1`	`Individual`	Second parent individual.	required

Returns:

Type	Description
`Tuple[Individual, Individual]`	A tuple containing two newly constructed offspring individuals.

`TwoPointCrossover(codon_size, crossover_proba, within_used=True, mode='fixed', random_state=None)`

Bases: GECrossoverStrategy

Two-point crossover generalizes one-point crossover by selecting two crossover points in each parent genome and exchanging the intermediate segments.

Offspring are formed by combining prefix and suffix regions from one parent with the middle region from the other parent.

Initialize the two-point crossover strategy.

This operator supports both fixed and variable two-point crossover:

"fixed": Uses the same two crossover points in both parents. A segment between the two points is exchanged, preserving offspring genome lengths. Equivalent to PonyGE-style fixed two-point crossover.
"variable": Uses independently selected crossover points in each parent. Exchanged segments may differ in length, allowing offspring genomes to grow or shrink. Equivalent to PonyGE-style variable two-point crossover.

Parameters:

Name	Type	Description	Default
`codon_size`	`int`	Maximum codon value (inclusive).	required
`crossover_proba`	`float`	Probability that crossover is applied; otherwise, offspring are copies of the parents.	required
`within_used`	`bool`	If True, crossover points are restricted to the used portion of each parent's genome (i.e., within `used_codon_count`). If False, the full genome length is considered.	`True`
`mode`	`Literal['fixed', 'variable']`	Crossover mode. Either "fixed" or "variable".	`'fixed'`
`random_state`	`Optional[int]`	Optional random seed for reproducibility.	`None`

`cross(parent1, parent2)`

Perform two-point crossover.

Parameters:

Name	Type	Description	Default
`parent1`	`Individual`	First parent.	required
`parent2`	`Individual`	Second parent.	required

Returns:

Type	Description
`Tuple[Individual, Individual]`	Two offspring individuals.

`UniformCrossover(crossover_proba, within_used=True, random_state=None)`

Bases: GECrossoverStrategy

Uniform crossover operator.

Uniform crossover recombines parent genomes by independently swapping genes between parents at each position with fixed probability.

Instead of using fixed crossover points, uniform crossover applies a binary mixing mask across the genome, providing maximal mixing of parental genetic material.

Initialize uniform crossover

Parameters:

Name	Type	Description	Default
`crossover_proba`	`float`	Probability of crossover occurring.	required
`within_used`	`bool`	If True, crossover is limited to the used portion of each parent's genome.	`True`

`cross(parent1, parent2)`

Perform uniform crossover.

Parameters:

Name	Type	Description	Default
`parent1`	`Individual`	First parent.	required
`parent2`	`Individual`	Second parent.	required

Returns:

Type	Description
`Tuple[Individual, Individual]`	Two offspring individuals.

`finchge.operators.mutation`

`CyclicMutation(mutation_probability, segment_size=3, random_state=None)`

Bases: GEMutationStrategy

Cyclic mutation strategy.

Rotates fixed-size contiguous segments of the genome with a per-gene mutation probability.

Initialize the cyclic mutation strategy.

Parameters:

Name	Type	Description	Default
`mutation_probability`	`float`	Probability in [0.0, 1.0] that a given position triggers a cyclic rotation.	required
`segment_size`	`int`	Size of the segment to rotate.	`3`

Raises:

Type	Description
`ValueError`	If mutation_probability is not in [0.0, 1.0].
`ValueError`	If segment_size is less than 2.

`mutate(individual)`

Apply cyclic mutation to an individual.

Each genome position is independently selected with mutation_probability. When selected, the contiguous segment starting at that position is rotated by one step.

Parameters:

Name	Type	Description	Default
`individual`	`Individual`	Individual to mutate.	required

Returns:

Type	Description
`Individual`	A new Individual with a mutated genotype.

`DuplicationMutation(mutation_probability, segment_size=2, random_state=None)`

Bases: GEMutationStrategy

Duplication mutation strategy.

With a given probability, duplicates a contiguous segment of fixed size and copies it into another non-overlapping location in the genome.

Initialize the duplication mutation strategy.

Parameters:

Name	Type	Description	Default
`mutation_probability`	`float`	Probability in [0.0, 1.0] of applying the duplication mutation.	required
`segment_size`	`int`	Length of the segment to duplicate, must be > 0.	`2`

Raises:

Type	Description
`ValueError`	If mutation_probability is not in [0.0, 1.0].
`ValueError`	If segment_size is not a positive integer.

`mutate(individual)`

Apply duplication mutation to an individual.

A segment of length segment_size is copied from one position and overwrites another non-overlapping segment with probability mutation_probability.

Parameters:

Name	Type	Description	Default
`individual`	`Individual`	Individual to mutate.	required

Returns:

Type	Description
`Individual`	A new Individual with a mutated genotype.

`GaussianMutation(mutation_probability, codon_size, std_dev=1.0, random_state=None)`

Bases: GEMutationStrategy

Gaussian noise mutation strategy.

Adds Gaussian noise to selected genes and clamps results to non-negative integers.

Initialize the Gaussian mutation strategy.

Parameters:

Name	Type	Description	Default
`mutation_probability`	`float`	Probability in [0.0, 1.0] that a given gene is mutated.	required
`codon_size`	`int`	codon size (needed to clamp between 0 and codon_size to avoid producing codons above codon_size)	required
`std_dev`	`float`	Standard deviation of the Gaussian noise.	`1.0`

Raises:

Type	Description
`ValueError`	If mutation_probability is not in [0.0, 1.0].

`mutate(individual)`

Apply Gaussian mutation to an individual.

Each gene is independently selected with mutation_probability. Selected genes receive Gaussian noise, are rounded to integers, and clamped to be non-negative and within codon_size.

Parameters:

Name	Type	Description	Default
`individual`	`Individual`	Individual to mutate.	required

Returns:

Type	Description
`Individual`	A new Individual with a mutated genotype.

`IntFlipMutation(mutation_probability, codon_size, mode='per_codon', mutation_events=1, within_used=True, random_state=None)`

Bases: GEMutationStrategy

Integer flip mutation strategy.

Parameters:

Name	Type	Description	Default
`mutation_probability`	`float`	Probability of mutation per gene.	required
`codon_size`	`int`	Maximum codon value (inclusive).	required

`mutate(individual)`

Perform integer flip mutation on an Individual.

Parameters:

Name	Type	Description	Default
`individual`	`Individual`	Individual to apply mutation on.	required

Returns:

Name	Type	Description
`Individual`	`Individual`	Mutated Individual.

`InversionMutation(segment_probability, random_state=None)`

Bases: GEMutationStrategy

Inversion mutation strategy.

With a given probability, selects a random contiguous segment of the genome and reverses its order.

Initialize the inversion mutation strategy.

Parameters:

Name	Type	Description	Default
`segment_probability`	`float`	Probability in [0.0, 1.0] of applying the inversion mutation.	required

Raises:

Type	Description
`ValueError`	If segment_probability is not in [0.0, 1.0].

`mutate(individual)`

Apply inversion mutation to an individual.

With probability segment_probability, a random contiguous segment [start:end) of the genome is reversed.

Parameters:

Name	Type	Description	Default
`individual`	`Individual`	Individual to mutate.	required

Returns:

Type	Description
`Individual`	A new Individual with a mutated genotype.

`MultipleMutation(strategies, probabilities=None, random_state=None)`

Bases: GEMutationStrategy

Multiple mutation strategies combiner.

Randomly selects and applies one mutation strategy according to provided probabilities.

Initialize the multiple mutation strategy.

Parameters:

Name	Type	Description	Default
`strategies`	`list[GEMutationStrategy]`	List of mutation strategies to choose from.	required
`probabilities`	`Optional[list[float]]`	Optional selection probabilities for each strategy. If None, all strategies are selected uniformly.	`None`

Raises:

Type	Description
`ValueError`	If probabilities length does not match strategies or if probabilities sum to zero.

`inject_rng(rng, np_rng=None)`

Inject RNGs to individual mutations for deterministic runs Args: rng: np_rng:

Returns:

`mutate(individual)`

Apply a randomly selected mutation strategy.

One mutation strategy is selected according to the configured probabilities and applied to the individual.

Parameters:

Name	Type	Description	Default
`individual`	`Individual`	Individual to mutate.	required

Returns:

Type	Description
`Individual`	A mutated Individual.

`SubtreeMutation(mutation_probability, non_terminals, tree_generator, mutation_max_depth, mutation_events=1, random_state=None)`

Bases: GEMutationStrategy

Subtree mutation replaces randomly selected subtrees within a program tree with newly generated random subtrees. Mutation targets nodes representing non-terminal grammar symbols. The subtree rooted at the selected node is replaced with a newly generated subtree that satisfies grammar constraints. Subtree mutation is a core mutation mechanism in genetic programming and was introduced alongside subtree crossover in early GP research.

Characteristics:

Maintains syntactic validity of individuals.
Introduces structural novelty.
Controls search-space exploration via subtree depth limits.

`mutate(individual)`

Mutate an individual using subtree mutation.

Parameters:

Name	Type	Description	Default
`individual`	`Individual`	Parent individual.	required

Returns:

Type	Description
`Individual`	A newly constructed mutated individual.

`SwapMutation(mutation_probability, random_state=None)`

Bases: GEMutationStrategy

Swap mutation strategy. Randomly swaps pairs of genes in the genotype based on a per gene mutation probability.

Initialize the swap mutation strategy.

Parameters:

Name	Type	Description	Default
`mutation_probability`	`float`	Probability in [0.0, 1.0] that a given gene position is selected for swapping.	required

Raises:

Type	Description
`ValueError`	If mutation_probability is not in [0.0, 1.0].

`mutate(individual)`

Apply swap mutation to an individual.

Each gene position is independently selected with mutation_probability. Selected positions are randomly paired and their values swapped.

Parameters:

Name	Type	Description	Default
`individual`	`Individual`	Individual to mutate.	required

Returns:

Type	Description
`Individual`	A new Individual with a mutated genotype.

`finchge.operators.replacement`

`GenerationalReplacement(max_best, random_state=None)`

Bases: GEReplacementStrategy

Generational replacement with protected elitism for single-objective optimization.

At each generation the entire offspring population replaces the old one, but a fixed number of elite individuals from the previous generation are guaranteed a slot in the next — they are never competed against offspring. This matches the behaviour of PonyGE2's generational replacement (Whitley, 1989).

Elites are taken from the best elite_size individuals in the old population and placed unconditionally into the result. The remaining population_size - elite_size slots are filled by the best-ranked offspring.

Parameters:

Name	Type	Description	Default
`max_best`	`bool`	If True, higher fitness is better (maximisation). If False, lower fitness is better (minimisation).	required

Notes

Elites are selected exclusively from the old population.
An elite is never displaced by a better offspring (protected elitism).
Fitness values must be evaluated before calling replace.
Individuals with unusable fitness are excluded from the elite pool.

`replace(new_population, old_population, elite_size, population_size)`

Applies protected elitism: reserves the best elite_size individuals from the old population unconditionally, then fills the remaining slots with the best-ranked offspring.

Parameters:

Name	Type	Description	Default
`new_population`	`list[Individual]`	Offspring generated this generation.	required
`old_population`	`list[Individual]`	Previous generation population.	required
`elite_size`	`int`	Number of individuals from old_population to preserve.	required
`population_size`	`int`	Target size of the returned population.	required

Returns:

Type	Description
`list[Individual]`	List of `population_size` individuals: elites first, then
`list[Individual]`	the best `population_size - elite_size` offspring.

`NSGA2ElitistReplacement(maximize_flags, random_state=None)`

Bases: GEReplacementStrategy

NSGA-II elitist replacement strategy using Pareto rank and crowding distance.

This replacement strategy implements the environmental selection step of NSGA-II. It preserves elite individuals from the previous population and fills the remaining population by selecting individuals from successive Pareto fronts, prioritizing diversity via crowding distance.

The strategy assumes that: - Non-dominated sorting has already been performed. - Each individual has a valid Pareto rank stored in meta["rank"]. - Crowding distance is computed per front using NSGA-II rules.

This class is not compatible with NSGA-III, as it relies on crowding distance for diversity preservation. For NSGA-III, reference-point-based environmental selection must be used instead.

Parameters:

Name	Type	Description	Default
`maximize_flags`	`list[bool]`	List indicating whether each objective should be maximized (True) or minimized (False). This is used to perform non-dominated sorting when combining parent and offspring populations.	required

Methods:

Name	Description
`replace`	Performs elitist environmental selection according to NSGA-II.

`replace(new_population, old_population, elite_size, population_size)`

Combines old and new populations, runs non-dominated sorting and crowding-distance ranking, and selects population_size individuals from successive Pareto fronts, with elite_size guaranteed elites from the previous generation prepended.

Parameters:

Name	Type	Description	Default
`new_population`	`list[Individual]`	Offspring generated this generation.	required
`old_population`	`list[Individual]`	Previous generation population.	required
`elite_size`	`int`	Number of top-ranked old individuals guaranteed a slot.	required
`population_size`	`int`	Target size of the returned population.	required

Returns:

Type	Description
`list[Individual]`	List of `population_size` individuals ranked by Pareto front
`list[Individual]`	and crowding distance.

`RandomElitistReplacement(max_best, random_state=None)`

Bases: GEReplacementStrategy

Random replacement strategy with elitism for single-objective optimization.

This replacement strategy preserves a fixed number of elite individuals from the current population and fills the remaining population slots by randomly sampling from the non-elite individuals of the old population and the newly generated individuals.

Elites are selected based on a scalar fitness value, using either maximization or minimization as specified at construction time. Diversity is introduced through random sampling rather than explicit distance or dominance measures.

Parameters:

Name	Type	Description	Default
`max_best`	`bool`	If True, higher fitness values are considered better (maximization). If False, lower fitness values are considered better (minimization).	required

Methods:

Name	Description
`replace`	Preserves elite individuals and randomly selects remaining individuals to form the next generation.

Raises:

Type	Description
`ValueError`	If `elite_size` is invalid.
`ValueError`	If any individual does not have a single-objective fitness representation (fitness list length != 1).
`ValueError`	If there are not enough eligible individuals to fill the population.

Notes

Assumes that fitness values are evaluated prior to replacement.
Does not use Pareto dominance, crowding distance, or reference-point niching.
Random sampling is performed without replacement.

`finchge.config`

`FinchConfig(data)`

Configuration manager for Grammar Evolution (GE) utils.

FinchConfig provides an interface for loading, accessing, and modifying GE configuration from YAML and INI files.

Can load conifg as dictionaries (dict[str, dict[str, Any]]) through a construtor of from_file. Supports automatic file discovery and typed access to configuration sections. When working in a single proplem project with one configuration, FinchConfig can detect the file automatiaally. Although automatic config detection is handy during quick tasks such as checking grammar and other components. For more organized utils it is recommended to provide the actual file. As this class may silently pick the config file from current working directory if available any.

Attributes:

Name	Type	Description
`_data`		Internal storage for configuration data as nested dictionaries.

Initializes a FinchConfig instance with configuration data.

Parameters:

Name	Type	Description	Default
`data`	`dict[str, dict[str, Any]]`	Nested dictionary where outer keys are section names and inner dictionaries contain section key-value pairs. Example: Text Only `{"eperiment": {"start_symbol": "<expr>"}, ...}`	required

Note

Prefer using the class methods from_file, from_yaml, or from_ini to create instances rather than calling this constructor directly.

`experiment` `property`

Retrieves the experiment configuration section. Returns: Dictionary of experiment configuration parameters.

`ge` `property`

Retrieves the grammatical evolution configuration section.

Typically contains

start_symbol: The starting non-terminal symbol
rules: Grammar production rules
max_depth: Maximum derivation depth
crossover_rate: Probability of applying crossover
mutation_rate: Probability of applying mutation
selection_method: Selection strategy (tournament, roulette, etc.)
tournament_size: If using tournament selection
other related parameters

Returns:

Type	Description
`dict[str, Any]`	Dictionary of grammar configuration parameters.

`parallel` `property`

Retrieves the parallel configuration section.

Returns:

Type	Description
`dict[str, Any]`	Dictionary of experiment configuration parameters.

`copy(update=None)`

Creates a deep copy of the configuration with optional updates.

Parameters:

Name	Type	Description	Default
`update`	`dict[str, dict[str, Any]] \| None`	Optional dictionary of updates to apply to the copy. Format: {section_name: {key: value, ...}, ...} If a section doesn't exist, it will be created.	`None`

Returns:

Type	Description
`FinchConfig`	A new FinchConfig instance with the copied (and optionally updated) data.

Example

Text Only

>>> new_config = config.copy({
...     "experiment": {"num_generations": 500},
...     "new_section": {"key": "value"}
... })

`from_dict(data)` `classmethod`

Loads dict as FinchConfig. Internally finchGE uses FinchConfig class. Args: data (dict[str, Any]): configuration as a dictionary

Returns: FinchConfig instance

`from_file(path=None)` `classmethod`

Creates a FinchConfig instance from a configuration file.

Automatically detects file format based on extension and loads the configuration. If no path is provided, searches for common config filenames in the current directory.

If the config files follow expected name (ge_config) and are in current working directory, they can be automatically detected. Othewise they have to be passed as path argument.

Parameters:

Name	Type	Description	Default
`path`	`str \| None`	Optional path to configuration file. If None, searches for 'ge_config.yaml', 'ge_config.yml', or 'ge_config.ini' in the current directory.	`None`

Returns:

Type	Description
`FinchConfig`	FinchConfig instance loaded with configuration data.

Raises:

Type	Description
`FileNotFoundError`	If no configuration file is found.
`RuntimeError`	If multiple configuration files are found (when path is None).
`ValueError`	If the file format is unsupported or the file content is invalid.

`from_ini(path)` `classmethod`

Creates a FinchConfig instance from an INI configuration file.

Parses INI sections and converts string values to appropriate Python types using ast.literal_eval. Values that cannot be evaluated remain as strings.

Parameters:

Name	Type	Description	Default
`path`	`str`	Path to the INI configuration file.	required

Returns:

Type	Description
`FinchConfig`	FinchConfig instance with parsed configuration data.

Raises:

Type	Description
`FileNotFoundError`	If the INI file does not exist.
`Error`	If the INI file is malformed.

Example INI format

Text Only

[grammar]
start_symbol = "<expr>"
max_depth = 10

[initialisation]
population_size = 100
method = "ramped_half_and_half"

`from_yaml(path)` `classmethod`

Creates a FinchConfig instance from a YAML configuration file.

Parameters:

Name	Type	Description	Default
`path`	`str`	Path to the YAML configuration file.	required

Returns:

Type	Description
`FinchConfig`	FinchConfig instance with parsed configuration data.

Raises:

Type	Description
`FileNotFoundError`	If the YAML file does not exist.
`YAMLError`	If the YAML file is malformed.
`ValueError`	If the top-level YAML structure is not a dictionary.

Example YAML format:

Text Only

    grammar:
        start_symbol: "<expr>"
        max_depth: 10
    initialisation:
        population_size: 100
        method: "ramped_half_and_half"

`section(name)`

Retrieves a configuration section by name.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name of the configuration section.	required

Returns:

Type	Description
`dict[str, Any]`	Dictionary containing the section's key-value pairs.
`dict[str, Any]`	Returns an empty dictionary if the section does not exist.

`to_dict()`

Converts the FinchConfig instance into a dict

Returns: dictionary for the FinchConfig

`to_json(indent=2)`

Converts the GECOnfig instance to json Args: indent: indent for the json

Returns: json string

`ConfigValidator`

Defines validation rules for FinchGE configuration files.

This class contains the schema and validation logic used to verify the structure and values of a FinchGE configuration.

Attributes:

Name	Type	Description
`MANDATORY_SECTIONS`	`set[str]`	Sections that must be present for a valid configuration.
`RECOMMENDED_SECTIONS`	`set[str]`	Sections that are optional but strongly recommended.
`OPTIONAL_SECTIONS`	`set[str]`	Fully optional sections.
`REQUIRED_FIELDS`	`dict[str, set[str]]`	Required fields for each configuration section.
`FIELD_VALIDATORS`	`dict[str, dict[str, Callable]]`	Field-level validators keyed by section and field name.

`finchge.fitness`

`GEFitnessFunction(maximize=False, case_data_key=None)`

Bases: ABC

Base class for a fitness function used in genetic algorithms.

This abstract class defines the interface for evaluating the fitness of a given phenotype. Subclasses must implement the evaluate method.

Parameters:

Name	Type	Description	Default
`maximize`	`bool`	Indicates whether the goal is to maximize (True) or minimize (False) the fitness score.	`False`

`required_context_keys` `property`

Declare what context keys this fitness function needs. Override this in subclasses that need more than y_pred/y_true.

`evaluate(context)` `abstractmethod`

Evaluate the fitness score using the provided context.

Parameters:

Name	Type	Description	Default
`context`	`EvaluationContext`	EvaluationContext Typeddict containing evaluation inputs, such as 'y_pred', 'y_val', 'x_val', etc.	required

Returns:

Name	Type	Description
`Fitness`	`Fitness`	The computed fitness.

`AccuracyFitness()`

Bases: GEFitnessFunction

Fitness function that evaluates model accuracy on a validation set.

This metric computes the classification accuracy between the predicted labels and the ground-truth labels from the validation data. It is intended to be used in supervised classification tasks, and is a maximization objective.

Methods:

Name	Description
`evaluate`	Computes accuracy using 'y_pred' and 'y_true' from the context.

`evaluate(context)`

Evaluates the accuracy of the model's predictions.

Parameters:

Name	Type	Description	Default
`context`	`EvaluationContext`	EvaluationContext Typeddict containing evaluation inputs. Must include: - 'y_pred': Predicted labels from the model (numpy array or array-like). - 'y_test': True labels for the test set (numpy array or array-like).	required

Returns:

Name	Type	Description
`float`	`Fitness`	Accuracy score between 0.0 and 1.0

Raises:

Type	Description
`ValueError`	If y_test and y_pred have different shapes or are empty

`MAEFitness()`

Bases: GEFitnessFunction

Mean Absolute Error fitness for regression problems. Lower values indicate better fit (minimization objective).

`evaluate(context)`

Calculate Mean Absolute Error between true and predicted values.

Parameters:

Name	Type	Description	Default
`context`	`dict[str, Any]`	Must contain: - 'y_true': Ground truth values (array-like) - 'y_pred': Predicted values (array-like)	required

CrossEntropy (log loss) fitness for classification.

Lower values are better (minimize). Requires predicted probabilities.

`HingeLossFitness()`

Bases: GEFitnessFunction

Mean Hinge Loss fitness for binary classification problems. This fitness evaluates how well a model separates two classes by measuring the margin between predictions and true labels. Hinge loss is calculated as hinge_i = max(0, 1 - y_true_i * y_pred_score_i) Should be used only for margin-based classifiers. The ground truth should be encoded between -1 and 1 , not 0/1. Similarly, the prediction scores should be raw model outputs (real-valued scores), representing the model’s confidence or margin, not class labels. The overall fitness value is the mean hinge loss across all samples.

`evaluate(context)`

Parameters:

Name	Type	Description	Default
`context`	`dict[str, Any]`	Context containing `y_true` labels encoded as -1 or +1, and `y_pred_score` with raw model outputs.	required

Returns:

Type	Description
`Fitness`	Fitness value for the hinge loss.

`RewardFitness(maximize=True, optimal_fitness=None)`

Bases: GEFitnessFunction

Fitness function for control problems with reward maximization. Simply returns the reward (higher is better).

Initialize reward fitness.

Parameters:

Name	Type	Description	Default
`maximize`	`bool`	Whether to maximize (True) or minimize (False)	`True`
`optimal_fitness`	`Optional[float]`	Known optimal fitness value	`None`

`evaluate(context)`

Evaluate fitness from context.

Expected context keys

y_pred: Array of rewards from runner

Parameters:

Name	Type	Description	Default
`context`	`dict[str, Any]`	context with y_pred	required

Returns:

Type	Description
`Fitness`	Total reward (sum across episodes)

`StringMatchFitness(target)`

Bases: GEFitnessFunction

`finchge.grammar.repair_strategy`

`RepairStrategy`

Bases: ABC

Abstract base class for phenotype repair strategies.

`repair(phenotype)` `abstractmethod`

Repair the given phenotype and return repaired phenotype

`finchge.core.GrammaticalEvolution(fitness_evaluator, grammar=None, config=None, initialiser=None, algorithm=None, expt_logger=None, checkpoint_manager=None, random_state=None)`

Bases: RandomStateMixin

Grammatical evolution class for running the evolution. This is the evolution controller class responsible for running FinchGE utils. This class is responsible for wiring up different components. This includes loading experiment configurations, loading grammar, initializing the initial population, setting up the fitness evaluator, and using a genetic algorithm to run the evolution. GrammaticalEvolution class is also responsible for running the evolution loop.

Parameters:

Name	Type	Description	Default
`fitness_evaluator`	`FitnessEvaluator`	Evaluator to evaluate the fitness of individuals.	required
`grammar`	`Optional[Grammar]`	BNF Grammar to be used. If not provided config must be available and must contain grammar_file value	`None`
`config`	`Optional[FinchConfig \| dict[str, Any]]`	Configuration settings for the GE algorithm.	`None`
`initialiser`	`GEInitialiser \| GETreeInitialiser \| None`	Initializer class either integer based or tree based initializer	`None`
`algorithm`	`Optional[Any]`	Evolutionary algorithm to be used e.g., GA, NSGA.	`None`
`expt_logger`	`BaseLogger \| None`	Logger used for experiment-level records.	`None`
`checkpoint_manager`	`CheckpointManager \| None`	Checkpoint manager used to save and resume runs.	`None`
`random_state`	`Optional[int]`	Seed for reproducible randomness.	`None`

Note: If algorithm is not provided, GA will be used (provided that config is available)

`halt(min_gens_allowed=0)`

halt signal to stop generation loop : can specify minimum allowed generations, if halt signal received it will wait if min_generations are not satisfied yet, if halt signal is received after minimum generations allowed, will halt immediately

Parameters:

Name	Type	Description	Default
`min_gens_allowed`	`int`	Minimum number of generations before halt	`0`

`run()`

Finds the fittest individual ( or a pareto front for multi-objective) in the population and logs the results.

`finchge.fitness.FitnessEvaluator(fitness_functions, mapper, runner=None, encode_trees=False, parallel_config=None, require_case_data=False)`

Fitness evaluator is used to evaluate the fitness of the individuals. It is a wrapper for fitness function to allow convenient fitness evaluation especially for use cases where data and models are involved. For example in hyperparameter optimization or neural architecture search. The phenotype is converted to model using the 'phenotype_to_model' parameter.

Supports both model-based evaluation (e.g., using train/validation data) and phenotype-only evaluation (e.g., string matching or symbolic tasks).

Parameters:

Name	Type	Description	Default
`fitness_functions`	`Union[GEFitnessFunction, list[GEFitnessFunction]]`	One or more fitness function instances.	required
`mapper`	`GenotypeMapper`	Genotype mapper.	required
`runner`	`PhenotypeRunner \| None`	Runner for evaluating phenotypes.	`None`
`encode_trees`	`bool`	Whether to encode trees to integer genotypes. This is needed if genome-based operators are used after tree-based initialization.	`False`
`parallel_config`	`dict[str, Any] \| None`	Parallel section of the config.	`None`
`require_case_data`	`bool`	Whether case-based evaluation is required, for example when using lexicase selection.	`False`

`count_objectives()`

Counts the number of objectives.

Returns:

Name	Type	Description
`int`	`int`	Number of fitness functions (objectives).

`derive_eval_seed(base_seed, phenotype, env_version)`

Although we share same random state using private RNG shared in different components, It will not work for parallelized evaluation. Each process, or even each machine, will run evaluations in a different order, so random state will fail.

So, we must use some deterministic way to evaluate. One way is to use different seeds in deterministic way Same phenotype and base seed combination should result in same seed in each run.

So we derive a 32-bit seed for evaluation/training in a worker process. Same (base_seed, phenotype, env_version) will give same evaluation seed.

`evaluate_individual(individual)`

Evaluates a given phenotype by training the corresponding model and applying fitness functions.

Parameters:

Name	Type	Description	Default
`individual`	`Individual`	An individual configuration or representation used to generate a model.	required

Returns:

Name	Type	Description
`list`	`None`	A list of fitness scores corresponding to each fitness function.

`evaluate_population(population)`

NEW: Evaluates entire population. Uses parallel execution if enabled, sequential otherwise.

Users should call this instead of manually calling evaluate().

`get_env_version()`

Return current environment version. To distinguish environments in adaptive environments.

`get_fitness_functions()`

Returns the list of fitness function instances.

Returns:

Name	Type	Description
`list`	`list[GEFitnessFunction]`	List of fitness function objects.

`get_maximize_flags()`

Returns the maximize flags for each objective :return:

`get_objective_names()`

Retrieves human-readable names of the objectives.

Returns:

Name	Type	Description
`list`	`list[str]`	List of objective names derived from fitness function class names.

`is_multi_objective()`

Determines if the evaluation is multi-objective.

Returns:

Name	Type	Description
`bool`	`bool`	True if more than one fitness function is provided, False otherwise.

`refresh_mapping(ind)`

Ensure that genotype, phenotype, and tree are consistent.

Rules

If already mapped, do nothing.
If tree exists and genotype is absent:
- reverse-map genotype when encode_trees=True
- otherwise derive phenotype directly from tree
If genotype exists, map through the mapper

`set_env_version(version)`

Update environment version (e.g. generation, dataset update).

`shutdown()` `async`

Cleanup parallel resources

`finchge.utils.cache`

`CacheManager(*, cache_type='lru', experiment_id=None, evaluator_namespace=None, **kwargs)`

Bases: Generic[V]

Central cache manager.

Responsibilities: - key construction (hashing, namespacing) - backend selection (LRU / disk ) - fitness cache access

`finchge.utils.logger`

`BaseLogger`

Bases: ABC

Interface for experiment tracking. Observer Pattern

`on_generation_end(generation, population, best=None, pareto_front=None, fitness_stats=None)` `abstractmethod`

Called at the end of each generation.

`on_run_end(result)` `abstractmethod`

Called when a run ends.

`on_run_start(log_dir, obj_names, config)` `abstractmethod`

Called when a run starts.

`ExperimentLogger(exclude=None, compress_genotypes=False, log_population_samples=False, sample_size=5, custom_log_hook=None)`

Bases: BaseLogger

Experiment logger

Logs essential data for post-hoc analysis. This provides minimal essential logs only. To add more logs, this class can be extended or wrapped to add custom metrics and analysis.

`on_generation_end(generation, population, best=None, pareto_front=None, fitness_stats=None, extra_data=None)`

Log data at the end of each generation.

Parameters:

Name	Type	Description	Default
`extra_data`	`Optional[dict[str, Any]]`	Additional user-provided data to log	`None`

`on_run_end(result)`

Finalize logging at the end of the run.

`on_run_start(log_dir, obj_names, config, algorithm=None)`

Initialize logging when the experiment run starts.

Parameters:

Name	Type	Description	Default
`algorithm`	`Optional[Any]`	Optional reference to algorithm instance for custom logging hooks	`None`

`setup_logging(logger_id='finch', group_name=None, verbose=False)`

Setup logging for a specific project instance.

`get_log_dir(logger_id='finch')`

summary Retrieve the log directory for a given project instance.

Parameters:

Name	Type	Description	Default
`logger_id`	`str`	description. Defaults to "finch".	`'finch'`

Raises:

Type	Description
`KeyError`	description

Returns:

Name	Type	Description
`str`	`str`	description

`finchge.utils.random_mixin`

`RandomStateMixin(*args, random_state=None, **kwargs)`

Mixin to add random state control All classes using randomness either using python random or numpy random should extend from RandomMixin. And, should provide random_state parmeter in the constructor, Otherwise, the determinism is not guaranteed across runs.

`getstate()`

Called when pickling. Removes unpicklable RNGs and stores only the seed.

`setstate(state)`

Called when unpickling. Restores object state and reinitializes RNGs.

`get_seed_info()`

Get information about the seed for logging

`finchge.utils.results`

`ResultHelper(project_id='finch')`

`StatsHelper`

Utility class for computing population-level fitness statistics.

`compute_fitness_stats(*, individuals, objective_names=None)` `staticmethod`

Compute fitness statistics for a population.

Parameters:

Name	Type	Description	Default
`individuals`	`list[Individual]`	List of evaluated Individuals.	required
`objective_names`	`Optional[list[str]]`	Optional list of objective names. If None, objectives will be indexed numerically.	`None`

Returns:

Type	Description
`list[Dict[str, Any]]`	A list of dictionaries, one per objective, containing:
`list[Dict[str, Any]]`	fitness_name
`list[Dict[str, Any]]`	min
`list[Dict[str, Any]]`	max
`list[Dict[str, Any]]`	avg
`list[Dict[str, Any]]`	std
`list[Dict[str, Any]]`	total_count
`list[Dict[str, Any]]`	valid_count

`finchge.runners`

`PhenotypeRunner(random_state=None)`

Bases: RandomStateMixin, ABC

Base class for all phenotype runners.

Every runner must implement run() which returns (predictions, targets) as numpy arrays.

`get_metadata()`

Optional metadata about the executor/problem.

Returns:

Type	Description
`dict[str, Any]`	Dictionary with metadata (can be empty)

`run(phenotype, context_hints=None)` `abstractmethod`

Run phenotype and return context dictionary.

Parameters:

Name	Type	Description	Default
`phenotype`	`str`	Program to evaluate	required
`context_hints`	`Optional[set[str]]`	What context keys the fitness functions need. None means assume minimal (y_pred/y_true only).	`None`

Returns:

Type	Description
`dict[str, Any]`	Dictionary with at least 'y_pred' and 'y_true', plus any hinted keys.

`SymbolicRegressionRunner(data_train, data_val=None, data_test=None, random_state=None)`

Bases: DirectEvalRunner

Runner for symbolic regression.

The phenotype is interpreted as a symbolic expression and evaluated directly on the active evaluation split.

`ControlRunner(env_factory, n_episodes=1, optimal_fitness=None, random_state=None)`

Bases: PhenotypeRunner, ABC, Generic[T]

Base class for control problem runners.

Type parameter T represents the environment type.

Initialize control runner.

Parameters:

Name	Type	Description	Default
`env_factory`	`Callable[[], T]`	Function that creates new environment instances	required
`n_episodes`	`int`	Number of episodes to run per evaluation	`1`
`optimal_fitness`	`Optional[float]`	Known optimal fitness value	`None`

`get_metadata()`

Return metadata about the control problem.

`run(phenotype, context_hints=None)`

Run multiple episodes and return rewards.

`LogicRunner(X, y, interpreter=None, random_state=None)`

Bases: PhenotypeRunner

Runner for logic problems like multiplexer.

Evaluates phenotype on all truth table combinations. Returns predictions and true values for fitness calculation.

`run(phenotype, context_hints=None)`

Evaluate phenotype on all input combinations.

Parameters:

Name	Type	Description	Default
`context_hints`	`Optional[set[str]]`	What context keys the fitness functions need. None means assume minimal (y_pred/y_true only).	`None`
`phenotype`	`str`	Logic program string	required

Returns:

Type	Description
`dict[str, Any]`	dict with context having y_pred, y_true and other context info as requested in context_hints argument

`MLModelRunner(data_train, data_val, model_parser, data_test=None, random_state=None)`

Bases: TrainEvalRunner

Runner for problems where phenotype builds and trains a model. This runner can be used in problems such as hyperparameter optimization and architecture search etc.

The phenotype is parsed into a trainable model or configuration, fitted on the training split, and evaluated on the active evaluation split.

`finchge.utils.checkpoint`

`CheckpointManager`

Bases: Protocol

Checkpoint manager interface.

`exists()`

Return True if at least one checkpoint exists.

`load_latest(*, expected_config_hash=None)`

Load and return the latest checkpoint.

`save(*, generation, population, algorithm, config, py_rng_state, np_rng_state)`

Persist a checkpoint and return the path written.

`should_save(generation)`

Return True if this generation should be checkpointed. Based on 'every' parameter's value, decide whether to save or not.

`CheckpointState(version, config_hash, generation, population, algorithm, rng_state)` `dataclass`

All state required to resume a GE run deterministically.

`FileCheckpointManager(directory, *, every=10, keep_last=5, filename_prefix='checkpoint_gen_')`

Bases: CheckpointManager

Filesystem checkpoint manager using pickle. Population and Algorithm must be pickle-serializable. RNG state is stored to guarantee deterministic resume. Files are written as: checkpoint_gen_[GEN].pkl

`load_latest(*, expected_config_hash=None)`

Load latest checkpoint and verify config hash.

`should_save(generation)`

Return True if this generation should be checkpointed. Based on 'every' parameter's value, decide whether to save or not.

`RNGState(py_state, np_state)` `dataclass`

Serializable RNG states for deterministic resume.

`stable_config_hash(config)`

Create a hash of config for checkpoint compatibility checks.

`finchge.symbolic`

`SymbolicExpression(expression)`

Parse and evaluate a restricted symbolic expression.

The class provides a small, controlled expression language intended for symbolic regression and genetic programming. Expressions are parsed using Python's AST and evaluated using protected numerical operators to avoid NaN/Inf where possible.

Supported features

Variables of the form x0, x1, x2, ...
Basic arithmetic operations (+, -, , /, *)
Selected mathematical functions (sin, cos, exp, log, etc.)
Constant expressions
Optional normalization of NumPy-style column syntax (e.g. x[:, 0] → x0)
Simple structural metrics (node count, depth)

Notes

This is not a full symbolic algebra system. Expressions are evaluated numerically and are not automatically simplified.

Attributes:

Name	Type	Description
`original_expression`	`str`	The input expression string as provided by the user.
`expression`	`str`	Normalized expression string after preprocessing (e.g. slice syntax rewritten).
`variables`	`list[str]`	Sorted list of variable names used in the expression (e.g. ['x0', 'x2']).
`_tree`	`Expression`	Parsed AST representation of the expression.

Raises:

Type	Description
`ExpressionSyntaxError`	If the expression contains unsupported syntax.
`ExpressionEvaluationError`	If evaluation fails at runtime.

`eval(X)`

Evaluate the expression on X.

Parameters

X: Input data. Expected shape is (n_samples, n_features). A scalar is treated as shape (1, 1). A 1D array is treated as a single sample with multiple features.

Returns

np.ndarray Shape (n_samples,).

`GERegressor(grammar, config, fitness_functions, generations=100, population_size=100, random_state=None)`

Bases: RandomStateMixin, BaseEstimator, RegressorMixin

Scikit-learn compatible symbolic regression model based on Grammatical Evolution (GE).

The estimator evolves mathematical expressions defined by a grammar and selects the best-performing individual according to the provided fitness function(s). The resulting symbolic expression can then be evaluated to make predictions.

Attributes:

Name	Type	Description
`ge_result_`	`GEResult`	Result object returned by the GE run.
`selected_individual_`	`Individual \| None`	Individual currently selected for prediction.
`selected_expression_`	`SymbolicExpression \| None`	Parsed symbolic expression corresponding to the selected individual.

Initialize the symbolic regression estimator.

Parameters:

Name	Type	Description	Default
`grammar`	`Grammar`	Grammar defining the search space of valid symbolic expressions.	required
`config`	`FinchConfig \| dict[str, Any]`	Configuration controlling GE behavior (operators, mutation rates, limits, etc.). A dictionary will be converted into a `FinchConfig`.	required
`fitness_functions`	`GEFitnessFunction \| list[GEFitnessFunction]`	Fitness function(s) used to evaluate candidate expressions.	required
`generations`	`int`	Number of evolutionary generations. Defaults to 100.	`100`
`population_size`	`int`	Number of individuals per generation. Defaults to 100.	`100`
`random_state`	`int \| None`	Random seed for reproducibility.	`None`

`fit(X, y)`

Run grammatical evolution to discover a symbolic expression that fits the provided data.

This launches the evolutionary search, evaluates individuals using the configured fitness function(s), and stores the resulting GE run along with the best individual (if available).

Parameters:

Name	Type	Description	Default
`X`	`Any`	Training features.	required
`y`	`Any`	Target values.	required

Returns:

Name	Type	Description
`GERegressor`	`GERegressor`	The fitted estimator.

`predict(X)`

Evaluate the selected symbolic expression on the input data.

Parameters:

Name	Type	Description	Default
`X`	`Any`	Input features.	required

Returns:

Type	Description
`NDArray[float64]`	NDArray[np.float64]: Predicted target values.

Raises:

Type	Description
`RuntimeError`	If the estimator has not been fitted or no individual has been selected.

`select_individual(individual)`

Manually select an individual from the GE population for prediction.

Useful for multi-objective runs where no single best individual is automatically chosen.

Parameters:

Name	Type	Description	Default
`individual`	`Individual`	Individual whose phenotype will be used as the prediction expression.	required

`finchge.grammar.range_handlers`

`ArraySliceHandler()`

Bases: RangeHandler

Handles x[:, 0], x[:, 1], etc. This will be useful if we don't have many columns we can just have a grammar like: ::= x[:, 0] | x[:, 1] However, when the number of columns is higher we may need ArraySliceRangeHandler.

`ArraySliceRangeHandler()`

Bases: RangeHandler

Handles Array slicer with longer range. For example: x[:, 0..4] will expand to x[:, 0], x[:, 1], ..., x[:, 4]

If it's not a range and we just need to pick columns randomly we can use individual array slices.

`CharClassHandler()`

Bases: RangeHandler

Handles character classes: [a-z], [A-Za-z0-9_]

`CharRangeHandler()`

Bases: RangeHandler

Handles character ranges: 'a'..'z' or 'a'..'z' step 2

`NumericRangeHandler()`

Bases: RangeHandler

Handles numeric ranges: 10..50, -5..5, 10..50 step 5

`RangeHandler(priority=0)`

Bases: ABC

Base class for all range handlers. This is the interface for different types of range handlers. There are several range handlers currently supported by finchGE BNFGrammarParser. To add custom range support new range handler can be created and should register by calling.

parser.range_registry.register(MyCustomRangeHandler())

All range handlers must inherit from RangeHandler.

`pattern` `abstractmethod` `property`

Return regex pattern that matches this range type.

`can_handle(token)` `abstractmethod`

Check if this handler can process the token.

`expand(token)` `abstractmethod`

Expand the range into individual symbols.

`RangeHandlerRegistry()`

Registry for all range handlers with priority support.

`expand_token(token)`

Expand a token using the appropriate handler (checking in priority order).

`get_combined_pattern()`

Get a combined regex pattern for all handlers in priority order.

`get_handler(token)`

Find a handler that can process the token (checking in priority order).

`register(handler)`

Register a new handler, maintaining priority order.

`SymbolicVariableRangeHandler()`

Bases: RangeHandler

Handles variable ranges for evaluation .

Replaces ArraySliceRangeHandler. Example: x[0..4] expands to x0, x1, x2, x3, x4 OR x0..4 expands to x0, x1, x2, x3, x4

`SymolicVariableHandler()`

Bases: RangeHandler

Handles individual variables like x0, x1, etc in symbolic regression, supporting easy evaluation with SymPy like libraries.

`finchge.benchmarks`

`Benchmark(random_state=None)`

Bases: RandomStateMixin, ABC

Abstract base class for all benchmarks.

`uses_data()`

Check if this benchmark uses traditional train/test data.

Returns:

Type	Description
`bool`	True if the benchmark implements _generate_data, False for control problems

`finchge.benchmarks.regression`

`KozaQuarticBenchmark(random_state=None, train_samples=None, test_samples=None, x_range=None, train_type='uniform', test_type='grid')`

Bases: Benchmark

`KeijzerBenchmark(version, random_state=None, train_samples=None, test_samples=None)`

Bases: Benchmark

`NguyenBenchmark(version, random_state=None, train_samples=None, test_samples=None, x_range=None, noise_std=None, train_type='grid', test_type='uniform')`

Bases: Benchmark

`VladislavlevaBenchmark(version, random_state=None, train_samples=1024, test_samples=5000, x_range=None, noise_std=None)`

Bases: Benchmark

`finchge.benchmarks.logic`

`MultiplexerBenchmark(version, random_state=None)`

Bases: Benchmark

`finchge.benchmarks.control`

`MazeBenchmark(version='medium', max_steps=None)`

Bases: Benchmark

`grammar()`

Return the Grammar object.

`MazeEnvironment(grid, max_steps=200, step_penalty=-1.0, goal_reward=100.0)`

Bases: ControlEnvironment

`from_version(version='medium')` `classmethod`

Creating env for given version

`get_available_actions()`

Returns only the actions that won't result in hitting a wall or going OOB.

`MazeInterpreter()`

Interpreter for maze navigation programs.

`SantaFeTrailBenchmark(random_state=None, max_steps=None)`

Bases: Benchmark

Santa Fe Trail (Artificial Ant) benchmark.

The ant must navigate a 32x32 grid with 89 food pellets along a winding trail. Fitness is the number of food pellets eaten within the time limit.

`grammar_str()`

Return grammar for Santa Fe Trail problem.

`SantaFeEnvironment(grid, max_steps, total_food, start_pos, start_dir)`

Bases: ControlEnvironment

`from_default()` `classmethod`

Static factory to create environment

`get_action_space()`

Return complete action space.

`get_available_actions()`

Return available actions (all actions always available).

`SantaFeInterpreter()`

Interpreter for Santa Fe ant programs using cleaner grammar.

`evaluate(tokens, observation, pc=0)`

Evaluate tokenized program.

Parameters:

Name	Type	Description	Default
`tokens`	`List[str]`	List of tokens from tokenize()	required
`observation`	`bool`	Current food-ahead sensor value	required
`pc`	`int`	Program counter (index into tokens)	`0`

Returns:

Type	Description
`Tuple[Optional[str], int]`	Tuple of (action, new_pc)

`program_to_string(tokens)`

Convert tokens back to readable program string. Tokens separated by space

`tokenize(program)`

Convert program string to list of tokens.

`CartPoleBenchmark(random_state=None, max_steps=None, n_episodes=1)`

Bases: Benchmark

`grammar()`

Return the Grammar object.

`CartPoleEnvironment(max_steps, gravity, mass_cart, mass_pole, length, force_mag, tau, theta_threshold_degrees, x_threshold, random_state=None)`

Bases: ControlEnvironment

`CartPoleInterpreter()`

Interpreter for cartpole balancing programs.

API Reference

finchge.core.Population(initialiser, population_size)

from_individuals(individuals, population_size) classmethod

finchge.core.Individual(*, genotype=None, phenotype=None, used_genome=None, used_codon_count=0, invalid=False, tree=None)

clone()

from_genotype(genotype) classmethod

from_tree(tree) classmethod

get_meta(key, expected_type=None)

mutated(mutation_strategy)

sort_key(maximize)

finchge.grammar.Grammar(grammar_str, parser=None)

analyze()

compute_min_ramp(*, population_size, max_init_depth)

compute_permutations_by_depth(max_depth)

describe(expanded=True)

from_file(filename, parser=None) classmethod

finchge.grammar.parser

GrammarParser

parse() abstractmethod

BNFGrammarParser(grammar_str)

parse()

remove_comments(lines)

consolidate_multiline_rules(lines)

split_choices(rhs_symbols)

finchge.grammar.rule

Rule(symbol, choices)

finchge.grammar.GenotypeMapper(*, grammar, max_tree_depth=None, max_recursion_depth=None, max_wraps=6, repair_strategy=None, random_state=None)

map(genotype)

reverse_map(tree, *, codon_size=127, pad_to_length=None, pad_mode='random')

finchge.grammar.mapper.MappingResult(phenotype, used_genome, used_codon_count, invalid, tree, tree_str) dataclass

finchge.grammar.derivation_tree.TreeNode(symbol)

add_child(child)

clone()

from_dict(data) classmethod

from_json(json_str) classmethod

from_string(s) staticmethod

replace_subtree_with(new_subtree)

size()

swap_subtree_with(other)

to_dict()

to_json(indent=2)

to_phenotype()

to_string()

finchge.algorithm

BaseAlgorithm(random_state=None)

BaseAlgorithmSO(random_state=None)

BaseAlgorithmMO(random_state=None)

GeneticAlgorithm(selection, crossover, mutation, replacement, elite_size, fitness_evaluator, random_state=None)

evolve_one_generation(population)

SteadyStateGA(selection, crossover, mutation, fitness_evaluator, random_state=None)

IslandGA(num_islands, migration_interval, migration_size, selection, crossover, mutation, replacement, elite_size, fitness_evaluator, random_state=None)

NSGA2(selection, crossover, mutation, replacement, elite_size, fitness_evaluator, random_state=None)

evolve_one_generation(population)

get_pareto_front(population)

sort_population(population)

NSGA3(selection, crossover, mutation, fitness_evaluator, num_divisions=12, random_state=None)

evolve_one_generation(population)

get_pareto_front(population)

finchge.initialisation

RandomGenomeInitialiser(genome_length, codon_size, random_state=None)

from_config(config) classmethod

RVDInitialiser(genome_length, codon_size, population_size, random_state=None, mapper=None)

from_config(config) classmethod

reset()

FullTreeInitialiser(init_min_depth, init_max_depth, strict_full=True, random_state=None)

GrowTreeInitialiser(init_min_depth, init_max_depth, random_state=None)

PIGrowInitialiser(init_max_depth, population_size, random_state=None)

from_config(config) classmethod

RHHInitialiser(init_max_depth, population_size, strict_full=True, random_state=None)

from_config(config) classmethod

PTC2Initialiser(target_size, random_state=None, max_depth=None)

RampedPTC2Initialiser(init_min_size, init_max_size, population_size, max_depth=None, random_state=None)

finchge.operators.selection

GESelectionStrategy(max_best, random_state=None)

TournamentSelection(max_best, tournament_size=3, random_state=None)

select(population_size, individuals)

RouletteWheelSelection(max_best, random_state=None)

select(population_size, individuals)

RankSelection(max_best, selection_pressure=1.5, random_state=None)

select(population_size, individuals)

`finchge.core.Population(initialiser, population_size)`

`from_individuals(individuals, population_size)` `classmethod`

`finchge.core.Individual(*, genotype=None, phenotype=None, used_genome=None, used_codon_count=0, invalid=False, tree=None)`

`clone()`

`from_genotype(genotype)` `classmethod`

`from_tree(tree)` `classmethod`

`get_meta(key, expected_type=None)`

`mutated(mutation_strategy)`

`sort_key(maximize)`

`finchge.grammar.Grammar(grammar_str, parser=None)`

`analyze()`

`compute_min_ramp(*, population_size, max_init_depth)`

`compute_permutations_by_depth(max_depth)`

`describe(expanded=True)`

`from_file(filename, parser=None)` `classmethod`

`finchge.grammar.parser`

`GrammarParser`

`parse()` `abstractmethod`

`BNFGrammarParser(grammar_str)`

`parse()`

`remove_comments(lines)`

`consolidate_multiline_rules(lines)`

`split_choices(rhs_symbols)`

`finchge.grammar.rule`

`Rule(symbol, choices)`

`finchge.grammar.GenotypeMapper(*, grammar, max_tree_depth=None, max_recursion_depth=None, max_wraps=6, repair_strategy=None, random_state=None)`

`map(genotype)`

`reverse_map(tree, *, codon_size=127, pad_to_length=None, pad_mode='random')`

`finchge.grammar.mapper.MappingResult(phenotype, used_genome, used_codon_count, invalid, tree, tree_str)` `dataclass`

`finchge.grammar.derivation_tree.TreeNode(symbol)`

`add_child(child)`

`clone()`

`from_dict(data)` `classmethod`

`from_json(json_str)` `classmethod`

`from_string(s)` `staticmethod`

`replace_subtree_with(new_subtree)`

`size()`

`swap_subtree_with(other)`

`to_dict()`

`to_json(indent=2)`

`to_phenotype()`

`to_string()`

`finchge.algorithm`

`BaseAlgorithm(random_state=None)`

`BaseAlgorithmSO(random_state=None)`

`BaseAlgorithmMO(random_state=None)`

`GeneticAlgorithm(selection, crossover, mutation, replacement, elite_size, fitness_evaluator, random_state=None)`

`evolve_one_generation(population)`

`SteadyStateGA(selection, crossover, mutation, fitness_evaluator, random_state=None)`

`IslandGA(num_islands, migration_interval, migration_size, selection, crossover, mutation, replacement, elite_size, fitness_evaluator, random_state=None)`

`NSGA2(selection, crossover, mutation, replacement, elite_size, fitness_evaluator, random_state=None)`

`evolve_one_generation(population)`

`get_pareto_front(population)`

`sort_population(population)`

`NSGA3(selection, crossover, mutation, fitness_evaluator, num_divisions=12, random_state=None)`

`evolve_one_generation(population)`

`get_pareto_front(population)`

`finchge.initialisation`

`RandomGenomeInitialiser(genome_length, codon_size, random_state=None)`

`from_config(config)` `classmethod`

`RVDInitialiser(genome_length, codon_size, population_size, random_state=None, mapper=None)`

`from_config(config)` `classmethod`

`reset()`

`FullTreeInitialiser(init_min_depth, init_max_depth, strict_full=True, random_state=None)`

`GrowTreeInitialiser(init_min_depth, init_max_depth, random_state=None)`

`PIGrowInitialiser(init_max_depth, population_size, random_state=None)`

`from_config(config)` `classmethod`

`RHHInitialiser(init_max_depth, population_size, strict_full=True, random_state=None)`

`from_config(config)` `classmethod`

`PTC2Initialiser(target_size, random_state=None, max_depth=None)`

`RampedPTC2Initialiser(init_min_size, init_max_size, population_size, max_depth=None, random_state=None)`

`finchge.operators.selection`

`GESelectionStrategy(max_best, random_state=None)`

`TournamentSelection(max_best, tournament_size=3, random_state=None)`

`select(population_size, individuals)`

`RouletteWheelSelection(max_best, random_state=None)`

`select(population_size, individuals)`

`RankSelection(max_best, selection_pressure=1.5, random_state=None)`

`select(population_size, individuals)`

`TruncationSelection(max_best, truncation_threshold=0.5, random_state=None)`