Hello World in Genetic Algorithms

First I am going to introduce evolution in a funny way just to make the following presentation of the genetic algorithm easier and thereafter I will program a genetic algorithm step by step.
If you know what evolution is and want to go direct to the introduction of genetic algorithms click here.
If you already know what a genetic algorithm is and would like to discover how to program click here.
I will not go into the details of how to compile and run a C program.
If you do not know C please read this before.

Hein?! What is a Genetic Algorithm?

Well, we need to know what is Evolution to answer it. But evolution is a simple thing, that can be explained easily with circles and squares (maybe with a little bit of color too ;) ):

+ =

In the first figure there is a starting population (the colored circles), in the second figure the population is shown at the environment (the squares are the predators and the gray background are the stones where the population lives). Finally in the third figure only the surviving population is shown. The reason why the red and the gray individuals have survived is that the red intimidates the predators and the gray is much more difficult to find amongst the stones.
You might have noticed that this is not a complete evolution yet, because if we leave the surviving population for some time in an environment something will happen, that is:


In short the evolution in a darwinistic point of view is basically Population + Environment = Surviving Population. And Surviving Population after some time = Surviving Population + New Generation. This Surviving Population + New Generation will form the new Population that will interact with the Environment again, going on and on in this cycle. This cycle should be something like:

Ok, but where is the Genetic Algorithm (GA) in all this explanation? Simple, it is an algorithm that tries to evolve things (programs, equations or anything that you would like and that you can describe) in a way similar to evolution. The ideia is to have better adapted things, just like in the evolution, and if these things are better adapted to some conditions than others, by setting the conditions we could somehow guide the evolution. Not that guiding evolution is a trivial, but it is not that difficult.

An Overview

The GA Cycle diagram is:
Associating the GA Cycle with the Evolution Cycle (we had seen before), it can easily be checked that the only different aspect is the presence of evaluation in the GA Cycle. That is, we try to simulate the Environment press to select beings better adapted by an Evaluation function (fitness function).
The GA begins with an Initial Population, that may be a random combination of the characteristics to be evolved. Later it goes through in a process of Evaluation where each being in the population will be evaluated, that is, it will be assigned to every beings a grade based on its characteristics (the function that gives these grades is called the fitness function).
After the Initial Population was avaliated and each individual has a fitness associated to it. This fitness will be taken into account in order to decide some way of reproducting, normally in a way that the better adapted being (better fitness) will influence the New Generation more than the other less adapted. Then by making part of the Initial Population die, we give place to the New Population that will be the breedings of the Initial Population.
We run this cycle until some number of generations are completed or until some condition is satisfied. (You could note here that if we continue to evolve on the fly, we will end up with a coevolution or something near. But this is for an other tutorial ;) ).
You are tired of explanations, are not you? Well, let's stop talking and have some fun programming. :P

Coding

Here is the secret recipe for the magic. First we start creating a file with the name genetic.cpp and inserting a skeleton of C code:

	#include<stdio.h> 
	#include<stdlib.h>
	
	int main()
	{
		
		return 0;
	}
	


Well, we have nothing more than a "ready to compile" program.
Let's imagine we want to evolve some characteristics. We know which characteristics are good, and which ones are not so good to what we want to evolve (in other words, we are trying to maximize a function).
Suppose that what we want to evolve is a lemonade. Yeah.. what is better in the summer? :) (think of a tea if you are reading this in the winter, hehe ;) )
To evolve this lemonade we have some ingredients in the kitchen and we want that the computer tries some mixture of ingredients and tell us of which ingredients a lemonade should be made in the end of the evolution.
The list of ingredients is: salt, sugar, lemon, egg, water, onion and apple. This would be our chromosome, that is the DNA we want to modify in order to have a lemonade.
But we do not want just one chromosome in the code, but a population of chromosomes, thus by setting an Initial population of 10 randomly chromosomes, we will have:ps2 gameplay bleach

	int i,j,g;	//counters
	int population=10;
	bool chromosome[10][7];	//population

	//initializing population
	for(i=0;i<population;i++)
	{
		for(j=0;j<7;j++)
		{
			//randomize the chromosome 
			chromosome[i][j]=rand()%2;			
		}
	}
	

In the code above the chromosome was set to bool, because we would like to know simply which ingredients to put in the lemonade, thus 0 means not put and 1 means put the ingredient in the lemonade.
We need a Fitness Function to evaluate the population of chromosomes. Then if the ingredients that are present in the lemonade receive a positive coeficient and those that are not present receive a negative coeficient by maximizing this function (selecting the chromosomes with greater fitness) we would guide the evolution to the lemonade. Here we set a function to calculate the fitness:

	int fitness(bool* chromosome)
	{
		// the ingredients are:  0     1     2     3   4    5     6
		// 			salt sugar lemon egg water onion apple
		return ( -chromosome[0] + chromosome[1] + chromosome[2]	
			 -chromosome[3] + chromosome[4] - chromosome[5]	
			 -chromosome[6] );	
	}
	

Here goes the evolution loop:

	for(g=0;g<100;g++)
	{
		printf("generation %d\n",g);

		//Evaluation
		
		//Reproduction

	}

	

Where we evolve the GA for 100 generations and stop to check what kind of chromosomes we get. (In our case what kind of lemonade we have, hehe).
Now working inside the Evolution loop, the Evaluation will be:


	//Evaluation
	int best=0;	//will store the index for the best in the population

	for(i=1;i<population;i++)
	{
		if(fitness(chromosome[best])<fitness(chromosome[i]))
			best=i;	
	}
			
	

In the code above the best chromosome is selected among the population. And the code below use Elitism to reproduce:


	//Reproduction
	for(i=0;i<population;i++)
	{
		//to not reproduct with itself
		//it would be a waste of time 
		if(i!=best)
		{
			for(j=0;j<7;j++)
			{
				//either the gene will be kept the 
				//same or it will be changed
				//to be what the best chromosome is
				if(rand()%2)
					chromosome[i][j]=chromosome[best][j];			
				else
					chromosome[i][j]=chromosome[i][j];			

				//mutation	
				if(rand()%100<4)
					chromosome[i][j]=rand()%2;			
					
			}
						
		}	
	}
	
	printf("best fitness %d\n",fitness(chromosome[best]));
					
			
	

The base of elistism is to select the best and make it reproduce with all of the others (It is like the sexual relationship between bull and cows, where the best bull reproduce with all cows). But we need to put some Mutation in the reproduction to increase the population diversity. Notice also that the name for this kind of operation where 2 chromosomes (in this case, the best one and some other) give birth to the new generation is Crossing Over.

Well, that is it... we made it! if you run you should see the fitness growing til it reaches the number 3, that is the maximum, since only 3 genes are positive in our function.
Here you can download the code.

Further Information: