# Course:CPSC312-2021/NaturalSelection

Authors: Jason Dien, Kuanmin Huang, Maria Sottile

### What is the problem?

Modeling population levels of some species with a given birth rate B and death rate D is pretty easy through the formula: N(i+1) = N(i) + B*N(i) - D*N(i)

However, this is trivial to implement in Prolog (and doesn’t make a cool project). It also fails for small number of individuals (< 10) as you quickly end up with a fractional population. We want to build a more interesting natural selection simulation that runs on probabilities and random numbers, rather than rates. The probability that an individual survives, reproduces, or dies will depend on different environmental factors such as the temperature, food availability, and proximity to other species.

1. We modeled three different geographic regions (desert, jungle, and plains biomes), with temperatures, rainfall/humidity level, and the availability of food in that biome. This is also set up in a way that other biomes can very easily be added to expand the scope of the simulation.
2. We have four different species (bear, wolf, rabbit, tortoise) who all prefer a different climates. We also established a food chain (a wolf will eat a rabbit, and a bear will eat everyone else), and simple facts like who is an herbivore or a carnivore. This is also set up such that other species can easily be added to the simulation.

At the topmost level, we give the simulation an environment (or a list of environments) that include information about what biome this is and a list of all the populations of different species that live there. Then, the simulation runs for a given number of days (or time steps) and returns resulting environment (or list of environments) with updated species and population counts to reflect the passage of time.

### What is the something extra?

We added natural language processing so users can query different facts about the model (What is the preferred temperature for a tortoise? What is an animal that hunts a rabbit?), AND lets users run a natural selection simulation dictated in plain text (What is the resulting environments of 1 wolf and 9 rabbit in a desert and 10 rabbit in a plain after 4 days?)

### What did we learn from doing this?

1. This simulation implementation in Prolog was easier to write from the bottom up rather than the top down. We knew what kind of base facts we wanted to start with (i.e. different species interacting with each other and the geographies that they live in), and we knew that we wanted to be able to run a simulation for a group of species in an environment over several days, but we couldn’t even write the top-level function until we had written a lot of the functions dealing with the survival and reproduction probabilities and figured out what kind of data structures we needed to pass all the facts around the program.
2. Prolog is really good at running this kind of simulation. Even though our simulation is more complicated than plugging numbers into a birth rate/death rate ODE, using Prolog’s random number generator is a lot easier than we thought it would be. And it allows us to build up this simulation out of multiple simple formulas that combine to make something a lot more complex, but still easy to work with and understand
3. Actions in Prolog that mimic iterations and recursions can be a bit counter-intuitive at first, but are essentially functionally equivalent and no harder to implement.
4. Prolog is also very good at implementing NLP. With base facts written down and the use of pattern matching, we can allow users to ask info on the model and easily extend what we have learned in class for NLP to allow for more complicated questions to be queried.
5. Prolog can be tricky to debug in certain cases where a bug is not registered as an error in Prolog even though the behavior is clearly broken. An example is when the result contains attributed variables and indefinite cases when there should be one. Since no clear error is thrown, it is hard to trace the source of the problem without a deep understanding of the code. This can make collaboration tricky since the source of the problem might originate from a different section where the error case had not been encountered in testing.