Thursday 6 December 2007

What is evolution? (part 2)

When an organism's genome has changed, in comparison with those of its parent(s), and when that change propagates within a population, evolution may be said to be occurring in the population. Are long term fixations of traits in populations the quanta of evolution? Some changes have no observable effect on the life of the organism, some have visible effects, and some change the organism's reproductive system or behaviour. Groups of organisms evolve which mate very slightly differently from the rest of the population, or are reproductively or ecologically isolated. Over the generations this leads groups which mate more often with each other to resemble each other more than the rest of the population, and to gradually drift, genetically, further from the main population to the point where their DNA no longer recombines with that of the rest of the population at all (or at least very very infrequently). We then might say that there are two species, and a reproductively isolated group of individuals is one of the ways that people have approached the definition of what a species is. The species model maybe the subject of a future blog.

Looking back from the current organisms that are alive, and trying to decipher when the branches occurred - the 'speciation events' that separated ancestral populations into reproductively isolated clades - is the subject of much research and scientific interest. In the plant world, the recent work by Moore et al., 2007 (no relation) develops modern phylogenetic analyses of chloroplast genomic data from angiosperms (flowering plants broadly), in an effort to clarify the basal branching order. They concluded that a rapid basal expansion occurred between 143.8+-4.8 and 140.3+-4.8 Mya, and found support through multiple maximum likelihood analyses for a number of hypotheses of branching order. This time coincides with the Early Cretaceous Berriasian epoch. This epoch was characterised by continued cooling including glaciation at high altitudes, and increased tropical humidity, during the break up of Gondwana, and is overlapped by estimates of the timing of the 1R plant genome duplication, identified in Arabidopsis (DeBodt et al., 2005).

In the animal clade, I read some really interesting work by McPeek and Brown (2007) recently. They integrated molecular and fossil phylogenetic data sets to address the question of whether species richness in a clade is best accounted for by clade age or clade diversification rate. They found that, in animals, clade age is the dominant signal in clade species richness. It will be interesting to see whether the same holds true for plants. I really felt for the researcher who measured all the pictures of trees with calipers. Hopefully, the very recently published work of Laubach and von Haeseler (2007) will make a useful contribution to future studies of this type. Laubach and von Haeseler developed a java application, TreeSnatcher, which semi-automates the process of extracting Newick-format trees by analyzing the tree structure and branch lengths of pixel images of multifurcating phylogenetic trees. Would be interested to try this one out.

My small palaeobotany collection, courtesy of ebay and its participants, now includes a small fossil cone that I haven't managed to take a good picture of. I'm going to try the digital camera through a 10x microscope, we'll see. Any ideas or pointers for strategies to build an evolutionary sample of fossil plant specimens? The database architect in me sees this as a snowflake schema problem. I'm thinking that age, clade, geographic location, plant part, and accessibilty of sample data are the dominant dimensions. What would be the most informative snowflake sampling strategies? Availability of published trees from the literature, and access to an overall time-based taxonomy would be important.

--

De Bodt S, Maere S, Van de Peer Y. (2005). Genome duplication and the origin of angiosperms. Trends in Ecology and Evolution 20:11, 591-597.

Laubach T, von Haeseler A (2007). TreeSnatcher: coding trees from images. Bioinformatics 23, 3384-3385.

McPeek MA, Brown, JM (2007). Clade age and not diversification rate explains species richness among animal taxa. The American Naturalist 169:4, E97-106.

Moore MJ, Bell CD, Soltis PS, Soltis DE (2007). Using plastid genome-scale data to resolve enigmatic relationships among basal angiosperms. PNAS 104:49, 19363-19368.

Sunday 2 December 2007

What is evolution?

I talked in the last post about recombination, one of the evolutionary forces which change DNA sequences over time.  Selection is another, and perhaps the one that most often gets 'blamed' for the evolutionary changes that are observed by scientists.  Mutation and genetic drift are the other two.  Selection tends to get a lot of focus because it can provide a good story - we observe that some DNA has changed, and we seek an explanation, or 'reason' for the change, and selection for increased fitness is a convenient hook to hang our rationality on.  The other three forces, implicated in 'non-adaptive' evolution,  are more difficult to build a good story around, so are not so often considered.  

My favourite evolution textbook at the moment is The Origins of Genome Architecture by Mike Lynch (2007), and he makes this point very forcefully, along with some other very
 interesting observations, in the discursive chapter at the end, entitled Genomfart (you'll have to buy his book or Google it :).  It's a great book and gives clear and detailed coverage, well backed-up by references, to the mechanisms and effects of genome evolution both from the base pair up, and the population down.  Mike is particularly keen to reiterate the point that you can't understand evolution without understanding populations, the level at which genetic drift operates.  I couldn't agree more, and am hoping to start some modeling of populations in the near future, from the bottom up, to learn more about some of the properties of populations that emerge when plants are domesticated.  More of those experiments if and when the grant proposal is accepted :)

In the meantime, I am mostly concerned with some bottom-up effects, and have been looking for SNPs in cotton this week (Single Nucleotide Polymorphisms - effects of the evolutionary force of mutation).  A colleague has been puzzling over something described as 'desi cotton', trying to ascertain its genotype, and one of the places this led me was to a very interesting and useful site, the Multilingual Multiscript Plant Name Database which Michel Porcher runs.  This amazing resource lists the names of plants in over 60 languages and various scripts, and is a really world-class reference.  It's in my favourites list on the right here, because these are one of my favourite kinds of site - places where someone has really put a lot of work over a long period of time into making something comprehensive and authoritative.  When I find a site like this, I can't help wanting to tell people! 

--
Lynch, M.  (2007).  The origins of genome architecture.  Sinauer Associates.