Monday, 30 August 2010

The crop genomes cometh

There has been a sudden rash (well two!) of published whole crop genomes this week, the fruit of the improvements in sequencing efficiency brought to us by the '2nd gen.' sequencing technologies.

The wheat genome will prove very interesting, as it's the first huge multi-ploid crop genome to be sequenced. Wheat is thought to be a hexaploid, evolved from a hybrid of three different ancestral cereal genomes. Assembly of the small fragments of a genome like this is a bioinformatician's nightmare. First, such a huge genome contains enormous numbers of repeat regions due to proliferation of 'selfish' DNA transposable elements. In addition, the genome contains several copies of most of the genes, originating from the three ancestral genomes, and figuring out which is which is rather tricky. How did the group who have published manage to do it? Well I'm not sure they actually did. It seems that what has been published is a set of unassembled 454 short reads providing 5-fold coverage of the genome as a whole. Millions upon millions of short fragments. Very useful information, but not quite a whole genome. We will see what we can make of that.

It was also announced that the Golden Delicious apple genome has been sequenced. Apples are evolutionarily very interesting, as they are very highly heterozygous - rampant outcrossers - which is why they never come true from seed. All the real Golden Delish out there are clones of a single genome that grew up in America, possibly from a seedling spread in waste from cider presses. But that's another story for another time.

Thursday, 6 December 2007

What is evolution? (part 2)

When an organism's genome has changed, in comparison with those of its parent(s), and when that change propagates within a population, evolution may be said to be occurring in the population. Are long term fixations of traits in populations the quanta of evolution? Some changes have no observable effect on the life of the organism, some have visible effects, and some change the organism's reproductive system or behaviour. Groups of organisms evolve which mate very slightly differently from the rest of the population, or are reproductively or ecologically isolated. Over the generations this leads groups which mate more often with each other to resemble each other more than the rest of the population, and to gradually drift, genetically, further from the main population to the point where their DNA no longer recombines with that of the rest of the population at all (or at least very very infrequently). We then might say that there are two species, and a reproductively isolated group of individuals is one of the ways that people have approached the definition of what a species is. The species model maybe the subject of a future blog.

Looking back from the current organisms that are alive, and trying to decipher when the branches occurred - the 'speciation events' that separated ancestral populations into reproductively isolated clades - is the subject of much research and scientific interest. In the plant world, the recent work by Moore et al., 2007 (no relation) develops modern phylogenetic analyses of chloroplast genomic data from angiosperms (flowering plants broadly), in an effort to clarify the basal branching order. They concluded that a rapid basal expansion occurred between 143.8+-4.8 and 140.3+-4.8 Mya, and found support through multiple maximum likelihood analyses for a number of hypotheses of branching order. This time coincides with the Early Cretaceous Berriasian epoch. This epoch was characterised by continued cooling including glaciation at high altitudes, and increased tropical humidity, during the break up of Gondwana, and is overlapped by estimates of the timing of the 1R plant genome duplication, identified in Arabidopsis (DeBodt et al., 2005).

In the animal clade, I read some really interesting work by McPeek and Brown (2007) recently. They integrated molecular and fossil phylogenetic data sets to address the question of whether species richness in a clade is best accounted for by clade age or clade diversification rate. They found that, in animals, clade age is the dominant signal in clade species richness. It will be interesting to see whether the same holds true for plants. I really felt for the researcher who measured all the pictures of trees with calipers. Hopefully, the very recently published work of Laubach and von Haeseler (2007) will make a useful contribution to future studies of this type. Laubach and von Haeseler developed a java application, TreeSnatcher, which semi-automates the process of extracting Newick-format trees by analyzing the tree structure and branch lengths of pixel images of multifurcating phylogenetic trees. Would be interested to try this one out.

My small palaeobotany collection, courtesy of ebay and its participants, now includes a small fossil cone that I haven't managed to take a good picture of. I'm going to try the digital camera through a 10x microscope, we'll see. Any ideas or pointers for strategies to build an evolutionary sample of fossil plant specimens? The database architect in me sees this as a snowflake schema problem. I'm thinking that age, clade, geographic location, plant part, and accessibilty of sample data are the dominant dimensions. What would be the most informative snowflake sampling strategies? Availability of published trees from the literature, and access to an overall time-based taxonomy would be important.

--

De Bodt S, Maere S, Van de Peer Y. (2005). Genome duplication and the origin of angiosperms. Trends in Ecology and Evolution 20:11, 591-597.

Laubach T, von Haeseler A (2007). TreeSnatcher: coding trees from images. Bioinformatics 23, 3384-3385.

McPeek MA, Brown, JM (2007). Clade age and not diversification rate explains species richness among animal taxa. The American Naturalist 169:4, E97-106.

Moore MJ, Bell CD, Soltis PS, Soltis DE (2007). Using plastid genome-scale data to resolve enigmatic relationships among basal angiosperms. PNAS 104:49, 19363-19368.

Sunday, 2 December 2007

What is evolution?

I talked in the last post about recombination, one of the evolutionary forces which change DNA sequences over time.  Selection is another, and perhaps the one that most often gets 'blamed' for the evolutionary changes that are observed by scientists.  Mutation and genetic drift are the other two.  Selection tends to get a lot of focus because it can provide a good story - we observe that some DNA has changed, and we seek an explanation, or 'reason' for the change, and selection for increased fitness is a convenient hook to hang our rationality on.  The other three forces, implicated in 'non-adaptive' evolution,  are more difficult to build a good story around, so are not so often considered.  

My favourite evolution textbook at the moment is The Origins of Genome Architecture by Mike Lynch (2007), and he makes this point very forcefully, along with some other very
 interesting observations, in the discursive chapter at the end, entitled Genomfart (you'll have to buy his book or Google it :).  It's a great book and gives clear and detailed coverage, well backed-up by references, to the mechanisms and effects of genome evolution both from the base pair up, and the population down.  Mike is particularly keen to reiterate the point that you can't understand evolution without understanding populations, the level at which genetic drift operates.  I couldn't agree more, and am hoping to start some modeling of populations in the near future, from the bottom up, to learn more about some of the properties of populations that emerge when plants are domesticated.  More of those experiments if and when the grant proposal is accepted :)

In the meantime, I am mostly concerned with some bottom-up effects, and have been looking for SNPs in cotton this week (Single Nucleotide Polymorphisms - effects of the evolutionary force of mutation).  A colleague has been puzzling over something described as 'desi cotton', trying to ascertain its genotype, and one of the places this led me was to a very interesting and useful site, the Multilingual Multiscript Plant Name Database which Michel Porcher runs.  This amazing resource lists the names of plants in over 60 languages and various scripts, and is a really world-class reference.  It's in my favourites list on the right here, because these are one of my favourite kinds of site - places where someone has really put a lot of work over a long period of time into making something comprehensive and authoritative.  When I find a site like this, I can't help wanting to tell people! 

--
Lynch, M.  (2007).  The origins of genome architecture.  Sinauer Associates.

Thursday, 29 November 2007

What is a plant? (part 2)

Plants are eukaryotes.  'Eu' from the Greek for true and 'karyote' for nut, meaning the nucleus.  Eukaryotes have cells with a nucleus containing their genome in the form of DNA organised in chromosomes.  Eukaryotes, all the plants, animals and other living things, except bacteria and archaea, are thought to have evolved from a single common ancestor, or in other words, eukarya are monophyletic.  Prokaryotes (bacteria and archaea which don't have a nucleus) have their DNA organised differently.  This all sounds very simple and clear, except that things are a bit more complicated (when in life are they not?).  When I said that the eukaryote nucleus contains the genome, this is only partly true.  There are some other organelles (structures within eukaryote cells) that also contain DNA.  The mitochondria, which generate energy for eukaryote cells contain DNA and, in the plants, the chloroplasts, which carry out that most planty of processes, photosynthesis, also contain DNA.

When organisms reproduce sexually, nuclear DNA from two parental cells comes together to make a 'daughter' genome, but the same thing doesn't happen for the organelle DNA - organelle DNA is generally inherited directly from the mother 'egg' cell.  This makes for some very different evolutionary trajectories for the different sets of DNA.  Recombination features strongly as an evolutionary force in nuclear DNA (due to the 'coming together'), but organelle DNA does not recombine in meiosis.  Recombination is still a force in organelle DNA evolution though, but through a different mechanism.  But we need to backtrack a little first, and look at where the organelles came from in the first place.

There's good evidence that the organelles did not start out as part of eukaryote cells.  It seems that they were originally free-living prokaryotes, that either invaded, or were subsumed by an ancient proto-eukaryote.  Somehow, rather than eating, or being eaten by one another, the prokaryote took up residence inside the proto-eukaryote, in either a symbiotic or parasitic arrangement.  We can tell this happened because of the similarities of the DNA in the organelles we find in eukaryotes today, and the DNA in the genomes of some prokaryotes.  The 'host' cell provides some kind of shelter or nutrients and the prokaryote provides energy or photosynthesis.  The question of when this event happened, and whether it happened once for a single cell, from which every eukaryote alive today has evolved, is open to question.

So now we know how the organelles got where they are, living inside the eukaryote cells, back to their DNA, and how recombination acts upon it.  Over the millennia since the endosymbionts took their places, there is very good evidence () that some of the genes from the original nuclear genome have migrated to the organelles, and some of the DNA from the original prokaryote have migrated from the organelle genomes to the nuclear genome of the host cells.    The reason why we can be so sure that this happened is that some of the nuclear genes in some eukaryotes look very much like some prokaryote genes, and some organelle genes look very much like some eukaryote nuclear genes.  There is evidence that this process is still happening in the flowering plants (Adams et al., 2002)

So we have said we know these things about the way genomes are structured, because we have found segments of DNA in different places that are too similar to have arisen by chance, so must have evolved from a common ancestral segment of DNA.  We call this homology - the fact that two biological features share similarity through descent.  But couldn't these things have arisen independently of each other? Isn't it possible that the same DNA evolved twice in separate organisms at random?  Isn't it possible that some gene sequences are so useful that they evolved twice independently by the force of natural selection?  Tricky questions indeed.  There are strong theoretical methods for determining whether we should accept or reject the hypothesis of homology in any given case.  Maybe I'll save the theory for another post on another day.

For now, I'll come back to what a plant is.  There's something about being a eukaryote that seems to help organisms grow bigger and more complex, rather than the apparently simple organisms that are the bacteria and archaea.  Some plants are single cells, but many have adopted multicellularity as a way of getting their heads above the crowd, into the light.

So plants are eukaryotes, with cells having a genome in their nuclei, and having tamed prokaryote chloroplasts harvesting all that lovely sunlight to help them grow and develop. We've also touched on what evolution is here.  I mentioned recombination (where pieces of DNA are exchanged between one DNA segment and another).  Generally DNA is a molecule that can reproduce itself, but if it always reproduced itself identically, there would be no such thing as evolution.  Recombination is one of the forces that act on DNA to cause it to evolve.  The DNA message may be different after the recombination occurs than it was before, and if the DNA is different then the organism might be different.  I'll come back to the other three evolutionary forces another time.

To finish for today, I'd like to share a very interesting paper about recombination that I came across.  I talked about recombination that occurs in meioses, and about recombination between nuclear and organelle DNA, but it does not stop there by any means.  Horizontal gene transfer is one of the other kinds of recombination, that is, the exchange of DNA between non-mating organisms.  Many bacteria regularly swap plastids with each other, and viruses recombine both with each other and with their hosts' DNA.  Single-celled eukaryotes are also known to engage in horizontal gene transfer, but we don't often come across instances of recombination between the genomes of completely unrelated complex eukaryotes.  Richardson and Palmer (2007) review horizontal gene transfer in plants, which has been found in a number of cases between mitochondrial genes in particular.  The extreme example is a very interesting plant, Amborella trichopoda, whose mitochondria may contain more 'foreign', or horizontally transferred genes, apparently 'captured' from a wide range of other mosses and plants, than 'native' genes.  A. trichopoda is endemic to the pacific island of New Caledonia, where it grows as a shrub in the understory of the forest, often covered with mosses, and other epiphytic  plants.  It seems possible that these plants have infiltrated the tissues of our shrub, in the past, and thereby transferred genes, but this isn't really clear, and although some of these 'foreign' genes are expressed in the living shrubs, it also isn't clear whether they are functional  It will be interesting to find out!

--
Adams KL, Qui YL, Stoutemyer M, Palmer JD,  2002.  Punctuated evolution of mitochondrial gene content: high and variable rates of mitochondrial gene loss and transfer to the nucleus during angiosperm evolution.  Proceedings of the National Academy of Sciences, USA 99: 9905-9912.

Richardson AO, Palmer JD.  2007.  Horizontal gene transfer in plants.  Journal of Experimental Botany 58(1): 1-9.

Tuesday, 27 November 2007

What is a plant? (part 1)

Plants are green aren't they? And they grow, and they don't walk about.  Well mostly...  As well as the green plants, Glaucophyta and Rhodophyta (red algae) are usually regarded as plants, but I'm going to leave them aside for the rest of this posting and focus on the green plants.

In the introduction to my PhD thesis I wrote that 'Green plants are characterised as containing chlorophyll a and b, storing photosynthetic products such as starch inside chloroplasts, and having cell walls made of cellulose (McCourt et al., 1996)'. In retrospect, this is a reasonable-ish definition, but is limited to describing shared morphological features, and doesn't necessarily speak to aspects of phylogeny and most recent common ancestry.  The word 'synapomorphy' might have been more specific, implying that the features were inherited from a common ancestor.  The Tree of Life web project (http://www.tolweb.org/green_plants) contains a similar morphological definition and also circumscribes the green plants as 'all organisms commonly known as green algae and land plants, including liverworts, mosses, ferns and other nonseed plants, and seed plants'. That page also has a great list of links and references. Now we know what we're talking about. Kind of.

Palaeos.org (http://www.palaeos.org/) takes quite a nice look at what a plant is, and makes an attempt at situating plants in time as well as space, explicitly delimiting the group of organisms (Chlorobionta) which we can call green plants, and which we think has a single common ancestor. They also observe that this involves the use of the taxonomic hypothesis of common ancestry, which, as a hypothesis, may well turn out to be incorrect, despite strongly supported phylogenies.

So now, as well as talking about what a plant is made of, and what kinds of different shapes and sizes they come in, we raise the question of time. When did the first common ancestor of all the things we call plants arise, and also, what (and when) was the first thing that we might recognise as a plant (the two might possibly be different). That can be another story for another day.

For now, I'd just like to share with you my favourite green algae, Aegagropila linnaei, or Marimo, as the Japanese know them, or Kuluskitur in Icelandic. They used to be called Cladophora aegagropila, when people thought they were closely related to the Cladophora seaweeds, but molecular evidence said otherwise (Hanyuda et al., 2002). Marimo are different from many plants, as they live under water (like lots of the green algae though). In fact, they live in only a few lakes in the far northern hemisphere, in cold, shallowish, brackish waters. Also, unlike lots of plants they are not anchored to a surface, they are free-living.

They grow in the form of green balls, up to several inches across, and either roll around on the bottom of the lake, or sometimes, photosynthesise and generate bubbles of oxygen which allow them to float up to the surface. Unlike their close relatives, the Cladophora seaweeds, they express chitin as part of their cell walls, so are quite 'crispy'. Also, unlike many plants, they seem able to shut down photosynthesis in the absence of light for long periods, then rapidly reform the chloroplasts when light is available again (Yoshida et al., 1998).

They are a protected species in both Japan and Iceland, but do pop up on ebay from time to time. I'm not sure of the original source of the ones for sale. In the 1990s there was a journal called 'Marimo Research', but it seems to have disappeared without a trace, and I haven't been able to get hold of a copy. If anyone knows where to find it, I'd be very pleased to know.

My pet marimo live in this jar, in ordinary mineral water from the supermarket. The water hasn't been changed for a year or so, and is still clear, and they are still growing happily, but I have seen recommendations that you change the water monthly, and massage the marimo to help them stay clean. They are somewhat sacred beings with many myths and legends surrounding them. A little bit fell off one, and I looked at it under my old russian microscope, you can see the chloroplasts quite clearly. This second pic was taken with a digital camera through the eyepiece.


To be continued...


--
Hanyuda T, Wakana I, Arai S, Miyaji K, Watano Y,  Ueda K.  2002. Phylogenetic relationships within Cladophorales (Ulvophyceae, Chlorophyta) inferred from 18S rRNA gene sequences with special reference to Aegagropila linnaei. Journal of Phycology 38: 564-571.

McCourt RM, Chapman RL, Buchheim M, Mishler BD. 1996. Green plants. Version 01 January 1996 (under construction). http://tolweb.org/Green_plants/2382/1996.01.01 in The Tree of Life Web Project, http://tolweb.org/.

Yoshida T, Horiguchi T, Nagao M, Wakana I, Yokohama Y. 1998. Ultrastructural study of chloroplasts of inner layer cells of a spherical aggregation of “Marimo” (Chlorophyta) and structural changes seen in organelles after exposing to light. Marimo Research 7: 1-13.

Monday, 26 November 2007

In the beginning...

How to start a blog about plant evolution?

To introduce myself, I'm a researcher in plant evolution, with particular interests in crop domestication, and in polyploidy and transposable element dynamics.  I approach the subject from a deeply-held and long-standing interest in botany, and an almost equally deeply-held and long-standing interest in getting computers to do interesting things.

This blog will be a record of interesting things about plant evolution that I come across from time to time, interesting facts that come from live research projects, and whatever else seems to fit.

I don't take a position on whether plants were/are designed by a superior intelligence, supernatural or otherwise.  I think there are some problems with definitions of superior, intelligence and natural that would get in the way of a serious argument about whether our plants were designed by one.  Having said that, I think that the combined wisdom of a thousand generations of foragers, gardeners, farmers and plant breeders could legitimately be called a superior intelligence.

I do think that variations in DNA and morphology within and between individuals and populations can tell us some really interesting things about evolution, and I think that we are living in an exciting time in the search for knowledge of how recombination, mutation, selection, and genetic drift have shaped the plants our lives depend on.  But enough of that for now ...

I hope you enjoy reading this blog and return from time to time.  I hope you find somethings of interest, and feel moved to add your thoughts.