Monday, 23 July 2012

Taxonomy basics (part 3): Drawing a phylogentic tree

As promised, a little methodology on drawing phylogenetic trees. The process is relatively simple to understand - different organisms are grouped according to characters they share. These characters are chosen by the person doing the study, and do need to be defined reasonably carefully. I think a worked example will be best to explain here, so I'm going to go out on a limb and use one from one of my undergrad assignments. The assignment was to draw (by hand) a phylogenetic tree of anything we liked, but it needed to be things, not organisms. I used chocolate bars, and will do so again as it's really good to see how simple characters can group things together.

Basically you need a few simple things to draw a tree. The first are organisms you're studying, and from these you need to derive your characters and their 'states'. To do this, examine the 'organisms' and pick apart their appearance, internal anatomy etc, and these become your characters. You need to be able to give these characters 'states', but these need to be well defined (ie if something is blue and something else red, the character would be colour and the states would be blue and red). For example a character of size might be useful, but if you define the states as 'small, medium or large' then it's an arbitrary measurement and will be impossible for anyone else to use your characters to see how you came up with the tree you did. A better character state would be 0-1.9cm long, 2-3.9cm long, greater than 4cm long (depending of course on what organisms you're describing) as it can be exactly replicated by any other researcher. Generally the more characters you can use the better, I'll list the ones I'll use below shortly.
The other thing that is useful is one or two 'outgroup(s)'. These help root the tree, and ensure the characters you're picking will split closely related organisms rather than working because they're completely independent of everything. For example if you chose red and blue as character states for cars and you had motorbikes and ships as outgroups, it would probably show up as a bad character. Outgroups shouldn't be too far removed from what your studying (eg if your studying cuttlefish then a jellyfish would be an inappropriate outgroup, but an octopus is closely related but different enough to work well).

So for the sweets example, we'll start out with 6 chocolate bars (in this case we'll use a Milky Way bar, Mars bar, Snickers, Violet Crumble, Crunchy and a Flake) and one lolly that isn't a chocolate bar as our outgroup - say a lolly snake.
From these 7 'organisms' we need to find characters that will split them up and hopefully place closely related 'species' together.
So our characters (and their states) might be:
1. Chocolate coated (yes/no)
2. Nougat present (yes/no)
3. Caramel present (yes/no)
4. Honeycomb present (yes/no)
5. Nuts present (yes/no)

So these 5 characters would be enough to split up the chocolates, but probably not enough to completely resolve them to separate 'species'. To draw the tree from these characters, you first need to make up a character matrix. In this case it's pretty simple, as we've got a binary character set (all of the answers are no, which we'll code as a 0, or a yes, which we'll code as a 1). You can have as many states as you need - up to 4 or 5 can work well - but too many may make the character useless. Often matrices are polarized so that all the outgroup scores a 0, but in this case that happened anyway so isn't necessary.

Anyway, our matrix looks like this:
(I used letters for the individual 'species' as it shortens the name and is easier to put on a tree)

So to draw a tree, we start out with whatever character is common to most species. In this case it's being chocolate coated (character 1) which splits the outgroup off immediately (diagram below - sorry for crappy quality, I don't have time to do them properly!). The character used to split the tree at that point can be indicated with a horizontal like and a number, it allows you to see exactly which characters separate species. Everything that occurs after that number has that character, so from the 1 on the tree below, we can see that A does not have any chocolate, but B, C, D, E, F and G do.
So the next step is to do the same thing again, with the remaining characters. The next one shared by most of our chocolates is character 2, which is containing nougat. So this splits off 3 species, shown below:
We keep going the same as before, looking at the next character that is common to BC and D or EF and G (as they're the biggest groups remaining). We'll use character 3 (caramel) next:
And to split the other big group, we'll use character 4 (honeycomb):

And now we'll use our last character, 5 (nuts)

So this is the final tree. It's ok, but not fantastic. We can see that A, E, B, C and D are all separate species, but we've failed to split F and G on the characters we used (both have chocolate and honeycomb, but that's as far as we got). If we were to use genetics as well, we might get further here. If we took brand to equal genus, then F and G would split nicely (VC= Nestle, Crunchy=Cadbury). I'll now colour-code branches according to brand:

Outgroup is green, Cadbury are purple, Nestle red and Mars are black. So in this case the Nestle bar was a good example of convergent evolution - that is two organisms evolving independently in similar environments that end up looking similar but are completely unrelated. It also shows that problems can arise if you use morphology alone - without the 'genetic' info here we couldn't split them at all. Just remember, things that are only one join (node) apart are most closely related (eg in this one C and D are more closely related than D and B or C and B).

This was a reasonably simple example, with only 5 characters across 7 'species' and realistically only one option for the best tree. However if you throw a few more species and a few more characters into the mix, there become multiple options as to what the tree could look like, and because of this it is easiest to use computer programs to try re-combinations hundreds of times rather than doing it by hand.

So that is how you draw a phylogenetic tree. Hopefully it helps in understanding how they come about, and how to read them. You'll come across terms like monophyly and paraphyly, which I might go through in a future post (for a definition right now, google a taxonomic dictionary ;) ), but it should hopefully be easier to understand relationships between species by looking at these diagrams.


One final note: if anyone has any questions or comments, feel free to ask away. I'll try to answer as best I can ;) Also if you've something you're interested in for a post about, let me know and I'll see what I can do... Perhaps next time for something different I'll write about the long-awaited 'talking' trees ;)

Saturday, 14 July 2012

Plant of the week #5

Leptospermum myrsinoides
(heath tea-tree)



A brief description: A common shrubby species throughout it's range, the heath tea-tree grows to about 1-2 meters in height. However it is often found as an understory shrub in Eucalyptus woodlands and is much smaller. The 5-petal flowers are white (sometimes pink) and appear in spring.

Taxonomy: From the same family as the Eucalyptus genus, Myrtaceae, it is one of about 85 species. Most species are endemic to South-Eastern Australia, with one found in New Zealand and another in Malaysia.

Distribution: SA, VIC, rare but reported in NSW (mostly in the SE).

Conservation status: Locally common in SA and VIC, not considered at risk in the wild.

Interesting things about tea-tree: Various species in this genus are commercially important for the garden industry and also for honey producers. Dense plantings of tea-trees are popular as hedges and there are many cultivars that are common garden plants (particularly due to the drought-tolerance of older plants). Honey made from the nectar of some species has been found to have antibacterial and antifungal properties.

Taxonomy basics (part 2): How do you decide what is a species?

The underlying question all taxonomists ask is what exactly is a species?
Perhaps slightly surprisingly, this is a question still without a strict answer. Generally it's thought to be if two groups of organisms can't interbreed, then you have separate species, but this doesn't always work (bacteria, for example, screw up this definition by reproducing asexually).
So aside from a definition debate, how do we work out whether or not something should be named as a new species, and also where it fits in with other currently known species?

There are 2 ways to go about this, and depending on the field you're in, you can use one or the other or (preferably) both.

The first method used is the traditional morphological method. This involves picking apart what the organism looks like and using those traits to group it with things that are similar. It's the method that has been used since the first ancient attempts were made to classify things, and realistically humans have been doing it for centuries. How did we determine that a mouse and a worm are different species? Because they look like they are.
Thankfully methods in morphological study are more advanced than that now. Traits studied might be form and function, as well as internal structures of that plant/animal. However different traits are also given different weighting, according to their importance, as not all characteristics are necessarily useful for classifying animals/plants/fungi. For example red hair in Orangutans and humans would certainly be a trait that would be largely ignored, as there are so many others that are far more important for determining whether or not they're different species.

However it is exactly this sort of uninformative characteristic that makes a purely morphological study potentially unreliable (although good systemics know this and choose better characters/leave out really dubious ones). It is important to recognise this, and make sure your characteristics are reliable, for example colour is usually a bad character for animal species (but is often ok for plants), as it can be really variable within a species.

Let's make a case-study of two similar species, say... the plant I'm working on, Cassytha pubescens (Lauraceae), and a dodder species, Cuscuta australis (Convolvulaceae). The reason I'll use these two is that they're frequently confused, and are a good example of the pros and cons of using morphology alone to describe species.

(Cuscuta image from http://www.natureloveyou.sg/Cuscuta%20australis/Main.html; Cassytha photo is my own) 

Cuscuta australis
Cassytha pubescens






             

So you can see from the above photos that the two plants are superficially similar. They're both rootless, twining parasitic vines that can spread quite rapidly. Single plants can cover rather large areas, with dense mats forming over the host vegetation. Their leaves are so reduced they appear like scales, and they both attach to host plants using haustoria. These traits, without looking at any others, would immediately create an idea that they should be grouped together.

However an immediate difference we can see is the colour - the Cassytha is green, whereas the Cuscuta is yellow. This is due to their different modes of parasitism - Cassytha is a hemiparasite (takes water and the associated dissolved nutrients from its host) and Cuscuta is a holoparasite (takes photosynthates from it's host, which is why it's orange - it doesn't typically photosynthesise much). In addition to this, Cassytha is a perennial plant and Cuscuta is an annual. So on outward appearance we might group them together, but we can already see that their physiology is very different.

Other differences can be easily found in the flowers and fruits of the plants. Typically when classifying plants by their morphology any characters you can find that are not reliant on flowers and fruits are good. This is because vegetative characteristics (eg veins in leaves, or leaf arrangement etc) are there the whole year around (ok, excluding deciduous trees!) but flowers and fruits are only on the plant at certain times of the year. Having said that, flowers and fruits are really important for classifying plants. It can just makes it a pain to identify them in the field if they're not in flower/fruit!

The above examples both have leaves that are so tiny they're called scales - they're only a few mm long and sit flush with the stem at the nodes where the vine branches. So while they can be used for taxonomic purposes, flower and fruit characters are also really useful here.

In the photo of Cassytha, you can see the flowers are yellow, whereas in Cuscuta they're white. Cassytha flowers are tiny, and open about half-way in comparison to Cuscuta flowers. They're anatomically different (which can get complicated, so I won't explain that in detail) and you can also see the distribution of the flowers is different - the position of flowers along the stem bearing them and also the structure of the inflorescence is different - in Cassytha they're borne on spikes and have about 3-5 flowers in a cluster, in Cuscuta they're closer to the stem and have many in tight clusters. The fruits are really different too (although you can't see them in the photos), with Cassytha having fleshy fruits that are vertebrate dispersed, and Cuscuta having light, dry papery fruits.
So all the physiology and flower/fruit characters are pointing to them being very different species, from different families.

There are many other differences between these two plants, I've just picked out a few obvious ones. So you can see how once you start to look closely, sufficient characters can be found to place them into different families.
This is why morphological studies work. For some areas of taxonomy, this is essential - for example palaentology. Morphological studies are  the only method they have to determine which ancient organisms belong together. But for those lucky enough to be working on living things, we can also use the newer, shinier methods created by our new(ish) knowledge of genetics.

So, to explain how we can do this, we turn to genetics 101, dot-point style:
- All living things have DNA (we won't go into the 'is a virus a living thing' debate here)
- This DNA is made up of 4 amino acid bases, Adenine, Guanine, Thymine, Cytosine, known as A, G, T and C.
- These pair up with each other (A with T, G with C)
- These pairs of bases run in sequence and create a double-helix of DNA.
- A specific region of DNA sequence makes up a 'gene' and is responsible for a certain function; ie it might be for blue eyes.
- Because all members of a species must have compatible genes, you tend to get subtle changes between species that allow us to tell them apart if we look at their DNA (if we can figure out which genes are best to look for this in).
- If the DNA can be extracted and sequenced, you end up with a string of A, T, G and Cs. By comparing these between your samples, you can determine if there are large enough differences to call them different species.
- Usually more than one gene is used - the more the better - as there is some variability within species as well as between them.

So from genetics combined with morphology we could, theoretically, end up with two different answers to whether or not our Cassytha and Cuscuta are in fact different species. However usually the two methods complement each other, and neither are without their own problems. The morphology results can be influenced by which characters you choose, or even how you score them, and the genetics could be influenced by how many genes you used, whether you took them from chloroplasts (which are like our mitochondria and only inherited from the 'female' plant) or the nucleus of the cells, or even which genes you picked. However with good experimentation and care these problems can be easily minimised.

Taxonomy isn't really that complicated. Both of these methods rely on the same principle - detecting differences between species based on the characters you chose to use (whether they be genetic or mophological ones). There also ways to represent the relationships between species visually, and these are typically known as phylogenetic trees. The process for making these is simple, but because you need so many characters to justify splits bewteen species, it can get mindbogglingly complicated with all the recombining and maths that becomes involved. This is why we let computers do the legwork for us now.
Having said that, I might make a post on how to construct a phylogenetic tree to finish up the taxonomy thread, as it's a concept that I think is way better understood if you actually make one from scratch, by hand. It's so easy to just plug in numbers and have a computer spit out an answer without actually understanding any of the processes involved.
So that's what I'll talk about next time :)

Tuesday, 10 July 2012

Taxonomy basics (part 1): What is it, and why do it?

Metaphorically speaking, people like to put things in boxes. It's human nature to catagorize things, and plants, animals and fungi are no different. Besides being human nature, it's also a useful tool to help describe something to anyone from any background - for example you might want to describe a leopard on the African savannah. You could try to tell someone that it's a large cat with spots, but a general description such as that won't tell them exactly what it was. If you gave them a species name, in this case Panthera pardus, they would then be able to look up exactly what you meant. This is because of the way species names (often called 'latin' or scientific names) work, with each two-part (binomial) name unique to the species you're trying to describe.

The naming system we currently use was created by Carl Linneus in 1735. Names are typically written in two parts (although there is a whole set of different levels which I'll run through later), with the first word describing the group that species belongs to, and the second the actual species it is. The first part is known as the genus name, and in the case of the leopard it is Panthera, along with jaguars, tigers and lions. So the information you get from the genus is the sort of species it is, and what it is closely related to. Once it's been used the first time the genus name is often abbreviated to the first letter, which abbreviated or written in full, is always capitalised. The whole name should also be written in italics or underlined (it's just the convention that is followed, so it's clear exactly what is the species name).

The second part of the name is the unique name of the species itself. This tells you exactly what your species is, in the case of the leopard it is pardus, which distinguishes it from the jaguars, tigers and lions that make up the rest of the Panthera genus. So in full its binomial name is Panthera pardus, telling you it's a big cat with spots that lives in Africa.
Or is it?

There are other levels of classification in addition to the genus and species levels. There are broader groups, such as family, or even kingdom, and there are also some other lower levels that might include subspecies, breed, or in plants variety or cultivar. These last few are all the same species, they're just different forms of the same species, and they should be able to reproduce with eachother to produce viable offspring. For example going back to the leopard, there are a couple of sub-species you might have. P. pardus sub. pardus is the African leopard, however the International Union for the Conservation of Nature (ICUN, the big-boys in validating species internationally) currently recognise 9 subspecies of leopard, with another 2 possible subspecies described from skulls. These include subspecies from South-East Asia, China, India, Russia and the Middle-East. However in theory as they're only subspecies and may look a little different, if you stuck a Javanese and an African leopard together, they should still be able to breed and create fertile offspring.

Confused? I'll put in the heirachy now then, and use a plant-based example this time, we'll make it the... Sunflower (Helianthus annuus) :)

Kingdom: Is it a plant, animal, fungi, or bacteria? Plant
Phylum: If it's a plant, is it a flowering plant? Seed plant? Seedless vascular plant? Flowering plant (angiosperm)
Class: Be more speciefic? it belongs to the Asterids, a very species-rich group of plants.

Order: Asterales - they have many little flowers clustered together to produce one large 'flower' like inflorescence
Family: Now we're getting to the other closely-related species, it belongs to the Asteraceae family, a group of plants with daisy-like flowers.
Genus: Helianthus - this tells us what it's closest relatives are :)
Species: annuus - tells us exactly what it is

(then there might be subspecies and cultivars or even hybrids listed here)


The other useful thing about two part names is that it allows the actual species name to be re-used. For example people like to name species after other people, so if you get someone famous like Sir Joseph Banks, you end up with a bunch of species named after/in honour of him. So you might have a plant with the species name banksii (for example Banks' Grevillia, Grevillia banksii) but also animal species like the Red-tailed Black Cockatoo (Calyptorhynchus banksii) and the binomial name allows you to differentiate the two because the plant and the cockatoo will have different genus names. Clever eh? It also means our imaginations are not taxed as heavily, as I suspect the number of species in the world would far outweigh our creativeness at naming them ;)

Usually the names you'll see written down will be from Family level down (you can tell it's a family name in botany if it ends in 'ceae', or in zoology if it ends in 'idae'), and in botany at least you should mention who named the species the first time you write the name out in full as well (it allows people to look it up/refer to it etc). Often with plants you'll find the genus and species name, followed by the family name in brackets e.g. Helianthus annuus (Asteraceae). The use of family level ID is good, as it allows you to either see how weird your species is (such as the one that is the subject of my PhD, Cassytha pubescens (Lauraceae) as it is the only Lauraceae that looks and behaves like that) or to look up what it is related to quickly and easily.

So now we know how to write out a descriptive species name and why to do it, the remaining obvious question is how? How do you decide what is a species? And as that's a question with a rather chunky answer, I'll put that in a fresh post later ;)