Systematics and Biogeography: October 2007

Such expressions as that famous one of Linnæus, and which we often meet with in a more or less concealed form, that the characters do not make the genus, but that the genus gives the characters, seem to imply that something more is included in our classification, than mere resemblance. I believe that something more is included; and that propinquity of descent,—the only known cause of the similarity of organic beings,—is the bond, hidden as it is by various degrees of modification, which is partially revealed to us by our classifications (Darwin, 1859, p. 413f).

Monday, 29 October 2007

The Great Phenetic Revival 2 Revisted: A Reply to Felsenstein

Recently on our blog we received a reply to our posting on The Great Phenetic Revival 2: Phenetics from Joseph Felsenstein (University of Washington). We thought it would be a pity to relegate our reply to the comments section and instead include it as separate post.

Felsenstein claims not to have been "trying to give the history of "phylogenetic methods" in his chapter 10. Nevertheless, this seems not to have prevented him from making sweeping (and damning) statements concerning classification - some published before the publication of his book: "The focus of systematics has shifted massively away from classification: it is the phylogenies that are central, and it is nearly irrelevant how they are then used in taxonomy" (Felsenstein 2001: 467), "Systematists get so worked up declaiming the centrality of classification in systematics that I have argued the opposite' (Felsenstein in Franz 2005, p. 495); others see things in much the same light: "Many phylogeneticists now see nomenclature and classification as largely irrelevant to phylogenetics..." (Hillis 2007: 331).

Still, Felsenstein sees himself as commenting only upon "algorithmic methods", when, of course, any method proposed can be made 'algorithmic' and many attempted to do so in constructing early versions of data matrices, way before Sneath or Sokal (see figure above as well as Tillyard 1919, Abel 1910 and Willman 2003).

Cladistics and phenetics might (erroneously) be seen as methods. Felsenstein wished to drop the terminology: "Making this distinction [between phenetics and cladistics] implies that something fundamental is missing from the 'phenetic' methods, that they are ignoring information that the 'cladistic' methods do not. In fact, both methods can be considered to be statistical methods, making their estimates in slightly different ways ... In this book we will give the terms 'cladistic' and 'phenetic' a rest and consider all approaches as methods of statistical inference of the phylogeny" (Felsenstein 2004: 145-146). Our comment on Felsenstein's wish to drop the terms 'cladistic' and 'phenetic', was "to grant equal time to all quantitative (numerical) methodologies", which now leaves us puzzled as to what, exactly, in this passage was "an outrageous misrepresentation of the content of my Chapter 10".

Further, "... numerical phylogenetics is not 'based on simple similarity'. It just isn't. There is no way you can compute either a parsimony tree, or a likelihood tree, from a table of similarities between species". What, then, is it based upon? The matrices that grace our systematic accounts certainly look to us as if they are sets of similarities.

To many (us included), cladistics was about the reform of palaeontology rather than the elaboration, support and promotion of one kind of method or another. That reform began in the 1960s almost entirely independent of the numerical development of data manipulation, of which the latter manifests itself as the ever present pernicious influence of phenetics (regardless of that manifestation as 'parsimony', 'compatibility', 'likelihood', etc.). Felsenstein doesn't mention palaeontology in his history chapter but does later in "Phylogenies and Paleontology" (Felsenstein 2004: 547 et seq.). Here his imprecision seems a little troubling: "...If the fossil record of a group has been searched thoroughly enough, then we should not only be allowed to interpret fossils as ancestors, we should be encouraged to do so" (Felsenstein 2004, p. 547) - searched thoroughly enough; we should not only be allowed to...we should be encouraged to do so. How thorough is enough? And since when has the scientific endeavour required 'permission' to be 'allowed' and 'encouraged' to 'believe' something? It was with such 'beliefs' that the first cladistic revolution was necessary. It is from the ever present phenetics that the second cladistic revolution will (eventually) be born.

References
Abel, O. 1910. Kritische Untersuchungen über die palaogenen Rhinocerotiden Europas. Abhandlungen Kaiserlich-Koenigliche Geologische Reichsanstalt 20: 1-22.
Felsenstein, J., 2001. The troubled growth of statistical phylogenetics. Systematic Biology 50: 465-467.
Felsenstein, J., 2004. A digression on history and philosophy. In: Felsenstein, J. (Ed.), Inferring Phylogenies. Sinauer Associates, Sunderland, MA, pp. 123-146.
Franz, N. 2005. On the lack of good scientific reasons for the growing phylogeny/classification gap. Cladistics 21: 495-500.
Hillis, D. M. 2007. Constraints in naming parts of the Tree of Life. Molecular Phylogenetics and Evolution 42: 331-338.
Tillyard R. J., 1919. The panorpoid complex. Part 3: the wing venation. Proceedings of the Linnean Society of New South Wales 44: 533-717.
Willman, R. 2003. From Haeckel to Hennig: the early development of phylogenetics in German-speaking Europe. Cladistics 19: 449-479.

The Great Phenetic Revival 3: DNA Barcoding

DNA Barcoding is not often directly associated with phenetics or numerical taxonomy. Given that it is without any methodological or theoretical foundation, DNA Barcoding has very little to with anything associated with taxonomy or comparative biology. Then we happened upon Sokal & Rohlf (1970) The Intelligent Ignoramus, an Experiment in Numerical Taxonomy published in Taxon (19: 305-319).

What is striking about Sokal & Rohlf's "Intelligent Ignorumus" is that it totally undermines our in-built ability to classify, even groups we do not know. Any one from the southern hemisphere would know how to group "Song Birds" based on characters that have not been pointed out to them. It is what we do naturally. Consider the European Robin (Erithacus rubecula), the American Robin (Turdus migratorius) and the European blackbird (Turdus merula). Based on their common names we group them as Blackbird (E. Robin, A. Robin). But if we look at them, it becomes clear that the European Blackbird looks like an American Robin. See for yourself. You don't need a detailed list of pre-defined characters; it is what we do naturally. But Sokal & Rohlf don't think so.

The Intelligent Ignoramus is a simpleton. They are "unprogrammed" with little to no training in biology, meaning the are unable to identify the organisms before them as "bees" and therefore are presumed to lack the ability to classify bees in general. Sokal & Rohlf's test was to assess: "The feasibility of using technicians untrained in taxonomy to collect data for use in numerical taxonomic studies .. [and ] ... the analysis of their descriptions were compared with the data of Michener and Sokal (1957)" (Sokal & Rohlf 1970: 305). The results astounded the authors.

The two technicians managed to find a similar classification of bees. Anyone looking at bees for the 170 hours (as mentioned in the results) would obtain a general knowledge of bee morphology. Obviously this time would be remarkably reduced if a trained taxonomist had pointed out the relevant morphology and the characters that relate various bee groups. Instead of seeing the obvious, Sokal & Rohlf make an enormous blunder. What if the technicians had measured all the specimens, that is quantifying the qualities that helped them group the bees? Quantities can be standardized and therefore automated. In fact, one "... could hire teams of technicians to study the specimens, make the necessary measurements, and record the data and perhaps even select the characters themselves. One step beyond this would be to automate the entire process completely" (Sokal & Rohlf 1970: 318).

The future of taxonomy envisioned by Sokal & Rohlf - groups of "Intelligent Ignoramuses" coding taxa to be processed by numerical methods - is typical of phenetics and their attempts to remove taxonomists from doing what they do best. Now, all of this is starting to sound familiar. "Intelligent Ignoramuses" that can identify and classify taxa without the burden of taxonomic training have reappeared in the guise of DNA Barcoding (identification) and the Phylocode (classification).

Can we re-label DNA Barcoding as Phenetic? Usually phenetic methods or techniques are considered, but rarely do we ever identify phenetic ideas or intentions. The Great Phenetic Revival is the revival not only of phenetic methods but the ideas endorsed by early phenetists like Sokal & Rolf. Read phenetics as Numerical Taxonomy and one quickly realizes that it's about numerical data - quantities, not qualities - and about obtaining such data mechanically and processing it quickly. The taxonomist is not a machine. He or she does not seek to provide measurements. The aim is to discover homologies. The same is true for classifications. They are based on monophyly, not on some general rule of classification. Unfortunately, the Great Phenetic Revival is about the rise of Intelligent Ignoramuses, those that wish to supplant taxonomy and systematics with phenetics under the guise of helping taxonomy. What a frightening thought.

Thursday, 25 October 2007

The Great Phenetic Revival 2: Phenetics

To all intents and purposes, by the late 1970s phenetics was dead. Why, then, some 30 years after its death, does Felsenstein (2004) in chapter 10 (Digression on history and philosophy) of his book Inferring Phylogenies begin by acknowledging Sokal and Sneath (1963), the first bible of phenetics, as the beginning, if not the foundation of phylogenetic methods (Felsenstein, 2004:124)?

Felsenstein notes (2004:145):

"Many systematists believe that it is important to label certain methods (primarily parsimony methods) as 'cladistic' and others (distance matrix methods, for example) as 'phenetic'."

Why does Felsenstein reduce the theory of cladistics and phenetics to different types of method? Part of the reason is that Felsenstein wishes to grant equal time to all quantitative (numerical) methodologies. Unwittingly (perhaps), he makes phylogenetics a 'realist' agenda for numerical systematics - rather than 'phenetic', a term carefully avoided.

Methods to one side, Felsenstein sees no purpose in discussions of classification, noting that,

"I would say that the effort put into this controversy is further evidence that systematists do not have their priorities straight. In their day-to-day work they really do not make much use of classifications, but they show a strange obsession with fighting about them for reasons that seem to me to be an historical curiosity" (Felsenstein 2005)

Felsenstein's dismissal of classification should not be surprising. Homology and monophyletic groups, crucial to the enterprise of classification, are not necessary under phenetics (in fact, Felsenstein mentions neither homology nor monophyly in his book) (see Williams & Ebach 2005).

So, what, exactly is phenetics? Or what has it become?

Phenetics is more than just a method of grouping by overall similarity; it's more than just a method. It's way to not do (to avoid) classification, namely to group taxa without any notion of homology beyond mere similarity, to form arbitrary groups without any notion of relationship (paraphyly) and work comfortably with branching diagrams depicting similarity without any specified hierarchy (unrooted trees). Phenetics is a synthesis that unites various numerical procedures to find non-groups that stem from ancestral 'vices' in a world in which the taxonomist and systematist has no prior knowledge or conviction of classification.

Perhaps there is a broader question: What are the fundamental differences between the Modern Synthesis, numerical phylogenetics, numerical taxonomy and phenetics? We know of none; all are based on simple similarity.

References
Felsenstein, J. 2004. Inferring Phylogenies. Sinauer Associates Inc., Massachusetts.
Sokal, R.R., Sneath, P.H.A. 1963. Principles of Numerical Taxonomy. W.H. Freeman, San Francisco.
Williams, D.M., Ebach, M.C. 2005. Drowning by Numbers: Re-reading Nelson's Nullius in Verba. Botanical Review 72, 355-387.

Wednesday, 24 October 2007

Planet Bob

The guys at the International Institute for Species Exploration Arizona State University, in cooperation with Media Alchem of Seattle, have just released the trailer and movie to Planet Bob.

Planet Bob is a fantastic way to advertise the importance of taxonomy to a general audience. Making people aware of taxonomy through a large media campaign is a novel idea and one that I hope will attract the attention of policy makers and funding bodies. We wish Quentin Wheeler and the IISE the best of luck.

Please visit the Planet Bob website http://www.planetbob.asu.edu/

Tuesday, 23 October 2007

The Great Phenetic Revival 1: Stratophenetics

Phenetics haunts us again in what appears to be a great revival in molecular systematics and paleontology - two fields that are linked not by their data but by a common world view.

The molecular clock and all that is associated with it, mainly a belief in a new empirical attempt to date nodes on trees, is an old idea that stems back to stratophenetics, a term coined by P.D. Gingerich in 1979. The practice of stratophenetics is simple. Taxa that share similarities that can be clustered either as phenograms (or on graphs) and together with their respective fossil dates can be compared directly to existing hypotheses of classification (i.e. phylogenies). The practice of assigning dates to taxa as well as to nodes (i.e. ancestors or events) is the underlying principle. In short, the ingredients for the stratophenetic recipe are:

Any taxic hierarchy (i.e. phenograms, cladograms, area cladograms etc.).

The oldest known fossil date for each taxon.

Interpreting nodes as either ancestors or events and cladograms and phenograms as explicit ancestor-descendant relationships.

This then can be turned into molecular clocks, stratocladistics or any other recent attempt at dating nodes (Wagner 1995, Hunn & Upchurch 2001, Donoghue & Moore 2003, Makovicky in press). Given the number of times that stratophenetics pops up in systematics and biogeography, many would be under the impression that it is a good idea. We beg to differ.

Stratophenetics like stratocladistics tests existing phylogenetic hypotheses based existing classifications (see Wagner, 1995). The test adds in the extra stratigraphical data (i.e. age of fossil taxa) to any given ancestor-descendant lineage. The better the stratigraphy the better rates of speciation can be retrodicted - something like a fossil clock. Unlike molecular clocks however stratophenetics and stratocladistics goes further. They use the fossil clock to test if the lineage is correct. If for instance the clock tells us that taxa A is the oldest followed by B, C and D sequentially, it would contradict a hypothesis that related A closer to D than to either B or C (if the lineage follows chronological order, namely A=>B=>C=>D). If we were to interpret phenograms and cladograms as real phylogenetic trees (rather than overall classifications) we could interpret the tree (AD)(BC) as the node that unites B and C (herein node X) to be older and therefore more ancestral than A. Many would go one step further and identify node X as an ancestor that is older than A but not necessarily older than the node that unites A and D. Molecular clocks do not go that far, only adopting the oldest known age as the "minimal" age for any given node.

Where both molecular systematics and paleontology share a common world view is that the oldest node of a group (e.g. Node X) is a real taxon or event - most likely an ancestor or radiation. Gingerich (1979) thought the same and would after some careful consideration also be the father of not only stratopehentics, but molecular clocks as well.

References
Donoghue, M. J. and B. R. Moore. 2003. Toward an integrative historical biogeography. Integrative and Comparative Biology 43: 261-270.
Gingerich, P. D. 1979. The stratophenetic approach to phylogeny reconstruction in vertebrate paleontology. In J. Cracraft and N. Eldredge (eds.), Phylogenetic Analysis and Paleontology, Columbia University Press, New York, pp. 41-77.
Hunn, C. A. & Upchurch, P. 2001 The importance of. time/space in diagnosing the causality of phylogenetic events: Towards a "Chronobiogeographical" paradigm? Systematic Biology 50:391-407.
Makovicky, P.J. In press. Telling time from fossils: a phylogeny-based approach to chronological ordering of paleobiotas. Cladistics.
Wagner, P.J. 1995. Stratigraphic tests of cladistic hypotheses. Paleobiology 21:153-178.

Monday, 22 October 2007

Phenetic "Natural" Classifications

Why would any one talk about Phenetic "Natural" Classifications? Strangely the concept turned up in a recent review of Johann-Wolfgang Wägele's book Foundations of Phylogenetics by Norman Platnick in The Quarterly Review of Biology (Vol. 81: 56 - 57).

What caught our eye was the following:

"Phenetics is the theory that clustering by raw similarity (i.e., by counting as significant both the presence and absence of characters, the 0s as well as the 1s in data matrices) will retrieve natural groups" (Platnick 2007: 56).

Phenetics and Natural Groups? We had to investigate.

The concept of a Natural Groups or a Natural Classification in phenetics was championed by P.H.A Sneath and R.R Sokal. Their claim followed Gilmour's dictum, namely a "... system of classification is the more natural the more propositions there are that can be made regarding its constituent classes" (Sokal & Sneath 1963: 19).

If we look at Gilmour (1951) wee see that his definition states: "In the general theory of classification, classifications which serve a large number of purposes are called natural, while those serving a more limited number of purposes are
termed artificial" (Gilmour 1951: 401).

It is clear that the meaning of the term "Natural" has been misinterpreted, both by Gilmour and Sokal & Sneath. No one who wished for a Natural Classification would have bought into the idea that Natural = more data whereas Artificial = less data. Moreover, Sneath and Sokal (1973) went as far as to defend their version of natural classification by using A. P. Candolle's distinction that Artificial Classifications, namely Systems (i.e. Linnaeus' system) should rejected in favour of Natural Classifications, namely a Method (Candolle, 1813) - a concept that was also supported by Goethe.

Gilmour's Natural Classification is a System of Classification and not a Natural Classification or Method. The former imposes a way to order nature (i.e. overall similarity) whereas the other discovers the way nature is ordered (homology and monophyly). The mistake is monumental and is one that often gets made (i.e. Phylocode).

References

Candolle A.-P. de 1813. Theorie elementaire de la botanique. Deterville, Paris.
Gilmour J.S.L. 1951. The development of taxonomic theory since 1851. Nature 168:400- 402.
Sneath P.H.A.& Sokal R.R. 1973. Numerical Taxonomy. Freeman, San Francisco.
Sokal R.R. & Sneath P.H.A. 1963. Principles of Numerical Taxonomy. W. H. Freeman, San Francisco.

Thursday, 18 October 2007

The Authors

The authors after two days of correcting proofs (at Gabana's in Berlin).

The new Blue Book?

It certainly is blue but Foundations of Systematics & Biogeography focuses on the history of comparative biology from Goethe to 21st century systematics and biogeography. The book is the history of our field from Ernst Haeckel to Adolf Naef, a rather neglected story that is unfairly dismissed as "typology" and forgotten. Perhaps this is Mayr's great undoing of what seems to be the foundations of 21st century systematics and biogeography.

The book is a combination of our collaborative efforts since 2001. During that time David and I have tried to unravel what is for us key-stones in our understanding of comparative biology. We are still mulling over a few, such as the role of homology in molecular data and why paraphyly is still the biggest problem in systematics. We hope that the book and this blog will help people try the unravel that same history.

The book is available from Springer in December. Watch this space for more news!