Systematics, AI and Frankenstein
Will artificial intelligence enhance or distract from taxonomy?
In systematic biology, there are two kinds of rapidly developing artificial intelligence. The first is technological, the second bad ideas and shortcuts that create the illusion that we know things about earth’s species that we in fact do not. A recent opinion piece in Entomology Today, an online publication of the Entomological Society of America, supports embracing AI as a new means to advance taxonomy. I agree with this positive outlook, but only to a degree. Taxonomists have always been early adopters of new technology. Years ago, Norman Platnick and colleagues at the American Museum of Natural History, in the project SPIDA-Web, demonstrated how machine learning could result in increasingly accurate automated identifications of spider species based on microscopic images; an early AI success story. It is easy to think of a wide range of routine taxonomic activities which might be enhanced, accelerated, or automated by AI, from prioritizing field locations where new species might be found to routine sorting of mass lots of specimens.
Photo: Courtney Maxson, U.S. Army. Source: https://www.defense.gov/News/News-Stories/Article/Article/2427173/artificial-intelligence-enablers-seek-out-problems-to-solve/
While there were a number of statements in the piece which I am tempted to respond to, one in particular suggested to me that a different, concerning kind of artificial intelligence is abroad in the systematics community. A false belief that simply gathering and analyzing data ever more efficiently, with the limited goal of simply telling one species from another more quickly or easily, is the equivalent of the science and scholarship of taxonomy done to its own high standards. The statement that I found most alarming was the following (italics mine):
“A few decades ago, traditional systematic entomology was derided as stamp collecting, an outdated field soon to be replaced by (back then) cladistics, phylogenetics, and studies of evolutionary processes. Turns out, systematics as a whole survived and is thriving, while cladistics not so much.”
This rosy assessment of the state of systematics depends very much on your understanding of what systematic biology is, or is not. Activities like DNA barcoding are indeed being funded and welcomed by users of taxonomic information who neither understand taxonomy as a science, nor seemingly care whether the names it provides to them are theoretically justified and attached to explicitly testable hypotheses. Cladistics not only revolutionized how systematics is done, it fulfilled the long sought after aim to make classifications accurately reflect relationships due to a history of common ancestry. Systematics is a fundamental science that aims to discover and know every kind of living thing, the characters which make each species unique among all others, and the history of descent with transformation that explains the origins of the diverse species and species attributes that we see today.
To the extent that systematics is alive and well as a credible science, as opposed to a species identification service, is directly correlated with the degree to which cladistics and so-called descriptive taxonomy are doing well. The ultimate expressions of systematics are taxonomic revisions, monographs, and phylogenetic classifications that are at once optimal information storage systems and the basis for making predictions about attributes of species not yet fully documented. The purpose of the science of systematics is to explore, document, analyze, describe, understand, name and classify the kinds of organisms that exist, or have existed, as well as the pattern of relationships among them that explains their similarities and differences. Simply telling species apart with DNA barcodes or accelerating species descriptions by minimizing their information content does not point to a systematics that is thriving. It reveals a systematics being redefined as a mere service, its theoretical foundations, synthesis of diverse evidence, and lofty aims dangerously degraded.
Too many taxonomists—long deprived of recognition and funding—have willingly engaged in this charade, reducing taxonomy to a sad parody of itself, reducing its theoretical integrity, and reducing its informativeness in a bid to enjoy funding and the approval of those unappreciative of systematics as an independent science.
The success of systematics should be measured in the extent and depth of our knowledge of the diversity and history of life, not in positions, grants, or amount of praise showered on those willing to betray its rich traditions in order to appear modern and technologically savvy, or to pander to the desires of those who gladly use taxonomic information and knowledge while caring little about its quality.
Individual characters and claims of homology; individual species and monophyletic groups of species are, done well, based on scientific hypotheses that make explicit predictions about the distribution of characters whether autapomorphies restricted to one species or synapomorphies shared by descendants of a common ancestral species. The age-old quest to classify the astounding diversity of life in a natural classification, a system that reflects relationships as they exist in nature (now understood to be the result of evolutionary history), cannot be abandoned without fundamentally changing what systematics is. Serious, scientifically and theoretically justifiable systematics, contrary to this opinion piece, is facing an existential threat.
Of course, AI, digital imaging, DNA barcodes, and every other technology that comes along can be adapted to further the ends of systematics and should be folded into taxonomic practice. But when such technology, or trivial outcomes such as simply telling species apart for the convenience of other biologists, is pursued while abandoning the aims of taxonomy as an independent, fundamental science, systematics does not thrive. Instead, it is made less significant, less informative, less interesting, less reliable and less impactful.
For science, society and our intellectual lives, it is imperative that systematics be supported to conduct exploration and research guided by its own goals and enabled by its own theories and methods. Like every fundamental science, systematics is driven by raw, irrepressible curiosity to know what kinds of organisms exist, what makes each kind different from every other, and how the perplexing kaleidoscope of similarities and differences among species makes sense in the light of cladistic relationships, as expressed in Linnaean classifications.
Just having jobs and grant money to gather data does not mean that systematics is thriving. If MacDonalds began dispensing SAE-30 instead of hamburgers, no one would say that it was thriving as a fast-food establishment, no matter how much motor oil it sold. Systematists are in the business of creating knowledge about species and their attributes —all of them, not just their DNA— and their relationships, not simply telling species apart or estimating their possible existence based on degrees of genetic distances.
Unless taxonomists stand up for their science, defend its purpose, aims and independence, and resist pressures to fundamentally change both its mission and methods, then the noble science known to Aristotle, Linnaeus, Darwin, and Hennig will cease to exist, replaced by rote procedures and trivial data unworthy of centuries of tradition and accumulated knowledge.
Frankenstein’s monster, I suppose, could be said to be thriving in the sense that it walks and grunts, but it is no longer the complex, thinking, compassionate, scient, lovable and wondrous human beings which once occupied the parts of cadavers from which it was stitched together. AI speeding the identification of putative species that are not based on all available evidence, not formulated as testable hypotheses, not described in their full complexity, is not a taxonomy that I recognize, much less one that could be said to be thriving. It is a mere service industry so degraded in intellectual content as to cease being a rigorous or independent science.
As taxonomists embrace AI, as they should, they must integrate it as one of many useful tools and not repeat the tragic mistake of DNA barcoding, blowing it out of proportion, confusing simply telling apart supposed species with the science of taxonomy which pursues all relevant evidence, rigorous theory, and a mission unique among the life sciences. Happily, both systematists and users of taxonomic information are best served when systematics is done to its own high standards. It is time to reject technology bandwagons and simply incorporate useful technologies into the toolbox of taxonomy; to reassert and focus on systematics’ own mission, while spinning off knowledge and information useful to science and society as a by-product.
It is anything but intelligent to conclude that the current state of systematics is good. Great collections are being divorced from the scientists who can best study and use them; major sources of evidence of relationships are ignored in order to profit from fads; institutional leaders with no understanding of taxonomy contribute to its dismemberment; a theory- and evidence-rich science is being diminished to trivial data gathering; and users of taxonomic knowledge unwittingly undermine the reliability of the information they seek by relentless efforts to remake taxonomy in their own image rather than supporting it to build on centuries of proven excellence and advances to be the best science that it can be. As taxonomists we can no longer be silent, compliant participants in the degradation and redefinition of our own science, willingly allowing technology to replace knowledge and the goals of other sciences to supplant those of systematics itself.
References
Russell, K.N., Do, M.T., Huff, J.C. and N.I. Platnick (2007) Introducing SPIDA-Web: Wavelets, Neural Networks and Internet Accessibility in an Image-Based Automated Identification System, pp. 131-152 In N. MacLeod (ed.), Automated Taxon Identification in Systematics: Theory, Approaches, and Applications. Systematics Association Special Volume, Taylor & Francis, London.
Automated identification of species based on images will depend on the species being described and imaged in the first place. The images will have to be fed into whatever AI application emerges--it will not be able to identify a species unless there is an image of it in its memory. Otherwise AI might label as new a species that in fact has been described but for which it does not have an image. It seems to me dubious that a sufficient bank of images can be assembled to make such an application broadly useful. Further, the client who uses it will have to be able to take a quality image of the specimen to be identified. Meanwhile, in the next lab over, an experienced taxonomist is sorting specimens at an order of magnitude faster than the application. By the way, whatever happened to SPIDA-web? I did a Google search for it and found nothing. Could it be that the requirement to have multiple images of each of the tens of thousands of described spider species proved an obstacle to great to overcome?