Showing posts with label Peer-reviewed blog post. Show all posts
Showing posts with label Peer-reviewed blog post. Show all posts

12 December 2010

It's just a stage. A phylotypic stage. Part II: The flies

ResearchBlogging.orgThe controversy about the existence of the phylotypic stage is more than some bickering about whether one blobby, slimy fish-thing looks more like a Roswell alien than another one does. It's about whether the phylotypic stage means something, whether it tells us something important about development and how developmental changes contribute to evolution. To answer such a question, we need more than another set of comparisons of the shape and movements of embryos and their parts. We need a completely different way of looking at the phylotypic stage, to see if something notable is going on under the hood. So vertebrates all look the same at the tailbud stage. What does that mean?

Embryos look the way they do because of the positions and behaviors of the cells that make them up. The cells in an embryo all have the same DNA, and the link between that DNA and those specific cell behaviors is the basic process of gene expression. (This is a fundamental principle of developmental biology.) And by gene expression, we usually mean the synthesis of messenger RNA under the direction of genes in the DNA. Different cell types express different sets of genes, and the orchestration of the expression of particular genes at particular times is a big part of what makes development happen. When considering the phylotypic stage, then, developmental biologists wondered: is the apparent similarity of embryos at that stage reflected by similarities in gene expression. Or, more specifically, does the hourglass model hold up when we look at gene expression? This was the focus of the two articles in last Friday's Nature that inspired the cool cover.

10 December 2007

Gene duplication: "Not making worse what nature made so clear"

But he that writes of you, if he can tell
That you are you, so dignifies his story,
Let him but copy what in you is writ,
Not making worse what nature made so clear,
And such a counterpart shall fame his wit,
Making his style admired every where.
--Sonnet 84, The Oxford Shakespeare
One of the most common refrains of anti-evolutionists is the claim that evolutionary mechanisms can only degrade what has already come to be. All together now: "No new information!" It's a sad little mantra, an almost religious pronouncement that is made even more annoying by its religious underpinnings, hidden or overt.
ResearchBlogging.org
But it's a good question: how do new genes come about?

One major source of new genes is gene duplication, which is as conceptually simple as it sounds. It might seem a little odd, and it's not that easy to picture, but the duplication of discrete sections of genetic material is commonplace in genomes. In fact, a significant amount of the genetic variation among individual humans is due to copy number variation, which is variation in the number of copies of particular genes or chunks of genetic material from individual to individual. Genes can be duplicated within a genome via various mechanisms, one of which includes the rare but fascinating occurrence of whole-genome duplication. In any case, it is very clear that gene duplication and subsequent evolution explains the existence of thousands of the most interesting genes in animal genomes.

It should be obvious that gene duplication gives you more genes, but perhaps it's not so clear how this can yield something truly new. For many years, new genes were thought to arise after duplication by a process called neofunctionalization. The basic idea is this: consider a gene A, with a set of functions we'll call F1 and F2. Now suppose the gene is duplicated, so that we now have genes A and B, both capable of carrying out F1 and F2. In neofunctionalization, gene B is free to vary and (potentially) acquire new functions, because gene A is still making sure that F1 and F2 are covered. So the duplication has created an opportunity for a little "experimentation." Most of the time, gene B will be mutated into another piece of genomic debris, a pseudogene with no evident function. (The human genome is riddled with pseudogenes, and that's a story all its own.) Occasionally, though, the tinkering will yield a gene with a new evolutionary trajectory. This model makes good sense and surely accounts for numerous genetic innovations during evolution.

But another model has come to the fore in the last several years, in which the two duplicates seem to "divide and conquer." The process is called subfunctionalization, and the idea is straightforward: gene A covers F1, while gene B covers F2. Straightforward perhaps, but this scenario creates some interesting evolutionary opportunities that aren't immediately obvious. Here in this newest Journal Club, I'll look at another example of the experimental analysis of evolutionary principles and hypotheses, summarizing some recent work that examines subfunctionalization in the laboratory.

In the 11 October issue of Nature, Chris Todd Hittinger and Sean B. Carroll examine an actual example of subfunctionalization in an elegant set of experiments that seeks to re-create the evolutionary changes that occurred after a gene duplication. Specifically, they looked at the events that led to the formation of a new pair of functionally-intertwined genes in yeast. The genes are GAL1 and GAL3, and there are several aspects of this story that make it an ideal system in which to experimentally explore the creation of new genes.
  1. GAL1 and GAL3 arose following a whole-genome duplication in an ancestral yeast species about 100 million years ago. The ancestral form of the gene (see Note 1 at the end of this article) is still present in other species of yeast (namely, those that branched off before the duplication event). This means that the authors were able to compare the new genes (meaning GAL1 and GAL3) and their functions to the single ancestral gene and its functions.
  2. The genomes of these yeast species have been completely decoded, so that the authors had ready access to the sequences of the genes of interest and any DNA sequences in the neighborhood.
  3. Decades of research on yeast have yielded superb tools for the manipulation of the yeast genome. Using these resources, the authors were able to create custom-designed yeast strains in which genes of interest were altered to suit experimental purposes. (Those of us who work in mammalian systems can only dream of being able to do this kind of genetic modification with such ease.)
  4. The biochemical functions of GAL1 and GAL3 were already well known.
Hittinger and Carroll capitalized on this excellent set of tools, and added a key component of their own. They needed a way to measure fitness of different strains of yeast, namely strains that had been modified to resemble various ancestral forms. But most typical methods for testing gene function are unsuitable for estimating fitness, which is the relevant issue. The question, in other words, is focused not on the ability of a particular protein to perform a particular function, but on the ability of a particular protein to change the fitness of the organism that expressed it. The authors' solution can only be described as elegant: they assessed fitness of various yeast strains by measuring the outcomes of head-to-head competitions between strains. Their experimental approach, developed by a colleague (see Note 2) employed some very nice genetic tricks and a sophisticated analytical tool called flow cytometry. (Take some time to read about Abbie Smith's research at ERV if you haven't already done so; in her work on HIV, she asks similar questions regarding fitness and uses a very similar approach in seeking answers.)

Why did the authors choose the GAL1-GAL3 system for close scrutiny? The two genes are critical components of a system in yeast that controls the utilization of galactose (a certain sugar) as an energy source. The GAL1 protein is an enzyme that begins the breakdown of galactose; the GAL3 protein controls the induction of the GAL1 protein. When galactose is present, the GAL3 gene is induced, such that GAL3 protein amounts increase by a few fold. The GAL3 protein is in turn a potent inducer of the GAL1 gene: when galactose is present, GAL1 protein levels increase 1000-fold or so. The two proteins are very similar to each other, and both are very similar to the single protein that is found in the genomes of yeasts that never underwent the genome duplication. So this means that the ancestral protein is bifunctional: it must carry out the very different processes of induction and of galactose metabolism. Not surprisingly, situations like this are thought to involve trade-offs which resolve "adaptive conflicts" between the two different functions of the protein. The reasoning is straightforward: mutations that would improve function A might degrade function B, and vice versa. So the protein is not optimized for either function. There is an adaptive conflict between the two functions. The GAL1-GAL3 system clearly involves subfunctionalization following duplication, and because the ancestral gene is available for comparison, the story invites exploration of the notion of adaptive conflict.

Hittinger and Carroll found that there is indeed an adaptive conflict that was resolved by the evolution of GAL1 and GAL3 following the duplication. But the nature of that conflict is not what some might have predicted. Look again at my description of adaptive conflict above. I focused exclusively on the proteins themselves, claiming that the conflict would arise during attempts to optimize two functions in a single protein. But there's another possibility (that need not exclude the first): perhaps the conflict occurs in the regulation of the expression of those proteins. In the case of GAL1 and GAL3, the two different genes can be turned on and off by two different signaling systems. But in the ancestral situation, there's only one gene and therefore fewer opportunities for diversity in the signaling that leads to expression.

The data presented by Hittinger and Carroll suggest that there is not strong adaptive conflict between the two functions of the ancestral protein. If such a conflict existed, we would expect that changes in GAL1 that make it look more like GAL3 (and vice versa) would cause significant decreases in fitness. But that's not what the fitness analysis showed, and the authors inferred that the adaptive conflict must occur in the arena of regulation, and not in the context of actual protein function. The story is complicated, and I'm not convinced that the authors have ruled out adaptive conflict at the level of the structure of the proteins. Nevertheless, their subsequent experiments demonstrate a clear adaptive conflict in the regulation of expression of the different proteins, and an efficient resolution of that conflict in the subfunctionalization of the two genes following duplication. Those results are strengthened by some detailed structural analysis that seems to account for the physical basis of the optimization that occurred during evolution of the GAL1 and GAL3 genes, optimization that occurred in DNA sequences that control the levels of expression of protein.

If you're a little dizzy at this point, relax and let's zoom out to reflect on this article's significance in evolutionary biology, and its relevance for those who are influenced by the claims of anti-evolution commentators.

First, take note that this article is another example of a sophisticated, hypothesis-driven experimental analysis of a central evolutionary concept. Research like this is reported almost daily, though you'd never learn this by reading the work of Reasons To Believe or the fellows of the Discovery Institute. The mis-characterization of evolutionary biology by the creationists of those organizations is a scandal, and as you might already know, my blog's main purpose is to give evangelical Christians an opportunity to explore the science that is being so carefully avoided by those critics. You don't need to understand sign epistasis or the structure of transcription factors to get this take-home message: evolutionary biologists are hard at work solving the problems that some prominent Christian apologists can't or won't even acknowledge. How does gene duplication lead to the formation of genes with new functions? The folks at the Discovery Institute can't even admit that it happens. Over at Reasons To Believe, they don't mention gene duplication all, despite their fascination with "junk DNA." That's from a ministry that claims to have developed a "testable model" to explain scores of questions regarding origins.

This makes me mad. No matter what you think of the age of the earth or the need for creation miracles, you should be upset by Christians who mangle science to serve apologetic ends.

Second, it's important to note that Hittinger and Carroll's paper is not merely a significant contribution to our understanding of subfunctionalization. It's also a salvo, in an apparently intensifying debate within evolutionary biology regarding the kinds of genetic changes that are more likely to drive evolutionary change. Sean Carroll is one of the leading lights in the new field of evolutionary developmental biology, or evo-devo, and one of the tenets of this upstart school is the claim that most of the genetic changes that lead to adaptation -- and especially to changes in form -- occur in regulatory regions of the genome and not in the genes themselves. (More technically: evo-devo advocates like Carroll postulate that changes in form are more likely to arise from mutations in cis-regulatory regions than in protein-coding sequences within genes.) This assertion is hotly contested, as are many of the other basic views of the evo-devo school. The antagonists include some serious evolutionary biologists, Michael Lynch and Jerry Coyne among them. (Lynch is the guy who took the time to explain why Michael Behe's paper on gene duplication was a joke. Coyne co-wrote the book on speciation, literally.)

I'm a developmental biologist, and therefore partial to many of the arguments of evo-devo thinkers. I'm excited about the union of evolutionary and developmental biology, and I do think that many of the new evo-devo ideas are thought-provoking and potentially fruitful. But the debate is riveting and informative, and I find Lynch and Coyne and their talented colleagues to be alarmingly convincing. I'm worried about some of those cool ideas, but I do take some comfort in this thought: any idea that can survive the onslaught of Lynch and Coyne is a hell of a good idea.

It's easy to see how the disputes spawned by the brash (and perhaps rash) evo-devo folks can lead to innovation and discovery, even if many of their proposals are diminished or destroyed in the process. The disagreement is pretty clear-cut, and both sides seem to agree on how to figure out who's right. They'll go to the lab; they'll perform hypothesis-driven experiments; they'll analyze their data; they'll write up their findings; their work will be subjected to peer review. In other words, they'll do real science.
---
Note 1: The ancestral gene itself, of course, isn't available for analysis. The authors are studying the ancestral form of the gene, using a yeast species that never experienced the whole-genome duplication.
Note 2: As Hittinger and Carroll indicate in the acknowledgments, the experimental design was developed by Barry L. Williams, who was a postdoctoral fellow in Carroll's lab and is now on the faculty at Michigan State. And by the way, this little state of Michigan doesn't have much of an economy, but boy are we crawling with gifted evolutionary biologists.

Article(s) discussed in this post:

  • Hittinger, C.T. and Carroll, S.B. (2007) Gene duplication and the adaptive evolution of a classic genetic switch. Nature 449:677-681.

15 October 2007

How to evolve a new protein in (about) 8 easy steps

ResearchBlogging.orgIf you have only read the more superficial descriptions of intelligent design theory, and specifically the descriptions of irreducible complexity, you might (reasonably) conclude that Michael Behe and other devotees of ID have claimed that any precise interaction between two biological components (two parts of a flagellum, two enzymes in the blood clotting cascade, or a hormone and its receptor) cannot arise through standard Darwinian evolution. (If you don't know anything about the term 'irreducible complexity' you should probably read a little about it before proceeding.) In other words, you may be under the impression that Behe doesn't think that such a system could arise through a stepwise process of mutation and selection. You may even be under the impression that Behe has demonstrated the near impossibility of such a system coming to be through naturalistic means.
This article was UPDATED on 1 November 2007, incorporating some corrections and clarifications provided by the senior author of the studies described. In other words, this post was peer reviewed, and this is the final version.






You would be mistaken, albeit (in my opinion) understandably so. Behe has not claimed this -- though he's often come pretty close -- and recently he has made it clear that this is not his position. Unfortunately, many of the critiques of irreducible complexity contain significant errors, including the claim that Behe rejects all stepwise accounts of molecular evolution, and you have to look pretty hard to find well-reasoned examinations of the problems with Behe's interesting but fruitless challenge to evolutionary theory.

My purpose in the preamble above is to make it clear that this Journal Club is not intended to refute Behe's claims regarding the ability of Darwinian mechanisms to generate irreducibly complex structures. (In my view, his claims are wholly mistaken, and Christian enthusiasm for his natural theology is a disastrous mistake. But that's for another time.) Rather, it is to discuss a superb recent example of the kind of experimental molecular analysis of evolution that can be done in this postgenomic era. Experiments like this are revealing how evolutionary adaptation actually comes about at the molecular level, thereby addressing the very questions raised by ID thinkers. ID apologists are, in a sense, wise to attack the work described here, because these experiments are the first fruits of the types of analysis that will usher ID into permanent scientific ignominy.

So, to our two papers.

How, exactly, does a protein acquire a new function during evolution? This is one of those "big questions" in evolutionary biology. Broad concepts such as gene duplication are quite helpful in formulating explanations, but the specific question raised is focused on the details -- the actual steps -- that must occur during the step-by-step modification of a protein such that it performs a different job than the proteins from which it has descended. The constraints on the process of change are significant, and the issues are similar to those I discussed when describing the concept of fitness landscapes in morphospace. The problem, basically, is this: how can you change a protein without wrecking it in the process? In other words, can you get from function A to function B, step by step, without passing through an intermediate form, call it protein C, which is worthless (or even harmful)?

These are precisely the questions addressed in an elegant set of experiments reported in two reports over the last year or so. The second article, by Ortlund et al., was reported in the 14 September issue of Science, and built on work reported in Science in April 2006. Their studies focused on two closely-related proteins that are receptors for steroid hormones. In this case, the steroids of interest are corticosteroids (the kind often used to treat inflammation; Ortlund et al. studied receptors for cortisol, which is of course quite similar to cortisone) and a mineralocorticoid (a less well-known hormone, aldosterone, that regulates fluid and salt intake). The hormones are structurally similar (being steroids).

Joseph Thornton, at the University of Oregon, has been studying the origins of these receptors for about 10 years, and has assembled an interesting (and detailed) account of their history. The basic outline is as follows: the original steroid receptor was an estrogen receptor, and is extremely ancient, apparently arising "before the origin of bilaterally symmetric animals" (Thornton et al., Science 2003). (That's seriously ancient, sometime in the Cambrian or earlier.) The progesterone receptor seems to have arisen next, followed by the androgen (i.e., testosterone) receptor. (Now that's intriguing.) Fairly late in this game, the two receptors of interest to us here, the corticosteroid receptor and the mineralocorticoid receptor, were added to the vertebrate repertoire. The two modern receptors are thought to descend from an ancestral corticosteroid receptor, which underwent a gene duplication. Hereafter, I'll refer to the receptors as the corticosteroid receptor and the aldosterone receptor, hoping that all the jargon won't obscure the message.

In a widely-discussed paper published in Science a year ago (Bridgham et al., Science 2006), Thornton's group determined the most likely DNA sequence of this ancestral gene, then "resurrected" it, meaning simply that they created that very DNA sequence in the lab. (Determining the ancestral sequence was a nifty piece of work; actually making the DNA is quite straightforward, especially if you have a little dough.)

Their experiments showed that the ancestral receptor could bind to a hormone that didn't exist yet (aldosterone) while it was functioning as a receptor for corticosteroids. In other words, the receptor was available for activation by aldosterone long before aldosterone was around. (All jawed vertebrates make corticosteroids, but only tetrapods make and use aldosterone, an innovation that occurred at least 50 million years later.) The modern corticosteroid receptor has since lost its ability to interact with aldosterone, and Bridgham et al. chart the most likely evolutionary path, at the molecular level, by which we and other tetrapods came to have a corticosteroid receptor that won't bind to aldosterone. The surprising result, however, is the fact that the ancient receptor was able to bind aldosterone, millions of years before aldosterone is thought to have been present.

The 2006 paper is, I think, more notable as an illustration of an important evolutionary principle ("molecular exploitation" is the authors' term) than as a set of observations; Michael Behe's trashing of the group's work is disgusting, but it's true that the findings are limited in scope. It's worth having a look at the whole paper, though (and I believe it's freely available with free registration), because the authors very clearly explain the rationale for their continuing work, which is to begin to address one of the major "gaps in evolutionary knowledge": the mechanisms underlying stepwise evolution of "complex systems that depend on specific interactions among the parts."

If you're well-read on ID thought, that last sentence should sound pretty familiar. So let's note that prominent papers in science's premier journals are acknowledging that the evolutionary mechanisms that generate complex structures -- including "irreducibly complex" systems -- are as yet poorly understood. And let's give ID credit for asking a good question. (Not a new one...but a good one.)

The 2006 paper did not, as advertised, utterly destroy ID arguments, and again Behe is right to criticize the near-hysteria surrounding that work. But I find Behe's bravado otherwise unconvincing. Because that paper did set up the most recent work, and the whole story illustrates rather clearly how ID's question will (soon) be answered.

The most recent paper adds significantly to the picture, and introduces some genetic concepts that Behe's fans should pray he understands. The authors (Ortlund et al.) took their analysis to a far more detailed level, by extending their previous observations to include much more of the receptor family tree. In the 2006 work, they had assembled a detailed family tree for the receptors, by looking at DNA sequences from living species known to represent various branches on the tree of life. In other words, they chose organisms such as lampreys, bony fish, amphibians and mammals, and examined their DNA codes (for the receptors) to find the changes that occurred in each branch of the lineage. Now, please stop and think about this, because it's really cool. What the authors did was mine existing databases of DNA sequence data, pulling out the sequences of the steroid receptors from 29 different vertebrate species. You could repeat this part of the experiment right now, by referring to their list of organisms in Supplemental Table S5, which provides the ID codes needed to locate the DNA sequences in the Entrez Gene database. Then they charted the changes in the DNA sequence in the context of the tree of life as sketched out in the fossil record. The tree they assembled includes all the steroid receptors, and I've annotated it a little if you want to have a look. They used this tree to guide their further experiments, as I'll explain below. What the most recent paper added to the story was an analysis of the 3-D structure of the various postulated intermediates in the evolutionary pathway. The authors accomplished this by making proteins from the "resurrected" genes, then crystallizing them and using X-ray diffraction techniques to determine their precise structures.

Examination of their receptor family tree revealed something interesting. Most vertebrates have highly specific receptors: the corticosteroid receptor isn't strongly stimulated by aldosterone, and vice versa. But some living vertebrates (skates, in particular) show a different pattern: the corticosteroid receptor isn't all that specific for cortisol. Because the ancestral receptor also lacked specificity (as shown in the 2006 paper), the authors concluded that the receptor acquired its discriminating taste at some point between the branching-off of skates (and their kin) and the separation of fish from tetrapods. Their Figure 1 is a little crowded, but it illustrates this nicely:


To follow the evolutionary narrative in this graph, start at the blue circle, which represents the ancestral receptor that was "resurrected" in the 2006 paper and that happily binds to both corticosteroids and aldosterone. (The graphs on the right side of the figure demonstrate the specificity, or lack thereof, of the receptors at different times in history.) There's a branch leading up and to the left, to the various GRs (corticosteroid receptors), and one leading up and to the right, to the MRs (aldosterone receptors). At the green circle, another branching event occurred, 440 million years ago, at which point certain groups of fishes (skates among them) branched off, up and to the right. The receptor at that point is an ancestral corticosteroid receptor, and it still isn't specific for corticosteroids. But the receptor at the yellow circle, in the common ancestor of tetrapods and bony fishes, is specific. The authors conclude that specificity arose between those two points, between 420 and 440 million years ago. With some (deliberate?) irony, they indicate that process with a black box.

The rest of the paper explores the pathway by which the receptor might have been successively altered so as to install specificity for cortisol. During those 20 million years of evolution, at least 36 different changes were introduced in the makeup of the receptors. By looking at the 3-D structures of the ancestral forms, the authors were able to discern the specific functional ramifications of these various changes, and they found that the alterations fell into three groups:
  • Group 'X' alterations included the changes reported in the 2006 article. These are the biggies, that account for much of the functional 'switch' between GRs and MRs. These alterations don't account for the specificity change that occurred inside the black box in Figure 1.
  • Group 'Y' alterations are all strongly conserved (meaning that they were permanent changes), and occurred during the black box time period. Moreover, this group of changes is always seen together: modern receptors have all of these alterations, while ancestral receptors have none of them.
  • Group 'Z' alterations are also conserved changes, but they don't always occur together like group 'Y'.
The authors set about the work of examining the function of "resurrected" receptors bearing these groups of changes. When they introduced group 'X' changes into the ancestral receptor, they got a receptor that was almost modern (i.e., specifically tuned to cortisol) but not quite; this was what the previous work had indicated. Then they hypothesized that the group 'Y' changes, because they were so highly conserved and because they all occurred together, would make the transition complete. But no: instead, the group 'Y' alterations made the receptor worthless, unable to bind any hormone at all. Surprise! Looking at their 3-D structures, they figured out what this meant. The group 'Y' changes were somehow important, but they could only have a beneficial influence in the presence of another set of alterations, group 'Z', which had to occur in advance. The biophysical details don't concern us, but the basic idea is that the group 'Z' changes created a permissive environment for the group 'Y' changes, which are the alterations that complete the development of the modern specific form of the receptor for cortisol.

In genetics, we have a word for this type of interaction between genetic influences: epistasis. The fascinating history of steroid receptor evolution includes examples of what the authors call "conformational epistasis," meaning that some alterations in 3-D structure are required in advance for other alterations to ever get off the ground. Specifically, some alterations are evolutionary dead ends, because they yield worthless proteins, unless those alterations follow another set of changes that generated a different -- and more fruitful -- environment.

The authors then construct a map of what they call "restricted evolutionary paths through sequence space," showing how you can get there from here, without traversing an evolutionary no-man's-land of non-function. The path includes changes that don't apparently improve the receptor, but that yielded the right environment for the changes that did improve function. Their map is in Figure 3:


The idea is that you want to get from the lower left corner of the cube (the ancestral receptor) to the upper right corner (the modern receptor) without hitting a stop sign (a worthless receptor). The green arrows indicate a change in function of some kind, the white arrows no change. Yes, you can get there from here.

The authors note that their data "shed light on long-standing issues in evolutionary genetics," firstly the question of whether adaptation proceeds through "large-effect" changes (mutations), or through baby steps. Their conclusion:
Our findings are consistent with a model of adaptation in which large-effect mutations move a protein from one sequence optimum to the region of a different function, which smaller-effect substitutions then fine-tune; permissive substitutions of small intermediate effect, however, precede this process.
They note that the large-effect changes are inherently easier to identify (of course), and that the painstaking work of "resurrecting" the ancestral proteins and studying their function is the only way to identify the critical small-effect alterations that made the "big jump" work.

The authors also comment on the big "contingency" debate. I'll write more on the whole "rewinding the tape of life" question some other time; for now, we'll just consider the authors' words:
A second contentious issue is whether epistasis makes evolutionary histories contingent on chance events. We found several examples of strong epistasis, where substitutions that have very weak effects in isolation are required for the protein to tolerate subsequent mutations that yield a new function. Such permissive mutations create “ridges” connecting functional sequence combinations and narrow the range of selectively accessible pathways, making evolution more predictable.
If you have read my summary of the wormholes in morphospace story, this metaphor of "ridges" should make a little sense. The authors here are describing the same concept: an evolutionary exploration of a design space, with paths meandering through a map of the possibilities. But:
Whether a ridge is followed, however, may not be a deterministic outcome. If there are few potentially permissive substitutions and these are nearly neutral, then whether they will occur is largely a matter of chance. If the historical “tape of life” could be played again, the required permissive changes might not happen, and a ridge leading to a new function could become an evolutionary road not taken.
The history of the steroid hormone receptor, then, appears to include several different aspects of evolutionary biology combined: "chance" creating opportunity, leading (via epistasis) to selection for improvement, all done step by step, with some steps generating more apparently dramatic change than others.

Amazingly, Michael Behe is pretending that this analysis is utterly unimportant, with no implications at all for ID proposals, because the receptor-hormone system isn't "irreducibly complex." Some critics of ID claim that the goalposts are being regularly moved, and I'm inclined to agree. But let's just grant Behe the difference between protein-hormone interactions and protein-protein interactions. Does anyone really believe that Joseph Thornton's work doesn't show us exactly how the "irreducible complexity" challenge is going to fare in the near future?

30 August 2007

Which came first, the bird or the smaller genome?

ResearchBlogging.orgIt’s easy to think of a genome as a collection of genes, perhaps because so many of the metaphors used to explain genes and genomes (blueprint, book of life, Rosetta Stone) can give one the impression that everything in a genome is useful or functional. But genomes are, in fact, packed with debris. Many genomes contain huge collections of fossil genes: genes that have been inactivated by mutation but were never discarded, sort of like the old cheap nonfunctional VCRs in my basement. And many genomes contain even more massive collections of another kind of fossil-like DNA: mobile elements, or their remnants. The human genome, for example, contains over 1 million copies of a single type of mobile genetic element, the Alu transposon. Together, the various types of mobile genetic elements comprise nearly half of the human genome.

Think about that. Almost half of the human genome is made up of known mobile elements, pieces of DNA that can move around, either within a genome or between genomes with the help of a virus. This extraordinary fact -- and many of the specifics surrounding it -- constitutes one of the most compelling sources of evidence in favor of common descent, the kind of data for which only common ancestry provides a complete (or even reasonable) explanation. I’ll come back to this topic regularly.

Now it turns out, not surprisingly, that differences in genome size among types of organisms are determined primarily by the numbers of these mobile elements, and not by the number of genes. In fact, there is wild variation in genome size among types of organisms, and the variation has little to do with the numbers of genes expressed by those organisms. Consider birds, the subject of this week’s Journal Club (“Origin of avian genome size and structure in non-avian dinosaurs,” Organ et al., Nature 446:180-184, 8 March 2007).

Birds have remarkably small genomes, averaging 1/2 to 1/3 of the size of typical mammalian genomes. (The chicken genome, for example, is less than half the size of the mouse genome.) Why might this be? In other words, how might we explain this difference? The authors point to two important ideas. First, the chicken genome has been fully sequenced and analyzed, and it contains far less of the debris mentioned above. It seems that the processes that create (or multiply) mobile genetic elements are significantly less active in birds than in mammals and other vertebrates. Second, small genome size is intriguingly correlated with flight. Bats, compared to other mammals, have small genomes, and flightless birds, compared to other birds, have larger genomes. This has led to the proposal that small genome size might offer a selective advantage to flying animals, by reducing the energy cost associated with hauling all that debris around. So, it seems that a smaller genome is advantageous for flying vertebrates, and that genome size can be reduced by restraining the production of mobile genetic elements. And this raises several interesting questions, including this one: did the reduction in genome size accompany the origin of bird flight, or did it happen in advance? In other words, we can propose at least two alternative scenarios:
  • 1) flight drove the genome change, by favoring small genomes, or
  • 2) the genome change happened first, and helped to get flight off the ground. ;-)
How can we even hope to distinguish between these possible explanations? We would need, somehow, to look at the genomes of the ancestors of birds. And all evidence indicates that the relevant ancestors of birds are dinosaurs; in fact, today's birds are considered to be flying dinosaurs. The recent description of protein sequences from T. rex bone provided strong confirmation of the birds-from-dinosaurs hypothesis, but no DNA was recovered from the samples, and no information about genome structure can be inferred from those otherwise fascinating studies. If only, a la Jurassic Park, we could get some dino DNA...

Enter Organ et al. with a wonderfully creative idea. It turns out that, in organisms alive today, cell size is strongly correlated with genome size. In other words, organisms with large genomes tend to have larger cells. This relationship was first described in red blood cells, but Organ et al. show that it holds quite well in bone cells as well. Using bones from living species, they created a statistical model that enabled them to infer genome size by looking at the size of bone cells. Then they combined their model with measurements of bone cell size from fossilized bones of long-extinct animals, and were able to estimate the genome size of dozens of extinct species, including 31 dinosaur species and several extinct bird species. Their results are remarkable: small genomes are found in the entire lineage (with one interesting exception, Oviraptor) that gave rise to birds, all the way back to the theropod dinosaurs that are the typical reference point in the dinosaur-to-bird story. Here's how the authors put it: "Except for Oviraptor, all of the inferred genome sizes for extinct theropods fall within the narrow range of genome sizes for living birds." Even if you don't have access to Nature, you can have a look at the cool family tree in Figure 2, which shows small genomes in red and larger ones in blue. It's a compelling image.

The results suggest that small genomes arose long before dinosaurs took to the air, and raise some interesting questions about the interplay of physiological function (e.g., energy consumption associated with flight) and genome structure. Certainly scenario #1 above is not favored by these findings: flight apparently arose in organisms that already had much smaller genomes than many of their earthbound cousins. The relationship between flight and small genome size, then, remains unclear and even mildly controversial. Organ et al. acknowledge that the two characteristics did not arise together, but after reference to the larger genomes in flightless birds, they conclude their paper by noting that "the two may be functionally related, perhaps at a physiological level." And they postulate that small genome sizes may have been favored by warm-bloodedness and its associated energetic demands. But a minireview of the paper raises several criticisms of these hypotheses, and it is clear that the evolutionary forces acting on genome size are complex and yet poorly understood.

Notwithstanding the unanswered questions regarding genome evolution, this paper is the kind of scientific article that should be carefully considered by those who deny common descent. Following are some aspects of the story that create interesting questions for creationists and/or design advocates.

Consider the results presented in Figure 2. Outside of common ancestry, how are we to account for these data? The strong correlation between flight and small genome size in living organisms might look like some kind of "design" to someone who favors that sort of thinking, but Organ et al. have conclusively uncoupled genome size and flight. Of course those of us who see the universe as a creation will be happy to marvel at the advantages presented by small genomes to flying organisms, and perhaps we'll all think of these wonders as evidence of brilliant "design." But it seems to me that "design" does not serve a significant explanatory role here. On the contrary, I maintain that the work of Organ et al. demonstrates the following: in dinosaur lineages, the best way to predict genome size in an extinct species is to know the ancestry of the species. Common design aspects don't help. Common descent explains the pattern.

And yet, I think it gets much worse than that for anti-evolution thinkers. I regularly see certain old-earth creationists (e.g. the folks at Reasons To Believe) and design proponents (e.g. William Dembski) arguing that "junk DNA" (which includes, but is not limited to, the 45% of the human genome composed of mobile elements and their debris) is not "junk" but can have important functions. (The arguments of these critics are flawed in several ways, which I'll detail some other time.) While it's true that mobile elements have contributed to the formation of new genes from time to time, and are thought to be significant sculptors of genomic evolution, it's also true that mobile elements are indiscriminate in their jumping, and their continued hopping about is a documented cause of harmful mutation. Here, though, is a significant quandary for a design advocate considering a bird genome: if these mobile elements have important functions in the organism, then how is it that birds can get by with 1/4 as many of them as, say, squirrels? Why, if these elements have important functions in the organism, do bats seem to need far fewer of them than, say, rats? (The genome of the big brown bat is 40% the size of the genome of the aardvark. Hello!) It seems to me that these facts are best understood when one considers the possibility that most of this DNA is essentially parasitic, and that some types of organisms have benefited by restraining its spread. A "design" perspective with regard to genome size is just not helpful, and if that perspective insists on excluding common ancestry, then it's worse than worthless.

Article(s) discussed in this post:

  • Organ, C.L., Shedlock, A.M., Meade, A., Pagel, M. and Edwards, S.V. (2007) Origin of avian genome size and structure in non-avian dinosaurs. Nature 446:180-184.