Friday, April 26, 2013

Science on the Desktop

For decades, I've been hoping I'd live long enough to see a day when serious science could be done on the desktop by dedicated amateurs. Amateur astronomers know what I'm talking about. You can't do much particle physics on the desktop, and there are no affordable desktop electron microscopes (yet), but if comparative genomics is your thing? Get ready to rock and roll, my friend.

Over the weekend I discovered http://genomevolution.org and promptly went nuts. Let me take you on a tour of what's possible.

First I should explain that my background is in microbiology, and I've always had a soft spot in my heart (not literally) for organisms with ultra-tiny genomes: things like Chlamydia trachomatis, the sexually transmitted parasite. It's technically a bacterium, but you can't grow it in a dish. It requires a host cell in which to live.

It turns out there are many of these itty-bitty obligate endosymbionts (at least a dozen major families are known), and because of their small size and obligate intracellular lifestyle, they have a lot in common with mitochondria. Which is to say, like mitochondria, they're about a micron in size, they divide on their own, they have circular DNA, and they provide services to the host in exchange for living quarters.

When you look at one of these little creatures under the microscope (whether it's Chlamydia or Ehrlichia or Anaplasma or what have you), you see pretty much the same thing. (See photo.) Namely, a tiny bacterium living in cytoplasm, mimicking a mitochondrion.

When Lynn Margulis wrote her classic 1967 paper suggesting that mitochondria were once tiny bacterial endosymbionts, it seemed laughable at the time, and her ideas were widely criticized (in fact her paper was "rejected by about fifteen journals," she once recalled). Now it's taught in school, of course. But we have a long way to go before we understand how mitochondria work. And we really, really need to know how they work, because for one thing, mitochondria seem to be deeply involved in orchestrating apoptosis (programmed cell death) and various kinds of signal transduction, and until we understand how all that works, we're going to be hindered in understanding cancer.

When I discovered the tools at http://genomevolution.org, one of the first things I did, on a what-the-hell basis, was compare the genomes of two small endosymbionts, Wolbachia pipientis and Neorickettsia sennetsu. The former lives in insects; the latter, in flatworms that infect fish, bats, birds, horses, and probably lots else. Note that for a horse to get Potomac horse fever, first the Neorickettsia has to infect a tiny flatworm; then the flatworm has to be ingested by a dragonfly, caddisfly, or mayfly; then the horse has to eat (or maybe be bitten by, although only infection-by-ingestion has been demonstrated) the worm-infected fly. The parasite-of-a-parasite chain of events is not only fascinating in its own right, it suggests (to me) that parasites enable each other through shared strategies at the biochemical level, and I might as well spoil some suspense here by revealing that there's even yet another layer of parasitism (and biochemical enablement) going on in this picture, involving viruses. But we're getting ahead of ourselves.

I mentioned Wolbachia a second ago. Wolbachia is a fascinating little critter, because it's found in the reproductive tract of anywhere from 20% to 70% of all insects (plus an undetermined number of spiders, mites, crustaceans, and nematodes), but they don't cause disease, and in fact it appears many insects are unable to survive without them. Wolbachia are unusual in that the extracellular phase of their lifecycle (the part where they spread from one host to another) isn't known; no one has observed it. What's more (and this part is incredible), Wolbachia have adapted to a stem-cell niche: They live in the cells that give rise to insect egg cells. Thus, all newborn female progeny of an infected mother are infected, and all eggs pass on the Wolbachia. In this sense, the genetics of Wolbachia obey mitochondrial genetics (whereby the mother passes on the organelle and its genome).

I quickly found, via Sunday afternoon desktop genomics, that Wolbachia and Neorickettsia (and other endosymbionts: Anaplasma, Ehrlichia, etc.) have many genes in common—hundreds, in fact. And when I say "genes in common," I mean that the genes often show better-than-50% similarity in DNA base-pair matching.

It's important to put some context on this. These little organisms have DNA that encodes only 1,000 genes. (By comparison, E. coli has around 4,400 genes.) Endosymbionts lack genes for common metabolic pathways. They cannot biosynthesize amino acids, for example; instead they rely on the host to provide such nutrients ready-made. If 400 to 500 of an endosymbiont's 1,000 genes are shared across major endosymbiont families, that's a huge percentage. It suggests there's a set of core genes, numbering in the low hundreds, that encapsulate the basic "strategy" of endosymbiosis.

A little more context: Mitochondria have their own DNA and look a lot like endosymbionts. But here's the thing: Mitochondrial DNA is tiny (only about 15,000 base pairs, versus a million for an endosymbiont). It turns out, 97% of the "stuff" that makes up a mitochondrion is encoded in the nucleus of the host. If you include these nuclear genes, mitochondria actually rely on about 1,000 genes total, of which only 3% are in the organelle's DNA. Lynn Margulis would say that what happened is, the endosymbiont ancestor of today's mitochondrion originally had DNA of about a million base-pairs (1,000 genes), but some time after taking up residency in the host cell, the invader's DNA mostly migrated to the host nucleus.

Why did symbiont-to-host DNA migration stop at 97%? Why not 100%? If we look at that 3%, we find genes coding for tRNA and bacterial ribosomes (specialized protein-making machinery) plus genes for enormous, complex transmembrane enzyme systems: cytochrome c oxidase and NADH dehydrogenase. (The former is the endpoint of oxidative respiration; the latter the entry-point.) Obviously it must be advantageous for these genes to be proximal to the organelle.

But why even have an organelle (a physical compartment)? One might ask why it's necessary to have a mitochondrial parasite swimming around in the cytoplasm at all, when most of the genes are part of the host's DNA? The answer is, the stuff that goes on inside the confines of the mitochondrion needs to be contained, because it's violently toxic stuff involving superoxide radicals, redox reactions, "proton pumps," and Fenton chemistry (transition-metal peroxide reactions). A containment structure is definitely called for, to segregate this toxic chemistry from the rest of the cell.

We might ask how it is that the DNA of the protobacterial ancestor of today's mitochondria wound up in the host nucleus in the first place. Let's consider the possibilities. Protobacterial (symbiont) DNA may have transferred to the host all at once, or it might have migrated piecemeal, over time. Or both. Is it realistic that huge amounts of endosymbiont DNA could have migrated to the host nucleus all at once? Yes. It's been suggested that vacuolar phagocytosis drove invader DNA to the nucleus in a big gulp. Evidence? Wolbachia inhabits the vacuolar space.

But export of genes and gene products to the host might have occurred piecemeal as well. A little desktop exploration provides some clues. If you use GenomeView or any number of other online tools to explore the DNA of Wolbachia, several things pop out at you. First is that many Wolbachia genes are mitochondria-like: They encode for things like cytochrome c oxidase, cytochrome b, NADH dehydrogenase, succinyl-CoA synthetase, Fenton-chemistry enzymes, and a slew of oxidases and reductases (including a nitroreductase). Wolbachia is clearly engaged in providing what might be called redox-detox services for the host—the same value proposition that mitochondria offer. This makes sense, because if Wolbachia cells were a net drag on the respiratory potential of host-cell mitochondria (if they couldn't at least hold their own with respect to mitochondria), the host would die.

The second thing that jumps out at you when you look at the Wolbachia genome is the abundance of genes devoted to export processes: membrane proteins, permeases, type I, II, and IV secretion systems, ABC transporters, etc., plus at least 60 ankyrin-repeat-domain genes—all powerful evidence of specializations aimed at export of genes and gene products to the host. But the most stunning "smoking gun" of all is the presence, in Wolbachia DNA, of five reverse-transcriptase genes, plus genes for resolvases, recombinases, transposases, DNA polymerases, RNA polymerases, and phage integrases. In essence, there's a complete suite of retroviral machinery, designed for export of foreign DNA into host DNA.

An example of one of 113 phage-derived genes in Wolbachia (lower gene array). In this case, the gene matches a phage gene found in Candidatus hamiltonella (upper gene array). The two isoforms exhibit 59% DNA sequence similarity, despite widely differing GC ratios. See text for discussion.

But wait. There's more. The third thing that jumps straight in your face when you start looking at the Wolbachia genome is the presence of (are you ready?) no less than 113 genes for phage-related proteins, including major and minor capsid and HK97-style prohead proteins, plus tail proteins, baseplate, tail tube, tail tape-measure, and sheath proteins; late control gene D; phage DNA methylases; and so on. (For non-biologists: phage is the term for viruses that attack bacteria.)

In the above screenshot, I'm comparing Wolbachia DNA (lower strip) to DNA from another insect-infecting endosymbiont, Candidatus hamiltonella, which is known to contain an intact virus (phage) in its DNA. Many phage proteins in Wolbachia have corresponding matches in the Candidatus genome. In this case, we're looking at a gene (the gold-colored stretch pointed at by red arrows) that is 1440 nucleotides long, with a 59% sequence match across genomes. The match percentage is remarkably high given that the Candidatus version of this gene has a 51.7% GC content while the Wolbachia version has a 40.6% GC. Also, note that Wolbachia itself has an overall GC of 34.2%. The fact that Wolbachia's putative phage genes are significantly higher in GC content than Wolbachia's non-phage genes is good confirmation that the genes really are from phage.

It's 100% clear that viral DNA has made its way into the DNA of Wolbachia (either recently or long ago), and it's reasonable to hypothesize that Wolbachia has repurposed the retrovirus-like phage genes for packaging and exporting Wolbachia DNA to the host nucleus.

Okay, so maybe you have to be a biologist for any of this stuff to make your hairs stand on end. To me, it's a dream come true to be able to do this kind of detective work on a Sunday afternoon while sitting on the living-room couch, using nothing more than a decrepit five-year-old Dell laptop with a wireless connection. The notion that you can do comparative genomics and proteomics while watching an Ancient Aliens rerun on TV is (for me) totally cerebrum-blowing. It makes me wonder what's just around the corner.