Today I am at home, theoretically packing my van in order to start moving house at 10am. What this actually means, of course, is that I’m enjoying the chance to still be in my bedroom at 8:30am with a cup of tea and the science news, including this super interesting story.
A collaboration of scientists from Edinburgh, Cambridge, Cork, Utah and Seattle funded by the BBSRC have discovered firstly that the flu virus has one more gene than they were expecting, and secondly that the allelic differences in this gene control how the host (i.e. you and me) respond to contracting the virus. Now you may be thinking ‘The influenza A virus genome is only 14000 bases long! [because you’re a massive geek like me and know things like that…] How can it possibly a) code for 13 genes in the first place and b) have a spare one hiding that nobody has noticed?!”
There are two ways in which a gene can be hidden in plain sight. The first is called alternative splicing and doesn’t really give you a new gene per se but can still change the biochemistry of a cell quite drastically. Genes don’t actually come in one big block. They are broken up in pieces called exons and split up in the genome separated by similar-sized pieces of non-coding DNA called introns. When it’s time to make a protein, the entire sequence is transcribed (copied) and the introns are then spliced (cut) out. Typically a gene might have around 5 exons (although some have any more – the one I work with has 14!) but sometimes there are exons that can be added in or left out depending on the final protein being made.
The second kind of gene hiding is by using a different reading frame. The genetic code is made up of 4 bases or “letters”, which together have to code for some 20 different amino acids, plus start and stop codons. 4 letters is clearly not enough, and nor are 16 2-letter combinations, so the actual code is made up of triplet codons of three letters, with some amino acids being coded for by several different codons. This means that any particular string of letters can be read in three different ways. AAG GGC GAG TCA AGG TCC TTT would be read as KGESRSF (or Lysine, Glycine, Glutamic Acid, Serine, Arginine, Serine, Phenylalanine) but AGG GCG AGT CAA GGT CCT TTC would be read as RASQGPF – a completely different code, just by moving a single base to the right. What’s happened in the virus is sort of a combination of these two effects. The newly discovered gene is a combination of the viral PA protein with an alternative C-terminal ending. Pretty cool, huh?
So what does this all mean for us anyway? Well it seems that this newly discovered gene is involved in host responses. The ‘active’ copy of it is capable of maintaining a low level response in the host, who will then recover. (And remember, this is really what the pathogen ‘wants’. Killing your host is a good way to make sure that you don’t get to visit any more hosts.) But when the gene is not active the host’s immune system goes into overdrive, causing the kind of potentially fatal infection seen in the Spanish flu epidemic in 1918. That’s all I’ve got on the topic for now, as I can’t get to the journal article until I’m back in work on Monday, but it sounds like some really good science!