Sometimes I am astounded by the sheer volume of data that we create in science nowadays. Where a few years ago we were sequencing individual genes, made up of a few thousand letters, now with a single Illumina run we can generate terabytes of data.
But what to do with that data? A lot of genomics at the moment is concerned with targeted resequencing, and bulk segregant analysis. Producing genome #1 is a lot of hard work, and doesn’t tell us all that much. Producing genomes #2 to #10 for the same species tells us a lot more: Why does wheat cultivar 1 have a higher yield than wheat cultivar 2? Why is apple variety 1 susceptible to a disease when apple variety 2 is not?
The key is in SNPs: Single Nucleotide Polymorphisms (or differences) that lead to different varieties of a protein being made. Our genomes vary at thousands of locations (or loci, to use the technical term) and by comparing groups of individuals (known as bulks) who are all the same in one respect (e.g. a group of people with a hereditary form of cancer versus those without it) we can hope to spot which differences recur in one of the groups. This may not tell us why the groups are different, but it gives us a ‘bookmark’ or molecular marker to search for. Maybe not all musicians carry music stands, but knowing that the two are linked means we can try to spot musicians by spotting music stands.
The problem is that sifting through all of this data is a time-consuming and computing-power-expensive process. What’s more, humans can often do as accurate a job. This is where crowd-sourcing comes in. By turning pattern-spotting into a game, researchers can convince ordinary people to help them with their work.
Scientists at the John Innes Centre, TGAC and the Sainsbury lab in Cambridge working on a fungal disease called ash dieback have produced a game that allows anybody with a Facebook account to help them in sifting through the data. Fraxinus is a game in which players identifying matching and non-matching sequences, denoted by different coloured leaves, corresponding to real sequencing data. Results of the project are made then available on the crowdsourcing website OpenAshDieBack. The scientists have their data analysed, and some members of the public may learn something about plant pathology and genetics along the way!
Somehow I’m still not sure my supervisor will be impressed if he catches me on Facebook during office hours…