A little knowledge is a dangerous thing, so I must be positively treacherous to myself as far as genomics is concerned. I've spent much of the last 16 years or so looking at gene and gene product sequences in exchange for a salary. Having participated in sequencing the odd organism or thirty, and with a fairly sound background in statistics, I'm reasonably well-acquainted with many of the issues surrounding genome sequencing, genotyping, and genome-wide association studies (GWAS). But still I was drawn to getting my genotype sequenced by 23andMe.
The initial process is fairly painless, so long as you don't find spending money on things you don't strictly require for survival to be painful. You simply saunter - as well as you can saunter on the web - over to their website, add a kit (or, to be precise, a small tube into which you can spit) to your basket, and pay for it, just like buying a book at Amazon. But this will eventually turn out to be a book written with letters amounting to interesting parts of your DNA: a guide to your own health, biology and ancestry. A book written by your parents, for you. A book that everyone carries around within their cells, but that has been essentially inaccessible until this century.
For the total cost of $299 plus shipping (which, to the UK, was a not inconsiderable $79) you can be part of the front line of humans to knowingly read, and potentially understand, parts of their own genome! Never mind that many people, including me, think that if we give it ten years the total cost will be around $10 for a complete readout of your own genome: this is still an astonishingly low price. That is, it's low so long as you're relatively affluent in that you have a disposable income that would make a third or more of the planet weep at the tat you waste it on.
Once you've registered with the site and paid your money, you get the opportunity to participate in a number of surveys (around 40, at the moment) which, even if they don't spell it out without a trip to the extensive FAQs, look very much like the sort of thing you'd use for GWAS. Topics range from medical history to physical appearance (is my lower back really hairy?), the ability to recognise people's mood from their expression (turns out my poor social skills probably aren't due to being on the autistic spectrum), to family history of Parkinson's. Some are more successful than others; the skin colour survey is clearly dependent on the ability of your monitor to render colour faithfully, for example.
On the whole, the surveys are about as much fun as those on YouGov, and seem a bit more professional than the ones in, say, Grazia. Not that I read Grazia - their surveys may well make the Office of National Statistics weep with envy - but you know what I mean. The surveys seem to be voluntary, so if you're not comfortable with sharing, you don't have to.
|"The Book of Life"
The analysis process is chip-based. 23andMe state that they have a custom SNP (single nucleotide polymorphism - a single base change in the DNA) chip, from the Illumina OmniExpress Plus family. These kinds of chips, rather than sequencing your DNA, identify where single bases in your DNA differ from those in a predefined set of regions of DNA. Of course, the nature of that set is important and, since 23andMe have a custom chip, it's not explicit up front which regions are being analysed. The nature of the service suggests that the focus of these regions is on health issues and heredity. So what you can expect to get back, in exchange for your spit and almost $300, is a long list of regions in your DNA, their base sequences, and whether your base sequence in particular has previously been associated with some health issue, or particular ancestry.
Obviously, interpreting this kind of data is difficult, and one of the key services 23andMe provide is that interpretation. These are given as headline names (e.g. 'Psoriasis') with a confidence level indicated with stars (these indicate a subjective assessment of the quality of research supporting the association), a population percentage risk, the percentage risk associated with your genotype, and the ratio of your risk:population risk. Clicking on the headline link takes you to a page that describes the health issue, an estimate of heritability and the trade-offs between environment and genetics for that issue, a description of the genetic markers and their expected effects, and links to the primary literature. Results are divided into elevated, decreased, and typical risk classes. It looks like the interpretations are updated over time.
But if 23andMe were only offering the genotyping and interpretation, I wouldn't have been as interested, and probably wouldn't have signed up. However, they also allow you to download the raw data from your sample. This comes as a large-ish (25MB or so, listing 950,000 SNPs) plain text file in tab-separated format, indicating the SNP identifier, the chromosome, location on that chromosome, and the SNP call on the forward strand:
rs4477212 1 72017 AA
rs3094315 1 742429 AA
rs3131972 1 742584 GG
rs12124819 1 766409 AA
rs11240777 1 788822 AG
rs6681049 1 789870 CC
rs4970383 1 828418 CC
rs4475691 1 836671 CT
rs7537756 1 844113 AG
rs13302982 1 851671 GG
(not my data - this is one of the example files)
Human genome builds are publicly-available so, with this data, mapping SNPs and investigating my own genome - to a limited extent - becomes feasible. Now I just have to sit back, and wait for the spit-tube to arrive.