Showing posts with label bioinformatics. Show all posts
Showing posts with label bioinformatics. Show all posts

Friday, 11 July 2014

The Baserate Fallacy, revisited

In which I share some iPython code for an interactive demo to calculate and visualise the probability that a positive test implies a positive result. You can get it here, and preview it here. Then I get entangled in a topic outwith my expertise.

Sunday, 10 November 2013

ANI are you okay? Are you okay ANI?

In which I describe Average Nucleotide Identity (ANI), which we can use to pigeonhole bacteria into conceptual boxes labelled as 'species'. And I share more code.

Wednesday, 13 February 2013

A Nice New Paradox

In which I work through a popular statistical puzzle/paradox (with potential implications for interpretation of large data studies). With example code.

Friday, 8 February 2013

Surely this has been done already...

In which it turns out that if it was done already, it was hiding somewhere. Also, I share a script that retrieves the corresponding nucleotide coding sequences from NCBI, given only a set of protein sequences.


Friday, 1 February 2013

KEGGWatch, part III

In which I finally get around to sharing some code, and give some examples of downloading and modifying KEGG pathway maps.

KEGGWatch, part II

In which I don't quite get around to writing a KGML parser and visualisation module (all very Tristram Shandy, this!), with a view to submitting to Biopython. This post describes some of the rationale and design choices - tune into part III for code and examples of use.

Monday, 21 January 2013

KEGGWatch, part I

In which I attempt to visualise metabolic maps for comparative genomics, and lead up to making a contribution to Biopython.

Sunday, 23 September 2012

The Colours, Man! The Colours!

In which I take a short diversion into colour theory, and share some code to automate colour selection for class data.

Thursday, 19 July 2012

On Reciprocal Best BLAST Hits

In which I narrowly avoid a rant. Reciprocal best BLAST hits can improve the quality of your searching, and are a good way to find candidate orthologues. There's evidence and everything.

Sunday, 1 July 2012

Dead fish, and multiple-test correction

In which a salmon is resurrected, but not enough to really be significant.  Why finding 20 positive results when your P-value threshold suggests you should only see 10 isn't necessarily anything to be excited about.  And an introduction to Bonferroni and Benjamini-Hochberg multiple test correction.

Saturday, 23 June 2012

The Base Rate Fallacy in Effector-Finding


In which an oft-overlooked bit of genome-mining statistics is considered, and your enjoyment of a holiday could depend heavily on other people's hygiene.

Thursday, 8 September 2011