Tuesday, March 06, 2012


When I was a little whelp, I had a brief and unsuccessful engagement at a bioinformatics startup in the bay area called Ingenuity. The company had a cool idea, a knowledge base for molecular biology, and smart and creative people to implement it. I was tasked, by a long-haired, Tufte-toting, Stanford grad-student, with developing new and rich UI elements with the budding new technology at the time, dynamic HTML. In particular, I was to implement a search bar that could automatically suggest terms from an ontology - the autocomplete feature.

There was even a prototype that sort-of worked on the right version of Netscape when the server 30 feet away. I pulled my hair out trying to get it working consistently across browsers. A better engineer, say John Resig, might have pulled it off, but I had to admit defeat. Like many DHTML toys of the time, it just wasn't ready for production. So, I recommended scrapping the idea in favor of a simple "google-box". This was not well received and my tenure was not long in coming to a close.

For years after, I'd argue on every project for simple, stripped-down web UIs. "Look at Google," I'd say, pointing to the minimalist search box. Trying to do anything advanced in a browser, I'd warn, was an invitation to cross-browser compatibility issues and nightmarish debugging sessions.

Meanwhile, as if to make me look like a bozo (as if I need any help), Google introduced autocomplete, AKA Google Suggest, first as a Google Labs project, in 2004 and finally rolled out autocomplete on the main Google page in 2008. So much for my "Look at Google" argument. Not that it matters now, but I feel slightly vindicated by the fact that it took even the mighty Google this long to deploy a feature that I was basically shit-canned for failing to implement in 2000. ...not that I'm bitter. But, anyway, back in the present day...

Doug Basset, Chief Scientific Officer at Ingenuity gave a demo last week at the ISB, where I now reside, showing off Ingenuity Variant Analysis, a new tool built on top of their knowledge base that helps find disease-causing genetic variations in resequencing data.

The basic trick is to filter down the millions of genetic variants found in any individual genome to those consistent with a given condition, starting with it's frequency in the population and genetic properties like homo- or heterozygosity. The neat part comes next.

For each variant surviving this far, the program traverses the graph of facts in the knowledge base. Of course, it will find known associations between variants or their host genes and disease. What's better, it can also find relationships a few degrees removed from direct implication. Say, A turns off B with regulates C. The biological process to which C belongs runs amok in disease X. Suddenly, a variant in a functional domain of gene A looks like an interesting candidate. If it works, you end up with a handful of genes with enough evidence to warrant further investigation.

The products's flash-based UI is very slick and modern, with drop-shadows, ghosting and barber-pole progress bars. Tables have spiffy little sparkline graphics. Right in the middle of the demo, a search dialog popped up and there was the autocomplete feature, mocking me. It's certainly no big deal these days. My current project has an autocompleting search box, too, thanks to jQuery and Solr. But, I guess the memory of flubbing that gig still has a little sting left in it.