Sunday, July 25, 2010

Gaggle Genome Browser

There's a certain windmill I've been tilting towards for a couple of years now. It's known as the Gaggle Genome Browser and we've published a paper on it called Integration and visualization of systems biology data in context of the genome.

The Gaggle Genome Browser is a cross-platform desktop program, based on Java and SQLite for interactively visualizing high-density genomic data, joining heterogeneous data by location on the genome to create information-rich visualizations of genome organization, transcription and its regulation. As always, a key feature is interoperability with other bioinformatics apps through the Gaggle framework.

Here it is displaying some tiling microarray data for Sulfolobus solfataricus. Click for a bigger graphic. The reference sample is shown in blue circles overlaid with segmentation in red. Eight time-points along a growth curve are plotted as a heat map - red indicating increased transcription relative to the reference; green indicating decreased transcription. We also show Pfam domains, predicted operons, and some previously observed non-coding RNAs, several of which we were able to confirm.

One of the features I'm most proud of is the integration with R, a tactic also being used by MeV. At this point it's only partially complete. There's quite a bit more that could be done with it, and I'm looking for time (or help!) to finish.

The past couple of years have seen a whole crop of new genome browsers. See the entry browsing genomes for a partial list. One reason is a new generation of lab hardware and techniques, including ChIP-chip, tiling arrays and high-throughput next-generation sequencing. Another is the ever changing landscape in computing.

It's lacking polish in some places. There's plenty yet to be done. Maybe later, I'll write up some lessons learned and mistakes made, but for now, I'm happy to have it published and out there.

Read more about the biology here:

Check out the screencast by OpenHelix here: