Showing posts with label visualization. Show all posts
Showing posts with label visualization. Show all posts

Wednesday, May 29, 2013

Shiny talk by Joe Cheng

Shiny is a framework work for creating web applications with R. Joe Cheng of RStudio, Inc. presented on Shiny last evening in Zillow's offices 30 stories up in the former WaMu Center. Luckily, the talk was interesting enough to compete with the view of Elliot bay aglow with late evening sunlight streaming through breaks in the clouds over the Olympics.

Shiny is very slick, achieving interactive and pleasant looking web UIs with node.js, websockets and bootstrap under the hood. It's designed on a reactive programming model (like bacon and ember) that eliminates a lot of the boiler-plate code associated with listeners or observers in UI coding.

Shiny comes in two parts, the shiny R package for developing Shiny apps and Shiny server for deploying them. The RStudio company intends to create a paid tier consisting of an enterprise server and a paid hosting service, Glimmer, which is free for now.

Among several demos were a plot of TV Show Rankings over time and a neat integration with Google's Geochart library to map World Bank health nutrition and population statistics. There are also some examples of combining D3 with Shiny (G3Plot).

Possibly, the coolest demo was a tutorial on reactive programming in the form of an R console in a browser. Chunks of code could drag-and-drop around as in "live document" systems like IPython notebooks or Chris Granger's Clojure IDE Light Table.

Links

Thursday, March 28, 2013

Playing with earthquake data

This is a little example inspired by Jeff Leek's Data analysis class. Jeff pointed us towards some data from the US Geological Survey on earthquakes. I believe these are the earthquakes for the last 7 days.

According to Nate Silver, analysis of earthquake data has made fools of many. But, that won't stop us. Let's see what we can do with it.

fileUrl <- "http://earthquake.usgs.gov/earthquakes/catalogs/eqs7day-M1.txt"
download.file(fileUrl, destfile = "./data/earthquake_7day.csv", method = "curl")
eq <- read.csv("./data/earthquake_7day.csv")

Let's be good citizens and record the provinance of our data.

attributes(eq)["date.downloaded"] <- date()
attributes(eq)["url"] <- fileUrl
save(eq, file = "earthquakes_7day.RData")
attributes(eq)$date.downloaded
## [1] "Mon Feb  4 20:03:48 2013"

Let's see what they give us.

colnames(eq)
##  [1] "Src"       "Eqid"      "Version"   "Datetime"  "Lat"      
##  [6] "Lon"       "Magnitude" "Depth"     "NST"       "Region"

Something looks a little funny about the distribution of magnitudes. What's up with that bump at magnitude 5? I'm guessing that some detectors are more sensitive than others and the less sensitive ones bottom out at around 5.

hist(eq$Magnitude, main = "Histogram of Earthquake Magnitude", col = "#33669988", 
    border = "#000066")

plot of chunk unnamed-chunk-5

I'm guessing that the better detectors are on the west coast of the US.

westcoast <- c("Northern California", "Central California", "Southern California",  "Greater Los Angeles area, California", "Santa Monica Bay, California", "Long Valley area, California", "San Francisco Bay area, California", "Lassen Peak area, California", "offshore Northern California", "Washington", "Seattle-Tacoma urban area, Washington", "Puget Sound region, Washington", "Olympic Peninsula, Washington", "Mount St. Helens area, Washington", "San Pablo Bay, California", "Portland urban area, Oregon", "off the coast of Oregon")
hist(eq[eq$Region %in% westcoast, "Magnitude"], main = "Histogram of Earthquake Magnitude - West Coast US", 
    col = "#33669988", border = "#000066")

plot of chunk unnamed-chunk-6

Now that looks a little smoother. Let's try R's mapping ability. Check out the ring of fire.

library(maps)
map()
points(x = eq$Lon, y = eq$Lat, col = rgb(1, 0, 0, alpha = 0.3), cex = eq$Magnitude * 
    0.5, pch = 19)

plot of chunk unnamed-chunk-7

Zoom in on the west coast.

map("state", xlim = range(c(-130, -110)), ylim = range(c(25, 50)))
points(x = eq$Lon, y = eq$Lat, col = rgb(1, 0, 0, alpha = 0.3), cex = eq$Magnitude * 
    0.5, pch = 19)

plot of chunk unnamed-chunk-8

Lucky for us, almost all the recent quakes out here were tiny and the biggest was way off the coast.

summary(eq[eq$Region %in% westcoast, "Magnitude"])
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     1.0     1.2     1.4     1.5     1.7     5.3

There are supposed to be active fault lines here in Seattle and you can see a few little ones. Down by Mt. St. Helens, they get a few moderately bigger ones. The only quakes I've felt recently were the contractors doing demolition for remodeling my basement.

The knitr output can also be seen on RPubs: Playing with Earthquake Data and on GitHub cbare/earthquakes.rmd.