Monday, September 26, 2011

Hipster programming languages

If you look at the programming languages that are popular these days, a few patterns emerge. I'm not talking about languages that have the most hits on the job sites. I'm talking about what the cool kids are coding in - the folks that hang out on hacker-news or at Strange Loop. Languages like Clojure, Scala and CoffeeScript. What do these diverse languages have in common other than an aura of geek-chic?

  • Functional programming is emphasized over object-oriented programming.
  • Common patterns for manipulating lists: map, filter, reduce.
  • Modern syntax in which everything is an expression and syntactic noise like semicolons is reduced.
  • CoffeeScript compiles to JavaScript, while both Clojure and Scala target the JVM. Targeting legacy platforms seems to be getting easier and more popular.
  • Innovative approaches to concurrency.

This last point deserves some elaboration. It might be a stretch to compare the event-driven nature of node.js with immutable data structures with actor model and software transactional memory, but, at heart, these are all strategies for dealing with concurrency. One place where Java was ahead of its peers was concurrency, so it's cool that Clojure and Scala are taking the next steps in concurrent programming on the JVM.

CoffeeScript

CoffeeScript is javascript, redesigned. It cleans up the syntax adding many of the niceties familiar from Ruby and Python. Curly braces and semicolons are out. String interpolation, list-comprehensions, default arguments, and more tasty sugar are in. Its creator, Jeremy Ashkenas, believes in code as literature and it shows all the way through the project. Take a look at the annotated source code for the CoffeeScript grammar and see if it doesn't make you weep for the ugliness of your own code.

Don't forget, CoffeeScript gets about 10 times hipper when combined with node.js, the event-driven app-server based on google's V8 javascript engine.

Clojure

Clojure is a dialect of lisp targeting the JVM. One of Clojure's key features is a set of immutable functional data structures, with efficiency comparable to Java's mutable collection classes. The beauty of a Lisp dialect is that the syntax of the language is also it's data representation. Code is data, data is code. There are lots of Clojure resources floating around, including a famous talk by Rich Hickey on state, identity, value and time and a project to port SICP to Clojure. Extra points if your client-side code is written in ClojureScript.

Several hip people have recommended reading The Joy of Clojure. Also on Rich Hickey's bookshelf is Chris Okazaki's book Purely Functional Data Structures.

Scala

Scala is a strongly typed object-functional hybrid. It's targets include the JVM and Microsoft's CLR. It's an academic language derived from the ML family, but meant to be a pragmatic replacement for Java. It has a C++ like reputation for being fully understood only by guru level developers. One of it's key features is a type system that is Turing complete in itself. I guess I'm not completely convinced that a rocket-science type system is the answer, but it's there's cool stuff in there - generics done properly, higher-kinded types, which as near as I can tell takes parametric types to a level of meta beyond generics. One nice thing is that Scala has a tighter mapping to Java than Clojure so the interop between the two is a little more reasonable.

The Akka project is a Scala platform for concurrent applications, providing both the actor model and software transactional memory. Those wanting to learn more can track down some interesting talks by language designer Martin Odersky available, plus the Scala Types podcast.

A couple more

Not enough languages for you? I'll throw in a couple more hip languages, R and Haskell. Truly cool kids know Haskell. What can I say, except that I am not yet that cool. I need to go out to a shed in the woods with a couple of books and learn me some Haskell.

R may have a bastardized syntax, but, eventually, it's functional core shines through. R is seeing a surge in popularity based on the highly hip and trendy field of data science, where it's powerful statistical methods and graphing come in handy. Aside from mclapply, R is a bit lacking in support for concurrency. [See correction below in comments!] Rhipe and Revolution Analytics's RHadoop are trying to change that by enabling distributed data analysis with R and Hadoop.

Fashion

You might be tempted to say it's all fashion. What goes around comes around. To some extent that's true, but, in each of these languages, there's something new and worthwhile to be learned. We have a ways to go before code is as expressive as we want it to be. Someone smart said that you'll like a programming language in proportion to what it teaches you. Mostly, I want to remind myself to set aside some time to play with these languages and see what new tricks they have to teach this old dog.

PS: When this post grows up, it wants to be Programming language development: the past 5 years by the very hip Michael Fogus.

Friday, September 23, 2011

Network Science

Network analysis is hip. Applications range over social networks, security, biology, and economics. At this point, you'll hardly be the first one to the party, but if you want to give network science a try, here's a random grab-bag of resources to get started.

Coordination of frontline defense mechanisms under severe oxidative stress, Kaur et al. 2010

Learning network science

Jon Kleinberg, a professor of computer science at Cornell University, co-wrote Networks, Crowds, and Markets: Reasoning About a Highly Connected World along with David Easley. He also wrote Algorithm Design, an undergraduate textbook.

A 2004 review paper by Barabasi and Oltvai Network biology: understanding the cell's functional organization. covers a broad range of applications of networks in modern biology. Barabasi is also author of Linked.

A Science special issue on networks, from July 2009, revisits the foundations of network analysis, and delves into applications to ecological interactions, counter-terrorism, and finance.

Video and slides are available for Drew Conway's presentation on social network analysis in R, which mostly focuses on software tools.

Tools for analyzing networks

Software tools for working with networks include the R packages graph, igraph, network. Also, the NetworkX library for Python looks quite powerful. Visualization tools tend to come and go, but some well-known tools are: Cytoscape, Gephi, and GraphViz.

More network stuff

Monday, September 19, 2011

Applying control theory to the cell

In a talk about research goals in the systems biology of microbes, Adam Arkin referenced the Internal Model Principle of control theory. Here are a couple definitions.

A regulator for which both internal stability and output regulation are structurally stable properties must utilize feedback of the regulated variable and incorporate in the feedback loop a suitably reduplicated model of the dynamic structure of the exogenous signals which the regulator is required to process.

Towards an Abstract Internal Model Principle Wonham, 1976

That's a mouthful. This one's a little less scary.

Internal Model Principle: control can be achieved only if the control system encapsulates, either implicitly or explicitly, some representation of the process to be controlled.

Lecture notes on Introduction to Robust Control by Ming T. Tham, 2002

Driving this thinking is the discovery that microbes show anticipatory behavior and the associations can be fairly readily entrained and lost in a few generations. Ilias Tagkopoulos and Saeed Tavazoie, in Predictive behavior within microbial genetic networks, demonstrated associative learning through rewiring gene regulatory networks. It turns out that when E. coli senses a shift to mammalian body temperature, it begins the transition to anaerobic metabolism, nicely anticipating the correlated structure of it's environment.

In another example, Amir Mitchell working at Weizmann, showed that yeast anticipates the stages of fermentation in Adaptive prediction of environmental changes by microorganism.

This raises some important questions. How is the internal model encoded within the cell? And how does the cell acquire, parameterize and adjust its internal model over evolutionary time scales? The answers will lead to a deeper understanding of living systems and might even feed new techniques and principles back to control theory.

An interesting challenge will be to experimentally read out the information embedded in the cell's control systems and then the informatics problem of how to represent and work with such things.

Understanding how this works is a prerequisite for re-engineering living systems, otherwise known as synthetic biology, championed by George Church and Drew Endy. This month, by the way, the journal Science has a special issue on synthetic biology.

I'm fascinated by the idea of applying engineering principles to biology - evolved systems, rather than engineered artifacts. Maybe that's because my spaghetti code looks a lot like the messy interconnectedness of biology. Creating software feels organic, rather than wholly predesigned. The engineering of complex software systems tends to be an adaptive evolutionary process. As messy as biology is, modularity naturally emerges. Maybe biology has something to teach us about organizing this chaos.