Tuesday, August 28, 2012

The Joy of Clojure

Clojure is a modern Lisp dialect that runs on the JVM. Since its release in 2007, it's become quite popular with elite hackers of a certain persuasion. Michael Fogus and Chris Houser's Joy of Clojure is a thoroughly enjoyable bit of summer reading explaining the philosophy behind the language, showing how Clojure promotes simplicity and flexibility, deals with time and state, and generally brings fun back into programming.

Concurrency is one of the motivating factors behind Clojure. Hickey's approach to sane management of state is anchored on the concept of immutability. The language encourages "immutability by default" with persistent collections - efficient implementations of list, vector, map and set that preserve historical versions of their state. When concurrency becomes necessary, Clojure provides some very modern constructs including STM (software transactional memory), agents similar to the actor model of Erlang, and atomic values.

Aside from the parentheses, one big difference between Clojure and Java (as well as fellow JVM resident, Scala) is dynamic typing. Entities in Clojure are typically modeled in terms of the basic data structures already mentioned, especially maps and vectors. In practice, this ends up looking like JSON or Javascript objects, which are just bags of properties. The book includes a basic implementation of prototype based inheritance as an example. I've always thought that was more appropriate to dynamic languages than building class hierarchies (as in Python and Ruby). You might go so far as to say that JSON is to Clojure what lists are to classic Lisp.

Clojure's interop with it's host platform gracefully bridges the conceptual gap between Java and Lisp, enabling Clojure to call into Java code making available the wealth of Java libraries. It's equally possible to expose Clojure APIs to Java code. Dynamically generating Java classes and proxies as well as creating and implementing interfaces on the fly leads to the belief in the Clojure community that "Clojure does Java better than Java." In spite of the differences in semantics, Clojure feels like a natural layer on top of Java, raising the levels of abstraction and dynamism while easing many pain-points that every Java programmer stumbles over.

The Clojure language is also branching out beyond the JVM. Clojurescript is a Clojure to javascript cross-compiler. Another offshoot targets Microsoft dot.net's CLR.

Aside from persistent data structures and Java interop, Clojure comes with a bunch of functional goodness built in. Clojure's multimethods put control of method dispatch into the programmer's hands. Macros let you construct your own flow-of-control structures, as demonstrated by do-until and unless examples in the book. Illustrating how Lisp and it's macros can be used to construct little languages or DSLs, the book implements a mini-SQL interpreter.

One especially nice aspect of the book is the putting it all together sections, which cover examples like: lazy quicksort, A*, and a builder for chess moves. These longer examples are still bite sized, making them more easily digested than the extended case studies found in some books.

The Joy of Clojure is not an introductory book nor is it a language reference. It will appeal to the reader who already has some programming experience. It's a good idea to spend some time with the online tutorials first. Where the book is strongest is answering the why questions, getting you started learning how to think about programming in Clojure and showing you how Clojure changes the way you think.

Companies using Clojure:

Popular Clojure projects and libraries:

Videos

Luckily for those wanting to learn more without leaving their hammock, there are lots of videos about Clojure. A lot of clear thinking about software, whether you do it in Clojure or not, can be found in Rich Hickey's talks.

More Clojure Stuff

Sunday, August 19, 2012

Scientific Python

Astronomer Joshua Bloom gave a talk titled Python as Super Glue for the Modern Scientific Workflow at SciPy2012. Bloom teaches Python for Scientific Computing at Berkeley (available as a podcast). Bloom showcased Pythonic contributions to work on Supernova and machine-learning in astronomy.

Python has solid support for data analysis and scientific computing, starting with Numpy for matrix manipulation and SciPy, which adds diff-eqs, optimization, statistics, etc. and matplotlib for graphics. I keep meaning to check out Pandas, scikit-learn, and Sage.

IPython keeps getting more impressive and appears to be evolving in the direction of Mathematica's Live Notebooks. I have a long-standing thing for mixing prose and code. Fernando Perez’s talk on IPython at SciPy2012 and a the longer version IPython in-depth are in my viewing queue.

Part of the beauty of Python is it's breadth as a general purpose programming language. Libraries for everyday programming tasks like web application and database interaction are well developed in the Python world. There are good arguments in favor of DSLs for math and statistics, the approach embodied by R, Matlab, Mathematica and Julia. On the other hand, some may agree with John Cook, who puts it like this: "I’d rather do math in a general-purpose language than try to do general-purpose programming in a math language."

More

Sunday, August 05, 2012

Simplicity made easy

Rich Hickey, inventor of Clojure, gave a talk at last year's Strange Loop called Simple Made Easy. In it, he makes the case that Lispy languages and Clojure in particular provide tools for achieving modularity and simplicity. Simplicity Matters, Hickey's keynote at RailsConf in April of this year is a variant on the same themes. Here are my notes, either quoted verbatim or paraphrased.

“Simplicity is a prerequisite for reliability”
- Edsger Dijkstra

“Simplicity is the ultimate sophistication”
- Leonardo da Vinci

Simplicity vs complexity

Complexity comes from braiding together, interleaving or conflating multiple concepts. Complect (braid together) vs. compose (place together)

Simplicity is defined as having one role - one task, one concept, one dimension. Composing simple components is the way we write robust software. A good designer splits things apart. When concepts are split apart, it becomes possible to vary them independently, use them in different contexts or to farm out their implementation.

There are reasons why you have to get off of this list. But there is NO reason why you shouldn't start with it.

Abstraction for simplicity

Analyze programs by asking who, what, when, where, why and how. The more your software components can say, "I don't know, I don't want to know", the better.

  • what: operations - specify interface and semantics, don't complect with how; how can be somebody else's problem
  • who: data or entities
  • how: implementation, connected by polymorphism constructs
  • when and where: use queues to make when and where less of a problem
  • why: policy and rules of the application

Data representation

Data is really simple. There are not a lot of variation in the essential nature of data. There are maps, there are sets, there are linear sequential things... We create hundreds of thousands of variations that have nothing to do with the essence of the stuff and that make it hard to manipulate the essence of the stuff. We should just manipulate the essence of the stuff. It's not hard, it's simpler. Same goes for communications... (44:30)

Information is simple, represent data as data, use maps and sets directly (56:30)

(From the RailsConf talk:) Subsystems must have well defined boundaries and interfaces and take and return data as lists as maps (aka JSON). ReSTful interfaces are popular in the context of web services. Why not do the same within our programs?

Choosing simplicity

Simplicity is a choice, but it takes work. Simplicity requires vigilance, sensibility and care. Choose simple constructs. Simplicity enables change and is the source of true agility. Simplicity = opportunity. Go make simple things.