Sunday, December 4, 2011

Why Clojure

After reading articles about Clojure for the past several months, and in particular listening to Rich Hickey's talks online regarding Clojure's approach to state management and persistent data structures, I have decided to start programming in Clojure.
I spent some time studying Haskell out of interest before and absolutely love that language. The deciding factor for me was that Clojure compiles to java bytecode and executes on the JVM.
I am forbidden from using anything other than Java for production systems at work... but my main frustrations with Java are less to do with code bloat, and inelegance but more to do with the fact that a majority of programs we write in Java tend to be highly threaded and performant applications. It is precisely in this space that i find Java's locking mechanisms and OOP in general to be very stifling.

Managing state in such applications has to be done at the application level, and this necessarily means people who write and maintain such threading code have to break encapsulation and understand the implementation level details of each object(s) in their system. Add to it the fact that locks are not composable, and that most applications we write need to deal with complex roll back logic between caching layers and databases, we end up with hairy unmaintainable mess of complex locking and custom roll back strategies which are difficult to write and impossible to maintain.

This is I think the sweet spot of a language like Clojure., and my main reason to learn it. Of course, I would be lying if I did not admit I love functional programming,  but to me Clojure's approach to state, identity , data structures that support concurrency, STM etc, make Clojure more than merely intellectually appealing, but a language where I can really get my work done efficiently and enjoy coding at the same time.

How am I going about learning Clojure?

The quickest route for me was to dowload leiningen, and counterclockwise. Leiningen is a Clojuresque dependency management tool, which allows you to set up new Clojure projects, download relevant libraries and even start a REPL for executing Clojure code. Clojure projects can be customized by editing project.clj, which itself is a Clojure file, so no more Maven for me!.

I spent a couple of days learning basic syntax of the language, and writing short programs to do basic stuff that the core language already has APIs for. This way, I get to learn a new language, and refer back to the Clojure source code for idiomatic Clojure, once i finished coding up my solutions.

Here is a simple program that computes the product of two numbers:


(def product (fn [x,y] (* x y)))


The notation is as follows:
def is what is called a special form in Clojure. It creates and interns a var, in this case product. The var is bound to the context defined by the function that takes two arguments and returns the product of the two arguments, so this bit of code successfully defines the operation of multiplication.

Clojure provides another special form that simplifies the act of creating a function and binding it to a variable.

(defn product [x,y] (* x y))

works exactly the same way as the code shown earlier.

As far as the function itself goes, the only interesting thing to note is that Clojure like other Lisps has a prefix notation so the operator * appears first, followed by the arguments it acts on.

Clojure has a rich set of data structures that I have not begun to understand yet. My foray into Clojure has just started with the ubiquitous list data structure.

Clojure's list data structure comes with a bunch of useful methods in the core library. I wanted to implement some of them myself to get a better feel for the language.

How to obtain the head of a list? That is, given [1 2 3], the head is 1, the first element of the list.

This is made simple because of a very powerful feature in Clojure called destructuring.
Here is how the resulting code looks:

(def head (fn [x] (let [[h] x] h)))


What we have done here, is to use the let keyword to bind variables to certain expressions. in Clojure, let allows you to destructure collections and maps, that is to bind vector literals to portions of sequential things. in this case, we use this to bind the variable h to the first element of x.

We can use the same logic to come up with a function that knows when a list is empty:


(defn isEmpty? [x] (let [[f] x] (nil? f)))



A more interesting function is tail, which computes the last element of a list. My initial attempt at this was:


(defn tail [x] (let [[f & rest] x] 
                (if(isEmpty? rest) f (tail rest) )))


Although this method is functionally correct, it runs out of stack space on sufficiently large lists. What happens is that the JVM currently does not support tail call optimization, so this recursive invocation leads to stack growth until the VM runs out of stack space. Clojure does allows you to set compiler hints for tail call optimization, which is what I ended up using in my second attempt at this:


(defn tail [x] (let [[f & rest] x]
                 (if (isEmpty? rest) f
                   (recur rest))))

Another interesting function is min, which computes the minimum in a list of comparable elements.


(defn smallest [x] (reduce  #(if(> %1 %2) %2 %1)  x))



This function illustrates the use of anonymous functions and their syntax. The idea is to make one pass over the list, and compare the value at hand, with the previously noted smallest value, exchanging the smallest value for the value at hand, if the latter turns out to be smaller than the former.



Now I am on my way to learning more Clojure, and getting better at recognizing and internalizing idiomatic Clojure. It has just been three days since I started learning Clojure, so clearly I have a very long way to go, but I like what I see, and this is probably the most fun I have had programming since I learnt Haskell.

I hope to keep posting here weekly on my attempts at getting better at Clojure, it might be useful to other newbies.
One last thing today:
Yesterday I finally bit the bullet and set up Emacs and SLIME for Clojure programming. I had wanted to learn Emacs for a long time, but now I finally have a real need for it. From my experience so far, I am loving the power of Emacs to which this is the first time I have been exposed.
I would also like to give a shout out for Counterclockwise., this is an amazing project and for someone using the JVM and having to integrate with legacy java applications, this would be my preferred route.
I tried Netbeans Enclojure without ever being able to successfully use it. La Clojure from IntelliJ seems good enough, but among the three Counterclockwise to me is far and away the better tool for Clojure development where integration with legacy Java source code is important.
For fresh Clojure projects where little to no java integration is required, I would go for Emacs.

Getting back to the trivial product function we started off with, I'd like to leave you with a slight generalization inspired by SICP.

Suppose, instead of computing the product of two numbers, we wished to compute n!, that is the factorial of n. How would we do that?

Since * is already natively defined in Clojure, this can be easily accomplished by :


(defn factorial [n] (apply * (range 1 (inc n))))


How would we compute for example, the sum of all numbers from 1 to n?






(defn sum [n] (apply + (range 1 (inc n))))



What is we were asked to compute the sum of squares of all numbers from 1 to n ?


(defn sum_of_squares [n] (apply + (map  #(* %1 %1) (range 1 n))))


At this point, it should be clear that the three different functions we have written are not all that different. Clearly, there is a common thread of abstraction that ties the functions together.

Actions like product, sum etc can be thought of as special cases of accumulation, ie. the act of taking a range of numbers, mapping them and reducing the resulting mapping to an accumulated value.

The right abstraction therefore is to view these functions as examples of accumulations.

This can be accomplished in Clojure as follows:


(defn accumulate [aggregator  mapper start end]
  (apply aggregator
         (map mapper
              (range start end) )))

The factorial method could now be re written as


(defn factorial [n] (accumulate * identity 1 (inc n)))



This does not seem like much progress, after all we have almost the same amount of code as before.
What about sum of squares ?


(defn sum_of_squares [n] (accumulate + #(* %1 %1) 1 (inc n)))

Hmm, still not as attractive as the idea first seemed. It does not look like we have reduced the coding required as much as we thought. Can we do better? In Haskell, currying allows us to use partially applied form of accumulate without having to define a new function. A somewhat similar effect in Clojure can be achieved by using the keyword partial.
For example, the sum of squares function can be rewritten as :


(partial accumulate + #(* %1 %1) )

This , in effect gives us a function of two arguments, start and end which can then be set to 1 and n, in order to produce a function that will sum numbers from 1 to n.

No comments:

Post a Comment