Book Reviews

Clojure Data Analysis Cookbook by Eric Rochester

ISBN: 978-1782162643
Publisher: Packt Publishing
Pages: 342

Learning Clojure was a rewarding experience. Not only is the language itself a marvel of elegance. There are also several good books available. I'm glad that the Clojure community has continued the Lisp tradition of exceptionally good books. My two favorites turned out to be The Joy of Clojure and Clojure Programming. I often turned to them for advice. But reading books only takes you so far. All learning is based upon a feedback loop. Without diving into learning theory, suffice to say there's a lot of support for the idea of learning by doing. For me, to really grasp a language, I want extensive programming experience with it. I need to make mistakes, learn from them, and see my designs improve as I accumulate more expertise and memorize the basic idioms. To get that experience with Clojure I decided to code my side-projects in the language. After a few web-apps and smaller utilities, my interests turned to data mining and analysis. And in that context, Clojure Data Analysis Cookbook is a great next step.

The book covers a wide area of topics related to big data analyses. From the gentle start with straightforward data mining, over parallel and distributed computing, all the way to graphical result visualizations and integration with related technologies like Mathematica and R. In-between you'll find short chapters on Clojure's concurrency support, including the fairly new reducers library. All chapters are built around receipts with a common format. A receipt starts with the necessary preparations, then the actual performance of the task is outlined before a short discussion concludes the receipt. Often, the receipts include references to further readings. I found the references useful since the explanations in the receipts are quite superficial. Perhaps that's something to expect given the cookbook format. But I believe the value provided by the book could be raised significantly just by providing a little more in-depth content here. Some tasks are indeed quite conceptually challenging despite the brevity of the Clojure code.

To me, the main reason for buying the book was due to its coverage of Incanter. As I started to prototype an analysis tool suite, I decided to build the core of it around the Incanter library. Incanter is a set of libraries and environment for statistical computing and visualizations. Incanter promises to bring the power of the R-language to Clojure programmers. As far as the library goes, I found Incanter quite powerful. The major hurdle was the lack of comprehensive documentation. Sure, the individual API:s are all documented. But there's little to no information on how to work efficiently with the library. That's where this book shines: Clojure Data Analysis Cookbook proved to be an excellent introduction to Incanter. I would say that approximately one third of the book covers different usage scenarios in Incanter, including the sparingly documented Zoo library used to visualize time series of data.

If you're into data analysis and already know the basics of Clojure, this book is a must buy. The fast evolving eco-system around Clojure will probably make much of the book obsolete over the next years. But I do hope the author manages to keep the material up to date. Clojure Data Analysis Cookbook fills an important gap in this age of data mining and analysis.

Reviewed November 2013