DevOps Zone is brought to you in partnership with:

David Pollack founded Visi.Pro, Cloud Computing for the Rest of Us along with the Visi Language open source project. David founded the Lift Web Framework and continuously contributes to Lift. David has posted 39 posts at DZone. You can read more from them at their website. View Full User Profile

Project Plugh: Open Source Log Analysis

05.23.2013
| 5995 views |
  • submit to reddit

Short Answer

I'm building an open source log manage/analysis tool that will offer folks what Splunk offers folks, except it's open. https://github.com/projectplugh

Longer answer

I'm using a combination of "right for the task" open source things to build a product that is long overdue. Specifically

  • Riak to store log data. Why? From what I can see, Riak has the best write scalability of any open source data store so building a system that can sustain 100,000 writes per second is likely possible with Riak... doesn't seem possible with other data stores... plus Riak supports REST which means that writing log adapters is a whole lot easier than dumping data into a proprietary wire format. Plus managing/securing HTTP requests is easy because there are tons of tools that can do it. Riak also offers map/reduce via JavaScript which is a piece of magic that will be very useful.
  • Clojure for the web front end/user interaction piece. Why? I want to learn Clojure. And core.logic gives us Datalog out of the box which means I don't have to write a query language. Clojure plus AngularJS plus some bits of Lift-style server-push means flowing results to the browser as they become available.
  • ClojureScript for Map/Reduce jobs as well as browser-side "what-if?". Why? Being able to express the query logic in datalog and compiling that datalog into JavaScript via ClojureScript means a unified query language. Plus browsers these days support 1M+ row datasets, so running the same queries in the browser on subsets of data makes for a very "Excel-like" experience because there's nothing that goes over the wire.

The Backstory

I signed up to give a presentation at Strange Loop on pushing data to the browser via Scala/Lift and Clojure... problem is I haven't done much of any Clojure work, so I needed a project that would let me build some nifty server-push technology in Clojure (yeah... stuff exists, but it wasn't invented here...) I advertised a 75% discount in my rate for a project that would let me learn Clojure and do server-push stuff, but nobody took me up on the offer.

I was lamenting to @meangrape (Jay Edwards) and he said, "let's build something together," so we noodled a bunch of ideas and came up with the idea of an open source project that would out-do Splunk. We looked around at technologies (I had wanted to use Datomic, but it's not open and its got crappy write performance.) Jay recommended Riak. I looked at it and said, "holy crap, this solves all my problems and it'll pour me a beer."

Then I pinged Jordan West... we chatted and I got even more excited.

Talked to James and the RedMonk folks about it this morning and James' take was "go forth and make it happen," so I am.

The MVP

We're developing in the open (no mailing list yet… but soon.) We've got a Minimally Viable Productand I'm currently slogging through hooking Netty up to Clojure (yeah… I could use a library, but learning Clojure is what I want to do.)

Misc

Yeah… there's a web site and a Twitter feed but there's nothing at either one.

The Name?

The name comes from Colossal Caves Adventure... "A hollow voice says, 'Plugh'". It's a magic word. My current tag line is "Plugh: open source magic to spelunk your log data."

Published at DZone with permission of its author, David Pollak. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)