Cascalog

Fully-featured data processing and querying library
for Clojure or Java.

Get Started! »


The main use cases for Cascalog are processing "Big Data" on top of Hadoop or doing analysis on your local computer. Cascalog is a replacement for tools like Pig, Hive, and Cascading which operates at a significantly higher level of abstraction than those tools.

Simple

Functions, filters, and aggregators all use the same syntax. Joins are implicit and natural.

Expressive

Logical composition is very powerful, and you can run arbitrary Clojure code in your query with little effort. You specify what you want and not how to do it.

Scalable

Cascalog queries run as a series of MapReduce jobs. The same code can run with a single data file on your laptop and for petabytes of data in a computing cluster.