Guide list

Cascalog documentation is organized as a number of guides, covering all kinds of topics.

We recommend that you read these guides, if possible, in this order:

Getting started

An overview of Cascalog with a quick tutorial that helps you to get started with it. It should take about 30 minutes to read and try the provided code examples

Understanding Cascalog

Operations

Joins

  • Inner
  • Outer
  • Cross

Running on a cluster

Misc.

Testing and Debugging

Upgrading from 1.x to 2.x

Cascalog for the Impatient

  • This guide is a set of progressive coding examples that start with a simple file copy and builds up to a MapReduce implementation of the TF-IDF algorithm.

Real Code Examples

Blog posts from around the web

List of companies using Cascalog

Help improve this site

Let us know what was unclear or what has not been covered. Maybe you do not like the guide style or grammar or discover spelling mistakes. Reader feedback is key to making the documentation better.

This documentation site is open source and we welcome pull requests.