Wednesday, August 15, 2012

Cassovary: A Big Graph-Processing Library

Cassovary: A Big Graph-Processing Library

We are open sourcing Cassovary, a big graph-processing library for the Java Virtual Machine (JVM) written in Scala. Cassovary is designed from the ground up to efficiently handle graphs with billions of edges. It comes with some common node and graph data structures and traversal algorithms. A typical usage is to do large-scale graph mining and analysis.

At Twitter, Cassovary forms the bottom layer of a stack that we use to power many of our graph-based features, including "Who to Follow" and “Similar to.” We also use it for relevance in Twitter Search and the algorithms that determine which Promoted Products users will see. Over time, we hope to bring more non-proprietary logic from some of those product features into Cassovary.

Please use, fork, and contribute to Cassovary if you can. If you have any questions, ask on the mailing list or file issues on GitHub. Also, follow @cassovary for updates.

-Pankaj Gupta (@pankaj)