Cassandra is an Open Source Distributed Data Persistence system which is designed for storing and managing large amounts of data across servers.
Apache Cassandra is an open source, distributed, decentralized, elastically scalable, highly available, fault-tolerant, tuneably consistent, column-oriented database that bases its distribution design on Amazon’s Dynamo and its data model on Google’s Bigtable. Created at Facebook, it is now used at some of the most popular sites on the Web.Installing Cassandra
Cassandra can be installed in most popular operating systems Windows (vista/XP/7/8), Mac OSX, and Linux variants such as Ubuntu, Red Hat, and CentOS.
Ideally all platforms which have JVM 1.6 or higher should be just fine. There is a Debian Packaging or a third party RPM distribution by DataStax.
There is a more popular way of downloading the tar distribution , follow this article by downloading and then un packing binary distribution from Apache website.
I am demonstrating this in windows 7.
Once un packed the folder structure should look like this.
I generally create an environment variable
JAVA_HOME which points to java jdk and
CASSANDRA_HOME which points to the root directory of Cassandra as shown above in the screen shot.
As the convention has it, this directory contains executable(batch and shell) to run Cassandra, along with the startup scripts and the nodetool utility. It also has scripts for converting SSTables (the datafiles) to JSON and back.
This folder contains configurations for Cassandra. Storage-conf.xml file allows to create data store by configuring keyspace and column families,cassandra.yaml and SH file are for configuring cassandra and the environemnt, log4j files are to configure logging levels.
This folder contains an RPC Description file defining Cassandra's interface in cassandra.thrift
Contains Standard Javadoc API documentation for Cassandra.Cassandra is a wonderful project, but the code contains precious few comments, so you might find the JavaDoc’s usefulness limited. It may be more fruitful to simply read the class files directly if you’re familiar with Java.
This folder contains all the dependecies and libraries which Cassandra needs, there are json parsers and google's collection libraries.
There is a git repo with source available
git clone git://git.apache.org/cassandra.git
By default, Cassandra uses the following directories for data and commitlog storage (this is linux filesystem):
In the past (version <= 0.6), Cassandra use to have a file called
storage-conf.xml, however from 0.7 all log related stuff goes in
log4j-tools.properties,in windows I recommend changing them appropriately.
cd apache-cassandra-$VERSION sudo mkdir -p /var/log/cassandra sudo chown -R `whoami` /var/log/cassandra sudo mkdir -p /var/lib/cassandra sudo chown -R `whoami` /var/lib/cassandra
If you like to configure other variables DataStax has a nice description of what goes in the
open command prompt, and type
sh bin/cassandra -f/
You should see something like this (Window screen shot)
Hope this helps