NoSQL Zone is brought to you in partnership with:

Amresh is a software designer/ developer based in New Delhi, India. He has worked with renowned software service providers. His area of interests are server side web development, NoSQL databases and Java programming. He loves reading, and sharing knowledge. Amresh is a DZone MVB and is not an employee of DZone and has posted 8 posts at DZone. You can read more from them at their website. View Full User Profile

Composite Keys in Cassandra

11.14.2012
| 7472 views |
  • submit to reddit
Introduction

A composite key consists of one or more primary key fields. Each field must be of data type supported by underlying data-store.

In JPA (Java Persistence API), there are two ways of specifying composite keys:

1. Composite Primary Key:

@Entity
@IdClass(TimelineId.class)
public class Timeline {
    @Id int userId;
    @Id long tweetId;

    //Other non-primary key fields
}
Class TimelineId {
    int userId;
    long tweetId;
}

2. Embedded Primary Key:

@Entity
public class Timeline {
    @EmbeddedId TimelineId id;

//Other non-primary key fields
}

@Embeddable
Class TimelineId {
   int userId;
   long tweetId;
}

Above Timeline entity is inspired from famous twissandra example. Starting 1.1 release, Cassandra supports composite keys.

Cassandra Composite Keys in Action

Visit this page in order to understand Cassandra Schema in general. In this section I will give you a feel of how composite keys are stored in Cassandra.

Let's start Cassandra 1.1.x server and run following commands from Cassandra/bin directory:

CQL:

./cqlsh -3 localhost 9160

CREATE KEYSPACE twissandra with strategy_class = 'SimpleStrategy' and strategy_options:replication_factor=1;

use twissandra;

CREATE TABLE timeline(
    user_id varchar,
    tweet_id varchar,
    tweet_device varchar,
    author varchar,
    body varchar,
    PRIMARY KEY(user_id,tweet_id,tweet_device));

INSERT INTO timeline (user_id, tweet_id, tweet_device, author, body) VALUES ('xamry', 't1', 'web', 'Amresh', 'Here is my first tweet');
INSERT INTO timeline (user_id, tweet_id, tweet_device, author, body) VALUES ('xamry', 't2', 'sms', 'Saurabh', 'Howz life Xamry');
INSERT INTO timeline (user_id, tweet_id, tweet_device, author, body) VALUES ('mevivs', 't1', 'iPad', 'Kuldeep', 'You der?');
INSERT INTO timeline (user_id, tweet_id, tweet_device, author, body) VALUES ('mevivs', 't2', 'mobile', 'Vivek', 'Yep, I suppose');

cqlsh:twissandra> select * from timeline;
 user_id | tweet_id | author  | body
---------+----------+---------+------------------------
   xamry |       t1 |  Amresh | Here is my first tweet
   xamry |       t2 | Saurabh |        Howz life Xamry
  mevivs |       t1 | Kuldeep |               You der?
  mevivs |       t2 |   Vivek |         Yep, I suppose

cqlsh:twissandra> SELECT * FROM timeline WHERE user_id='xamry';
 user_id | tweet_id | tweet_device | author  | body
---------+----------+--------------+---------+------------------------
   xamry |       t1 |          web |  Amresh | Here is my first tweet
   xamry |       t2 |          sms | Saurabh |        Howz life Xamry

cqlsh:twissandra> select * from timeline where tweet_id = 't1';
 user_id | tweet_id | tweet_device | author  | body
---------+----------+--------------+---------+------------------------
   xamry |       t1 |          web |  Amresh | Here is my first tweet
  mevivs |       t1 |         iPad | Kuldeep |               You der?

cqlsh:twissandra> select * from timeline where user_id = 'xamry' and tweet_id='t1';
 user_id | tweet_id | tweet_device | author | body
---------+----------+--------------+--------+------------------------
   xamry |       t1 |          web | Amresh | Here is my first tweet

cqlsh:twissandra> select * from timeline where user_id = 'xamry' and author='Amresh';
Bad Request: No indexed columns present in by-columns clause with Equal operator

cqlsh:twissandra> select * from timeline where user_id = 'xamry' and tweet_device='web';
Bad Request: PRIMARY KEY part tweet_device cannot be restricted (preceding part tweet_id is either not restricted or by a non-EQ relation)

cqlsh:twissandra> select * from timeline where user_id = 'xamry' and tweet_id = 't1' and tweet_device='web';
 user_id | tweet_id | tweet_device | author | body
---------+----------+--------------+--------+------------------------
   xamry |       t1 |          web | Amresh | Here is my first tweet

Cassandra-cli:

impadmin@impetus-ubuntu:/usr/local/apache-cassandra-1.1.2/bin$ ./cassandra-cli -h localhost -p 9160
Connected to: "Test Cluster" on localhost/9160
Welcome to Cassandra CLI version 1.1.2

Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.

[default@unknown] use twissandra;
Authenticated to keyspace: twissandra
[default@twissandra] list timeline;
<pre>Using default limit of 100
Using default column limit of 100
-------------------
RowKey: xamry
=> (column=t1:web:author, value=Amresh, timestamp=1343729388951000)
=> (column=t1:web:body, value=Here is my first tweet, timestamp=1343729388951001)
=> (column=t2:sms:author, value=Saurabh, timestamp=1343729388973000)
=> (column=t2:sms:body, value=Howz life Xamry, timestamp=1343729388973001)
-------------------
RowKey: mevivs
=> (column=t1:iPad:author, value=Kuldeep, timestamp=1343729388991000)
=> (column=t1:iPad:body, value=You der?, timestamp=1343729388991001)
=> (column=t2:mobile:author, value=Vivek, timestamp=1343729389941000)
=> (column=t2:mobile:body, value=Yep, I suppose, timestamp=1343729389941001)

Observations
  1. First part of composite key (user_id) is called "Partition Key", rest (tweet_id, tweet_device) are remaining keys.
  2. Cassandra stores columns differently when composite keys are used. Partition key becomes row key. Remaining keys are concatenated with each column name  (":" as separator) to form column names. Column values remain unchanged.
  3. Remaining keys (other than partition keys) are ordered, and it's not allowed to search on any random column, you have to start with the first one and then you can move to the second one and so on. This is evident from "Bad Request" error.




Published at DZone with permission of Amresh Singh, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)