NoSQL Zone is brought to you in partnership with:

Don Pinto is a Product Marketing Manager with experience in cloud and database technologies. Don is a DZone MVB and is not an employee of DZone and has posted 83 posts at DZone. You can read more from them at their website. View Full User Profile

Secondary Indexes or Full-Text Search?

03.23.2013
| 1603 views |
  • submit to reddit

Applications get data from Couchbase Server in different ways - they can use basic key-value operations, secondary indexes (views) or full-text search. As a developer, how do you decide whether you should use secondary indexes or full-text search for your new app feature? This blog explains the differences between secondary indexes and full-text search indexes so that you know what you should use to access data in Couchbase based on the scenario you have at hand.

Views in couchbase server are defined in javascript using a map function, which pulls out data from your documents and an optional reduce function that aggregates the data emitted by the map function. In the map function, you can specify what attributes to build the index on. Views are eventually indexed and queries are eventually consistent with respect to the documents stored. 

Visually, this is how a data structure for a secondary index looks like - 

Using a B-tree data structure for secondary indexes optimizes quick key based lookups (in this case ‘Item name’) and range queries. For example, imagine that you are building a product catalog app and want to list all the product names that starting with ‘A’ till ‘F’. Using a secondary index in Couchbase on ‘item name’, only parts of the B-tree data nodes would  need to be accessed.  

So why use full-text search?

Imagine that you want to list all the products in your store having the keyword ‘red’ - this includes items such ‘red sweaters’, ‘red pants’ or even items with the color attribute ‘red’. A full-text search index maps document terms to the the list of document id’s - which means you can quickly get back the list of document id’s that have a particular term in them. 

Couchbase server integrates with Elasticsearch, a full-text search engine. Using the Couchbase adapter for Elasticsearch, documents are replicated in real-time to Elasticsearch. Elasticsearch parses each document and builds a full-text index so that you can search across all your documents from your app.

The figure above shows how a full-text search index maps document terms found in the documents to document IDs. This data-structure is elegant for ad-hoc search querying - so for example, if you’re looking for “sweaters”, you get the document id’s relevant to  Red and Blue sweaters.

Now that you understand about secondary indexes and full-text search indexes, let’s take a look at when you should use full-text search and when you should consider using a secondary index in your app. 

You should use full-text search when :

-  you want to search through large amounts of textual data such as web page content, blog posts, digital articles and, content metadata. Full-text search indexes will allow you to search across the entire dataset, across any attribute in addition to some relevance form of ranking the results.

-  your app needs term based search.

You should use secondary search when :

 -  you have queries in your app that run over and over again.
 
 -  you know exactly which attributes to query on based on your application. Your queries have exact matches or range queries. For example, you want to get item number “1000” or want a list of all the documents of type “pants” and sizes between 5 to 10. 
 
So, when you’re building your next app feature on Couchbase and deciding whether to use a secondary index or a full-text search index, try to apply some of the guidelines above when selecting the best index to use for your specific use-case. If you're interested to learn more about Couchbase Server and Elasticsearch, register now and don't miss the the upcoming webinar.
 
Published at DZone with permission of Don Pinto, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)