Performance Zone is brought to you in partnership with:

Leigh has been in the technology industry for over 15 years, writing marketing and technical documentation for Sun Microsystems, Wells Fargo, and more. She currently works at New Relic as a Marketing Manager. Leigh is a DZone MVB and is not an employee of DZone and has posted 106 posts at DZone. You can read more from them at their website. View Full User Profile

The Best Unknown Databases for PHP Apps

08.03.2012
| 7286 views |
  • submit to reddit

Matthew Setter is a professional technical writer and passionate web application developer. He’s also the founder of Malt Blue, the community for PHP web application development professionals and PHP Cloud Development Casts – learn Cloud Development through the lens of PHP. You can connect with him on Twitter, Facebook, LinkedIn or Google+ anytime.

If you’re a long time PHP developer, an open source enthusiast or geek of any stripe, then you’ve heard about what could be called the golden quadrilogy of relational databases for web based application development: MySQL, PostgresSQL, Microsoft SQL Server, and Oracle.

With their dominance of the market, you might not have heard about alternatives to the big four. And you could easily be forgiven for not looking elsewhere. But despite their relative merits, impressive pedigrees, commercial support, and large communities, they are hardly the only shows in town.

In this day and age, there a plethora of options available to us. In this three part series, I’m going to walk you through five alternative databases that you may or may not have heard of. In it, I’ll explain:

* Why they can be a valuable addition to your software development toolkit
* Where they can be best applied
* How to get started with them

Whether you’re involved in embedded development, OLTP, OLAP, massive scalability and storage or simple database-backed applications, you’re not going to walk away with the same perspective that you had before you started reading these posts. So without further ado, let’s get started with a look at a veteran of the Internet: Berkeley DB.

Berkeley DB
Berkeley DB has been around for quite a long time, predating the World Wide Web as we know it today. Back in 1991, the University of California was migrating from BSD v4.3 to v4.4 and it wanted to distribute a version of UNIX that had no proprietary AT&T code. Berkeley DB was to replace the existing hsearch and dbm/ndbm packages.

Margo Selzer and Mike Olson created a library toolkit that provided a very robust and high performing key/datastore solution. Since then, the code has been used in a variety of products, including the 389 Directory Server (formerly the Netscape LDAP Server), Asterisk PBX, Postfix, Sendmail, SpamAssassin, and Subversion. Each version became increasingly sophisticated, without adding too much complexity.

They include such features as:

* Concurrent data access
* Transaction logging and recovery
* High availability capabilities
* Write-ahead logging
* Checkpoints

In addition, Berkeley DB adds the following features:

* Records and keys can be up to 4GB in length
* A single database can be up to 256 petabytes
* Supports fast data access (keyed and sequential)
* Supports ACID transactions, fine grained locking, hot backups and replication
* Available under Sleepycat Public License or a proprietary license
* Multiple language bindings, including C/C++, Java, C#, .NET, Perl, Python, Ruby, and PHP

What’s more, it’s been available for years in standard Linux and BSD package repositories. You’ve likely already been in contact with it, or only a step or two away. The reason you might not have heard of it is because it’s not quite a database – at least in the same vein as, say, Oracle, SQLServer or MySQL. Whereas these are more of the traditional RDBMS variety, Berkeley DB is a library and consequently takes a different approach.

With Berkeley DB, you cannot access the database over a network via TCP. Instead, you have to make in process API calls and you don’t have a table, row, column structure either.

Where as traditional RDBMS’ place clear constraints on how you’ll create the structure of your datastore and how it can be optimized, Berkeley DB leaves this wide open. Instead, you can stipulate just how you want the data to be stored, as it suits the needs of your application – similar to the new breed of NoSQL databases including mongoDB, Hadoop and CouchDB.

What’s It Good For?
So what can you use Berkeley DB for? Well, the answer is just about anything. Maybe you’re looking for:

• An embedded database
• An in-memory database
• A massive key/value store
• A datastore that’s both ultra-fast and ultra-scalable

No matter which of these or a combination of these your looking for, Berkeley DB is up to the task.

How Do You Use It?
But enough talking. Let’s install Berkeley DB locally and have a look at the code so you can see just how easy it is to use.

If you’re on Windows, refer to the Google Code project tutorial. If you’re on Linux or a BSD variant, it’s likely to be in the package repositories. If you want to install it from source, grab a copy of the latest version and run a standard configure, make and make install similar to that below.

./configure
make
make install

After that, you need to install or configure the dba extension for PHP. While the DBA functions list doesn’t have many functions available, you shouldn’t underestimate the functionally it can provide.

Let’s look at a simple example:

<?php
 
define('DATABASE_NAME', "/tmp/test.db");
define('RECORD_KEY', "key");
define('RECORD_VALUE', "This is another example!");
 
$dbID = dba_open(DATABASE_NAME, "n", "db4");
 
if (!$dbID) {
    echo "dba_open failed\n";
    exit;
}
 
// replace/insert a value
dba_replace(RECORD_KEY, RECORD_VALUE, $dbID);
 
// display the value if it exists and delete afterwards
if (dba_exists(RECORD_KEY, $dbID)) {
    echo dba_fetch(RECORD_KEY, $dbID);
    dba_delete(RECORD_KEY, $dbID);
}
 
dba_close($dbID);

In the example above, we’ve defined a set of variables which store the name of our database, the record key and the record value. We then open a db4 version database with create, truncate and read-write access.

Let’s do a quick check to see if the database opened successfully and then set about adding the data to the database. Though the command is named dba_replace, this command does one of two things.

1. If the record exists, its value is replaced with the new value.
2. If it doesn’t, then it’s created with the value provided.

Nice and simple.

Following this, we check if the record exists, fetch it from the database, print it, and delete it afterwards. After all the work is done, we close the connection. Now let’s have a look at a slightly less trivial example.

A Slightly Less Trivial Example
Let’s say that we’re creating a simple in-memory phonebook. The key will be the user’s username and the value will be the full contact details of the user in JSON format.

class simpleAddressBook
{
    // the database filename
    const DATABASE_NAME = "/tmp/address-book.db";
 
    protected $_dbhndl = null;
 
    public function __construct($dbname = null)
    {
        if (!empty($dbname)) {
            $this->_dbhndl = dba_open($dbname, "n", "db4");
        } else {
            $this->_dbhndl = dba_open(self::DATABASE_NAME, "n", "db4");           
        }
 
        if (!$this->_dbhndl) {
            throw new Exception("dba_open failed");
        }
    }
 
    public function __destruct()
    {
        dba_close($this->_dbhndl);
    }
 
    public function persist($username, array $userData = array())
    {
        if (!empty($username) && !empty($userData)) {
            return dba_replace($username, json_encode($userData), $this->_dbhndl);
        }
        return FALSE;
    }
 
    public function remove($username)
    {
        if (!empty($username) && dba_exists($username, $this->_dbhndl)) {
            return dba_delete($username, $this->_dbhndl);
        }
        return FALSE;
    }
 
    public function find($username)
    {
        if (!empty($username) && dba_exists($username, $this->_dbhndl)) {
            return json_decode(dba_fetch($username, $this->_dbhndl), TRUE);
        }
        return FALSE;
    }
}
 
$ab = new simpleAddressBook();
$username = "settermjd";
 
if ($userRecord = $ab->find($username) !== FALSE) {
    print_r($userRecord);
} else {
    printf("User [%s] was not found<br />", $username);
}
 
$recordItem = array(
    'name' => array(
        'first' => 'Matthew',
        'last' => 'Setter'
    ),
    'email' => 'matthew@maltblue.com',
    'url' => 'http://www.maltblue.com',
    'phone' => '+44 12345 67890',
);
 
if ($ab->persist($username, $recordItem)) {
    print "User stored successfully<br />";
}
 
if (($userRecord = $ab->find($username)) !== FALSE) {
    print_r($userRecord);
} else {
    printf("User [%s] was not found<br />", $username);
}

Have a look at address-book.php. You can see that there is a class called simpleAddressBook. It has three functions: persist, remove and find. It also implements a constructor and destructor for opening and closing the database connection.

Following the class, we then interact with the address book by checking if the user record exists with the given username, create the record as a simple array, pass it to the persist function, and JSON encodes it and persists it in the database. Then we search for the record and display it if it’s found – which results in retrieving it – JSON decodes it back to the original array value and prints it on the screen.

As you can see, though there aren’t many functions available, we do have a lot of flexibility in what we can do with Berkeley DB. This post has only touched the surface of what’s possible. But I hope you can see how much power it has and how flexible it is to work with.

Winding Up
I hope you’ve enjoyed this first part of the series. In the next installment, I’ll be looking at Firebird, another old hand of the database world and newcomer Gladius DB, a database written in pure PHP and flat files. Until then, let me know what you think of Berkeley DB. Do you see a use for it in your applications?

Want to Learn More?
If you’d like more information about Berkeley DB, check out the following links:

http://www.aosabook.org/en/bdb.html
http://en.wikipedia.org/wiki/Berkeley_DB
http://www.dolcevie.com/anton/docs/dr-dobbs-article.pdf
http://www.stanford.edu/class/cs276a/projects/docs/berkeleydb/reftoc.html

 

 

 

 

 

Published at DZone with permission of Leigh Shevchik, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)