PHP-FPM, nginx and Couchbase - The Connection Problems
Before we get started, I’d like to point you a blog by my colleague Michael Nitschinger, in which you can learn about the best way to set up your basic PHP and Couchbase Environment, if not using nginx & PHP-FPM:
Recently, we’ve been seeing many people using the Couchbase PHP SDK alongside nginx and php-FPM. There has, however, been some recurring issues amongst these users, that I aim to troubleshoot in this article!
The issue you may be experiencing is that you are not able to control the number of connections made, through FPM to Couchbase; You are at the mercy of the number of PHP child processes. The fact is; in FPM you do not need as many connections to Couchbase as you have processes.
You will notice a huge performance impact. Let’s say we have our FPM max_children set to 300; under load, you'll have 300 PHP processes running, and each one of those may have 4 persistent CouchbaseClient instances. Across 16 machines, this is almost 20,000 client objects. These multiple parallel FPM processes cannot share 1 Couchbase connection, and this is our main issue. Our other issue is that once 300 processes are reached, these connections will be torn down and restarted. This is expensive and something we should wish to avoid.
Let's step back a bit and get back to the root of the issue. Our Couchbase Smart Clients keep a connection open to the cluster REST manager; to be notified about changes in the topology. This works fine for most languages in which we would typically use a limited number of processes, and perhaps just 2-3 objects in each of them; so that each server would probably just have 5 such connections to the cluster. However, in a PHP deployment with FPM - people typically spin up around 2000 of such processes and if you just use _ONE_ Couchbase object in each of them, each server will then occupy 2k of connections to our REST server.
There are a couple of ways of solving these issues, though. One way, is by using theCouchbase Configuration Cache (Available in the PHP SDK v1.1.5 +). When using the config_cache, creating a new client instance first looks for a cached version of the cluster config; if it's found, then it doesn't query the node at all for the config, it just uses the cached value. If it can't connect via a cached config, only then will it open a connection (on port 8091, speaking HTTP) to one of the Couchbase nodes you pass to the constructor in your code.
The config_cache is an optional path to a directory where the library may store files containing the cluster topology. (Used to cache the configuration instead of connecting to the REST server each time to download it.) You can enable the connection cache by putting the following line into your Couchbase.ini file:
couchbase.config_cache = "/PATH/TO/SOME/DIRECTORY"
You can read more about the Config Cache here:http://www.couchbase.com/wiki/display/couchbase/libcouchbase+configuration+cache
By utilising the Config_Cache, we can remove roughly ¾ of the connections that would have been made, saving resources, and partly fixing the issue. The persistent connections are shared more, as well, so that can provide some more efficiency. We can also change the FPM Config to not run so many parallel processes by reducing the max_children and increasing the max_requests. I suggest we try tweaking the pm.max_requests value from 300 up to 3000 or 5000 (depending on what kind of memory bloat you see in your PHP processes over time), which will also reduce the number of times these client objects have to be recreated.
The reason this can help here, I believe, is that it removes 3/4 of the client objects that are being created. You might try tweaking the pm.max_requests value from 300 up to 3000 or 5000 (depending on what kind of memory bloat you see in your PHP processes over time), which will also reduce the number of times these client objects have to be recreated. We can also reduce the number of pm.max_children down to 100 or so.
Have a look at my sample FPM.conf file for reference:https://gist.github.com/rbin/82e47f7f75f2072f02fd
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)