Performance Zone is brought to you in partnership with:

David Catuhe is a Microsoft HTML5 Evangelist based in Paris, France. He does much of his research on Windows 8, Kinect, and the various HTML5 standards and JS libraries. David is a DZone MVB and is not an employee of DZone and has posted 23 posts at DZone. You can read more from them at their website. View Full User Profile

Reducing Pressure on the Garbage Collector with F12 Developer Bar for IE11

09.05.2013
| 2666 views |
  • submit to reddit

As you may know, I’m working on a 3D engine for WebGL (Babylon.js) during my spare time. A 3D engine is a place where matrices, vectors and quaternions live. And there may be tons of them!

Please note that everything done here applies to Internet Explorer 11 and Windows 8.1 apps developed with HTML5/JavaScript.

Removing Non-required Instantiations

For instance, let’s have a look at this scene:

image

Using the F12 developer bar, you can launch a profiler to analyze what is going on from the point of view of the performance. The profiler has a "Start" and "Stop" button to capture a period of time and then give you this screen:

image

The drawElements function is the function that *EFFECTIVELY* renders the objects. We’re then not surprised to get it first. The second one (multiply) is about multiplying two matrices. This function is used a lot. Indeed, for each object, you have to compute the matrix required to draw it, the matrix to compute the position of every texture, etc…

multiply is used more than 12,000 times during a period of two seconds!

Here is the code for this function:

BABYLON.Matrix.prototype.multiply = function (other) {
    var result = new BABYLON.Matrix();

    result.m[0] = this.m[0] * other.m[0] + this.m[1] * other.m[4] + this.m[2] * other.m[8] + this.m[3] * other.m[12];
    result.m[1] = this.m[0] * other.m[1] + this.m[1] * other.m[5] + this.m[2] * other.m[9] + this.m[3] * other.m[13];
    result.m[2] = this.m[0] * other.m[2] + this.m[1] * other.m[6] + this.m[2] * other.m[10] + this.m[3] * other.m[14];
    result.m[3] = this.m[0] * other.m[3] + this.m[1] * other.m[7] + this.m[2] * other.m[11] + this.m[3] * other.m[15];

    result.m[4] = this.m[4] * other.m[0] + this.m[5] * other.m[4] + this.m[6] * other.m[8] + this.m[7] * other.m[12];
    result.m[5] = this.m[4] * other.m[1] + this.m[5] * other.m[5] + this.m[6] * other.m[9] + this.m[7] * other.m[13];
    result.m[6] = this.m[4] * other.m[2] + this.m[5] * other.m[6] + this.m[6] * other.m[10] + this.m[7] * other.m[14];
    result.m[7] = this.m[4] * other.m[3] + this.m[5] * other.m[7] + this.m[6] * other.m[11] + this.m[7] * other.m[15];

    result.m[8] = this.m[8] * other.m[0] + this.m[9] * other.m[4] + this.m[10] * other.m[8] + this.m[11] * other.m[12];
    result.m[9] = this.m[8] * other.m[1] + this.m[9] * other.m[5] + this.m[10] * other.m[9] + this.m[11] * other.m[13];
    result.m[10] = this.m[8] * other.m[2] + this.m[9] * other.m[6] + this.m[10] * other.m[10] + this.m[11] * other.m[14];
    result.m[11] = this.m[8] * other.m[3] + this.m[9] * other.m[7] + this.m[10] * other.m[11] + this.m[11] * other.m[15];

    result.m[12] = this.m[12] * other.m[0] + this.m[13] * other.m[4] + this.m[14] * other.m[8] + this.m[15] * other.m[12];
    result.m[13] = this.m[12] * other.m[1] + this.m[13] * other.m[5] + this.m[14] * other.m[9] + this.m[15] * other.m[13];
    result.m[14] = this.m[12] * other.m[2] + this.m[13] * other.m[6] + this.m[14] * other.m[10] + this.m[15] * other.m[14];
    result.m[15] = this.m[12] * other.m[3] + this.m[13] * other.m[7] + this.m[14] * other.m[11] + this.m[15] * other.m[15];

    return result;
};

It is a bit brutal but there is nothing complex.

Things are going crazy when you use the F12 developer bar to track the responsiveness of your page:

image

As you can see the garbage collector (orange bars) is called very often! And this is not a good thing because it can lead to visual glitches due to interruption in your frames’ rendering.

On the same screen, you can also have more details:

image

This capture shows an important thing: The garbage collector runs on a background thread (12516) which is really good to free time for the render thread (you can see that the garbage collector runs simultaneously with the animation frame callback since babylon.js uses requestAnimationFrame to render each frame).

Even if the garbage collector of IE11 runs on a background thread, we have to reduce the memory pressure. This is because our code can run on a low end hardware where threads are not available or because not all browsers have a background garbage collector.

So as much as you can, do not rely on instantiation (the new BABYLON.Matrix() here). You should prefer reusing objects instead of creating new ones. The updated multiply function can then be:

BABYLON.Matrix.prototype.multiplyToRef = function (other, result) {
    result[0] = this.m[0] * other.m[0] + this.m[1] * other.m[4] + this.m[2] * other.m[8] + this.m[3] * other.m[12];
    result[1] = this.m[0] * other.m[1] + this.m[1] * other.m[5] + this.m[2] * other.m[9] + this.m[3] * other.m[13];
    result[2] = this.m[0] * other.m[2] + this.m[1] * other.m[6] + this.m[2] * other.m[10] + this.m[3] * other.m[14];
    result[3] = this.m[0] * other.m[3] + this.m[1] * other.m[7] + this.m[2] * other.m[11] + this.m[3] * other.m[15];

    result[4] = this.m[4] * other.m[0] + this.m[5] * other.m[4] + this.m[6] * other.m[8] + this.m[7] * other.m[12];
    result[5] = this.m[4] * other.m[1] + this.m[5] * other.m[5] + this.m[6] * other.m[9] + this.m[7] * other.m[13];
    result[6] = this.m[4] * other.m[2] + this.m[5] * other.m[6] + this.m[6] * other.m[10] + this.m[7] * other.m[14];
    result[7] = this.m[4] * other.m[3] + this.m[5] * other.m[7] + this.m[6] * other.m[11] + this.m[7] * other.m[15];

    result[8] = this.m[8] * other.m[0] + this.m[9] * other.m[4] + this.m[10] * other.m[8] + this.m[11] * other.m[12];
    result[9] = this.m[8] * other.m[1] + this.m[9] * other.m[5] + this.m[10] * other.m[9] + this.m[11] * other.m[13];
    result[10] = this.m[8] * other.m[2] + this.m[9] * other.m[6] + this.m[10] * other.m[10] + this.m[11] * other.m[14];
    result[11] = this.m[8] * other.m[3] + this.m[9] * other.m[7] + this.m[10] * other.m[11] + this.m[11] * other.m[15];

    result[12] = this.m[12] * other.m[0] + this.m[13] * other.m[4] + this.m[14] * other.m[8] + this.m[15] * other.m[12];
    result[13] = this.m[12] * other.m[1] + this.m[13] * other.m[5] + this.m[14] * other.m[9] + this.m[15] * other.m[13];
    result[14] = this.m[12] * other.m[2] + this.m[13] * other.m[6] + this.m[14] * other.m[10] + this.m[15] * other.m[14];
    result[15] = this.m[12] * other.m[3] + this.m[13] * other.m[7] + this.m[14] * other.m[11] + this.m[15] * other.m[15];
};

Almost the same thing without the instantiation! But removing 6,000 instantiations per second can be a great optimization!

The point here is that you have to create a storage matrix for each operation (the matrix is created once inside the constructor and reuse each time the multiply operation needs to be used).

After doing the same thing for every function working with matrices, vectors, colors and quaternions, the responsiveness graph of babylon.js is far better:

image

Great isn’t it? Obviously the garbage collector will be called but in a less frequent way.

GC-friendly Array Object

I also found another solution to remove the memory pressure. Indeed, during a frame’s render, I have to use a lot of arrays to determine active objects, active shaders, active particles, etc…

So basically at the beginning of every frame, I was using this code:

this._activeMeshes = [];

But, obviously even if the code is simple, it has a big impact on memory. That’s why I decided to create a new kind of array that is able to reuse the initially allocated space:

// Garbage collector friendly array
BABYLON.Tools.GCFriendlytArray = function (capacity) {
    this.data = new Array(capacity);
    this.length = 0;
};

BABYLON.Tools.GCFriendlytArray.prototype.push = function (value) {
    if (this.length >= this.data.length) {
        this.data.length *= 2;
    }
    this.data[this.length++] = value;
};

BABYLON.Tools.GCFriendlytArray.prototype.reset = function () {
    this.length = 0;
};

BABYLON.Tools.GCFriendlytArray.prototype.indexOf = function (value) {
    var position = this.data.indexOf(value);

    if (position >= this.length) {
        return -1;
    }

    return position;
};

With this small piece of code, you can have an array that can be reset in order to reuse its memory. You can create it with an estimated size and just call reset() to, well, reset it. Sourire

Using our new array is like using a standard array, you have a length property and a push function. The only difference is when you want to access data because you have to use myArray.data:

for (subIndex = 0; subIndex < activeMeshes.length; subIndex++) {
    activeMeshes.data[subIndex].render();
}

Some Additional Notes

Please note that the optimizations described here are useful in my case because I was looking for performances and I did not care about memory consumptions. Indeed, I had to use a lot of memory to create the required cached objects.

For instance, the GC friendly arrays have to reserve a lot of memory that will not necessary be used. The tradeoff between memory and performance must be taken seriously.

Going Further

Please find here some great links about the F12 developer bar of IE delivered during Build 2013:

Published at DZone with permission of David Catuhe, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)