Big Data/Analytics Zone is brought to you in partnership with:

Justin Bozonier is the Product Optimization Specialist at GrubHub formerly Sr. Developer/Analyst at Cheezburger. He's engineered a large, scalable analytics system, worked on actuarial modeling software. As Product Optimization Specialist he is currently leading split test design, implementation, and analysis. The opinions expressed here represent my own and not those of my employer. Justin is a DZone MVB and is not an employee of DZone and has posted 27 posts at DZone. You can read more from them at their website. View Full User Profile

Message-Oriented Object Design and Machine Learning in JavaScript

10.20.2012
| 3578 views |
  • submit to reddit

This article will show how to use Message Oriented Object Design (not unlike Message Oriented Programming aka MOP or Actor Model) to model your user interface as an actor and handle some more complex processing while updating the user interface. Specifically, the sample code implements a simple machine learning exercise wherein you enter any character on your keyboard and the program attempts to guess what you chose (without cheating ;).
 
First what is Message Oriented Object Design (henceforth referred to as MOOD)? Message Oriented Object Design is an object oriented design philosophy wherein we view objects as sending immutable messages/publishing events on channels. MOOD systems also rely on the configuration of object networks to enable collaboration between them. A core tenet is the lack of inter-object getters (be it method or property calls). Since I wrote this example in Javascript and it has no inherent support for this concept all of the ideals of MOOD will need to be enforced by convention. Message Oriented Object Design is a term I made up. I'm not sure that it's sufficiently different from Message Oriented Programming or Message Based Programming to warrant existence but I also don't want to sully those terms with my own ideas if there are important subtleties I'm missing.
 
I'm in the process of writing a very in depth article on Message Oriented Object Design so if you want to know more just let me know and I'll contact you when it's available. For now, suffice it to say that the words object and actor are interchangeable as are the words message, method, or event.
 
The problem we're going to solve is this: Given a text box where a user can enter in any character literal we will create a system that will use that information to predict what the user's next entry to be and also update the web page with our stats on how we're doing. Because we're using the MOOD philosophy, there will be no getters between objects (using them on private methods is perfectly acceptable though).
 
To get started I wrote the following very simple javascript object to represent the user interface:

var user_actor = function(guess_dom_target, accuracy_dom_target)
{
  this._target_element = guess_dom_target;
  this._accuracy_target_element = accuracy_dom_target;
};
user_actor.prototype.send_guesses_to = function(channel)
{
  this._guess_channel = channel;
}
user_actor.prototype.value_entered = function(value)
{
  this._guess_channel.next(value);
};
user_actor.prototype.previously_guessed_value_updated = function(guess_value)
{
  this._target_element.html(guess_value);
};
user_actor.prototype.accuracy_updated = function(accuracy_value)
{
  this._accuracy_target_element.html(accuracy_value * 100);
};

One of the key ideas that makes MOOD so powerful is that it views your user interface as just another MOOD object (basically an as an actor). This means that all of the UI eventing that can be so troublesome finds a home here. The idea of asynchronous actions will be built into all of our objects so even as we switch contexts to work on the machine learning portion of the system, the overall object design will look very familiar.
 
Here you can also see the concept of a channel in my objects. In MOOD (and Message Oriented Programming) we always assume that we're using a channel which well pass along our message to the correct object. So while we will end up passing an object reference as the channel, this assumption forces us to view our code as though it is an isolated object unaware of how its method calls will affect others. This will enable us to ensure an extremely clean separation of concerns (SRP) and it will make it easier on us to verify when we violate SRP. How? Look at the semantic meaning of the code in the object. Does any of it seem out of place for an object that's managing the type of UI we are? Why isn't there any knowledge of the learning that this program will be doing? Think about this as we continue.
 
Once I had this code written, I wrote some quick test code just to make sure it was outputting the correct values to the correct spots in the HTML. I'll leave writing that code as an exercise to the reader as it is fairly trivial.
 
Next up, I iterated on the actor that managed the learning task. While the latest code is utilizing a markov chain to learn the users' patterns, I started it incrementally by just having it guess "yesterday's weather" (ie. use the current input as our prediction of the next input). This is the completed implementation:

var learning_actor = function()
{
  this._markov_chain = {};
  this._guessed_value = "";
  this._previous_value = "";
};
learning_actor.prototype.set_guess_channel_to = function(channel)
{
  this._guess_channel = channel;
};
learning_actor.prototype._make_best_guess = function(current_value)
{
  var value_to_guess = current_value;
  var score_to_beat = -1;
  
  var guess_list = this._markov_chain[current_value];
  for(var previously_guessed_value in guess_list)
  {
    var score = guess_list[previously_guessed_value];
    if(score > score_to_beat)
    {
      value_to_guess = previously_guessed_value;
      score_to_beat = score;
    }
  }
  
  return value_to_guess;
};
learning_actor.prototype._learn_from_new_information = function(previous_value, current_value)
{
  if(this._markov_chain[previous_value] == null)
  {
    this._markov_chain[previous_value] = {};
    this._markov_chain[previous_value][current_value] = 0;
  }
  
  if(isNaN(this._markov_chain[previous_value][current_value]))
    this._markov_chain[previous_value][current_value] = 0;
    
  this._markov_chain[previous_value][current_value] = this._markov_chain[previous_value][current_value] + 1;
};
learning_actor.prototype.next = function(value)
{
  this._guess_channel.guessed(this._guessed_value, value);

  this._learn_from_new_information(this._previous_value, value);
  this._guessed_value = this._make_best_guess(value);
  this._previous_value = value;
};

As a refresher, the Markov Chain as I've implemented it tells us which value is most likely to be entered next given the previous value. I won't go into the implementation details but the code is fairly concise and is hopefully legible enough to be decrypted.
 
The learning actor has just a couple of main parts to it.

  • The next(value) message that is passed the value that the user entered.
  • The _learn_from_new_information(previous_value, current_value) method that trains our markov chain.
  • The _make_best_guess(value) method that utilizes our trained markov chain to make an educated guess about the user's next entry.
  • Last but not least, a simple set_guess_channel_to(channel) message that we can use to publish what we guessed and what the right guess actually was.


Initially, I had actually written the code that is now in the scoreboard actor as a part of the learning actor. Here's that code:

var scoreboard_actor = function()
{
  this._values_entered_count = 0;
  this._correctly_guessed_value_count = 0;
};
scoreboard_actor.prototype.set_display_channel_to = function(channel)
{
  this._display_channel = channel;
};
scoreboard_actor.prototype.guessed = function(my_guess, correct_guess)
{
  this._values_entered_count = this._values_entered_count + 1;
  
  if(my_guess == correct_guess)
  {
    this._correctly_guessed_value_count = this._correctly_guessed_value_count + 1;
  }
  
  this._display_channel.accuracy_updated(this._correctly_guessed_value_count / this._values_entered_count);
  this._display_channel.previously_guessed_value_updated(my_guess);
};

You can see it's fairly simple and likewise I was hesitant to move it to a new class. As you get started with this style, you will feel this quite often. I recommend fighting through the pain until you come upon the first "major" refactoring you need to do. The ease in which you'll be able to make that change I guarantee will astound you and you'll be hooked. Another reason I hesitated to move this out of my learning actor is that I assumed I would be duplicating the concept of "the previous value must equal the last". Since MOOD doesn't allow for getters I knew that the only way I could have shared that logic would be copy 'n' paste reuse (read: ewww). Look at the algorithm left over in the learning actor though. It never cares whether or not we guessed right. It only tracks the guesses and makes a hypothesis regarding them. So if guess checking wasn't a concern of the learning algorithm why did I have it there to begin with? I simply wanted to display a scoreboard. Hence the creation of my scoreboard actor.
 
We've got all of these objects but what to do with them? The configuration of our objects is referred to as the network configuration. This is essentially just a different flavor of dependency injection. The difference here being that your configuration will be able to be factored away from the rest of your code and isolated if you so choose. Here's the object network configuration for this code:

var computer_guess_display = $('#computer_guess');
var computer_guess_accuracy_display = $('#computer_guess_accuracy');

my_user = new user_actor(computer_guess_display, computer_guess_accuracy_display);
var my_learner = new learning_actor();
var my_scoreboard = new scoreboard_actor();

my_user.send_guesses_to(my_learner);
my_learner.set_guess_channel_to(my_scoreboard);
my_scoreboard.set_display_channel_to(my_user);

The first thing that should stand out to you is that we are making no attempt to make our objects immutable. In MOOD, just like in Actor Model, we are guaranteed that an actor will only ever be used from the context of a single thread throughout its lifetime. This might seem to be a poor constraint here is why it's not: Imagine the learning actor gets some VERY complex logic. That isn't a stretch depending on how accurate you want the guesses to be. So, if you had written this code without using this style and didn't explicitly design for asynchronicity what might happen? The first time that learning actor needs to really think, your UI will freeze up. Because we wrote this using Message Oriented Object Design however, we can throw that logic _anywhere_ and it won't block our UI. What do I mean by anywhere? I mean we could literally host it on a web service and instead of implementing our actor on the HTML we could have an actor that was responsible for interacting with the web service. Someday, if Javascript gets threads we could even throw the extra work onto a thread and create a channel object to manage the threading context on the passed messages. The rest of our code wouldn't change for either case. If you need an actual example leave me a comment to that effect because for now this seems as though it's easy to see especially once someone has pointed it out. In the meantime, if you've been thinking that Message Oriented Object Design is a lot of extra work for pedantic self-indulgent programmers think about whether or not your code could do that.
 
Oh yeah. Also, notice that there is only one VERY thin object in all of that code that has anything to do with the DOM. The rest is trivially unit testable. And not just testable in a small way, but testable as in only the object under test will be exercised. I didn't TDD this code. That's just the way the MOOD pulls me.
 
Also, I apologize in advance for my horrible naming of methods and objects. Hopefully it still gets the point across.
 
That's it! Go ahead and try it, it's pretty neat. Just "randomly" pressing keys on the keyboard the way I do the code was able to guess correctly 40% of the time or so. Not bad at all! Also, regular patterns like "abcabcabc" it will get pretty quick and you'll see the code try to follow you if you do something like "aaaabababaaaababababab". Of course, like all learning agents, the more random the string you enter, the worse the agent will perform.
 
The full HTML source code is here for you to download and try. It does require you to include jquery for it to work. Leave a comment if you have any questions! :)

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
  <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
  <title>LearningU.js</title>
  <script type="text/javascript" src="jquery.js"></script>
  <script type="text/javascript">
    var my_user = null;
    $(document).ready(function(){
    // Start user_actor
      var user_actor = function(guess_dom_target, accuracy_dom_target)
      {
        this._target_element = guess_dom_target;
        this._accuracy_target_element = accuracy_dom_target;
      };
      user_actor.prototype.send_guesses_to = function(channel)
      {
        this._guess_channel = channel;
      }
      user_actor.prototype.value_entered = function(value)
      {
        this._guess_channel.next(value);
      };
      user_actor.prototype.previously_guessed_value_updated = function(guess_value)
      {
        this._target_element.html(guess_value);
      };
      user_actor.prototype.accuracy_updated = function(accuracy_value)
      {
        this._accuracy_target_element.html(accuracy_value * 100);
      };
      
    // Start learning_actor
      var learning_actor = function()
      {
        this._markov_chain = {};
        this._guessed_value = "";
        this._previous_value = "";
      };
      learning_actor.prototype.set_guess_channel_to = function(channel)
      {
        this._guess_channel = channel;
      };
      learning_actor.prototype._make_best_guess = function(current_value)
      {
        var value_to_guess = current_value;
        var score_to_beat = -1;
        
        var guess_list = this._markov_chain[current_value];
        for(var previously_guessed_value in guess_list)
        {
          var score = guess_list[previously_guessed_value];
          if(score > score_to_beat)
          {
            value_to_guess = previously_guessed_value;
            score_to_beat = score;
          }
        }
        
        return value_to_guess;
      };
      learning_actor.prototype._learn_from_new_information = function(previous_value, current_value)
      {
        if(this._markov_chain[previous_value] == null)
        {
          this._markov_chain[previous_value] = {};
          this._markov_chain[previous_value][current_value] = 0;
        }
        
        var likely_values_based_on_prev_value = this._markov_chain[previous_value];
        
        if(isNaN(likely_values_based_on_prev_value[current_value]))
          likely_values_based_on_prev_value[current_value] = 0;
          
        likely_values_based_on_prev_value[current_value] = likely_values_based_on_prev_value[current_value] + 1;
      };
      learning_actor.prototype.next = function(value)
      {
        this._guess_channel.guessed(this._guessed_value, value);
      
        this._learn_from_new_information(this._previous_value, value);
        this._guessed_value = this._make_best_guess(value);
        this._previous_value = value;
      };
      
    // Start scoreboard_actor
      var scoreboard_actor = function()
      {
        this._values_entered_count = 0;
        this._correctly_guessed_value_count = 0;
      };
      scoreboard_actor.prototype.set_display_channel_to = function(channel)
      {
        this._display_channel = channel;
      };
      scoreboard_actor.prototype.guessed = function(my_guess, correct_guess)
      {
        this._values_entered_count = this._values_entered_count + 1;
        
        if(my_guess == correct_guess)
        {
          this._correctly_guessed_value_count = this._correctly_guessed_value_count + 1;
        }
        
        var guess_accuracy = this._correctly_guessed_value_count / this._values_entered_count;
        this._display_channel.accuracy_updated(guess_accuracy);
        this._display_channel.previously_guessed_value_updated(my_guess);
      };
      
      //Create actors
      // my_user needs to be global so UI can use it.
      var computer_guess_display = $('#computer_guess');
      var computer_guess_accuracy_display = $('#computer_guess_accuracy');
      
      my_user = new user_actor(computer_guess_display, computer_guess_accuracy_display);
      var my_learner = new learning_actor();
      var my_scoreboard = new scoreboard_actor();
      
      my_user.send_guesses_to(my_learner);
      my_learner.set_guess_channel_to(my_scoreboard);
      my_scoreboard.set_display_channel_to(my_user);
    });
  </script>
</head>
<body>
<div>
    You: <input name="human_value" value="" onclick="this.select();" onfocus="this.select();" maxlength="1" onkeyup="my_user.value_entered(this.value); this.select();" />
  </div>
  <div>
    My Guess: <span id="computer_guess"></span>
  </div>
  <div>
    My Accuracy: <span id="computer_guess_accuracy">0</span>%
  </div>
</body>
</html>

 

Published at DZone with permission of Justin Bozonier, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)