Rise of the Bots : Deep Learning

In recent years big companies like Google, Facebook, and Microsoft acquired start-ups (and/or hired talent) that deal with a special branch of Artificial Intelligence (or simply AI) known as Deep Learning. AI is nothing new, and the dream of intelligent machines that can learn and reason have been around well before the 1950’s when the term AI was first coined. In ancient times there were myths about super beings in the form of machines and hybrids that permeated cult imagery and stories with common themes. At the root of these themes is an exploration of human fears and ethical concerns at the spectre of AI. This naturally made its way to Hollywood as early as the original movie Frankenstein, capturing an attempt by humans to forge something more unnatural and artificial.

From “I, Robot” to AMC’s “Humans”

Yesterday I watched the third episode of “Humans”, a new AMC series set in present day London, England which explores AI in the form of human like bots, known as “Synths”, that help with chores around the house, some even provide other more sinister services. In the series, the Synths are capable of highly intelligent conversation, and behaviour, to the point where it’s hard to distinguish them from a real human, except for their glassy green eyes. While this series is a work of fiction, maybe it is not so far fetched from a non-fictional future if the field of AI, esp. as it relates to Deep Learning continues to develop and make progress at the rate it has been.

Deep Learning

What is Deep Learning any ways? In the field of AI there is a special branch known as Machine Learning, which traditionally use rule based methods for algorithms that learn from sample data. Another method which has been reinvigorated through recent ground breaking discoveries is now collectively known as Deep Learning. At the heart of this method is an architecture that is loosely based on how the human brain is structured, namely a network of neurons connected by synapses that get excited under certain conditions. This structure is simply known as Neural Network (NN), and it is composed of an input and output layer, with a number of layers in between, known as “hidden” layers. It is said that the first NN was invented in 1958, by a psychologist Frank Rosenblatt, and he called it a Perceptron. It was intended to model how the human brain processed visual data and recognized objects.

So why all the Hype now?

It has something to do with the way these NNs are trained. Each of these neurons has a weight which acts as a knob to reduce or amplify the input signals of the previous layer, and forward it to the next layer for processing. The concept of training comes down to adjusting these weights, which can be thought of as knobs, based on desired outputs given a set of known inputs. So let’s say I feed some NN a picture of a dog as input, and it outputs that it recognizes it as a cat, I would adjust the weights in such a way as to penalize those neurons that gave rise to the output of a cat. This process is known as back-propagation.

Yes yes, but why all the hype now? Well, the general intuition of “having more hidden layers between the inputs and outputs, make for a more intelligent NN” was not realized since there was no feasible method employed that produced better predictions given deeper hidden layers. This all changed in 2006, when Geoffery Hinton, an AI professor and researcher at the University of Toronto made a breakthrough in the way these NNs are trained. Instead of training through back-propagation across all hidden layers, it was more effective to back-propagate one layer at a time, i.e. keep previous layers constant, with a final sweep across all layers for fine tuning. Of course this explanation is overly simplified, and details are beyond the scope of this article. This gave rise to deeper NNs since it was now possible to demonstrate an intelligent gain when more hidden layers were added. Thus Deep Learning was born.

The latest Bot

Unlike the bots (or Synths) in the series Humans, we are nowhere near human like conversations, but a recent attempt by Google, published here looks somewhat promising. The chat bot mentioned in the paper can hold a conversation about the meaning of life, and can even trouble shoot a help desk scenario… well not really! In this case the training set was not a set of pictures of dogs and cats, rather it was a collection of subtitles from movies or transcripts of help desk scenarios. The idea for the subtitles set is to train the NN how to predict the next sentence given the previous sentence. In the subtitles case it is assumed that the 62M sentences are turn based, where one says ABC in one sentence, and the next is a response XYZ, of course this is not always the case, so the data set is said to be “noisy”. Never the less, the idea is to train the NN given input ABC output XYZ. The limitations are obvious, as noted by the authors but it has produced some surprising conversations:

Human: is sky blue or black ?
Machine: blue .
Human: does a cat have a tail ?
Machine: yes .
Human: does a cat have a wing ?
Machine: no
Human: can a cat fly ?
Machine: no .
Human: how many legs does a cat have ?
Machine: four , i think .

Keep in mind that the machine had no prior formal knowledge representation, of cats or the colour of the sky, nor did it use traditional information extraction and rule based query answering crafted by linguists . It simply learned to predict answers given open domain movie subtitles. Looks impressive right? I’d say yes!

Conclusion

The idea of artificial intelligence and the associated fears / ethical concerns have existed since ancient times. The recent methods of Deep Learning are bringing us closer to Hollywood fiction. Does the above “really” represent an intelligent conversation? or is it a surprising outcome of replaying some sentences that simply aligned well with the input? If the machine consistently produced well aligned sentences as answers can that be considered as intelligence? (Turing Test Alert). I tend to think it’s a step closer to realizing the dream of intelligent machines, and it’s something definitely noteworthy, but we’re still a long way off from the Synths portrayed in the Humans series.


Image Processing using the HTML5 Canvas

Last year I was in between jobs, and decided to learn HTML5 since it was gaining popularity within the developer community due its cross-platform capability and the rise of mobile apps.  I am a business applications specialist by profession, and a game programmer hobbyist at heart, so I decided to have some fun and make a Zombies game in HTML5 to learn the ropes of this new technology.  Recently I turned my learning experience into a lecture series how to make a Zombies game.

I’m terrible at graphics so I set out on a public domain hunt for Zombie images suitable for a game.  Instead of finding a nice sprite sheet with all the animation sequences laid out by direction in one image I found 5 animation sequences of a “red zombie” as individual PNG images.  Each animation sequence, eg. walking, attacking etc. had a number of frames broken into individual images.  All together there were 448 images!

zombie

Zombie Images

When I opened an image in a Paint program I noticed that its transparency was not set, neither was the shadow.  While I’m sure there are applications that can process these images in bulk mode, I decided that this would be a good opportunity to learn about the HTML5 Canvas and how to manipulate pixels of an image.

The HTML5 Canvas element  is used to draw graphics on the fly, on a web page.  First I needed to construct a simple web page with a canvas so that I can draw an image on it.

<html>
<head>
<title>HTML5 Test</title>
</head>
 
<body>
 
<div>HTML5 Test!</div>
 
<canvas id="mainScreen" width="1024" height="640">
Canvas not supported!
</canvas>
 
</body>
 
</html>

The code above includes a canvas tag, you can test it out, and if you see “Canvas not supported!” that means your browser does not support the HTML5 canvas, otherwise you’re fine.

Now lets write some code to display the Zombie image as is.  This involves newing up a new Image object and setting its src attribute to the URL where the image is located.  I have uploaded a Zombie image for you to test on amazon here.


<html>
<head>
<title>HTML5 Test</title>
<script type="text/javascript">
 
var g_mainScreen = null;
 
window.onload = onReady;
 
function onReady()
{
 g_mainScreen = document.getElementById("mainScreen").getContext("2d");
 
 var img = new Image();
 
 img.onload = function()
 {
  g_mainScreen.drawImage(this, 0,0);
 };
 
 img.src = "https://s3.amazonaws.com/ken-z/attack.png";
}
 
</script>
 
</head>
 
<body>
 
<div>HTML5 Test!</div>
 
<canvas id="mainScreen" width="1024" height="640">
Canvas not supported!
</canvas>
 
</body>
 
</html>

Line 8 registers a handler for onload so that onReady is invoked once the document loads.   We get a handle to our canvas (line 12) and grab its 2d context so we can use its drawImage function.  We new up an image dynamically (line 14), and register an anonymous function to handle it once it’s loaded (line 16).  We set the image’s src property to the location of the Zombie image (line 21), this kicks off the loading process.  Once the image is loaded we simply draw it on the canvas (line 18) at location x = 0, and y = 0, which is the top left corner of the canvas.

blog_zombie

Original Zombie Image

If you run this code in your browser, it will display the Zombie image in the top left corner.  The problem is the background color is not transparent, and the shadow should really be displayed at 50% opacity.

The Zombie image is made of thousands of pixels, and each pixel is composed of Red, Green, Blue setting in addition to Alpha.  The Alpha is what controls the opacity, we need to some how set it to zero for the background color so that it is completely transparent, and to 128 (50%) for the shadow.

pixels

To do this we need to scan all the pixels in the image and if we notice a background pixel we change its Alpha to zero, and we do the same for the shadow setting its Alpha to 128 which is 50% of 256 (given the range of 0 to 255).

The plan is to use a work canvas which is off screen, where we can draw the image, examine and manipulate its pixels, and then make a new image out of it, so we can draw it on our main screen canvas.

<html>
<head>
<title>HTML5 Test</title>
<script type="text/javascript">
 
var g_mainScreen = null;
 
window.onload = onReady;
 
function onReady()
{
 g_mainScreen = document.getElementById("mainScreen").getContext("2d");
 
 var img = new Image();
 
 img.onload = function()
 {
  processImage(this);
 };
 
 img.src = "https://s3.amazonaws.com/ken-z/attack.png";
}

function processImage(original)
{
    var workCanvas = document.createElement("canvas");
    workCanvas.width = original.width;
    workCanvas.height = original.height;

    var wctx = workCanvas.getContext("2d");
    wctx.drawImage(original, 0, 0);

    var imageData = wctx.getImageData(0, 0, original.width, original.height);
    var length = imageData.data.length;

    for (var i = 3; i < length; i += 4)
    {
        // if transaparent color then set alpha to 0
        if (imageData.data[i - 3] ==  106 && 
            imageData.data[i - 2] ==  76  &&
            imageData.data[i - 1] ==  48)
        {
            imageData.data[i] = 0;
        }

        // if shadow color then set alpha to 128 (50%)
        else if (imageData.data[i - 3] == 39 && 
                 imageData.data[i - 2] == 27 && 
                 imageData.data[i - 1] == 17)
        {
            imageData.data[i] = 128;
        }
    }

    wctx.putImageData(imageData, 0, 0);

    img = new Image();

    img.onload = function()
    {
        g_mainScreen.drawImage(this, 0, 0);
    };

    img.src = workCanvas.toDataURL();
}
 
</script>
 
</head>
 
<body>
 
<div>HTML5 Test!</div>
 
<canvas id="mainScreen" width="1024" height="640">
Canvas not supported!
</canvas>
 
</body>
 
</html>

If you run this html page in your browser you’ll probably get a cross-origin violation error, this is because the html page is saved to your local file system, and your browser considers that as the origin of the page. When the code starts to inspect the image data via line 33 the browser’s security mechanism kicks in and notices that you’re trying to manipulate the image data which came from a different origin. So to solve this issue you can either put the image and html page on your own server, or try out the html page I uploaded to amazon here.

Instead of displaying the original image once it’s loaded, we now call processImage (line 18). In this function we get the image’s data imageData (line 33) by drawing it on a temporary work canvas (line 30). Now that we have the imageData we simply loop through all the pixels (each pixel is 4 bytes, red + green + blue + alpha = 4) and compare the color to the background (106, 76, 48) or shadow (39, 27, 17) and we set the alpha respectively to 0 or 128. This is done through the loop from line 36 to 53.

The second step is to put it back to our work canvas using putImageData (line 55), then we new up a new Image (line 57) and load it by setting its src to a URL using workCanvas.toDataURL, which will convert our manipulated image to a URL (line 64).

Once the manipulated image is loaded, we simply draw it on our mainScreen canvas (line 61).

With this technique you can process images however you like, by inspecting its pixels as raw data composed of 4 bytes per pixel.

If you like to learn more about image processing, take a look at lecture 5 and 8 of the Zombies game course.