Quantifying Picture Differences


Not really motion detection, but more useful in a programmatic way.

When I started to write the scripting dictionary for Daydreamer 2.0, I decided to take seriously the dictum that I was taught by the AppleScript team – expose everything to the script, even if you can't figure out a use for it. Well, okay, I thought. Daydreamer's main product is a picture. I can't figure out a use for it, but I'll provide it.

Now being wired the way I am, I had to figure out a use. In conversations with friends, a security system became obvious pretty quickly. So, when I got the code to the testing stage, I wrote a script called Alert to serve as an example security system. (You can find it in the Daydreamer distribution in the AppleScripts folder.)

There are a few problems with that original script however. First, I didn't provide or suggest a way to get the source picture. Second, the script is a one-shot. It stops after a change has been detected. And third (and most important) it is far too sensitive in detecting change. Any change sets it off. And light shifts a lot during a day. All told, a pretty useless script.

I dealt with the first problem in my last blog entry, Using an iSight with Daydreamer. The second is trivial, and I'll deal with it here. The third problem is the hardest. When I wrote that first script, I knew I didn't have the graphic chops to write picture difference-detecting code. From time to time over the last year, I made a half-hearted attempt to find code I could use to ignore trivial changes and catch important ones.

I found nothing until I was looking at the sample code at Apple's ADC web site for another project (I was looking for Core Image sample code). What I found was Image Difference (mad props to jcr - if I see you at WWDC, I'll buy you a beer!). Image Difference finds the difference between two pictures by first inverting one (with a category on NSImage), and then subtracting one from the other graphically with NSImageView's setImage:differenceFromImage: method. The result makes it pretty clear to the human eye the difference between the pictures (of course, the human eye is pretty good at detecting differences at the gross level I'm interested in).

I figured that I could use another category (following the sample code's example) to calculate a numeric difference. I was also pretty confident that I could make it sufficiently scriptable, since it was written in Cocoa. My first attempt looked simply added up the red, green and blue values for each pixel in the differenced image.

- (float) calcRGB
{
    RGBPixel
    *pixels = (RGBPixel *)[self bitmapData]; // -bitmapData returns a void*, not an NSData object ;-)
    
    int
        row,
        column,
        widthInPixels = [self pixelsWide],
        heightInPixels = [self pixelsHigh];
    float differenceIndex = 0.0;
    for (row = 0; row < heightInPixels; row++)        
        for (column = 0; column < widthInPixels; column++)
        {
            RGBPixel
            *thisPixel = &(pixels[((widthInPixels * row) + column)]);
            differenceIndex = differenceIndex + thisPixel->redByte;
            differenceIndex = differenceIndex + thisPixel->greenByte;
            differenceIndex = differenceIndex + thisPixel->blueByte;
        }
            return differenceIndex;        
}    

This gave me a numeric difference all right, but it was huge. I decided to go ahead and make a new scriptable app would allow setting of the two images via Applescript, and would calculate the numeric difference when asked for it by AppleScript.

When I had that basically running, I rewrote the Alert script to use the iSight, not stop after one difference was found, and to use this difference engine. I ran more experiments, first finding an average pixel difference, and then taking it further to an average difference per red, green and blue. I ended up with a two pass difference that produced a pretty reasonable result.

- (float) calcRGB
{
    RGBPixel
    *pixels = (RGBPixel *)[self bitmapData]; // -bitmapData returns a void*, not an NSData object ;-)
    
    int
        row,
        column,
        widthInPixels = [self pixelsWide],
        heightInPixels = [self pixelsHigh];
    float redDifferenceIndex = 0.0;
    float greenDifferenceIndex = 0.0;
    float blueDifferenceIndex = 0.0;
    for (row = 0; row < heightInPixels; row++)
    {
        for (column = 0; column < widthInPixels; column++)
        {
            RGBPixel
            *thisPixel = &(pixels[((widthInPixels * row) + column)]);
            redDifferenceIndex = redDifferenceIndex + thisPixel->redByte;
            greenDifferenceIndex = greenDifferenceIndex + thisPixel->greenByte;
            blueDifferenceIndex = blueDifferenceIndex + thisPixel->blueByte;
        }
    }
    long totalPixels = heightInPixels * widthInPixels;
    float redAverage = redDifferenceIndex/totalPixels;
    float greenAverage = greenDifferenceIndex/totalPixels;
    float blueAverage = blueDifferenceIndex/totalPixels;
    float differenceIndex = 0.0;
    for (row = 0; row < heightInPixels; row++){
        for (column = 0; column < widthInPixels; column++)
        {
            RGBPixel
            *thisPixel = &(pixels[((widthInPixels * row) + column)]);
            redDifferenceIndex = thisPixel->redByte - redAverage;
            if (redDifferenceIndex > 0.0)
                differenceIndex = differenceIndex + redDifferenceIndex;
            greenDifferenceIndex = thisPixel->greenByte - greenAverage;
            if (greenDifferenceIndex > 0.0)
                differenceIndex = differenceIndex + greenDifferenceIndex;
            blueDifferenceIndex = thisPixel->blueByte - blueAverage;
            if (blueDifferenceIndex > 0.0)
                differenceIndex = differenceIndex + blueDifferenceIndex;
        }
    }
    return differenceIndex/totalPixels;        
}    

The result is a reasonably sized number. My experiments lead me to think that an "interesting" difference would be somewhere in the region of 10 to 50.

My Numeric Image Diff app (compiled for Tiger/Universal) and the new alert script can be found in a zip archive . You can also download the source .

I know this is not a perfect solution, but I'm done with the topic. The monkey's off my back, and I'm on to other projects.

Posted: Sat - July 15, 2006 at 03:00 PM          


©