visual descriptor, image descriptor, perceptual hash

Topic: Perceptual Hash: Average -- How to compare images  
A Perceptual Hash is a hash value, generated from any input image. They create a distinct (but not unique) fingerprint for an image. The purpose of these hashes is to allow us to compare two images in a fast and efficient manner and detect if, and how much, the two images differ from each other.

Hashes are generally known as ways to create unique identifiers for sets of data, like passwords, without giving away any details about the data set itself. The key to these hashes is that they be as unique as possible. E.g.: Changing one byte in the input data set, generates a completely different hash value. Perceptual hashes on the other hand, work differently. They are intended to change only very slightly if a small part of an input image is altered.

Practical applications include: Very quickly finding duplicate images in a large library of images. Even if the duplicates differ in size or colour. There are a number of ways to generate such a hash from an image. In this post, we will focus on the simplest one: The Average.

The Average method holds up to minor colour changes, changing brightness and contrast and is indifferent to aspect ratio and image size changes. It is a great algorithm if you are looking for something specific. For example: if we have a small thumbnail of an image and we wish to know if the big version exists somewhere in our collection; Average Hash will find it fast.

The Average Hash is lightweight and easy, but it can generate false-misses if gamma correction or color histogram is applied to the image. This is because the colors move along a non-linear scale -- changing where the "average" is located and therefore changing which bits are above/below the average. If there are modifications -- like text was added or a head was spliced into place, then Average Hash probably won't do the job either and we need to revert to a different method.


Generating hashes:

Create the average hash for an image is very easy and involves the following steps:

Code: [Select]
define IMAGE as any input image.

Convert IMAGE to grey scale mode:
Code: [Select]
for each PIXEL in IMAGE:
    Set the current pixel to (R+G+B) / 3

Resize the image down to 8x8 pixels, regardless of its size or aspect ratio.
How to do this, is beyond the scope of this post and I suggest you use an existing image library for this.

Compute the average colour value for all 64 pixels:
Code: [Select]
define AVERAGE as a 32-bit, unsigned integer.

for each PIXEL in IMAGE:
    Add the pixel colour to AVERAGE

Divide AVERAGE by the amount of pixels.

Now use the average to create the hash value:
Code: [Select]
define HASH as a 64-bit, unsigned integer.

for each PIXEL in IMAGE:
    If PIXEL is higher than AVERAGE:
        Set the Nth bit of HASH to 1.
    Otherwise:
        Set the Nth bit of HASH to 0.

Comparing hashes:

In order to compare two image hashes to actually see how similar the images are, we use something called the Hamming Distance. This simply yields the number of bits in each hash which are different from each other. The larger the Hamming Distance, the more different the images are. A Hamming Distance of 0 means the images are identical. A distance of 1-5, will mean the images are likely very similar. In our case, the maximum distance possible will be 64. And this means the images are completely different.

Code: [Select]
define DISTANCE as a 64-bit, unsigned integer.
define HASH_A and HASH_B as the two hashes to compare.

for each BIT_A in HASH_A and BIT_B in HASH_B:
    If BIT_A is not equal to BIT_B:
        Increment DISTANCE by 1.

Now we have the distance between the two hashes and we know how the two images compare to each other.
Whether we consider an image equal or not, depends on your needs. Ideally, you will define some threshold distance. Any distance larger than this threshold, means the image is not the same and is discarded.
 
posted on 2013-04-06 23:00  Hanson-jun  阅读(337)  评论(0编辑  收藏  举报