2011年3月16日 星期三

Lost in Binarization: Query-Adaptive Ranking for Similar Image Search with Compact Codes

The combination of local feature like SIFT and the BoW framework has been adopted in many system for image representation. But such representation suffers from:
  1. High dimensionality
  2. A long list of visual word for each image
which makes indexing and searching efficiently a hard problem. Binary embedding is so far the most promising indexing method, comparing with traditional tree-based method, which is not suitable for high dimensional data, and inverted file, which is not as efficient as it is in text domain due to 2. above.

But when performing image search using binary representation, many images will have the same distance to the query one, due to the discrete nature of Hamming distance, and it is unlikely that all these images are as similar to the query as others.The paper proposed a method that can perform fine-grained re-ranking for images with the same hamming distance.

The argument of the paper is that, although every bit in a binary representation has the same importance in Hamming distance, this is not the case for real-world data. Some bits is more discriminative than others, and the importance of each bit may vary in different class of images. According to the argument, the system is designed as followed:
For each semantic class, a "importance", or weight, of each bit in the binary representation is offline calculated. When a query image comes in, a weight for the query will be adaptively generated, based on the weight pre-calculated, and is than used to re-rank results with the same Hamming distance.

To generate the weight of each semantic class, two criteria are considered:
  1. Intra-class similarity -
    Image in the same class should remain close together after the re-weighting transform.
    This is formulated as:
    where ai is the weight of i'th semantic class, ci is the center of the i'th semantic class, and x is an image.
  2. Inter-class relationship -
    The distance relationship of each class should remain the same.
    This is formulated as:
    where sij is the distance between the center of class i and j.
So the problem is formulated as
and can be transformed into a quadratic-programing (QP) like problem
which can be solved iteratively.

As for the weight of query image, it is calculated using a pseudo-relevance feedback method. The weight is the average of the semantic class weight of the top k  most similar images. The distance metric used in the system is

Comment:
1. The usage of semantic class is, although somehow reasonable, lack of validation. This is especially true when the dataset is more diverse in sense of semantic class or lack of labeling.

沒有留言:

張貼留言