2011年3月1日 星期二

Distinctive Image Features from Scale-Invariant Keypoints

This paper introduces the well-known SIFT local feature, which is a widely used local feature, and has been proven to work well on many application, especially in object recognition and image matching.


SIFT feature contains two main parts, with four stages to extract the feature:
  1. Feature point identification
    1. Scale-space extrema detection -
      Identify interesting points by searching scale-space extrema. This is done by finding local minimum in the difference-of-Gaussian convolved image. It can be proved that the minimum is a good approximation of the laplacian of gaussian convovlved image, which has been shown to produce stable feature.
      Following computation are done in the particular scale-space, and are thus scale invariant.
    2. Keypoint localization -
      The position of interesting points is locate more precisely by finding the exact local minimum point. This is done by expanding the DOG-convolved image around the interesting point find in step1 to second order, and the minimum is located analytically.
      The keypoints founded are then filtered. This step is to eliminate poor keypoints that occurs due to edge response. This is done by thresholding the ratio of principal curvature and curvature of perpendicular direction, which will be large on edge.
    3. Orientation assignement -
      Each keypoint is then assigned with an principal direction. The direction is determined by creating an orientation histogram about the gradient, with the maximun being the principal direction. If there are multiple peaks in the histogram, a keypoint is created for each direction.
      Following calculation are done with the direction of keypoint being aligned, and are thus rotation invariant.
  2. Feature point descriptor
    1. Keypoint descriptor -
      The descriptor is an image gradient histogram. Points around the keypoints are divided into a 4*4 array of regions, a gradient histogram with 8 orientation bins is created for each region. Therefore, each keypoint is described by a 4*4*8 = 128 dimensional descriptor, along with the position and orientation.
      The descriptor is invariant to affine changes in illumination.

Some desirable properties about SIFT feature make it a successful local feature. These properties includes invariant to rotation, scaling, affine distortion and change in illumination. Also, it extract large numbers of keypoints in general images, which makes it very distinctive.

沒有留言:

張貼留言