Research in text domain retrieval has a long history and a relatively well studied problem, with many promising techniques available. These techniques have been applied to many application, such as search engine, and are proven to be very success. Image retrieval, on the other hand, is a relatively novel problem and is not as mature as text retrieval. So it will be very helpful if we can utilize the technique of text retrieval in image retrieval, which is the aim of this paper.
The key of the paper is to transform low-level visual feature into "Visual Word". Low-level visual features are first obtained by identifying interesting regions and using SIFT feature descriptor to represent these regions. Visual vocabulary are then build by performing clustering, with the center of each cluster being a "Visual Word", and each feature point extracted is cast into a particular "Visual Word". These visual words are analogy to regular words in text, and techniques in text retrieval can then be applied.
After visual words are obtained, they are processed as if they were words in a text, and the structure of the retrieval system follows that of regular text retrieval system:
- Building Vocabulary -
- Term weighting using tf-idf
- Stop word removal
- Indexing -
- Inverted file
- Retrieval Model -
- Vector space model
The main benefit of the proposed method is its time efficiency. With very simple technique, it can greatly reduce the retrieval time while returning results comparable to pure visual matching method.
沒有留言:
張貼留言