posted on 2000-03-01, 00:00authored byWei-Hao Lin, Rong Jin, Alexander Hauptmann
Web image retrieval is a challenging task that requires efforts from image processing, link structure analysis, and Web text retrieval. Since content-based image retrieval is still considered very difficult, most current large-scale Web image search engines exploit text and link structure to "understand" the content of the Web images. However, local text information, such as caption, filenames and adjacent text, is not always reliable and informative. Therefore, global information should be taken into account when a Web image retrieval system makes relevance judgment. We propose a re-ranking method to improve Web image retrieval by reordering the images retrieved from an image search engine. The re-ranking process is based on a relevance model, which is a probabilistic model that evaluates the relevance of the HTML document linking to the image, and assigns a probability of relevance. The experiment results showed that the re-ranked image retrieval achieved better performance than original Web image retrieval, suggesting the effectiveness of the re-ranking method. The relevance model is learned from the Internet without preparing any training data and independent of the underlying algorithm of the image search engines. The re-ranking process should be applicable to any image search engines with little effort.