Copyright ACM, 2000

Image Retrieval Using Flexible image subblocks

ByoungChul Ko

Dept. of Computer Science, Yonsei Univ.
Seoul, Korea, 120-749

Hae-Sung Lee

Dept. of Computer Science, Yonsei Univ.
Seoul, Korea, 120-749

Hyeran Byun

Dept. of Computer Science, Yonsei Univ.
Seoul, Korea, 120-749



Color histogram, Central moment, Ohta-color space, multi-step k-nearest neighbor search, flexible subblock.


In this research, we propose a flexible subblock image retrieval algorithm which is robust to object translation, lighting change, object appearance or disappearance in an image which is divided into 9 non-overlapping subblocks. Furthermore, using Ohta-color moments and biorthogonal wavelet frames from each subblock, we can reduce the dimension of color space and improve the performance. Also, as the two features are applied to the multi-step k-nearest neighbor algorithm, this system clearly outperforms the global color histogram, RGB moments, and HSI moments using the same method. In addition, we provide another retrieval environment for the user which is a relative block location search as well as an absolute block location search. In the case of the  block location search, the user selects a block the user wants to search.


1.     Introduction

The presence of vast digital repositories(both on-line and off-line) leads to many schemes of indexing and retrieving such data[1].(e.g. QBIC [2], Netra [3], VisualSeek [4], PickToSeek [5]).  The purpose of these schemes is to retrieve similar images rapidly which is what the user is in search of. Most systems, basically, work in the same way: a feature vector is extracted from each image in the database and the set of all feature vectors is organized into a database index. At the query time, a feature is extracted from the query image and a user-provided sketch or an image from an image montage is matched against the feature vectors in the index [6]. Among all of the visual features, color is one of the most dominant and important feature for image representation. As color plays an important role in image composition, many color indexing techniques have been studied. Global color histograms and moments [6] have many advantages [7], but they provide no color-based spatial information. Thus, when the image collection becomes very large, many false hits frequently occur [12]. So, spatial information of color, texture, directional properties and shape are needed to overcome the limitations of global color indexing. In this paper, we propose a new scheme which uses both color moments and directional properties obtained from non-overlapping 9 subblocks. In addition, we apply flexible subblocks and multi-step k-nearest neighbor search algorithm to 9 subblocks along with the two features.


This paper is organized as follows. In the next section, we explain the limitations of the existing methods. Section 3 is dedicated to algorithms for another retrieval approach. In section 4, we propose the new algorithm for image retrieval including color spatial information and geometric properties using multi-step k-nearest neighbor search algorithm. The experimental results and evaluations are explained in section 5. We conclude the article with a remark on this scheme in section 6.

2.     Color Indexing Method

2.1     The limitation of color histogram

A color histogram is a vector where each entry stores the number of a given color in image [7]. It is a strong cue for retrieval of images and, also, is computationally the least intensive. Additionally, it is robust against occlusion and changes in camera viewpoint. It is widely used for content-based image retrieval.  But, color histogram has some limitations.

Figure 1. Two images with similar color histogram

 Since it merely describes which colors are present in the image, and in what quantities, it provides no spatial information like Figure 1. In addition, color histograms are sensitive to both compression artifacts and changes in overall image brightness [7]. For example, if the color histogram of an image is a vector , where  is the number of colors(bin) in the image Q, the similarity between the two images may be measured using the . But, image retrieval  which is based on the  produces many false positives, i.e., it does not retrieve all the images with perceptually similar color histogram. This happens because perceptually similar color histograms may be a large  a part from each other. Changes in lighting which may result in a slight shift in the color histogram, can cause the metric to misjudge the similarity completely[8]. So, histogram intersection[9] is used to retrieve image more than .

2.2     Other limitations of color histogram

The color histogram requires additional storage space and a large amount of processing. For an image of size, the histogram calculation requires O()additions and O()increments. In addition, if N is the number of histogram bins, O(N) operations are required to compare a pair of histograms. To decrease the computational complexity, the number of bins should be reduced [10]. QBIC [2] computes the average(RGB) and Munsell color coordinates of each object and each image. It also computes a k elements color histogram(primarily k=64). This method quantizes color space to k levels and reduces the colors in the original image. The second approach to reducing the number of bins is based on the observation that a small number of bins capture the majority of pixel counts in a histogram [9]. But, Experiments have shown that this approach results in a marginal degradation in performance [10].

2.3     An alternative approach against color histogram

An alternative approach to reducing the computational complexity in color indexing is to use the dominant features of a histogram. A color distribution of an image is interpreted as probability distribution which can be characterized by its moment [8]. This moment approach outperforms other histogram-based approaches which are based on reducing the number of bins, in terms of storage requirements, retrieval speed, and robustness. However, the use of low-order moments is sensitive to change in illumination [10]. Gong et al. [11] have proposed the use of local histograms. Here, an image is partitioned into 9(3x3) subimages. For each subimage, a color histogram is generated. The content of the image is represented by the histogram of the entire image and the histograms of the subimages. However, this technique only considers color information, but geometric information is required to retrieve more precise images. An extracted index uses more memory than an index which would store a color histogram for global image. Stricker & Dimai [8] have proposed a technique in which the image is partitioned into 5 overlapping fuzzy regions. They implemented this approach by storing the first three moments of each color channel of an image in an index, i.e., for a HSV image they store only 9 floating point numbers per image. Though this technique is invariant to small translation and rotation of 90 degrees around the center of the image is guaranteed, good results cannot be expected in the case of a large object disappearing or appearing from the same background or abrupt change of lights (Figure. 2). The performance of their algorithm is only illustrated with 3 queries.

3.     Block-Based segmentation of image

  From Section 2, we knew that color histogram method has more limitations in comparison to advantages. So, we need to develop a new image retrieval scheme allowing for such characteristics as

· having invariance to rotation and translation of objects in the images,

· taking into account the spatial color information as well as        

  global color information,

· decreasing storage computational complexity problem.

Figure 2. Same images have the different number of object, translated positions and abrupt changes of light (viewpoint).

3.1     Block-Based features extraction

In order to represent more meaningful spatial color information in an image and reduce the index size, we divide the image into 9(3x3) partially non-overlapping subblocks and compute the three central moments of color distributions of each image. In this system, we use Ohta-color space [13] for image representation. The axes of this space are the three largest eigenvectors of RGB space, found from the principal components analysis of a large selection of natural images.


The advantage of the Ohta-color space is that the color channels are approximately decorrelated, which makes it a good choice for computing per channel histograms [13].

3.1.1     Color Properties

From the Ohta-color space, we calculate three central moments. The first moment () is average color of each image, and the second central moment () is the variance. The third central moment () is the skewness of each color channel. 

The color index size for one image as follows,

81 = 3 color channel(I1,I2,I3) ´ 9 subblocks ´ 3 moments

In similarity matching, we are not interested in all of the subblocks but in the seven most similar subblocks. That is, if corresponding subblocks between query and database images are not perfectly matched, then candidate images will be declared if the number of similar subblocks is over seven.  

3.1.2     Directional Properties

Texture is an important feature of a visible surface where repetition of fundamental pattern occurs. Texture features such as contrast, uniformity, coarseness, roughness, regularity, frequency, density, and directionality provide significant information for scene interpretation and image classification [15]. Several techniques based on features derived from a multiresolution representation of texture image have been reported. Jacob et al. [16] chose Haar wavelets, because they are the fastest to compute and the simplest to implement. The coefficients of these decompositions are distilled into small “signatures” for each image. Gabor decomposition [17] and MSAR(Multiresolution Simultaneous Autoregressive Model) [18] or wavelet transform [19]  have studied extracting texture features.  In this paper, we do not use the classical approaches to extract directional texture information. Rather, we choose the biorhogonal wavelet frames(BWF).  Frames are over-complete version of a basis set. If the set of functions or vectors is dependent and yet does allow the expansion described in equation (2), then the set is called a frame. If one is using a frame, a dual frame set can be specified so that analysis and synthesis can be done as r non-orthogonal basis [20].



Since, the BWF is wavelet-based, the inherent multiresolution nature of a wavelet provides more flexibility in the analysis of images [21]. Using the BWF, we can get the same size of lowpass and highpass images from the original image. Each highpass images are decomposed again into X-Y directional subimage.

Figure 3. Image decompositon using Biorthogonal Wavelet Frames.(S : Lowpass filtered image, W1+: X-directional highpass filtered image, W2+: Y-directional highpass filtered image)

From the first-level highpass filtered images(W11, W21), we calculate X-Y directional amplitude. The extracted directional feature index size for one image as follows,

20 = 9 subblocks x 2(X-Y directional features of subblock) + 2(Global X-Y directional features)

Fig. 4. First level highpass filtered images (a) Original image (b) X-directional image (c) Y-directional image

4.     Similarity matching method using multi-step k-NN search

To determine the similarity of two images, we use both the three color moment and the X-Y directional properties. Most CBIR(Content-Based Image Retrieval) systems use the linear constraint model to combine the features [22].

But, the relationship among the features is certainly not linear, so such a metric is difficult to construct. Therefore, this approach is a mistake, even though it is commonly used by researchers working with content-based retrieval [23]. Instead of a linear comparison of features, we use multi-step k-NN(nearest neighbor) search. Multi-step algorithms are available for similarity searches in the presence of complex high-dimensional or even user-adaptable similarity distance functions [23].

4.1     Flexible subblock

In our system, each subblock is treated independently. So, these can not influence other subblocks. Most system using subblocks [6][11] may be influenced by other blocks. The reason is that they use a linear combination of all subblocks, like in equation (3).



But, in this case, as subblocks are closely correlated each other closely, abrupt change of one or two blocks may influence the matching results.

In this paper, we propose a flexible subblock scheme for color: even though the corresponding subblocks between query and database images are not perfectly matched for color, primary candidate images will be declared if the number of similar subblocks is over seven.


Figure 5. Primary candidate image by flexible subblock method (S: Similar block , X: dissimilar block)

4.2     Matching procedure

We use a multi-stage k-nearest neighbor algorithm in order to search similar images by two features. Along with similarity range queries, k-nearest neighbor queries are an important query types in similarity search [23]. We use an adapted version of the multi-step k-nearest algorithm of [24]. The query image denoted by , and the parameter k specifies the requested number of neighbors. After the equation (3) is converted into the equation (4), distances of corresponding blocks are calculated by queation (4).


In the first stage, if the number of df  within a threshold is over 7, then the image will be declared the primary candidate image. In order to allow the error of flexible subblcoks, we declare the number of candidate images to be k+4. Then, these primary cadidate images are ranked by global moment distance.


From the ranked primary candidate images, the maximum of the global directional distance is determined by using equation (6).


In the second stage, a range query on the index is performed by  returning all primary candidate images that have a directional distance of, at most .  For all of these candidates, the exact image distance is evaluted and k closest images are reported with a new rank using equation (7).



5.     Evaluation and Results

We have performed a variety queries using a set of 1,500 images from the WWW containing natural images, synthetic images (e.g. graphics, drawing), and 90 images from Cornell University ( for comparison test with other method. In order to demonstrate the effectiveness of our algorithm, its performance is compared with the RGB histogram using histogram intersection algorithm, and our algorithm using RGB, HSI, Ohta color spaces. Three weight matrices for RGB, HSI, Ohta color moments are given in Table 1.











First Moment
































Table 1. Three weight matrices for each color spaces

To evaluate the performances of each method, recall is estimated by using color images from Cornell University image data and query results are retrieved from our generated test set of 1,500 color images. The indexing method based on Ohta-color moment and directional properties is clearly superior to the other methods. Here, we found that RGB color is superior to HSI color, in the case of using moments. Retrieval results are shown in Figure 8,9. The color version of retrieval results are accessible at

Figure 6. Image Retrieval Performance (%)

5.1     Query by block Search

We provide the user not only with an absolute block location search, but also another retrieval environment, a relative block location search. In the case of an absolute block location search, the user selects a block that he or she wants to search. The system compares corresponding blocks in the database image using multi-step k-NN algorithm. At this time, directional properties are not global properties but local properties. In the same way, if a user selects the relative block location search, the system compares all of the corresponding blocks in the database image like Figure 7.

Figure 7. Relative block search

6.     Conclusion

In this research, we propose a flexible subblock image retrieval scheme which is robust to object translation, lighting change, object appearance or disappearance in an image. Furthermore, using the Ohta-color moments and directional properties, we can reduce the dimension of color space, and improve the performance. As the two features are applied to the multi-step k-NN algorithm, it clearly outperforms the global color histogram, RGB moments, and HSI moments using the same method. If we combine this algorithm with shape information in the image, we will be obtain better than what we obtain now.



Figure 8. Retrieval Results (k=8)


Figure 9. Retrieval Results (k=5): left-absolute search, right-relative search


[1]     Shih-Fu Chang, William Chen, Hari Sundaram, Semantic Visual Templates: Linking Visual Features to Semantics, International conference on Image Processing (ICIP     '98), Chicago, Illinois, October 4-7, 1998.

[2]     C. Faloutsos, M. Flickner, W.Niblack, D. Petkovic, W. Equitz, R. Barber, Efficient and Effective Querying by Image Content, Research Report #RJ 9203 (81511), IBM Almanden Research Center, San Jose, Aug. 1993.

[3]     W.Y.Ma , B.S. Manjunath, Netra: A toolbox for navigating large image database, IEEE International Conference on Image Processing, 1997.

[4]     J.R. Smith and S.F. Chang, VisualSEEK: A Fully Automated Content-Based Image Query System, ACM Multimedia 1996, Boston MA,Nov,1996.

[5]     Theo Gevers and Arnold W.M. Smeulders, The PicToSeek WWW Image Search System, International Conference on Multimedia Computing and System,1999.

[6]     Markus Stricker and Alexander Dimai, Color Indexing with Weak Spatial Constraints, Storage and Retrieval for Image and Video Database IV, vol. 2570, SPIE conference, Feb. 96, San Jose.

[7]     Greg Pass, Ramin Zabih, Justin Miller, Comparing Imgaes Using Color Coherence Vectors, In 4th ACM Conference on Multimedia, 1996.

[8]     Markus Stricker and Markus Orengo, Similarity of Color Images, SPIE: Storage Retrieval Image and Video Database III, 2420, February 1995, 381-392.

[9]     M.J. Swain, Interactive indexing into image database, Proc. SPIE: Storage and Retrieval Image and Video Database 1908, February 1993, 95-103.

[10]  F. Idris and S. Panchanathan, Review of Image and Video Indexing Techniques, Journal of Visual communication and Image Representation, Vol. 8, No.2, June, pp.146-166, 1997.

[11]  Y. Gong, H. Zhang, H. Chuant, and M. Skauuchi, An image database system with content capturing and fast image indexing abilities, in Proceedings of the International Conference on Multimedia Computing and Systems, May 1994, pp. 121-130.

[12]  Yi Tao and William I. Grosky, Spatial Color Indexing: A Novel Approach for Content-Based Image Retrieval, International Conference on Multimedia Computing and System, 1999.

[13]   Y-I Ohta, T. Kanade , and T. Sakai, Color information for region segmentation, Comp. Graph. And Img. Oroc., 13:222-241, 1980.

[14]  Martin Szummer and Rosalind W. Picard, Indoor-Outdoor Image Classification, IEEE Intl Workshop on Content-based Access of Image and Video Databases, Jan 1998.

[15]  M. Tuceryan and A. K. Jain, Texture analysis, in Handbook of Pattern Recognition and Computer Vision, pp. 235-276, World Scientific, Singapore, 1993.

[16]  Charles E. Jacobs, Adam Finkelstein, David Salesin, Fast Multiresolution Image Querying, Proc. of SIGGRAPH 95, NewYork, 1995

[17] W. Y. Ma and B. S. Manjunath, Image Indexing using a texture dictionary, Digital Image Storage Archiving Systems 2606, October 1995, 288-298.

[18] H. Zhang and D. Zhong, A scheme for visual feature based on image indexing, Storage Retrieval Image Video Database III 2420, February 1995, 36-46.

[19] J. L. Chen and A. Kundu, Rotation and gray scale invariant texture identification using wavelet decomposition and hidden Markov model, IEEE Trans. Patt. Anal. Mach. Intell. 16(2), February, 1994, 208-214.

[20] C. Sideney Burrus, Ramesh A. Gopinath, Haitao Guo, Introduction Wavelets and Wavelet Transforms, A primer, Prentice-Hall, 1998

[21] John-Wei Hsieh, Hong-Yuan Liao, Ming-Tat Ko, and Kuo-Chin Fan, A New Wavelet-Based edge detector via Constrained Optimization, Image and Vision Computing, Vol. 15, ISS. 7, pp. 511-527, 1997.

[22] Hari Sundaram and Shih-Fu Chang, Efficient Video Sequence Retrieval in Large Repositories. Proc. IS&T/SPIE Symposium on Electronic Imaging: Science and Technology(EI ’99), San Jose, CA, January, 1999.

[23] Thomas Seidl ,and Hans-Peter Kriegel, Optimal Multi-Step k-nearest Neighbor Search, Proc. ACM SIGMOD Int. Conf. On Management of Data, Seattle, Washington, June 1998.

[24] Korn F., Sidiropoulos N., Faloutsos C., Siegel E., Protopapas Z., Fast Nearest Neighbor Search in Medical Image Databases, Proc. 22nd VLDB Conference, Mumbai, India, 1996, pp. 215-226.





































Columns on Last Page Should Be Made Equal Length

Copyright 2000 ACM

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and or fee.
SAC 2000 March 19-21 Como, Italy
(c) 2000 ACM 1-58113-239-5/00/003>...>$5.00