Image Retrieval Using Flexible image subblocks
ByoungChul Ko
Dept. of Computer Science, Yonsei Univ.
Seoul, Korea, 120-749
+82-2-361-2719
soccer1@aipiri.yosnei.ac.kr
Hae-Sung Lee
Dept. of Computer Science, Yonsei Univ.
Seoul, Korea, 120-749
+82-2-361-2719
geneel@aipiri.yosnei.ac.kr
Hyeran Byun
Dept. of Computer Science, Yonsei Univ.
Seoul, Korea, 120-749
+82-2-361-2719
hrbyun@aipiri.yosnei.ac.kr
Keywords
Color histogram, Central moment, Ohta-color
space, multi-step k-nearest neighbor search, flexible subblock.
ABSTRACT
In
this research, we propose a flexible subblock image retrieval algorithm which
is robust to object translation, lighting change, object appearance or
disappearance in an image which is divided into 9 non-overlapping subblocks.
Furthermore, using Ohta-color
moments and biorthogonal wavelet frames from each subblock, we can reduce the
dimension of color space and improve the performance. Also, as the two features
are applied to the multi-step k-nearest neighbor algorithm, this system clearly
outperforms the global color histogram, RGB moments, and HSI moments using the same method. In addition, we
provide another retrieval environment for the user which is a relative block
location search as well as an absolute block location search. In the case of
the block location search, the
user selects a block the user wants to search.
The presence of vast digital repositories(both on-line and off-line) leads to many schemes of indexing and retrieving such data[1].(e.g. QBIC [2], Netra [3], VisualSeek [4], PickToSeek [5]). The purpose of these schemes is to retrieve similar images rapidly which is what the user is in search of. Most systems, basically, work in the same way: a feature vector is extracted from each image in the database and the set of all feature vectors is organized into a database index. At the query time, a feature is extracted from the query image and a user-provided sketch or an image from an image montage is matched against the feature vectors in the index [6]. Among all of the visual features, color is one of the most dominant and important feature for image representation. As color plays an important role in image composition, many color indexing techniques have been studied. Global color histograms and moments [6] have many advantages [7], but they provide no color-based spatial information. Thus, when the image collection becomes very large, many false hits frequently occur [12]. So, spatial information of color, texture, directional properties and shape are needed to overcome the limitations of global color indexing. In this paper, we propose a new scheme which uses both color moments and directional properties obtained from non-overlapping 9 subblocks. In addition, we apply flexible subblocks and multi-step k-nearest neighbor search algorithm to 9 subblocks along with the two features.
This paper is organized as follows. In the next section, we explain the limitations of the existing methods. Section 3 is dedicated to algorithms for another retrieval approach. In section 4, we propose the new algorithm for image retrieval including color spatial information and geometric properties using multi-step k-nearest neighbor search algorithm. The experimental results and evaluations are explained in section 5. We conclude the article with a remark on this scheme in section 6.
A color histogram is a vector where each entry stores the number of a given color in image [7]. It is a strong cue for retrieval of images and, also, is computationally the least intensive. Additionally, it is robust against occlusion and changes in camera viewpoint. It is widely used for content-based image retrieval. But, color histogram has some limitations.

Figure 1. Two images with
similar color histogram
Since it merely describes which colors are present in the
image, and in what quantities, it provides no spatial information like Figure
1. In addition, color histograms are sensitive to both compression artifacts
and changes in overall image brightness [7]. For example, if the color
histogram of an image
is a vector
, where
is the number of
colors(bin) in the image Q, the
similarity between the two images may be measured using the
. But, image retrieval
which is based on the
produces many
false positives, i.e., it does not retrieve all the images with perceptually
similar color histogram. This happens because perceptually similar color
histograms may be a large
a part from each
other. Changes in lighting which may result in a slight shift in the color
histogram, can cause the
metric to misjudge the similarity completely[8]. So,
histogram intersection[9] is used to retrieve image more than
.
The color
histogram requires additional storage space and a large amount of processing.
For an image of size
, the histogram calculation requires O(
)additions and O(
)increments. In addition, if N is the number of histogram
bins, O(N) operations are required to compare a pair of histograms. To decrease
the computational complexity, the number of bins should be reduced [10]. QBIC
[2] computes the average(RGB) and Munsell color coordinates
of each object and each image. It also computes a k elements color
histogram(primarily k=64). This method quantizes color space to k
levels and reduces the colors in the original image. The second approach to
reducing the number of bins is based on the observation that a small number of
bins capture the majority of pixel counts in a histogram [9]. But, Experiments
have shown that this approach results in a marginal degradation in performance
[10].
An alternative approach to reducing the computational complexity in color indexing is to use the dominant features of a histogram. A color distribution of an image is interpreted as probability distribution which can be characterized by its moment [8]. This moment approach outperforms other histogram-based approaches which are based on reducing the number of bins, in terms of storage requirements, retrieval speed, and robustness. However, the use of low-order moments is sensitive to change in illumination [10]. Gong et al. [11] have proposed the use of local histograms. Here, an image is partitioned into 9(3x3) subimages. For each subimage, a color histogram is generated. The content of the image is represented by the histogram of the entire image and the histograms of the subimages. However, this technique only considers color information, but geometric information is required to retrieve more precise images. An extracted index uses more memory than an index which would store a color histogram for global image. Stricker & Dimai [8] have proposed a technique in which the image is partitioned into 5 overlapping fuzzy regions. They implemented this approach by storing the first three moments of each color channel of an image in an index, i.e., for a HSV image they store only 9 floating point numbers per image. Though this technique is invariant to small translation and rotation of 90 degrees around the center of the image is guaranteed, good results cannot be expected in the case of a large object disappearing or appearing from the same background or abrupt change of lights (Figure. 2). The performance of their algorithm is only illustrated with 3 queries.
From Section 2, we knew that color histogram method has more limitations in comparison to advantages. So, we need to develop a new image retrieval scheme allowing for such characteristics as
¡¤ having invariance to rotation and translation of objects in the images,
¡¤ taking into account the spatial color information as well as
global color information,
¡¤ decreasing storage computational complexity problem.

Figure 2. Same images have the different number of object, translated
positions and abrupt changes of light (viewpoint).
In order to represent more meaningful spatial color information in an image and reduce the index size, we divide the image into 9(3x3) partially non-overlapping subblocks and compute the three central moments of color distributions of each image. In this system, we use Ohta-color space [13] for image representation. The axes of this space are the three largest eigenvectors of RGB space, found from the principal components analysis of a large selection of natural images.
(1)
The advantage of
the Ohta-color space is that the color channels are approximately
decorrelated, which makes it a good choice for computing per channel histograms
[13].
From the Ohta-color
space, we calculate three central moments. The first moment (
) is average color of each image, and the second central
moment (
) is the variance. The third central moment (
) is the skewness of each color channel.
The color index size for one image as follows,
81 = 3 color
channel(I1,I2,I3) ¢¥ 9
subblocks ¢¥ 3 moments
In similarity matching, we are not interested in all of the subblocks but in the seven most similar subblocks. That is, if corresponding subblocks between query and database images are not perfectly matched, then candidate images will be declared if the number of similar subblocks is over seven.
Texture is an important feature of a visible surface where repetition of fundamental pattern occurs. Texture features such as contrast, uniformity, coarseness, roughness, regularity, frequency, density, and directionality provide significant information for scene interpretation and image classification [15]. Several techniques based on features derived from a multiresolution representation of texture image have been reported. Jacob et al. [16] chose Haar wavelets, because they are the fastest to compute and the simplest to implement. The coefficients of these decompositions are distilled into small ¡°signatures¡± for each image. Gabor decomposition [17] and MSAR(Multiresolution Simultaneous Autoregressive Model) [18] or wavelet transform [19] have studied extracting texture features. In this paper, we do not use the classical approaches to extract directional texture information. Rather, we choose the biorhogonal wavelet frames(BWF). Frames are over-complete version of a basis set. If the set of functions or vectors is dependent and yet does allow the expansion described in equation (2), then the set is called a frame. If one is using a frame, a dual frame set can be specified so that analysis and synthesis can be done as r non-orthogonal basis [20].
(2)
where, 
Since, the BWF is wavelet-based, the inherent multiresolution nature of a wavelet provides more flexibility in the analysis of images [21]. Using the BWF, we can get the same size of lowpass and highpass images from the original image. Each highpass images are decomposed again into X-Y directional subimage.

Figure 3. Image decompositon using Biorthogonal Wavelet Frames.(S :
Lowpass filtered image, W1+: X-directional highpass filtered image, W2+:
Y-directional highpass filtered image)
From the first-level highpass filtered images(W11, W21), we calculate X-Y directional amplitude. The extracted directional feature index size for one image as follows,
20 = 9 subblocks x 2(X-Y directional
features of subblock) + 2(Global X-Y directional features)

Fig. 4. First level highpass filtered images (a) Original image (b)
X-directional image (c) Y-directional image
To determine the similarity of two images, we use both the three color moment and the X-Y directional properties. Most CBIR(Content-Based Image Retrieval) systems use the linear constraint model to combine the features [22].
But, the relationship among the features is certainly not linear, so such a metric is difficult to construct. Therefore, this approach is a mistake, even though it is commonly used by researchers working with content-based retrieval [23]. Instead of a linear comparison of features, we use multi-step k-NN(nearest neighbor) search. Multi-step algorithms are available for similarity searches in the presence of complex high-dimensional or even user-adaptable similarity distance functions [23].
In our system, each subblock is treated independently. So, these can not influence other subblocks. Most system using subblocks [6][11] may be influenced by other blocks. The reason is that they use a linear combination of all subblocks, like in equation (3).
(3)
where,

But, in this case, as subblocks are closely correlated each other closely, abrupt change of one or two blocks may influence the matching results.
In this paper, we propose a flexible subblock scheme for color: even though the corresponding subblocks between query and database images are not perfectly matched for color, primary candidate images will be declared if the number of similar subblocks is over seven.

Figure 5. Primary candidate image by flexible subblock method (S: Similar
block , X: dissimilar block)
We use a multi-stage k-nearest
neighbor algorithm in order to search similar images by two features. Along
with similarity range queries, k-nearest neighbor queries are an important
query types in similarity search [23]. We use an adapted version of the
multi-step k-nearest algorithm of [24]. The query image denoted by
, and the parameter k specifies the requested
number of neighbors. After the equation (3) is converted into the equation (4),
distances of corresponding blocks are calculated by queation (4).
(4)
![]()
In the first stage, if the number
of df within a threshold is over 7, then the image will be declared the
primary candidate image. In order to allow the error of flexible subblcoks, we
declare the number of candidate images to be k+4. Then, these primary cadidate
images are ranked by global moment distance.
(5)
![]()
From the ranked primary candidate
images, the maximum
of the global directional distance is determined by using
equation (6).
(6)

In the second stage, a range
query on the index is performed by
returning all primary candidate images that have a directional distance
of, at most
. For all of
these candidates, the exact image distance is evaluted and k closest images are
reported with a new rank using equation (7).
(7)
![]()
We have performed a variety queries using a set of 1,500 images from the WWW containing natural images, synthetic images (e.g. graphics, drawing), and 90 images from Cornell University (http://cs.cornell.edu/rdz) for comparison test with other method. In order to demonstrate the effectiveness of our algorithm, its performance is compared with the RGB histogram using histogram intersection algorithm, and our algorithm using RGB, HSI, Ohta color spaces. Three weight matrices for RGB, HSI, Ohta color moments are given in Table 1.
|
|
R |
G |
B |
H |
S |
I |
I1 |
I2 |
I3 |
|
First Moment |
1 |
1 |
1 |
3 |
2 |
1 |
1 |
1 |
1 |
|
Second Moment |
2 |
2 |
2 |
3 |
1 |
1 |
2 |
2 |
2 |
|
Third Moment |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
Table 1. Three weight matrices
for each color spaces
To evaluate the performances of
each method, recall is estimated by using color images from Cornell University
image data and query results are retrieved from our generated test set of 1,500
color images. The indexing method based on Ohta-color moment and
directional properties is clearly superior to the other methods. Here, we found
that RGB color is superior to HSI color, in the
case of using moments. Retrieval results are shown in Figure 8,9.
The color version of retrieval results are accessible at
http://vip.yonsei.ac.kr/Frip.

Figure 6. Image Retrieval
Performance (%)
We provide the user not only with an absolute block location search, but also another retrieval environment, a relative block location search. In the case of an absolute block location search, the user selects a block that he or she wants to search. The system compares corresponding blocks in the database image using multi-step k-NN algorithm. At this time, directional properties are not global properties but local properties. In the same way, if a user selects the relative block location search, the system compares all of the corresponding blocks in the database image like Figure 7.

Figure 7. Relative block
search
In this research, we propose a flexible subblock image retrieval scheme which is robust to object translation, lighting change, object appearance or disappearance in an image. Furthermore, using the Ohta-color moments and directional properties, we can reduce the dimension of color space, and improve the performance. As the two features are applied to the multi-step k-NN algorithm, it clearly outperforms the global color histogram, RGB moments, and HSI moments using the same method. If we combine this algorithm with shape information in the image, we will be obtain better than what we obtain now.

Figure 8. Retrieval Results (k=8)

Figure 9. Retrieval Results (k=5): left-absolute search, right-relative search
[1] Shih-Fu Chang, William Chen, Hari Sundaram, Semantic Visual Templates: Linking Visual Features to Semantics,
International conference on Image Processing
(ICIP '98), Chicago,
Illinois, October 4-7, 1998.
[2] C. Faloutsos, M.
Flickner, W.Niblack, D. Petkovic, W. Equitz, R. Barber, Efficient and Effective Querying by Image Content, Research Report
#RJ 9203 (81511), IBM Almanden Research Center, San Jose, Aug. 1993.
[3] W.Y.Ma , B.S.
Manjunath, Netra: A toolbox for
navigating large image database, IEEE International Conference on Image
Processing, 1997.
[4] J.R. Smith and
S.F. Chang, VisualSEEK: A Fully Automated
Content-Based Image Query System, ACM Multimedia 1996, Boston MA,Nov,1996.
[5] Theo Gevers and
Arnold W.M. Smeulders, The PicToSeek WWW
Image Search System, International Conference on Multimedia Computing and
System,1999.
[6] Markus Stricker
and Alexander Dimai, Color Indexing with
Weak Spatial Constraints, Storage and Retrieval for Image and Video
Database IV, vol. 2570, SPIE conference, Feb. 96, San Jose.
[7] Greg Pass, Ramin
Zabih, Justin Miller, Comparing Imgaes
Using Color Coherence Vectors, In 4th ACM Conference on
Multimedia, 1996.
[8] Markus Stricker
and Markus Orengo, Similarity of Color
Images, SPIE: Storage Retrieval Image and Video Database III, 2420,
February 1995, 381-392.
[9] M.J. Swain, Interactive indexing into image database,
Proc. SPIE: Storage and Retrieval Image and Video Database 1908, February 1993,
95-103.
[10] F. Idris and S.
Panchanathan, Review of Image and Video
Indexing Techniques, Journal of Visual communication and Image
Representation, Vol. 8, No.2, June, pp.146-166, 1997.
[11] Y. Gong, H. Zhang,
H. Chuant, and M. Skauuchi, An image database system with content capturing and
fast image indexing abilities, in Proceedings of the International Conference
on Multimedia Computing and Systems, May 1994, pp. 121-130.
[12] Yi Tao and William
I. Grosky, Spatial Color Indexing: A
Novel Approach for Content-Based Image Retrieval, International Conference
on Multimedia Computing and System, 1999.
[13] Y-I Ohta, T. Kanade , and T. Sakai, Color information for region segmentation,
Comp. Graph. And Img. Oroc., 13:222-241, 1980.
[14] Martin Szummer and
Rosalind W. Picard, Indoor-Outdoor Image Classification, IEEE Intl
Workshop on Content-based Access of Image and Video Databases, Jan 1998.
[15] M. Tuceryan and A.
K. Jain, Texture analysis, in Handbook of Pattern Recognition and Computer
Vision, pp. 235-276, World Scientific, Singapore, 1993.
[16] Charles E. Jacobs,
Adam Finkelstein, David Salesin, Fast Multiresolution Image Querying, Proc. of
SIGGRAPH 95, NewYork, 1995
[17] W. Y. Ma and B. S.
Manjunath, Image Indexing using a texture
dictionary, Digital Image Storage Archiving Systems 2606, October 1995,
288-298.
[18] H. Zhang and D. Zhong, A
scheme for visual feature based on image indexing, Storage Retrieval Image
Video Database III 2420, February 1995, 36-46.
[19] J. L. Chen and A. Kundu, Rotation and gray scale invariant texture identification using wavelet decomposition and hidden Markov model, IEEE Trans. Patt. Anal. Mach. Intell. 16(2), February, 1994, 208-214.
[20] C. Sideney Burrus, Ramesh A. Gopinath, Haitao Guo, Introduction Wavelets and Wavelet Transforms, A primer, Prentice-Hall, 1998
[21] John-Wei Hsieh, Hong-Yuan Liao, Ming-Tat Ko, and Kuo-Chin Fan, A New Wavelet-Based edge detector via Constrained Optimization, Image and Vision Computing, Vol. 15, ISS. 7, pp. 511-527, 1997.
[22] Hari Sundaram and Shih-Fu Chang, Efficient Video Sequence Retrieval in Large Repositories. Proc. IS&T/SPIE Symposium on Electronic Imaging: Science and Technology(EI ¡¯99), San Jose, CA, January, 1999.
[23] Thomas Seidl ,and Hans-Peter Kriegel, Optimal Multi-Step k-nearest Neighbor Search, Proc. ACM SIGMOD Int. Conf. On Management of Data, Seattle, Washington, June 1998.
[24] Korn F., Sidiropoulos N., Faloutsos C., Siegel E., Protopapas Z., Fast Nearest Neighbor Search in Medical Image Databases, Proc. 22nd VLDB Conference, Mumbai, India, 1996, pp. 215-226.
Columns on Last Page Should Be Made Equal Length