Computer Graphics & Geometry
Deputy of Chief of the CST department
Moscow Engineering Physics Institute
Abstracts The problem of image segmentation using brightness and color features including originally developed methods as well as methods of segments description applicable for future classification and comparison are reviewing in this paper.
The problem of image segmentation is one of important parts in most image processing and recognition applications. Whole system performance usually depends on the segmentation process quality.
This paper consists of some descriptions of original and enhanced methodics of image segmentation (using fixed threshold or dynamically changing criteria; iterational algorithms) in compare with wide- known segmentation algorithms. Another issue of this paper is to define a set of segment features allowing us to create computationally low-cost and effective methods of patches comparison and classification.
Experimental processing system PDem have been created to investigate problems related with brightness and color criteria in the patches segmentation process and segmentation descripting.
All methods are realized in practice.
Key words: images processing; images segmentation; machine vision; computer graphics ; objects from the picture.
The input image segmentation plays very important role in image processing aimed to objects description squeezing. The segmentation in common case may be defined as breaking image for some parts (patches) by similarity of the involved pixels properties.
It is some number of common segmentation algorithms. In most cases brightness- based segmentation used for monochrome images or color coordinate- based segmentation for color images. However, it's possible to make a segmentation using another features such as texture, shape or contours.
One of the simplest segmentation algorithms using brightens is binarization with following breaking the bi-colored image on 'black' and 'white' parts. Because of it's simplicity this method is widely used. But certain faults not allow to apply binarization on any image. The main fault of this method is an unacceptability for images with different light levels in different scene parts.
Anyway, for the pictures that contains objects with homogenous brightness on the homogenous background this method (like a black text on the white paper) may be used.
For these images the brightness can be a distinctive feature for target object separation. An important problem for this method is to find the optimal binarization threshold. If the choice of binarization threshold is not optimal, some sought objects (for example letters) can disappear or a part of noised background can be appended to the object.
A few analytical approaches for definition of the binarization optimal threshold are known. One of the methods is to establish the threshold on that value so the total sum of elements with underthreshold brightness is adjusted with apriory probability of these brightness values . For example, if it is known that black signs on the text page take 25% from the total square, so the threshold level for this image is necessary to establish in such way that brightness of fourth of elements will be lower the threshold.
Another approach concerning brightness threshold limitation is to choose the threshold corresponding with a bimodal histogram minimum and that is disposed between it's two peaks.  The definition of this minimal value often is embarrassed because of the locally non-monotonous stepped histogram. So a part of the histogram between the peaks is approximated with an analytical function, and it's minimum can be founded with calculation of derivatives. For example, if the histogram is approximated near it's minimum with a simple expression
y= ax2 + bx + c, the histogram minimum will be when x=-b/2a.
Also, for brightness threshold definition the Laplace operator can be used.
Let's consider image area in object region where the brightness increases from low "plateau" level to high "plateau" level that are connected with a slanted surface. On a flat sections the Laplacian is equal to zero and on slanted sections is nearly a zero. On areas where the transition from low "plateau" is made, the Laplacian will has big positive value, and where the transition to high "plateau" is made - big negative value. A histogram which is constructed using source image points corresponding to very high or very low values of Laplacian only, will be bimodal with a clear "valley" between the peaks. There are another methods of brightness threshold determination using another procedures of brightness differential separation.
If image background is not homogeneous, it is necessary to make the brightness threshold adapting to it's middle level. It can be performed with breaking images to little fragments and establishing thresholds for every fragment. The thresholds for every element of the image can be determinated with interpolation between fragment's centers.
2. Using Additional Features
The idea of image binarization using brightness criterion can be propagated to color images segmentation. There are methodics of source image breaking to areas based on threshold limitation in color coordinate systems like RGB, YIQ and in system intensity � color(hue)- saturation that are defined (Tenninbaum, Stanford Research Institute) as
Intensity I = R + G + B
Saturation S = 1-3min(r,g,b), where r=R/I; g=G/I; b=B/I
There are at least two methods of working with color coordinates. According to first of them, originating from all color components, a common coefficient is defined, which compares how close threshold level and color characteristics of viewed image pixel are disposed. Root-mean-square deviation can be considered as a simple example. According to another method, the comparison is made consistently for every color characteristic, starting with more significant one.
Thus all discussed above methods for monochrome images comparison, are suitable for color processing as subprocedures.
Let's consider some segmentation types.
A lot of works pay a big attention to texture segmentation. However, complexity of disclosing and texture parameters measuring hinders practical using of this idea.
For example, one of the approaches to texture segmentation  consists in calculation of some degree of texture granularity in all image points, with following definition this degree changes. In fact, source image is exposed to preliminary processing to transform it to brightness image. The main trouble of this approach consists in the fact that the measuring of a texture parameters performs in some window. So their values in border areas between texture areas, are average. As a result it is hard to localize the border between texture areas.
It is often useful in image recognition problems to divide a complex form object into connected set of parts with simple form, that can be easily described. This form segmentation can be done in two steps. Firstly the arbitrary form object is approximated with set of connected linear or curvilinear segments. Then the approximated form is exposed with segmentation by salient points. The main rules of segmentation are quite simple: nearest points of concavity are joined, forming a 'isthmuses', which then are removed from main body; the process is continued until all parts will become convex.
In spite of attraction and simplicity of described above segmentation methods, that are based on brightness, they have a set of valuable defects. The basic lack is the necessarity of same level of scene light for all part of an image, and the impossibility to separate two or more objects from one scene if they have different levels of light or quite different color characteristics.
To get better image segmentation a few another methods based on brightness segmentation which is described above, can be used. For example, in accordance with one of these methods, the histogram is broken into more than two zones, and then the source image limitation by brightness with dividing to a few layers is made.
In simple case this method means the histogram separation into several parts with the same length, and it not always gets the reasonable results.
More perfect approach requires to analyze the image histogram if there several local minimums. So in a presented on the picture histogram two local minimums easily can be found. Thus all source image pixels belongs not to two categories (like in binarization case), but to three categories and then an separated areas analysis will be done for every of three groups of pixels.
Definition of several minimum points is more complex problem in comparison with definition of the only minimum like in binarization case. It can be achieved, for example, with analysis of changing a sign of derivative of histogram with preliminary suppressed noise. Noise suppression can be made with a set of simple filters, for example, in simple case, new value of i column height of histogram can be calculated as
When real world scenes images are processed there are usually a big range of submitted colors and shades. In this case described above methods don't allow to get an acceptable result. Much better to make the segmentation of image with local threshold adjustment, for example, as it was described for binarization case. However this approach is not ideal also.
With the purpose of overcoming these restrictions new methods of segmentations described in this article were developed.
3. Dynamic segmentation
It is better in this situation to use segmentation with dynamic threshold , to which this article is dedicated. Using dynamic threshold allows to eliminate troubles related to necessarily to define a set of threshold values for whole image and errors of local thresholds determination.
Method of segmentation with dynamic threshold is based on representation of image as a set of "patches", that are a set of four- or eight- connected points with close colour- brightness characteristics. As against of that, the threshold limitation methods that are described above, firstly divide the image into several "layers" and then separate segments from binary "layer". If the "layerization" is used, it is easily can be get a situation when adjacent segments with close characteristics and belonging to the same object will be refer to different "layers" and then their following consolidation and referring to the same object will be embarrassed.
To make a segmentation with dynamic threshold firstly it is necessary to define a value of acceptable color-brightness characteristics range of pixels, that are belonging to one segment . The task of getting such threshold value can be solved basing on desired size of segments and their number.
Each segment separation begins with fixing of a pixel which presumable is inside the future segment. Problem of this pixel choice will be described below. The fixed pixel will be the first segment pixel and her brightness (color characteristics) will be brightness reference level of segment. Further we will call this level «base brightness» or «base color» for color images.
Then the border image pixels are examined. If pixel brightness is differs from base value not more than the point is added to the segment. Forming of the segment is finished if there are no more pixels that are borders to segment pixels and satisfying tocondition. After that a new start pixel is defined and the process repeat again until all image pixels are exhausted.
A set of segments is a result of this algorithm execution. These segments are separated by brightness criteria, each segment separation is done maximum independently from the whole image without blurring of threshold value in border points that usually takes place when simple threshold brightness limitation with threshold definition in every image point is made.
However, this processing scheme not always brings good results because it consist fixed criterion , as well as in the earlier described methods. The result of this rigid limitation will be that in this case, like in case of simple threshold limitation, the image part with smooth varied brightness (and representing single object with high probability) will be divided on some segments, that will complicate in the further task of recognition.
One of the possible ways how to avoid this impasse is to apply «Dynamical » threshold. That means that the threshold may be changed not only if switched from one segment to another, but also may be changed inside segment when processing. In original algorithm this may be achieved as the following. On every step of the described above algorithm the brightness of just added pixel takes in account. The updated value of the base brightness will be used when the next pixel checking. We offer to use one of two expressions for recalculating of the new value of base brightness :
1) , where is the brightness of pixel j.
And the criterion of an belonging of a boundary point is expressed as
It is easy to see, that (1) is a special case of (2) for k = 1. In this case is the arithmetic average of brightness already of joined pixels.
For k> 1 there is as strengthening of influence earlier joined pixels, and for k < 1, on the contrary, 'forgetting' of an initial level.
As the joining of borders occurs radialy from center (in an ideal case), it is more reasonable to apply k> 1, that will allow to keep dominant meaning of brightness of the image central area. In practice the best results are achieved for k ~ 1,01 for average pixels amount in a segment about 200. The best value of this factor is strongly depends on the amount of pixels. For reduction of influence of amount of pixels the common expression of the factor can be determined as, for example
Summing up this section, it is possible to tell, that such dynamically varying criterion of pixels selection , belonging to a segment, allows to make brightness segmentation of the images containing smooth varying of tones better. Thus the required computing resources are increased insignificantly (especially for k = const).
Let's return to a task of using described method to full-color images. In the given context the expression "full-color" means not only use of colors for representation of the information, but also as an antonym to expression "pseudo-color" the images, such, as, for example, thermogrammas.
In computer processing of the images some systems of color representation may be found usable. Most widespread is the RGB system, in which the colors are represented as a combination of red, green and blue components. Despite of affinity of such color representation to a biological mechanisms (the humans perceives colors dividing it on these three components), such system has a number of lacks.
One of closest to psycho- physical perception of color by the human is system "intensity-color tone - saturation", considered above, however it requires rather complicated transformations from RGB standard.
Much easier looks the problem with color translation from RGB to HSV (hue, saturation, value), that very close to mentioned "intensity-color tone-saturation" system. 
The translation from RGB to HSV system may be carried out by the following expressions: 
Such transformation performs in integers, without use of trigonometrical functions, and consequently with the least computing expenses.
The described above method of image segmentation with a dynamic threshold was realized for full-color images with use RGB and HSV systems of color representations.
Because in RGB representation all components are equivalent (in terms of importance), may be evaluated as "color distance", i.e. distance between two points in 3D color cube with axes R, G and B. The criterion, on which the selection of pixels to be joined to a segment performs as expressed:
The color of the initial images is not an obstacle for realization of dynamic change of base color. The only difference from a case of the monochrome image consists in necessity of a storage of base color (or colors of pixels) in triple variables, and also the complicated calculation of a color difference on the basis of above mentioned criteria.
Despite using of dynamic base color in construction of the image segment the set of segments received on an output of this method, not always reflects a real state of things. Frequently, the area of the image appropriate to one surface on a scene and having close characteristics appears broken on some segments. Therefore the next important phase of overall process of segmentation is the association of several neighboring segments in one, provided that their color characteristics are close. For achievement of good results it is required to define very complex rules of segments association. So, in the given project the simplified algorithms of revealing of segments subject to association were used. The necessity of affinity of the color characteristics determined how was described above, lays in their basis (criterion) as also necessity of a close position of segments. Additional parameter used in described work, is a "desirable minimum number of pixels in a segment". If joining segment contains amount of pixels, smaller than necessary, the probability of its joining grows.
Let's return to a question of a finding of an authentic start pixel, from which a segment grows in a method with a dynamic threshold.
The two-step processing of the image was applied for this purpose in described system. At the first stage the choice of start pixels performs arbitrary. Then the centers of areas are determined, and they are accepted on base points of segments for the second iteration of segmentation(Red marks shows the center this ). As the practice shows, for achievement of the best result it is usable to choose for the first stage larger, than for second.
4. Segments description
The next important stage in image analyzes and processing is to describe segment (patch) properties in order to future comparison and classification.
We propose segments description based on the set of simple geometrical and grammatical features.
|1) Outer sizes and geometrical center is the simplest features and define the size and position of the patch on the picture. Geometrical center position is very important in motion detection tasks.|
|2) The center of maximum distance from borders defines the center of most wide part of a segment. Evidently, this center lies in the «segment skeleton» naturally defined as the following («fire in prairie» method).
Guess that our segment is a part of prairie covered with dry grass. Let's fire it starting from whole border simultaneously. Fire will expands to the center and two waves of fire will converge in some points. These points are belongs to a «skeleton». The center of maximum distance from borders will be a point fired very last.
|3) The mass center being collated with another centers shows relative mass distribution and is a valuable simple feature of the segment.|
|4) The mass distribution on directions and the maximal border distance distribution on directions indirectly describe the shape of segment and remaining computationally simple at the same time.|
| 5) Finally, simple grammatical description shows one more way to describe streamlined segment shape.
In this case we describe a segment by a chain of symbols that represent abrupt turns to the left (-) or right (+). Please note, that the chain is ringed, so it is related with segment orientation as well as showed above features.
It is also important that features do not have any real value by itself. We should take in account features correlation instead. So, for example, we should evaluate and use centers displacement vectors from predefined one (say, from geometrical center).
Basing on the features signed above (which set, certainly, is redundant, and consequently not noncontradictory), it is possible to define segments and likeness criterion. If we will consider as an n-dimensional vector, where n is related with number of features, then likeness may be expressed:
As a result of application of the described above advanced methods of segmentation of the image on brightness criterion with a dynamic threshold we managed to achieve significant improvement of quality of a final set of segments and to simplify the further processing of the image. Segment description based on the features set showed above let us create relatively simple and effective comparison and classification algorithms.
Table 1: some commonly-used color models 
|Acronym alternate||Intended Use
||Axis 1||Axis 2||Axis 3||Used in|
||Device-specific color specification||Intensity of red gun||Intensity of green||Intensity of blue gun||Macintosh Windows|
||Saturation||Value||Macintosh Color Picker
||Lightness||Saturation||Windows Custom Color dialog
||Measuring color distance in accordance with
|Lightness||Red / Green balance||green/blue balance
||North American broadcast television||Luminance
||Blue- Green / Orange balance||Yellow � Green/Magenta balance|
|Opponent||Model of human color vision channels||Achrornatic||Red / Green||yellow/blue|
|TekHVC||Perceptually uniform color selection on electronic displays
||Color selection for Tektronix terminals
References1. Sarah Douglas and Ted Kirkpatrick. Do Color Models Really Make a Difference? Department of Computer Science, University of Oregon, 1995
Computer Graphics & Geometry