Color Image Processing

RGB Images
1. An RGB color image is an M N 3 array of color pixels.
2. Each color pixel is a triplet corresponding to red, green and blue
components of an RGB image at a specific spatial location.
3. An RGB image can be viewed as a stack of a three gray-scale images
that produce a color image on the screen when fed into red, green and
blue inputs of a color monitor.
4. The three images forming an RGB color image are referred to as the red,
green and blue component images.
5. The data class of the component images determines their range of
6. The number of bits used to
represent the pixel values of
the component images
determines the bit depth of an
RGB image.
7. The number of possible colors
in RGB image is (2^b)^3.

8. The cat operator is used to stack the three component images, having
rgb_image = cat (3, fR, fG, fB)
9. The three component images can be extracted by using the syntax:
fR = rgb_image (: , : , 1)
fG = rgb_image (: , : , 2)
fB = rgb_image (: , : , 3)
10. The RGB color space is shown graphically as an RGB color cube.
11. The vertices of the cube are the primary and secondary colors of light.
12. Moreover, function rgbcube is used to view the color cube from any perspective as:
rgbcube(vx, vy, vz)

Indexed Images
1. An indexed image has two components: a data matrix of integers, X, and
a colormap matrix, map.
2. An indexed image uses direct mapping of pixel intensity values to
colormap values.
3. An indexed image can be displayed as:
imshow (X, map)
or, image (X)
colormap (map)
4. The function imapprox is used to approximate an indexed image by
fewer colors, having syntax:
[Y, newmap] = imapprox (X, map, n)
5. When number of rows in map is less than the number of distinct integer
values in X, same color is used to display multiple values of X.
6. Color map can be specified by using the statement:
map (k, 🙂 = [r(k) g(k) b(k)]
7. The function whitebg is used to change the background color of the
figure, having syntax:
whitebg (‘color_long_name’)
or, whitebg (‘color_short_name’)
or, whitebg ([RGB_values])

Predefined Colormaps

Conversion between RGB, Indexed, and Gray-scale Intensity Images

NTSC Color Space
1. In this, image data consists of three components: luminance (Y), hue (I),
and saturation (Q).
2. The YIQ components are obtained from the RGB components by using
function rgb2ntsc, having syntax:
yiq_image = rgb2ntsc (rgb_image)
3. Similarly, the function ntsc2rgb is used to transform YIQ components to
RGB components, having syntax:
rgb_image = ntsc2rgb (yiq_image)

YCbCr Color Space
1. In this, image data consists of three components: luminance (Y),
difference between blue component and reference value (Cb), and
difference between red component and reference value (Cr).
2. The conversion from RGB to YCbCr is done by using function rgb2ycbcr,
having syntax:
ycbcr_image = rgb2ycbcr (rgb_image)
3. Similarly, the function ycbcr2rgb is used to transform YCbCr image to
RGB, having syntax:
rgb_image = ycbcr2rgb (ycbcr_image)

HSV Color Space
1. HSV is hue, saturation, value that refers to tint shade and tone.
2. It is formulated by looking at the gray axis of the RGB color cube which results in the hexagonally
spaced color palette.
3. Along the vertical (gray) axis, there is a change in the size of the hexagonal plane that yields
the volume depicted in figure.

4. Function rgb2hsv is used to convert image from RGB to HSV, having
hsv_image = rgb2hsv (rgb_image)

5. Similarly, the function hsv2rgb is used to transform HSV image back to
RGB, having syntax:
rgb_image = hsv2rgb (hsv_image)

CMY and CMYK Color Spaces
1. The secondary colors of light are Cyan, Magenta,and Yellow.
2. These colors are also considered as the primary colors of pigments.
3. Cyan pigment can be obtained by subtracting red light from reflected
white light.
4. Similarly, pure magenta doesn’t reflect green and pure yellow doesn’t
reflect blue.
5. Black pigment can be obtained from the equal amounts of cyan,
magenta, and yellow.
6. Fourth color black can be added to produce true black, giving rise to
CMYK color space.
7. To convert from RGB to CMY and vice-versa, imcomplement function is
used, having syntax:
cmy_image = imcomplement (rgb_image)
and, rgb_image = imcomplement (cmy_image)

HSI Color Space
1. HSI (hue, saturation, intensity) color space decouples the
intensity component from the color-carrying information in a
color image.
2. Figure shows the HSI model based on color triangles and circles.


Color Transformations
1. Linear interpolation can be implemented by using interp1q function,
having syntax:
z = interp1q (x, y, xi)
2. The function spline can be used to implement cubic spline interpolation,
having syntax:
z = spline (x, y, xi)

Color Image Smoothing
1. RGB color image smoothing with linear spatial filtering consists of the
following steps:
a. Extract three component images:
fR = fc ( : , : , 1); fG = fc ( : , : , 2); fB = fc ( : , : , 3);
b. Filter each component individually:
fR_filtered = imfilter (fR, w)
c. Reconstruct the filtered RGB image:
fc_filtered = cat (3, fR_filtered, fG_filtered, fB_filtered)

Color Image Sharpening
1. Laplacian filter mask can be used to sharpen the blurred image, fb,
having syntax:
lapmask = [1 1 1; 1 -8 1; 1 1 1];
2. Enhanced image can be computed and displayed using command:
fen = imsubtract (fb, imfilter(fb, lapmask, ‘replicate’)
imshow (fen)

Image Segmentation

Point Detection
1. The detection of isolated points embedded in the areas of constant or
nearly constant intensity in an image is straightforward in principle.
2. The point detection approach can be implemented by using command:
g = abs (imfilter (double (f), w)) >= T
3. The other approach is to find the points in all neighborhoods for which
the difference of the maximum and minimum pixel values exceeds a
specified value of T.
4. The function ordfilt2 can be used to implement this approach, having
g = imsubtract (ordfilt2 (f, m*n, ones (m, n), . . .
ordfilt2 (f, 1, ones (m, n)));

Line Detection
1. Line detection is the next level of
2. If mask in figure (a) were moved
around an image, it would
respond more strongly to lines
oriented horizontally.
3. Similarly, second mask responds
to lines oriented at +45 degrees
(fig. (b)), third mask to vertical
lines (fig. (c)), and fourth mask to
lines at -45 degrees (fig. (d)).


+45 degree


– 45 degree

Edge Detection
1. Edge detection is the most common approach for detecting meaningful
discontinuities in intensity values.
2. Several edge estimators are provided by function edge, having syntax:
[g, t] = edge (f, ‘method’, parameters)
3. Edge detectors available in function edge are:
a. Sobel
b. Prewitt
c. Roberts
d. Laplacian of Gaussian
e. Zero crossings
f. Canny

Line Detection using Hough Transform
1. Hough transform is the approach of linking line segments in an image.
2. Function hough has either the default syntax:
[H, theta, rho] = hough (f)
or, [H, theta, rho] = hough (f, ‘ThetaRes’, val1, ‘RhoRes’, val2)
3. Specified number of peaks are found by using function houghpeaks,
having syntax:
peaks = houghpeaks (H, NumPeaks)
or, peaks = houghpeaks (. . . , ‘Threshold’, val1, ‘NHoodSize’, val2)
4. To determine meaningful line segments associated with peaks an to get
the information of lines start and end, function houghlines is used, having
lines = houghlines (f, theta, rho, peaks)
or, lines = houghlines (. . . , ‘FillGap’, val1, ‘MinLength’, val2)

1. Image thresholding has a central position in applications of image
segmentation because of its intuitive properties and simplicity of
2. Basic global thresholding
a. Visual inspection of the image histogram is one of the way of
choosing a threshold
b. The preferred approach is to use an algorithm capable of choosing
the threshold automatically based on image data.
c. One such approach is as following:
i. Select an initial estimate for the global threshold, T.
ii. Segment the image using T.
iii. Compute the average intensity values for the pixels.
iv. Compute a new threshold value using averahe intensity values.
v. Repeat steps 2 through 4.
vi. Use function im2bw to segment the image as:
g = im2bw (f, T/den)
3. Optimum global thresholding using Otsu’s method
a. Function graythresh is used to compute the Otsu’s threshold, having
[T, SM] = graythresh (f)
4. Using image smoothing to improve global thresholding
a. Noise can turn a simple thresholding problem into unsolvable one.
b. When noise cannot be reduced at the source, and thresholding is the
segmentation method of choice, image smoothing is done prior to
thresholding to enhance the performance.
5. Using edges to improve global thresholding
a. Following algorithm is used to describe this process:
i. Compute an edge image from the input image.
ii. Specify a threshold value, T.
iii. Threshold the edge image to produce a binary image which is
used as marker image.
iv. Compute the histogram using only the pixels in input image that
correspond to the locations of 1-valued pixels in binary image.
v. Use the histogram to segment input image globally using Otsu’s
b. percentile2i function is used to compute an intensity value
corresponding to specified percentile, P, having syntax:
I = percentile2i (h, P)
6. Variable thresholding based on local statistics
a. Local thresholding is illustrated using the standard deviation and
mean of the pixels in a neighborhood of every point in an image.
b. Function stdfilt is used to compute the local standard deviation,
having syntax:
g = stdfilt (f, nhood)
7. Image thresholding using moving averages
a. A special case of the local thresholding method is based on
computing a moving average along scan lines of an image.
b. Function movingthresh is used to compute this, having syntax:
g = movingthresh (f, n, K)

Region-based segmentation
1. Region growing
a. It is a procedure that groups pixels or subregions into larger region
based on predefined criteria for growth.
b. The basic approach is to start with a set of seed points and from
these grow regions by appending to each seed those neighboring
pixels that have predefined properties similar to the seed.
c. Function regiongrow is used to do basic region growing, having
[g, NR, SI, TI] = regiongrow (f, S, T)
2. Region splitting and merging
a. In this the image is subdivided initially into a set of arbitrary,
disjointed regions and the merge and split the regions.
b. Function splitmerge is used for the same, having syntax:
g = splitmerge (f, mindim, @predicate)

Segmentation using Watershed Transform
1. Watershed segmentation using distance transform
a. Distance transform is useful tool with the watershed transform for
b. Function bwdist is used to compute the distance transform, having
D = bwdist (f)
c. Function watershed is used to compute the negative of the distance
transform, having syntax:
L = watershed (A, conn)
2. Watershed segmentation using gradient
a. The gradient magnitude is used often to preprocess a gray-scale
image prior to using the watershed transform for segmentation.
b. The gradient magnitude image has high pixel values along object
edges, and low pixel values everywhere else.
c. Then the watershed transform would result in watershed ridge lines
along object edges.

Spatial Filtering

Linear Spatial Filtering
1. The linear spatial filtering is implemented by using imfilter function havin
g = imfilter (f, w, filtering_mode, boundary_options, size_options)
2. The most common syntax for imfilter is:
g = imfilter (f, w, ‘replicate’)
3. To perform convolution when working with filters that are neither
pre-rotated nor symmetric, two options can be used.
a. Use the following syntax for imfilter:
g = imfilter (f, w, ‘conv’, ‘replicate’)
b. Use the function rot90 (w, 2) to rotate w to 180 degrees and then use
g = imfilter (f, w, ‘replicate’)
4. Options for function imfilter:

Nonlinear Spatial Filtering
1. Two functions are used to perform nonlinear filtering: nlfilter and colfilt.
2. Function colfilt organizes the data in the form of columns and is faster
than nlfilter.
3. The colfilt function has syntax as:
g = colfilt (f, [m n], ‘sliding’, @fun)
4. The function padarray is used for padding the input image explicitly:
fp = padarray (f, [r c], method, direction)

5. Options for function padarray:

Linear Spatial Filters
1. The function fspecial is used to generate a filter mask, having syntax:
w = fspecial (‘type’, parameters)
2. Spatial filters supported by function fspecial:

Nonlinear Spatial Filters
1. The ordfilt2 function, which generates order-static filters, is used for
generating non linear spatial filters, having syntax:
g = ordfilt2 (f, order, domain)
2. The function medfilt2 is used for median filtering, having syntax:
g = medfilt2 (f, [m n], padopt)


What is Digital Image Processing?
1. An image may be defined as a two dimensional function f(x,y), where x
and y are spatial coordinates.
2. Amplitude of f at any pair of coordinates (x,y) is called intensity or gray
level of the image at that point.
3. When x, y, and amplitude values of f are all finite, discrete quantities,
image is said to be a digital image.
4. Hence, digital image processing refers to processing digital images by
means of a digital computer.
5. As a digital image is composed of a finite number of elements , each of
which has a particular location and a value.
6. These elements are referred to as picture elements, image elements,
pels, and pixels.
7. Pixel is the most widely used term to denote the elements of an image.
8. Moreover, digital image processing encompasses processes whose
inputs and outputs are images and, in addition, encompasses processes
that extract attributes from images, up to and including the recognition of
individual objects.

Background on MATLAB and the Image Processing Toolbox
1. MATLAB stands for MATrix LABoratory.
2. It was written originally to provide easy access to matrix software
developed by LINPACK and EISPACK projects.
3. MATLAB engines incorporate the LAPACK and BLAS libraries,
constituting the state of the art in software for matrix computation.
Background on MATLAB and the Image Processing Toolbox
4. MATLAB is a high level language for technical computing.
5. Its typical uses include:
a. Math & computation
b. Algorithm development
c. Data acquisition
d. Modeling & simulation
e. Data analysis, exploration, and visualization
f. Scientific and engineering graphics
g. Application development, including GUI building.
6. MATLAB is complimented by an application specific solutions family
called toolboxes.
7. The Image Processing toolbox is a collection of MATLAB functions that
extend the capability of MATLAB environment for the solution of image
processing problems.
8. There are various other toolboxes such as Signal processing, fuzzy logic,
and wavelet toolboxes, which are sometimes used to compliment IPT.

Areas of Image Processing covered in tutorials
1. The areas that must be covered in this tutorial will be:
a. DIP fundamentals
b. Intensity transformations and spatial filtering
c. Processing in the frequency domain
d. Image restoration
e. Color Image Processing
f. Wavelets
g. Image compression
h. Morphological image processing
i. Representation & Description
j. Object Recognition

MATLAB desktop
MATLAB environment consists of following main parts

1. Command Window
2. Command History
3. Workspace
4. Current Folder
5. Editor Window

Command Window
1. Main window in MATLAB.
2. Used to enter variables.
3. Used to run functions and
M-file scripts.
4. All commands are typed after
command prompt “>>”.

Command History
1. Statements entered in
command window are logged
in Command History.
2. View and search previously
run statements.
3. Copy and execute selected

1. List all variables used as long
as MATLAB has opened.
2. Type “who” in command
window to list all the
commands used.
3. Type “whos” to list all the
commands with current
values, dimensions, etc.

4. “clear” command is used to
clear all the variables from
5. Save all the variables and data
to text file (.mat file) to use it
for later.

Current Folder
1. Lists all m files, etc. available
in current directory.
2. Set working folder as current
directory or as a part of the
search path so that MATLAB
will find the files easily.


1. Used to create scripts and
2. Click the “New Script” button
in the Toolbar.

“Help” System
1. Type “help” in command
window to go through various
2. Type “help elfun” (Elementary
Math Function), and MATLAB
will list all the functions
according to specific

3. “help” command is used to get specific help about this function.

4. To open the help window on the specific topic of interest, use “doc” command.

5. help keyword is used to get
help for a specific function,
but lookfor command is used
to search for all functions, etc.
with a specific keyword.
6. Example: >>lookfor plot

Intensity Transformations

1. The term spatial domain refers to the image plane itself.
2. Spatial domain techniques operate directly on the pixels of image.
3. The spatial domain processes are denoted as:
g (x,y) = T [f (x,y)]
where, f (x,y) is the input image, g (x,y) is the processed image, and T is an
operator on f.
Intensity Transformation Functions
1. imadjust
a. Function imadjust is the basic IPT tool for intensity transformations of
grayscale images.
b. Syntax:
g = imadjust (f, [low_in high_in], [low_out high_out], gamma)
c. The negative of an image can also be obtained by using
imcomplement function.
g = imcomplement (f)
2. Logarithmic and contrast stretching transformations
a. These transformations are basic tools for dynamic range
b. Logarithm transformations are implemented using the expression
g = c * log (1 + double (f))
c. One of the principal use of the log transformation is to compress
dynamic range.
d. When performing log transformation, the compressed values would
be brought back to the full range of the display.
e. This statement is used to do this for 8 bits,
gs = im2unit8 (mat2gray (g))
f. Use of mat2gray brings the values to the range [0,1], and im2unit8
brings them to the range [0,255]
g. The function shown in fig. (a) is
called a contrast-stretching
transformation function.
h. The function shown in fig. (b) is
called a thresholding function

3. Handling a variable number of input and/or outputs
a. Function nargin is used to check the number of arguments input into
the M-function.
b. Function nargout is used in connection with the outputs of an
c. To check if the correct number of arguments were passed in the
body of an M-function, nargchk function is used.
4. Function to compute negative, gamma, log and contrast stretching
a. Function changeclass is used for this purpose, having syntax as.
g = changeclass (newclass, f)
where, image f is converted to the class specified in parameter newclass,
and output it as g.
5. Function used for image scaling
a. Function gscale is used for this purpose, having syntax as.
g = gscale (f, method, low, high)
where, image f is the image to be scaled.

Histogram processing and function plotting
1. Generating and plotting image histograms
a. The histogram of a digital image can be displayed by using imhist
command, having syntax:
h = imhist (f,b)
b. Histograms are also plotted using bar graphs, having syntax:
bar (horz, z, width)
c. Stem graph can also be used to plot histograms, having syntax:
stem(horz, z, ‘color_linestyle_marker’, ‘fill’)
d. Histograms are also plotted using plot command, having syntax:
plot(horz, z, ‘color_linestyle_marker’)
2. Histogram Equalization: To implement histogram equalization in the
toolbox, histeq function is used, having syntax:
h = histeq (f, nlev)
3. Histogram Matching: It is implemented in toolbox by using the syntax:
g = histeq (f, hspec)

Morphological Image Processing

1. Dilation grows and thickens objects in a binary image.
2. Structuring element controls the specific manner and extent of this
3. It is performed by function imdilate, having syntax:
A2 = imdilate (A, B)
4. The function strel constructs structuring elements with a variety of
shapes and sizes, having syntax:
se = strel (shape, parameters)

1. Erosion shrinks and thins objects in a binary image.
2. It is performed by function imerode, having syntax:
se = strel (shape, parameters)
A2 = imerode (A, se)

Combining Dilation and Erosion
1. Opening and closing
a. Functions imopen and imclose are used to implement opening and
closing in the toolbox, having syntax:
C = imopen (A, B)
and C = imclose (A, B)
2. Hit-or-Miss transformation
a. Function bwhitmiss is used to implement hit-or-miss transformation,
having syntax:
C = bwhitmiss (A, B1, B2)
3. Using Lookup Tables
a. Lookup table (LUT) is the faster way to compute the hit-or-miss
transformation when structuring elements are small.
b. A lookup table is constructed by using function makelut, based on a
user-supplied function.
c. Function applylut processes binary images using this lookup table.
d. The line of code
persistent lut
establishes a variable called lut and declares it to be persistent.

Function bwmorph
1. Function bwmorph implements various useful operations by combining
dilations, erosions, and lookup table operations, having syntax:
g = bwmorph (f, operation, n)
2. The two main operations are thinning and skeletonization.
a. Thinning means reducing binary objects or shapes in an image to
strokes that are a single pixel wide.
b. Skeletonization reduces binary image objects to a set of thin strokes.
c. Thinning and skeletonization produce short extraneous spurs called
parasitic components.
d. The process of cleaning up these spurs is called pruning.

Labeling Connected Components
1. The term connected component was defined in terms of path which in
turn depends on adjacency.
2. All the connected components in a binary image are computed by using
bwlabel function, having syntax:
[L, num] = bwlabel (f, conn)

Morphological Reconstruction
1. Reconstruction involves two images and a structuring element.
2. Function imreconstruct is used for morphological reconstruction, having
out = imreconstruct (marker, mask)
3. Opening by reconstruction
a. It restores exactly the shapes of the objects that remain after erosion.
b. The first step is to erode an image using imerode function, having
fe = imerode (f)
c. Finally, reconstruction is obtained by:
fobr = imreconstruct (fe, f)
4. Filling Holes
a. Function imfill performs holes filling operation, having syntax:
g = imfill (f, ‘holes’)
5. Clearing Border Objects
a. To perform this operation, function imclearborder is used.
g = imclearborder (f, conn)


Digital Image Representation
1. An image is a two dimensional function, f(x,y), where x and y are spatial
2. Amplitude of f at any pair of coordinates (x,y) is called intensity or gray
level of the image at that point.
3. The intensity of monochrome images is referred to as gray level.
4. Individual 2D images are combined to form the color images.
5. For example, in RGB color system, three individual component images
are constituted to form a color image.
6. Due to this, many of the techniques developed for monochrome images
can be extended to color images by processing the three component
images individually.
7. An image may be continuous with respect to the x- and y- coordinates,
and also in amplitude.
8. So converting such an image to digital form requires that the
coordinates, as well as the amplitude, be digitized.
9. Digitizing the coordinate values is called sampling; digitizing the
amplitude values is called quantization.
10. Thus, when x, y, and f are all finite, discrete quantities, the image is said
to be a digital image.

Reading Images
1. Images can be read by using imread function.
2. Syntax to read the image from current directory is:
imread (‘filename’)
3. If the file that has to be read is stored in a specified directory then
include a full or relative path to that directory in filename.
Reading Images
4. The row and column dimensions of an image are given by using size
5. The additional information about an array is displayed by using whos

Displaying Images
1. imshow function is used to display the images on the MATLAB desktop.
2. Syntax:
imshow (f)
3. Function pixval is used to display the intensity values of the individual
pixels interactively.

Writing Images
1. imwrite function is used to write the image to the disk.
2. Syntax:
imwrite (f, ‘filename’)
3. A more general syntax for JPEG image is:
imwrite (f, ‘filename.jpg’, ‘quality’, q)
4. imfinfo function is used to get an idea of image compression achieved
and to obtain other details of an image.
5. Syntax:
imfinfo filename
6. A more general syntax for tif image is:
imwrite (g, ‘filename.tif’, ‘compression’, ‘parameter’, …
‘resolution’, [colres rowres])
7. The contents of figure window can be exported to the disk in two ways.
a. Use the File pull down menu in the figure window and then
choose Export and then select a location, filename, and format.
b. Use the print command, as:
print -fno -dfileformat -rresno filename

Image Formats Supported by imread & imwrite

Data Classes

Image Types
1. Intensity images
a. It is a data matrix whose values have been scaled to represent
2. Binary images
a. It is a logical array of 0s and 1s.
b. If A is a numeric array, logical array B can be created as
B = logical (A)
c. islogical function is used to test if an array is logical:
islogical (C)
3. Indexed images
a. It has two components: a data matrix of integers, X, and a
colormap matrix, map.
b. It is displayed as:
imshow (X, map)
c. Indexed image can also be displayed as:
image (X)
colormap (map)
4. RGB images
a. An RGB color image is an M N 3 array of color pixels.
b. Each pixel is a triplet corresponding to the red, green, and blue
components of an RGB image at a specific spatial location.

Converting between Data Classes
1. The general syntax is:
B = data_class_name (A)
where, data_class_name is one of the names of the data types.

Converting between Image Classes and Types

Array Indexing
1. Vector Indexing
a. An array of dimension 1 N is called a row vector.
b. The elements of such a vector are accessed using
one-dimensional indexing.
c. The elements of a vector are enclosed by square brackets and
are separated by spaces or by commas.
d. A row vector is converted to column vector by using transpose
operator (.’).
e. The colon notation is used to access blocks of elements.
f. Function linspace generates a row vector, x, of elements, n,
linearly spaced between and including elements, a and b, whose
syntax is
x = linspace (a,b,n)
g. A vector can be used as index into another vector.
2. Matrix Indexing
a. Matrices can be represented as a sequence of row vectors
enclosed by square brackets and separated by semicolons.
b. Elements can be selected in matrices just as in vectors, but two
indices can be needed: one to establish row location and the
other for the corresponding column.
c. The colon notation is used in matrix indexing to select a
two-dimensional block of elements out of a matrix.
3. Selecting array dimensions
a. Operations of the form
operation (A,dim)
where operation denotes an applicable operation, A is an array, and
dim is a scalar.
b. Function ndims gives the number of dimensions of an array,
d = ndims (A)

Some Important Standard Arrays
1. zeros(M,N) generates an M N matrix of 0s of class double.
2. ones(M,N) generates an M N matrix of 1s of class double.
3. true(M,N) generates an M N logical matrix of 1s.
4. false (M,N) generates an M N logical matrix of 0s.
5. magic(M) generates an M M “magic square”.
6. rand(M,N) generates an M N matrix whose entries are uniformly
distributed random numbers in interval [0,1].
7. randm(M,N) generates an M N matrix whose numbers are normally
distributed random numbers with mean 0 and variance 1.