Image Segmentation

Point Detection
1. The detection of isolated points embedded in the areas of constant or
nearly constant intensity in an image is straightforward in principle.
2. The point detection approach can be implemented by using command:
g = abs (imfilter (double (f), w)) >= T
3. The other approach is to find the points in all neighborhoods for which
the difference of the maximum and minimum pixel values exceeds a
specified value of T.
4. The function ordfilt2 can be used to implement this approach, having
g = imsubtract (ordfilt2 (f, m*n, ones (m, n), . . .
ordfilt2 (f, 1, ones (m, n)));

Line Detection
1. Line detection is the next level of
2. If mask in figure (a) were moved
around an image, it would
respond more strongly to lines
oriented horizontally.
3. Similarly, second mask responds
to lines oriented at +45 degrees
(fig. (b)), third mask to vertical
lines (fig. (c)), and fourth mask to
lines at -45 degrees (fig. (d)).


+45 degree


– 45 degree

Edge Detection
1. Edge detection is the most common approach for detecting meaningful
discontinuities in intensity values.
2. Several edge estimators are provided by function edge, having syntax:
[g, t] = edge (f, ‘method’, parameters)
3. Edge detectors available in function edge are:
a. Sobel
b. Prewitt
c. Roberts
d. Laplacian of Gaussian
e. Zero crossings
f. Canny

Line Detection using Hough Transform
1. Hough transform is the approach of linking line segments in an image.
2. Function hough has either the default syntax:
[H, theta, rho] = hough (f)
or, [H, theta, rho] = hough (f, ‘ThetaRes’, val1, ‘RhoRes’, val2)
3. Specified number of peaks are found by using function houghpeaks,
having syntax:
peaks = houghpeaks (H, NumPeaks)
or, peaks = houghpeaks (. . . , ‘Threshold’, val1, ‘NHoodSize’, val2)
4. To determine meaningful line segments associated with peaks an to get
the information of lines start and end, function houghlines is used, having
lines = houghlines (f, theta, rho, peaks)
or, lines = houghlines (. . . , ‘FillGap’, val1, ‘MinLength’, val2)

1. Image thresholding has a central position in applications of image
segmentation because of its intuitive properties and simplicity of
2. Basic global thresholding
a. Visual inspection of the image histogram is one of the way of
choosing a threshold
b. The preferred approach is to use an algorithm capable of choosing
the threshold automatically based on image data.
c. One such approach is as following:
i. Select an initial estimate for the global threshold, T.
ii. Segment the image using T.
iii. Compute the average intensity values for the pixels.
iv. Compute a new threshold value using averahe intensity values.
v. Repeat steps 2 through 4.
vi. Use function im2bw to segment the image as:
g = im2bw (f, T/den)
3. Optimum global thresholding using Otsu’s method
a. Function graythresh is used to compute the Otsu’s threshold, having
[T, SM] = graythresh (f)
4. Using image smoothing to improve global thresholding
a. Noise can turn a simple thresholding problem into unsolvable one.
b. When noise cannot be reduced at the source, and thresholding is the
segmentation method of choice, image smoothing is done prior to
thresholding to enhance the performance.
5. Using edges to improve global thresholding
a. Following algorithm is used to describe this process:
i. Compute an edge image from the input image.
ii. Specify a threshold value, T.
iii. Threshold the edge image to produce a binary image which is
used as marker image.
iv. Compute the histogram using only the pixels in input image that
correspond to the locations of 1-valued pixels in binary image.
v. Use the histogram to segment input image globally using Otsu’s
b. percentile2i function is used to compute an intensity value
corresponding to specified percentile, P, having syntax:
I = percentile2i (h, P)
6. Variable thresholding based on local statistics
a. Local thresholding is illustrated using the standard deviation and
mean of the pixels in a neighborhood of every point in an image.
b. Function stdfilt is used to compute the local standard deviation,
having syntax:
g = stdfilt (f, nhood)
7. Image thresholding using moving averages
a. A special case of the local thresholding method is based on
computing a moving average along scan lines of an image.
b. Function movingthresh is used to compute this, having syntax:
g = movingthresh (f, n, K)

Region-based segmentation
1. Region growing
a. It is a procedure that groups pixels or subregions into larger region
based on predefined criteria for growth.
b. The basic approach is to start with a set of seed points and from
these grow regions by appending to each seed those neighboring
pixels that have predefined properties similar to the seed.
c. Function regiongrow is used to do basic region growing, having
[g, NR, SI, TI] = regiongrow (f, S, T)
2. Region splitting and merging
a. In this the image is subdivided initially into a set of arbitrary,
disjointed regions and the merge and split the regions.
b. Function splitmerge is used for the same, having syntax:
g = splitmerge (f, mindim, @predicate)

Segmentation using Watershed Transform
1. Watershed segmentation using distance transform
a. Distance transform is useful tool with the watershed transform for
b. Function bwdist is used to compute the distance transform, having
D = bwdist (f)
c. Function watershed is used to compute the negative of the distance
transform, having syntax:
L = watershed (A, conn)
2. Watershed segmentation using gradient
a. The gradient magnitude is used often to preprocess a gray-scale
image prior to using the watershed transform for segmentation.
b. The gradient magnitude image has high pixel values along object
edges, and low pixel values everywhere else.
c. Then the watershed transform would result in watershed ridge lines
along object edges.

Spatial Filtering

Linear Spatial Filtering
1. The linear spatial filtering is implemented by using imfilter function havin
g = imfilter (f, w, filtering_mode, boundary_options, size_options)
2. The most common syntax for imfilter is:
g = imfilter (f, w, ‘replicate’)
3. To perform convolution when working with filters that are neither
pre-rotated nor symmetric, two options can be used.
a. Use the following syntax for imfilter:
g = imfilter (f, w, ‘conv’, ‘replicate’)
b. Use the function rot90 (w, 2) to rotate w to 180 degrees and then use
g = imfilter (f, w, ‘replicate’)
4. Options for function imfilter:

Nonlinear Spatial Filtering
1. Two functions are used to perform nonlinear filtering: nlfilter and colfilt.
2. Function colfilt organizes the data in the form of columns and is faster
than nlfilter.
3. The colfilt function has syntax as:
g = colfilt (f, [m n], ‘sliding’, @fun)
4. The function padarray is used for padding the input image explicitly:
fp = padarray (f, [r c], method, direction)

5. Options for function padarray:

Linear Spatial Filters
1. The function fspecial is used to generate a filter mask, having syntax:
w = fspecial (‘type’, parameters)
2. Spatial filters supported by function fspecial:

Nonlinear Spatial Filters
1. The ordfilt2 function, which generates order-static filters, is used for
generating non linear spatial filters, having syntax:
g = ordfilt2 (f, order, domain)
2. The function medfilt2 is used for median filtering, having syntax:
g = medfilt2 (f, [m n], padopt)


What is Digital Image Processing?
1. An image may be defined as a two dimensional function f(x,y), where x
and y are spatial coordinates.
2. Amplitude of f at any pair of coordinates (x,y) is called intensity or gray
level of the image at that point.
3. When x, y, and amplitude values of f are all finite, discrete quantities,
image is said to be a digital image.
4. Hence, digital image processing refers to processing digital images by
means of a digital computer.
5. As a digital image is composed of a finite number of elements , each of
which has a particular location and a value.
6. These elements are referred to as picture elements, image elements,
pels, and pixels.
7. Pixel is the most widely used term to denote the elements of an image.
8. Moreover, digital image processing encompasses processes whose
inputs and outputs are images and, in addition, encompasses processes
that extract attributes from images, up to and including the recognition of
individual objects.

Background on MATLAB and the Image Processing Toolbox
1. MATLAB stands for MATrix LABoratory.
2. It was written originally to provide easy access to matrix software
developed by LINPACK and EISPACK projects.
3. MATLAB engines incorporate the LAPACK and BLAS libraries,
constituting the state of the art in software for matrix computation.
Background on MATLAB and the Image Processing Toolbox
4. MATLAB is a high level language for technical computing.
5. Its typical uses include:
a. Math & computation
b. Algorithm development
c. Data acquisition
d. Modeling & simulation
e. Data analysis, exploration, and visualization
f. Scientific and engineering graphics
g. Application development, including GUI building.
6. MATLAB is complimented by an application specific solutions family
called toolboxes.
7. The Image Processing toolbox is a collection of MATLAB functions that
extend the capability of MATLAB environment for the solution of image
processing problems.
8. There are various other toolboxes such as Signal processing, fuzzy logic,
and wavelet toolboxes, which are sometimes used to compliment IPT.

Areas of Image Processing covered in tutorials
1. The areas that must be covered in this tutorial will be:
a. DIP fundamentals
b. Intensity transformations and spatial filtering
c. Processing in the frequency domain
d. Image restoration
e. Color Image Processing
f. Wavelets
g. Image compression
h. Morphological image processing
i. Representation & Description
j. Object Recognition

MATLAB desktop
MATLAB environment consists of following main parts

1. Command Window
2. Command History
3. Workspace
4. Current Folder
5. Editor Window

Command Window
1. Main window in MATLAB.
2. Used to enter variables.
3. Used to run functions and
M-file scripts.
4. All commands are typed after
command prompt “>>”.

Command History
1. Statements entered in
command window are logged
in Command History.
2. View and search previously
run statements.
3. Copy and execute selected

1. List all variables used as long
as MATLAB has opened.
2. Type “who” in command
window to list all the
commands used.
3. Type “whos” to list all the
commands with current
values, dimensions, etc.

4. “clear” command is used to
clear all the variables from
5. Save all the variables and data
to text file (.mat file) to use it
for later.

Current Folder
1. Lists all m files, etc. available
in current directory.
2. Set working folder as current
directory or as a part of the
search path so that MATLAB
will find the files easily.


1. Used to create scripts and
2. Click the “New Script” button
in the Toolbar.

“Help” System
1. Type “help” in command
window to go through various
2. Type “help elfun” (Elementary
Math Function), and MATLAB
will list all the functions
according to specific

3. “help” command is used to get specific help about this function.

4. To open the help window on the specific topic of interest, use “doc” command.

5. help keyword is used to get
help for a specific function,
but lookfor command is used
to search for all functions, etc.
with a specific keyword.
6. Example: >>lookfor plot

Intensity Transformations

1. The term spatial domain refers to the image plane itself.
2. Spatial domain techniques operate directly on the pixels of image.
3. The spatial domain processes are denoted as:
g (x,y) = T [f (x,y)]
where, f (x,y) is the input image, g (x,y) is the processed image, and T is an
operator on f.
Intensity Transformation Functions
1. imadjust
a. Function imadjust is the basic IPT tool for intensity transformations of
grayscale images.
b. Syntax:
g = imadjust (f, [low_in high_in], [low_out high_out], gamma)
c. The negative of an image can also be obtained by using
imcomplement function.
g = imcomplement (f)
2. Logarithmic and contrast stretching transformations
a. These transformations are basic tools for dynamic range
b. Logarithm transformations are implemented using the expression
g = c * log (1 + double (f))
c. One of the principal use of the log transformation is to compress
dynamic range.
d. When performing log transformation, the compressed values would
be brought back to the full range of the display.
e. This statement is used to do this for 8 bits,
gs = im2unit8 (mat2gray (g))
f. Use of mat2gray brings the values to the range [0,1], and im2unit8
brings them to the range [0,255]
g. The function shown in fig. (a) is
called a contrast-stretching
transformation function.
h. The function shown in fig. (b) is
called a thresholding function

3. Handling a variable number of input and/or outputs
a. Function nargin is used to check the number of arguments input into
the M-function.
b. Function nargout is used in connection with the outputs of an
c. To check if the correct number of arguments were passed in the
body of an M-function, nargchk function is used.
4. Function to compute negative, gamma, log and contrast stretching
a. Function changeclass is used for this purpose, having syntax as.
g = changeclass (newclass, f)
where, image f is converted to the class specified in parameter newclass,
and output it as g.
5. Function used for image scaling
a. Function gscale is used for this purpose, having syntax as.
g = gscale (f, method, low, high)
where, image f is the image to be scaled.

Histogram processing and function plotting
1. Generating and plotting image histograms
a. The histogram of a digital image can be displayed by using imhist
command, having syntax:
h = imhist (f,b)
b. Histograms are also plotted using bar graphs, having syntax:
bar (horz, z, width)
c. Stem graph can also be used to plot histograms, having syntax:
stem(horz, z, ‘color_linestyle_marker’, ‘fill’)
d. Histograms are also plotted using plot command, having syntax:
plot(horz, z, ‘color_linestyle_marker’)
2. Histogram Equalization: To implement histogram equalization in the
toolbox, histeq function is used, having syntax:
h = histeq (f, nlev)
3. Histogram Matching: It is implemented in toolbox by using the syntax:
g = histeq (f, hspec)

Morphological Image Processing

1. Dilation grows and thickens objects in a binary image.
2. Structuring element controls the specific manner and extent of this
3. It is performed by function imdilate, having syntax:
A2 = imdilate (A, B)
4. The function strel constructs structuring elements with a variety of
shapes and sizes, having syntax:
se = strel (shape, parameters)

1. Erosion shrinks and thins objects in a binary image.
2. It is performed by function imerode, having syntax:
se = strel (shape, parameters)
A2 = imerode (A, se)

Combining Dilation and Erosion
1. Opening and closing
a. Functions imopen and imclose are used to implement opening and
closing in the toolbox, having syntax:
C = imopen (A, B)
and C = imclose (A, B)
2. Hit-or-Miss transformation
a. Function bwhitmiss is used to implement hit-or-miss transformation,
having syntax:
C = bwhitmiss (A, B1, B2)
3. Using Lookup Tables
a. Lookup table (LUT) is the faster way to compute the hit-or-miss
transformation when structuring elements are small.
b. A lookup table is constructed by using function makelut, based on a
user-supplied function.
c. Function applylut processes binary images using this lookup table.
d. The line of code
persistent lut
establishes a variable called lut and declares it to be persistent.

Function bwmorph
1. Function bwmorph implements various useful operations by combining
dilations, erosions, and lookup table operations, having syntax:
g = bwmorph (f, operation, n)
2. The two main operations are thinning and skeletonization.
a. Thinning means reducing binary objects or shapes in an image to
strokes that are a single pixel wide.
b. Skeletonization reduces binary image objects to a set of thin strokes.
c. Thinning and skeletonization produce short extraneous spurs called
parasitic components.
d. The process of cleaning up these spurs is called pruning.

Labeling Connected Components
1. The term connected component was defined in terms of path which in
turn depends on adjacency.
2. All the connected components in a binary image are computed by using
bwlabel function, having syntax:
[L, num] = bwlabel (f, conn)

Morphological Reconstruction
1. Reconstruction involves two images and a structuring element.
2. Function imreconstruct is used for morphological reconstruction, having
out = imreconstruct (marker, mask)
3. Opening by reconstruction
a. It restores exactly the shapes of the objects that remain after erosion.
b. The first step is to erode an image using imerode function, having
fe = imerode (f)
c. Finally, reconstruction is obtained by:
fobr = imreconstruct (fe, f)
4. Filling Holes
a. Function imfill performs holes filling operation, having syntax:
g = imfill (f, ‘holes’)
5. Clearing Border Objects
a. To perform this operation, function imclearborder is used.
g = imclearborder (f, conn)


Digital Image Representation
1. An image is a two dimensional function, f(x,y), where x and y are spatial
2. Amplitude of f at any pair of coordinates (x,y) is called intensity or gray
level of the image at that point.
3. The intensity of monochrome images is referred to as gray level.
4. Individual 2D images are combined to form the color images.
5. For example, in RGB color system, three individual component images
are constituted to form a color image.
6. Due to this, many of the techniques developed for monochrome images
can be extended to color images by processing the three component
images individually.
7. An image may be continuous with respect to the x- and y- coordinates,
and also in amplitude.
8. So converting such an image to digital form requires that the
coordinates, as well as the amplitude, be digitized.
9. Digitizing the coordinate values is called sampling; digitizing the
amplitude values is called quantization.
10. Thus, when x, y, and f are all finite, discrete quantities, the image is said
to be a digital image.

Reading Images
1. Images can be read by using imread function.
2. Syntax to read the image from current directory is:
imread (‘filename’)
3. If the file that has to be read is stored in a specified directory then
include a full or relative path to that directory in filename.
Reading Images
4. The row and column dimensions of an image are given by using size
5. The additional information about an array is displayed by using whos

Displaying Images
1. imshow function is used to display the images on the MATLAB desktop.
2. Syntax:
imshow (f)
3. Function pixval is used to display the intensity values of the individual
pixels interactively.

Writing Images
1. imwrite function is used to write the image to the disk.
2. Syntax:
imwrite (f, ‘filename’)
3. A more general syntax for JPEG image is:
imwrite (f, ‘filename.jpg’, ‘quality’, q)
4. imfinfo function is used to get an idea of image compression achieved
and to obtain other details of an image.
5. Syntax:
imfinfo filename
6. A more general syntax for tif image is:
imwrite (g, ‘filename.tif’, ‘compression’, ‘parameter’, …
‘resolution’, [colres rowres])
7. The contents of figure window can be exported to the disk in two ways.
a. Use the File pull down menu in the figure window and then
choose Export and then select a location, filename, and format.
b. Use the print command, as:
print -fno -dfileformat -rresno filename

Image Formats Supported by imread & imwrite

Data Classes

Image Types
1. Intensity images
a. It is a data matrix whose values have been scaled to represent
2. Binary images
a. It is a logical array of 0s and 1s.
b. If A is a numeric array, logical array B can be created as
B = logical (A)
c. islogical function is used to test if an array is logical:
islogical (C)
3. Indexed images
a. It has two components: a data matrix of integers, X, and a
colormap matrix, map.
b. It is displayed as:
imshow (X, map)
c. Indexed image can also be displayed as:
image (X)
colormap (map)
4. RGB images
a. An RGB color image is an M N 3 array of color pixels.
b. Each pixel is a triplet corresponding to the red, green, and blue
components of an RGB image at a specific spatial location.

Converting between Data Classes
1. The general syntax is:
B = data_class_name (A)
where, data_class_name is one of the names of the data types.

Converting between Image Classes and Types

Array Indexing
1. Vector Indexing
a. An array of dimension 1 N is called a row vector.
b. The elements of such a vector are accessed using
one-dimensional indexing.
c. The elements of a vector are enclosed by square brackets and
are separated by spaces or by commas.
d. A row vector is converted to column vector by using transpose
operator (.’).
e. The colon notation is used to access blocks of elements.
f. Function linspace generates a row vector, x, of elements, n,
linearly spaced between and including elements, a and b, whose
syntax is
x = linspace (a,b,n)
g. A vector can be used as index into another vector.
2. Matrix Indexing
a. Matrices can be represented as a sequence of row vectors
enclosed by square brackets and separated by semicolons.
b. Elements can be selected in matrices just as in vectors, but two
indices can be needed: one to establish row location and the
other for the corresponding column.
c. The colon notation is used in matrix indexing to select a
two-dimensional block of elements out of a matrix.
3. Selecting array dimensions
a. Operations of the form
operation (A,dim)
where operation denotes an applicable operation, A is an array, and
dim is a scalar.
b. Function ndims gives the number of dimensions of an array,
d = ndims (A)

Some Important Standard Arrays
1. zeros(M,N) generates an M N matrix of 0s of class double.
2. ones(M,N) generates an M N matrix of 1s of class double.
3. true(M,N) generates an M N logical matrix of 1s.
4. false (M,N) generates an M N logical matrix of 0s.
5. magic(M) generates an M M “magic square”.
6. rand(M,N) generates an M N matrix whose entries are uniformly
distributed random numbers in interval [0,1].
7. randm(M,N) generates an M N matrix whose numbers are normally
distributed random numbers with mean 0 and variance 1.

Image Compression

1. Image compression refers to the reduction of the amount of data required to represent a digital image.
2. By removing one or more of the three basic data redundancies, compression can be achieved.
3. Redundancies may be:
a. Coding redundancy
b. Interpixel redundancy
c. Psychovisual redundancy
4. Image compression systems comprises an encoder and a decoder.
5. Image f(x,y) is fed into the encoder which creates a set of symbols from the input data and uses them to
represent the image with compression.

6. Compression that is achieved can be numerically quantified via
compression ration:
CR = n1/n2
7. Function imratio can be used to compute the ratio of the number of bits
used in the representation of the two image files or variables, having
Cr = imratio (f1, f2)
8. Function bytes return the number of bytes in the input image, having
b = bytes (f)
9. To view or use the compressed image, it must be fed into a decoder and
reconstructive output image can be obtained.
10. The reconstructive output mage is not an exact replication of original
11. But if it is, the system is called error free, information preserving or
lossless and if not, is called lossy compression with error e (x, y).
12. Function compare can be used to compute and display the error
between two matrices.
rmse = compare (f1, f2, scale)

Coding Redundancy
1. The statistical dependencies among the neighbouring pixels causes a
redundancy called coding redundancy.
2. Each fixed length pixel is used to code the uncompressed image.
3. The average input per source output is called the entropy of the source.
4. If an image is emitted by a gray-level source, the gray-level histogram of
an observed image is used to model the source symbol probabilities and
generate an estimate called the first-order estimate.
5. Function entropy can be used to compute the first-order estimate of a
matrix, having syntax:
h = entropy (x, n)

Huffman Codes
1. The smallest possible number of code symbols per source symbol is
contained by Huffman codes when coding the gray levels of an image or
output of a gray-level mapping operation.
2. The first step is to create series of source reductions.
3. The second step is to code each reduced source.
4. The function huffman is used to implement the source reduction and
code assignment procedures, having syntax:
CODE = huffman (p)
5. All MATLAB global variables must be declared in the function using the
global x y z
6. Function cell is used to initialize the CODE, having syntax:
X = cell (m, n)
7. The function sort is used to sort the vector in ascending order of
probability at each iteration, having syntax:
[y, i] = sort (x)
8. The output is generated by using functions celldisp and cellplot as:
celldisp (s)
or, cellplot (s)
9. The final step is calling the makecode function as
makecode (s, [])

Interpixel Redundancy
1. It results from correlations between the pixels of an image or between
the pixels of neighboring images in a sequence of images.

Psychovisual Redundancy
1. Associated with real or quantifiable visual information.
2. There is a loss of quantitative information due to elimination of
psychovisual redundant data, it is called quantization.
3. Function quantize is used to quantize the elements of the matrix, having
y = quantize (x, b, type)
4. Improved gray-scale quantization method is used to recognize the eye’s
inherent sensitivity to edges and breaks them up by adding to each pixel
a pseudorandom number.

JPEG Compression
1. JPEG (Joint Photographic Experts Group) standard is one of the continuous tone, still frame compression
2. The first step is to subdivide the input image into non-overlapping pixel blocks of size 8*8.

3. As each 8*8 block or subimage is processed, its 64 pixels are level
shifted by subtracting 2^(m-1), and its 2D discrete cosine transform is
4. The resulting coefficients are the simultaneously normalized and
5. After quantizing each block’s DCT coefficients, the elements of the
resultant are reordered in zigzag pattern.
6. Function im2jpeg is used for JPEg compression, having syntax:
y = im2jpeg (x, quality)
7. Two specialized block processing functions used to simplify the
computations are:
a. blkproc automates the entire process of dealing with images in
blocks, having syntax:
B = blkproc (A, [M N], FUN, P1, P2, . . . )
b. im2col gives a matrix whose each column contains the elements of
one distinct block of the input image, having syntax:
B = im2col (A, [M N], ‘distinct’)
8. The function jpeg2im is used to decompress a compressed image,
having syntax:
x = jpeg2im (y)