Huang Dong’s Blog, email: huangdongxy@hotmail.com

November 27, 2008

2-D Image Fourier Transform

source 1,
source 2

  • FT is used if we want to access the geometric characteristics of a spatial domain image.
  • In most implementations the Fourier image is shifted in such a way that the DC-value (i.e. the image mean) F(0,0) is displayed in the center of the image. The further away from the center an image point is, the higher is its corresponding frequency

We will now experiment with some simple images to better understand the nature of the transform.

The image

shows 2 pixel wide vertical stripes. The Fourier transform of this image is

If we look carefully, we can see that it contains 3 main values: the DC-value and, since the Fourier image is symmetrical to its center, two points corresponding to the frequency of the stripes in the original image. Note that the two points lie on a horizontal line through the image center, because the image intensity in the spatial domain changes the most if we go along it horizontally.

Similar effects as in the above example can be seen when applying the Fourier Transform to

The magnitude of the Fourier Transform:

Magnitude of FT

We can see that, again, the main components of the transformed image are the DC-value and the two points corresponding to the frequency of the stripes. However, the logarithmic transform of the Fourier Transform,

Log of magnitude of FT. now the image contains many minor frequencies. The main reason is that a diagonal can only be approximated by the square pixels of the image, hence, additional frequencies are needed to compose the image. The logarithmic scaling makes it difficult to tell the influence of single frequencies in the original image.

shows that now the image contains many minor frequencies. The main reason is that a diagonal can only be approximated by the square pixels of the image, hence, additional frequencies are needed to compose the image. The logarithmic scaling makes it difficult to tell the influence of single frequencies in the original image. To find the most important frequencies we threshold the original Fourier image at 5% of the main peak.

To find the most important frequencies we threshold the original Fourier image at 5% of the main peak. Compared to the original Fourier image, several more points appear. They are all on the same diagonal as the three main components, i.e. they all originate from the periodic stripes.

Compared to the original Fourier image, several more points appear. They are all on the same diagonal as the three main components, i.e. they all originate from the periodic stripes. The represented frequencies are all multiples of the basic frequency of the stripes in the spatial domain image.

Finally, we present an example (i.e. text orientation finding) where the Fourier Transform is used to gain information about the geometric structure of the spatial domain image. Text recognition using image processing techniques is simplified if we can assume that the text lines are in a predefined direction. Here we show how the Fourier Transform can be used to find the initial orientation of the text and then a rotation can be applied to correct the error. We illustrate this technique using

a binary image of English text. The logarithm of the magnitude of its Fourier transform is

and

is the thresholded magnitude of the Fourier image. We can see that the main values lie on a vertical line, indicating that the text lines in the input image are horizontal.

If we proceed in the same way with

which was rotated about 45°, we obtain

and

in the Fourier space. We can see that the line of the main peaks in the Fourier domain is rotated according to rotation of the input image. The second line in the logarithmic image (perpendicular to the main direction) originates from the black corners in the rotated image.

Common variants of DFT: DCT

The main advantages of the DCT are that it yields a real valued output image and that it is a fast transform. A major use of the DCT is in image compression — i.e. trying to reduce the amount of data needed to store an image. After performing a DCT it is possible to throw away the coefficients that encode high frequency components that the human eye is not very sensitive to. Thus the amount of data can be reduced, without seriously affecting the way an image looks to the human eye.

Template Matching

From the text below, we located areas where they have a high degree of correlation and similarity with the template “A”. There is a high degree of correlation in places where there is high degree of similarity.

Places with high correlation is indicated by bright white spots. However, the image is upside-down and left-side right. Rotating the image of the correlation would correspond would yield a direct correspondence with the places in the original text.

Edge Detection

We performed edge-detection using vertical, and horizontal filters using the command imcorrcoef(). This command is similar to template matching but instead uses a kernel template, and doesn’t care with the size of the template and the image.

let=imread(”L.bmp”);
let=im2gray(let);
hor = [-1 -1 -1; 2 2 2; -1 -1 -1];
vert= [-1 2 -1; -1 2 -1;-1 2 -1];
von = [-1 -1 -1; -1 8 -1; -1 -1 -1];
pattern=von //im2gray(vert);
c=imcorrcoef(let, pattern);
imshow(c);

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a comment

Create a free website or blog at WordPress.com.