A good FR methodology should consider representation as well as classification issues, and a good representation method should require minimum manual annotaions.
March 10, 2009
February 25, 2009
Biologically motivated computationally intensive approaches to image pattern recognition
Author: Nikolay Petkov
Date: 1995
Abstract:
The concerned approaches are biologically motivated, in that we try to mimic and use mechanisms employed by natural vision systems, more specifically the visual system of primates. Visual information representations which are motivated by the function of the primary visual cortex, more specifically by the function of so-called simple cells, are computed.
Cortical filters and images
Computational models of visual neurons with linear spatial summation
The receptive field of a visual neuron is the part of the visual field within which a stimulus can influence the response of the concerned neuron.
The response r of a neuron to an input image s(x,y) can be modelled as follows:
- linear spatial summation. The receptive field function is like the impulse response of a linear system.
- thresholding and non-linear local contrast normalization. (this part not understood!!!)
Simple cells
In this study we are concerned with computer simulations of so-called simple cells in the primary visual cortex.
Neurophysiological research has shown that, on the population of all simple cells, the receptive field sizes vary considerably with the diameters of smallest to the largest receptive fields being in a ratio of at least 1:30.
It has been found that the spatial aspect ratio vary in a very limited range of 0.23<γ<0.92. The value γ=0.5 is used in our simulations.
The ratio σ/λ determines the number of parallel excitatory and inhibitory zones which can be observed in a receptive field. Neurophysiological research shows that the paramters λ and σ are closely correlated; on the set of all cells, the ratio σ/λ which determines the spatial-frequency bandwidth of a cell varies in a very limited range of 0.4-0.9 which corresponds to two to five excitatory and inhibitory stripe zones in a receptive field. The value σ/λ=0.5 is used in our simulations.
In our simulations, we use for φ the following values: 0 (symmetric receptive fields to which we refer as ‘center-on’ in analogy with retinal ganglion cell receptive fields whose central areas are excitatory), π (symmetric receptive fields to which we refer to as ‘center-off’, since their central lobe are inhibtory) and -0.5π and 0.5π (antisymmetric receptive fields with opposite polarity).
For technical applications one might wish to have filters with a more distinct effect, in that such a filter should enhance edges or bars, but not both at the same time. This would simplify the interpretation of the resulting cortical images and the subsequent processing steps. We propose a mechanism for the elimination of so-called ‘shadow’ lines in cortical images obtained from filters with antisymmetric spatial summation functions. An edge which is strongly enhanced by a cortical filter with an antisymmetric receptive field function of orientation θ and phase φ will give rise to a pair of parallel lines in the cortical image produced by a cortical filter with an antisymmetric receptive field function of the same orientation θ and phase φ+π. We call these lines ‘shadow’ lines and proposed to eliminate them by a mechanism we called lateral inhibitation. Roughly speaking, this mechanism acts as follows: the value of a pixel in a cortical image corresponding to orientation θ and phase φ is set to zero, if a pixel of higher value is found in a certain vicinity of the concerned pixel in the cortical image corresponding to orientation θ and phase φ+π. Here, it is proposed to extend the lateral inhibition mechanism applying it to all cortical images with the same set of values of the receptive field function parameters, but φ.
Using cortical images
Extracting lower dimension representations
Although the sets of cortical images computed according to the above model deliver usefully structured information, they themselves do not give an ultimate solution to the image pattern recognition problem.
All pixel values in a cortical image are summed together to build a quantity which is (partially) characteristic of the input image. (A positive aspect of this scheme is that the result does not depend on the precise position of the object to be recognized.) Since a number of different cortical channels are used, each of them computing a different cortical image from the input image, a number of such quantities are computed, one per cortical channel. Together these quantities form a descriptor vector which is considered as a projection of the input image onto a point in a lower-dimension space. This representation is then used to discriminate among different images.
Computing optic flow
February 20, 2009
December 9, 2008
Protected: Kernel Machine-Based One-Parameter Regularized Fisher Discriminant Method for Face Recognition
Protected: An efficient renovation on kernel Fisher discriminant analysis and face recognition experiments
Protected: A new kernel Fisher discriminant algorithm with application to face recognition
December 6, 2008
Protected: The FERET Evaluation Methodology for Face-Recognition Algorithms
November 28, 2008
November 27, 2008
Line Detection
Brief Description
While edges (i.e. boundaries between regions with relatively distinct graylevels) are by far the most common type of discontinuity in an image, instances of thin lines in an image occur frequently enough that it is useful to have a separate mechanism for detecting them. Here we present a convolution based technique which produces an image description of the thin lines in an input image. Note that the Hough transform can be used to detect lines; however, in that case, the output is a parametric description of the lines in an image.