Sunteți pe pagina 1din 8

BioSystems 87 (2007) 314321

A phase-based stereo vision system-on-a-chip


Javier Daz a, , Eduardo Ros a,1 , Silvio P. Sabatini b,2 , Fabio Solari b,3 , Sonia Mota c,4
a Department of Computer Architecture and Technology, University of Granada, Spain
b Department of Biophysical and Electronic Engineering (DIBE), University of Genoa, Via Opera Pia 11A, I-16145 Genova, Italy
c Department of Computer Science and Numerical analysis, University of Cordoba, Spain

Received 28 February 2005; received in revised form 8 July 2006; accepted 15 July 2006

Abstract
A simple and fast technique for depth estimation based on phase measurement has been adopted for the implementation of a
real-time stereo system with sub-pixel resolution on an FPGA device. The technique avoids the attendant problem of phase warping.
The designed system takes full advantage of the inherent processing parallelism and segmentation capabilities of FPGA devices
to achieve a computation speed of 65 megapixels/s, which can be arranged with a customized frame-grabber module to process
211 frames/s at a size of 640 480 pixels. The processing speed achieved is higher than conventional camera frame rates, thus
allowing the system to extract multiple estimations and be used as a platform to evaluate integration schemes of a population of
neurons without increasing hardware resource demands.
2006 Elsevier Ireland Ltd. All rights reserved.

Keywords: Stereopsis; Real-time; Bio-inspired systems; FPGAs; Gabor filters

1. Introduction images. This task is accomplished in the visual cortex


by a specialized receptive field structure (DeAngelis et
Stereo vision allows many biological systems to al., 1991).
reconstruct depth information encoded within multiple Significant studies have shown that a substantial pro-
portion of neurons in the striate and extrastriate cortex
of monkeys have stereoscopic properties; that is, they
Corresponding author at: Department of Computer Architecture
respond differentially to binocular stimuli, thus provid-
and Technology, University of Granada, ETSI Informatica, c/ Peri-
odista Daniel Saucedo Aranda s/n, E-18071 Granada, Spain.
ing cues for stereoscopic depth perception (Hubel and
Tel.: +34 958 240461; fax: +34 958 248993. Wiesel, 1962; Barlow et al., 1967; DeAngelis et al.,
E-mail addresses: jdiaz@atc.ugr.es (J. Daz), eros@atc.ugr.es 1998). Stereoscopic neurons display disparity selectiv-
(E. Ros), silvio.sabatini@unige.it (S.P. Sabatini), fabio.solari@unige.it ity and correlation selectivity. Many neurons have tuned
(F. Solari), smota@uco.es (S. Mota). disparity response profiles that collectively cover the
1 Present address: Dept. Arquitectura y Tecnologa de Computa-

dores, University of Granada, ETSI Informatica, c/ Periodista Daniel


entire range of physiological disparities. These cells can
Saucedo s/n, E-18071 Granada, Spain. Tel.: +34 958 246128; be classified on the basis of their responses: first, neu-
fax: +34 958 248993. rons with peak responses at (or about) zero disparity
2 Tel.: +39 010 3532289/794; fax: +39 010 3532289/777.
(tuned zero neurons, excitatory or inhibitory) which
3 Tel.: +39 010 3532289/794; fax: +39 010 3532289/777.
4 Present address: Area de Ingeniera de Sistemas y Automatica,
have narrow and symmetrical receptive fields; second,
Dpto. de Informatica y Analisis Numerico, Universidad de Cordoba,
neurons that are tuned to larger disparities, either crossed
Campus de Rabanales, E-4071 Cordoba, Spain. (tuned near neurons) or uncrossed (tuned far neurons).
Tel.: +34 957 21 21 72; fax: +34 957 21 86 30. These have broader excitatory receptive fields that are

0303-2647/$ see front matter 2006 Elsevier Ireland Ltd. All rights reserved.
doi:10.1016/j.biosystems.2006.09.028
J. Daz et al. / BioSystems 87 (2007) 314321 315

asymmetrically wider toward the smaller disparities, and forward, since the physical principles upon which bio-
commonly include an inhibitory component around zero logical tissues are based are very different from those
disparity. Other stereoscopic cells have reciprocal pro- characteristically used in electronic technology. Further-
files (near or far neurons, respectively) in the sense more, biological and electrical technologies face dif-
that they respond with excitation to crossed or uncrossed ferent restrictions which are overcome by resorting to
disparities and with suppression to disparities of the different strategies.
opposite sign (Poggio et al., 1988). Nevertheless, an opportunistic attitude which takes
Furthermore, binocular depth perception is useful in the key-functional principles that contribute to the out-
many visual applications such as autonomous robot nav- standing performance of biological systems and also uses
igation and grasping tasks. Due to the intensive calcu- technology-motivated computing techniques to adapt
lation required to estimate the disparity values, most of those computing primitives must be of considerable
the approaches implemented so far process the sequences interest. This opportunistic approach should on its own
off-line, rendering them unsuitable for real applications. merits provide a suitable solution to the individual task in
The use of customized hardware allows us to process question, whilst also helping to identify and characterize
stereo-image sequences in real-time. These hardware- the functional principles that support the high perfor-
based approaches generally use correlation-based mod- mance observed in biological systems. For example,
els (Brown et al., 2003) because they are quite suitable biological systems widely use massive parallel process-
to hardware architecture. In contrast to feature corre- ing to overcome the slow chemical-based principles that
spondence and correlation techniques, during the last support most of the computing and transmission princi-
decade phase-based computational models have been ples of neurons. On the other hand, whereas electrical
proposed as an interesting alternative (Fleet and Jepson, technology allows faster devices (more than three orders
1993; Fleet et al., 1996), mainly because they are based of magnitude), the connectivity allowed by current tech-
on local operations and produce dense depth maps with nology is restricted to 2D patterns and so this massive
direct sub-pixel resolution. Several real-time approaches parallelism becomes impossible to adopt in electronic
based on this technique have recently been proposed by devices.
Porr et al. (2002) and Darabiha et al. (2003). To be able to adopt biologically inspired processing
In this paper we describe how to deal with the prop- schemes we use a time-slicing technique and we have
erties of bio-inspired systems to be designed as embed- developed a very fast computing unit that abstracts the
ded systems for real-world applications. We describe functional principles upon which the emulated scheme
an embedded stereo processing system based on an is based. In this way, we can process in stereo the dis-
FPGA device known as a system-on-a-chip (SoC), which parity between two images several times (with different
computes a modified phase-based technique originally shifts and spatial scales) and thus obtain multiple dis-
described by Solari et al. (2001). This model avoids the parity estimations which in a biological system would
explicit computation of the single local phases of Gabor- have been extracted by different populations of neurons.
filtered binocular images, making the approach hardware We then integrate all these estimations constructively to
friendly and thus allowing our design to outperform pre- achieve the best performance.
vious approaches. The system includes all the hardware We illustrate here one example of such an approach.
controllers necessary for a two-camera frame-grabber, We have developed a very fast disparity estimation sys-
external memory management units, VGA visualiza- tem that is able to obtain multiple disparity estimations
tion output generation, user control interface for system (up to eight) at a conventional camera frame rate and
configuration, etc. This allows us to use it as a smart VGA resolution. This allows the exploration of integra-
embedded sensor that works as a system-on-a-chip, pro- tion schemes in the framework of real-time processing
viding low level vision disparity information. tasks. In Section 6 we call neural population coding the
set of estimations obtained on multiple scales and with
2. From biological models to real-time hardware multiple shift profiles. It is documented that the per-
systems formance of biological systems is based upon multiple
estimations (Fleet et al., 1996) and an efficient selection
Engineering processing architectures designed for mechanism that integrates complementary information
tasks that biological systems solve with impressive ease from different sources.
can benefit considerably by mimicking computing strate- Conventionally, parallel processing of different cir-
gies developed by nature over long periods of evolution. cuits is limited due to the limited transmission band-
But the adaptation of such techniques is not straight- width. Especially significant are the constraints deriving
316 J. Daz et al. / BioSystems 87 (2007) 314321

from the external memory access; which is usually one


of the important bottlenecks for FPGA processing capa-
bility, but due to the on-chip system management of
external and internal memory, and since the described
architecture consists of one single processing unit, with
the whole system implemented on the same device (as
a SoC), the access control is carefully designed and
this bandwidth limitation is overcome. Furthermore, the
proposed scheme is scalable; since there are plenty of
available computing resources on the same chip, two or
more processing units can be used, if further parallelism
is needed, to extract more estimations or increase the
spatial resolution.

3. Hardware-friendly phase-based stereo

In our approach, we will use only tuned-excitatory


neurons. The results of Jones and Palmers experiments
(Jones and Palmer, 1987) suggest modelling the shapes Fig. 1. Phase-based disparity estimation using neurons with receptive
of the RFs by two-dimensional Gabor filters with vari- fields as quadrature Gabor filters.
able spatial phase. In particular, experiments carried
out by Pollen and Ronner (1981) suggest that most of Spatially localized phase measures can be obtained by
the simple cells can be combined in pairs, one cell of filtering operations with complex-valued quadrature pair
each pair with even symmetry and the other one with bandpass kernels (e.g. Gabor filters), approximating a
odd symmetry. This can be modelled by a cosine func- local Fourier analysis on the retinal images (see Solari
tion and a sine function, corresponding to the real and et al., 2001; Fleet et al., 1991, 1996; Fleet and Jepson,
imaginary parts of a complex-valued Gabor filter, respec- 1993). Considering a complex Gabor filter with a peak
tively. Among various computational vision models that frequency k0 and a spatial extension :
 2 
make use of Gabor functions such as localized spatial x
filters or as basis functions for image transformations h(x; k0 ) = exp 2 + jk0 x

(Daugman, 1985; Porat and Zeevi, 1988; Fogel and
Sagi, 1989; Chang and Chatterjee, 1993), phase-based = hC (x; k0 ) + jhS (x; k0 ) (2)
approaches for stereo vision have been widely studied the resulting convolutions with the left and right binoc-
recently (Sanger, 1988; Fleet et al., 1991). In these mod- ular signals can be expressed as
els disparity is computed as the one-dimensional shift 
along the epipolar lines necessary for aligning the phase Q(x) = I()h(x ; k0 ) d = C(x) + jS(x)
values of the bandpass filtered versions of the binocular
stereo signal. An illustrative scheme is shown in Fig. 1. = (x) ej(x) (3)
Formally, the left and right observed intensities from
the two eyes, IL (x) and IR (x), respectively, result related where (x) and (x) denote their amplitude and phase
as components, and C(x) and S(x) are the responses of the
quadrature filter pair. Local phase measurements are sta-
I L (x) = I R [x + (x)] (1) ble and with a quasi-linear behaviour over relatively large
where (x) is the (horizontal) binocular disparity. spatial extents, except around singular points where the
Disparity can be estimated in terms of phase differ- amplitude of Q(x) vanishes and the phase becomes unre-
ences in the spectral components of the stereo-image pair liable. This property of the phase signal yields good
(Fleet and Jepson, 1993; Fleet et al., 1996). Since the two predictions of binocular disparity by
images are locally related by a shift, in the neighbour-  L 
(x) R (x) 2 (x)2
hood of each image point the local spectral components (x) = = (4)
of IL (x) and IR (x) are related by a phase difference equal k(x) k(x)
to (k) = L (k) R (k) = k, where is the image local where 2 denotes the principal part of its argument,
phase at this position and k is the spatial frequency. i.e. 2 (, ) and k(x) is the average instantaneous
J. Daz et al. / BioSystems 87 (2007) 314321 317

frequency of the bandpass signal, measured using the corresponds to a real benefit because the division in the
phase derivative from the left and right filter outputs (x fix-point arithmetic requires high precision. Although
subscripts indicates differentiation along the x-axis): from a computational point of view there is no difference
between computing disparity from differences of the
xL (x) + xR (x)
k(x) = (5) phase on the monocular images or from a direct measure
2 of the binocular phase difference (without explicit com-
As a consequence of the linear phase model, the putation of monocular phases), quantization errors make
instantaneous frequency is generally constant and close the former approach noisier, which in addition requires
to the tuning frequency of the filter (x k0 ), except near more hardware resources. We evaluated both methods
singularities where abrupt frequency changes occur as a using random-dot stereograms and fix-point data of 32
function of spatial position. Therefore, a disparity esti- bits, obtaining direct phase computation yields for higher
mation at a point x is accepted only if |(x k0 )| < k0 , performance when the available operation precision is
where is a proper reliability threshold. limited.
It should be noted that Eq. (4) does not require the To address the hardware implementation of this
explicit calculation of the left and right phases. There- approach the basic steps can be summarized as follows
fore, following the approach proposed by Solari et al.
(2001), we can compute directly the phase difference in
1. dc component image removal using the local contrast
the complex plane using the following identities:
  I Imean operator in a 9 9 pixel window.
(x)2 = arg(QL QR ) 2 2. Even and odd Gabor 17 taps filtering of left and right
images.
= arctan2(Im(QL QR ), Re(QL QR )) 3. Direct phase difference calculation using Eq. (6).
= arctan2(CR S L CL S R , CL CR + S L S R ) 4. Disparity computation using Eq. (4) assuming
k(x) k0 .
(6)
where Q* denotes the complex conjugate of Q. The dc component image removal is particularly rel-
This formulation is computationally simple because evant because (in a first approximation) the retina pro-
it is composed primarily of algebraic combinations of duces a neural image of local contrast (Shapley and
the filter outputs. Moreover, it embeds the calculation of Enroth-Cugell, 1984).
the principal part of phase differences, without explicit
manipulations of the two phases of the left and right
images. In this way, it takes into account the period- 4. Hardware system implementation
icity of the phase without incurring in the wrapping
effects on the resulting depth map. Furthermore, follow- The implementation of the previous simplified
ing (Fleet et al., 1991), for the expression of the average phase-based model (Solari et al., 2001), requires being
spatial frequency (5), to eliminate the need for an explicit consistent with the discussion in Section 2. Large
calculation of phases and, consequently, the problems neural populations are not suitable for implementation
arising from phase unwrapping, we use the following in hardware because the available hardware resources
identities: are limited. We have designed a processing unit
using fine-grain parallelism resources based on highly
Im[Q Qx ] Sx C SCx
x = = (7) pipelined structures and short processing times. We
2 C2 + S 2 describe the implementation of a SoC for real-time
where Qx , Cx , and Sx are the spatial derivatives of Q, C, stereo computation which can be used in embedded
S. systems. The device is a general purpose system for
This approach has several advantages which make the image stereo computation where the technology is
system hardware-friendly. Although Eq. (6) increases based on re-configurable hardware (FPGA).
the number of multiplications, current FPGA devices The choice of a phase-based stereo approach is
include embedded multipliers making this technology of also justified because of its robustness to illumination
specific interest for vision tasks. In fact, the main advan- changes. As commented in (Cozzi et al., 1997), the con-
tage provided by this approach is to avoid the explicit trast test shows that this approach is not very sensitive to
logic required for wrap-around mechanism. This implies differences in such magnitude. The approach seems to
a considerable reduction of comparison logic. Further- be rather robust to unbalanced images as well (usual in
more, the division operation is reduced by 50%. This real cameras which have different luminance gain).
318 J. Daz et al. / BioSystems 87 (2007) 314321

Fig. 2. Software vs. hardware implementation. (a) Original images, (b) software stereo processing, (c) hardware stereo processing, (d) results using
the multiple estimation-based model described in Section 6. The disparity is encoded in grey levels, light pixels indicate short distances. Note that
small differences between the software (b) and the hardware model (c) are visible as salt and pepper noise presented in the hardware produced
images due to the limited precision available in the hardware implementation.

In Fig. 2 we show the algorithm outputs for a couple applications, the RC300 board from Celoxica (see
of standard image pairs. We compare the software and http://www.celoxica.com). All the processing opera-
hardware results of the raw model (just one spatial scale tions are fully computed in the FPGA device (as a SoC).
and without neuron shifting) and we also show the results
from the multiple estimation model described in Section 5. System performance and requirements
6.
The previous outputs (Fig. 2b and c) represent the raw The system frequency is 65 MHz and produces one
data extracted from the stereo sensor encoded using a pixel per clock cycle meaning that we can compute up to
disparity-to-grey levels map. The system set-up requires 65 megapixels/s (for instance corresponding to 211 fps of
image rectification and camera calibration (which is a 640 480 pixels per image, or 52 fps of 1280 960 pix-
critical stage). The present implementation only includes els of resolution). The system quality depends on image
a simple pre-processing method based on image dis- resolution and disparity range. The present implemen-
placements that runs in a previous system configura- tation runs well for small disparities (typically values
tion. An improved calibration pre-processing step can under 4 pixels for 15 taps Gabor filters). The first stage
be implemented using an embedded calibration module of camera calibration reduces the global image displace-
to achieve better stereo-image rectification. ment and improves the local disparity range. Compared
The hardware system architecture according to the with similar recent real-time implementations Porr et al.
model described in Section 1 is shown in Fig. 3. (2002), which process at video-frame rate and Darabiha
The confidence measure used in the system is the et al. (2003), which process 256 360 pixels per image
neuron energy (module of the Gabor filter outputs) at up to 30 fps, our system outperforms these approaches.
because phase is not well defined near module singu- Table 1 shows the required resources for the whole
larities. The system is configured by five stages in the system. Note that in the convolutional stages the process-
coarse-grain pipeline (Fig. 3). All the processing stages ing has been done with fixed point data representation
are designed with micro-pipeline data-paths. Therefore, of nine bits. The arctan function has been implemented
the total latency of the system is about 115 clock cycles. using a look-up-table of 1024 address of 10 bits with
Nevertheless, the data throughput is one estimation 5 fractional bits and some logic to decide the sign. As
per clock cycle. The system has been implemented shown in Fig. 2, the hardware results are similar to the
in a stand-alone board as a prototype for embedded software ones implemented with double data precisions
J. Daz et al. / BioSystems 87 (2007) 314321 319

Fig. 3. Stereo hardware architecture. The figure shows the main processing units designed for the stereo vision system. Each sub-unit has been
developed to process the data using a fine-grain pipeline structure. The efficient use of the intrinsic parallelism and segmentation capabilities available
in the FPGAs allow the computation of one estimation per clock cycle. We have implemented a customized pipeline processing structure with parallel
computing blocks in different stages for computing left and right image primitives at the same time. The micro-pipeline module computes the phase
difference using a LUT for the arctan function.

and after doing several trials we consider these bit widths (Gabor filters of 31 and 55 taps) which enlarge the range
as good trade-offs between the system accuracy and of available disparities computable by the system but
hardware resource requirements. reduces their resolution. Note that the system demand
Each design is characterized by the megapixels per grows for each scale but the computing speed in terms
second and is completely modular. Therefore, we can of fps remains constant. In future research we plan to
choose different resolution versus frames per second design a multi-resolution system plus scale integration
trade-off. unit to compute at each pixel the scale which best fits the
The FPGA re-configurability also allows different image properties at this position.
image scales computation. Provided that stereo tech-
niques work better for small disparities, we have 6. Improvements to the basic model: multiple
designed three different scales, with Gabor filters of 15, estimation-based scheme
31 and 55 taps. In this way, depending on the image
structure, our FPGA can be re-configured for different The main limitation of the previous system is the
scales to estimate the range of disparities that better limited range of disparities available due to the linear
match the image structure. Table 1 also shows the hard- approximation of the phase model. Theoretically this is
ware resources required for these larger spatial scales /2 (being = 2/k0 the period of the tuning frequency

Table 1
System resources required on a Virtex-II XC2V6000-4
Slices (%) EMBs (%) Embedded multipliers (%) Mpps Gabor spatial scale (filter taps) Image resolution fps

6411(18%) 15 (10%) 21 (14%) 65 15 640 480 211


1280 960 52
9197(27%) 39 (27 %) 31 (21 %) 65 31 640 480 211
1280 960 52
13048(38%) 71 (49%) 59 (49 %) 65 55 640 480 211
1280 960 52

EMBs stands for embedded memory blocks.


320 J. Daz et al. / BioSystems 87 (2007) 314321

Table 2 The processing speed of the system using a cus-


Evaluation of the multiple estimation approach using sequences pro- tomized frame-grabber allows us to test several popula-
vided by Scharstein and Szeliski (2002, 2003)
tion types and fusion methods in real-time. For example,
RMS error we can process each image pair eight times, using three
Sawtooth 1.98 spatial scales and a shifted distribution of five neurons
Tsukuba 1.53 with overlapping disparity tuning to increase the avail-
Venus 1.55 able range of disparities obtaining an equivalent circuit
running up to 26 fps of image sizes of 640 480 pixels
using approximately the same system resources (mem-
of the Gabor filter) but experimentally is about /3 (for ory resources demand is increased). Shift neuron just
details see Cozzi et al., 1997). Usually the solution found implies offset values in the frame-grabber of one of
in the literature consists of a coarse-to-fine approach, the cameras and the different scales imply just chang-
using confidence values from coarse scales to warp the ing the Gabor filter coefficients. Therefore, we use the
image at fine scales. The problem of such an approach same primitives described in Section 3. Furthermore, the
is that wrong estimations propagate from coarse-to-fine outstanding processing speed achieved by our approach
scales. Furthermore, there is no biological evidence of allows us to use the same circuits to process the images
such kinds of architecture in the brain (Mallot et al., repetitively (with different shifts and filter scales) storing
1996). the results to be integrated in a simple winner-takes-all
Contrary to this approach, a parallel processing of stage. In this fusion module we just take at each pixel
spatial scales with a fusion integration stage is more bio- the disparity value (among candidates) with the highest
logically plausible. In a similar way to Fleet (1994) the confidence value.
scales are processed in parallel and integrated using a
similarity measure. Shift neurons could also be added
(Fleet, 1994; Fleet et al., 1996; Porr et al., 2002) to 7. Conclusions
improve the disparity range using neurons with over-
lapping disparity tunings. Contrary to Fleets approach The adopted stereo computation technique is efficient
(Fleet, 1994), which uses Gabor filter correlation and and hardware-friendly. It provides sub-pixel resolution
sub-pixel estimations by linear interpolation, our scheme and the disparity range can be adapted to the image
uses sum of absolute differences (SAD) over the energy structure. Furthermore, it allows, in a straightforward
of the shifted cells (which is more hardware-friendly manner, a multi-scale and multi-shift approach as an
because it avoids square roots and division operations). immediate improvement. The hardware is very pow-
At this stage, the cell with the lowest response encodes erful (65 megapixels/s that can be arranged as 211 fps
the winner shift value which achieves the best dispar- of 640 480 pixels per image). This outstanding per-
ity tuning. Phase difference for sub-pixel estimation formance with a customized frame-grabber allows the
(instead of linear interpolation methods) is used to obtain system to be used as a platform for studying differ-
sub-pixel disparities values. The shift offset obtained ent models of neural population coding and integration
with SAD, is calculated with the value obtained from mechanism (which take full advantage of multiple dis-
the basic model providing the improved sub-pixel dis- parity estimations) in real-time tasks.
parity estimation. We present a way of implementing a biological model
Qualitative results for this model are shown in onto programmable hardware which runs on a stand-
Fig. 2d. Note that the disparity range and resolution alone chip for embedded applications. The efficient
are improved, obtaining smooth variation and disparity exploitation of the computing resources available on
details. The approach has also been evaluated numeri- FPGA devices leads to an outstanding processing speed.
cally with the sequences used in (Scharstein and Szeliski, A customized pipeline processing structure, including
2002, 2003), for which we know the ground-truth. some well-balanced parallel processing modules, effi-
The accuracy using the RMS (root-mean-squared) error ciently performs phase-based stereo estimations (about
(measured in disparity units) between the computed dis- one million gates on the Virtex-II FPGA are required
parity map and the ground-truth map is summarized in for the 15 taps Gabor filter system). The accuracy of
Table 2. The used parameters are the following: 9 shifted the system depends on the bit-width adopted at the dif-
neurons (with a distance of 5 pixels between them) with ferent computing stages; this is quantified using bench-
= 14 to cover a wide disparity range (from 24 to 24 mark images. Some illustrative and promising results are
pixels) with overlapping. shown in Fig. 2.
J. Daz et al. / BioSystems 87 (2007) 314321 321

Acknowledgments Fleet, D.J., Wagner, H., Heeger, D.J., 1996. Neural encoding of binocu-
lar disparity: energy models, position shifts and phase shifts. Vision
Res. 36 (12), 18391857.
This work has been supported by the EU grant
Fogel, I., Sagi, D., 1989. Gabor filters as texture discriminator. Biol.
DRIVSCO (IST-016276-2) and the National Spanish Cybern. 61, 103113.
Grant DEPROVI (DPI2004-07032). Hubel, D.H., Wiesel, T.N., 1962. Receptive fields, binocular interaction
and functional architecture in the cats visual cortex. J. Physiol. 160,
106154.
References Jones, J.P., Palmer, L.A., 1987. An evaluation of the two-dimensional
gabor filter model of simple receptive fields in cat striate cortex. J.
Barlow, H.B., Blakemore, C., Pettigrew, J.D., 1967. The neural mech- Neurophysiol. 58 (6), 12331258.
anism of binocular depth discrimination. J. Physiol. 193, 327342. Mallot, H.A., Gillner, S., Arndt, P.A., 1996. Is correspondence search
Brown, M.Z., Burschka, D., Hager, G.D., 2003. Advances in com- in human stereo vision a coarse-to-fine process? Biol. Cybern. 74
putational stereo. IEEE Trans. Pattern Anal. Mach. Intell. 25 (8), (2), 95106.
9931008. Poggio, G.F., Gonzalez, F., Krause, F., 1988. Stereoscopic mechanisms
Chang, C., Chatterjee, S., 1993. Ranging through Gabor logonsa in monkey visual cortex: binocular correlation and disparity selec-
consistent, hierarchical approach. IEEE Trans. Neural Netw. 4, tivity. J. Neurosci. 8, 45314550.
827843. Pollen, D.A., Ronner, S.F., 1981. Phase relationship between adjacent
Cozzi, A., Crespi, B., Valentinotti, F., Worgotter, F., 1997. Performance simple cells in the visual cortex. Science 212, 14091411.
of phase-based algorithms for disparity estimation. Mach. Vision Porat, M., Zeevi, Y.Y., 1988. The generalized Gabor scheme of image
Appl. 9 (5/6), 334340. representation in biological and machine vision. IEEE Trans. PAMI
Darabiha, A., Rose, J., MacLean, W.J., 2003. Video-rate Stereo Depth 10, 452467.
Measurement on Programmable Hardware (CVPR 03), vol. I, Porr, B., Nurenberg, B., Worgotter, F., 2002. A VLSI-compatible com-
Madison, WI, June. puter vision algorithm for stereoscopic depth analysis in real-time
Daugman, J.G., 1985. Uncertainty relation for resolution in space, spa- international. J. Comput. Vision 49 (1), 3955.
tial frequency, and orientation optimised by two-dimensional visual Sanger, T.D., 1988. Stereo disparity computation using gabor filters.
cortical filters. J. Opt. Soc. Am. A 2, 11601169. Biol. Cybern. 59, 405418.
DeAngelis, G.C., Cumming, B.G., Newsome, W.T., 1998. Cortical area Scharstein, D., Szeliski, R., 2002. A taxonomy and evaluation of dense
MT and the perception of stereoscopic depth. Nature 394, 677680. two-frame stereo correspondence algorithms. IJCV 47 (13), 742.
DeAngelis, G.C., Ohzawa, I., Freeman, R.D., 1991. Depth is encoded Scharstein, D., Szeliski, R., 2003. High-accuracy stereo depth maps
in the visual cortex by a specialized receptive field structure. Nature using structured light. In: IEEE Computer Society Conference on
352 (6331), 156159. Computer Vision and Pattern Recognition (CVPR 2003), vol. 1,
Fleet, D.J., 1994. Disparity from local weighted phase-correlation. Madison, WI, pp. 195202.
IEEE Int. Conf. Syst. Man Cybern. 1, 4854. Shapley, R., Enroth-Cugell, C., 1984. Visual adaptation and retinal
Fleet, D.J., Jepson, A.D., 1993. Stability of phase information. IEEE gain control. Progr. Retinal Res. 3, 263346.
Trans. Pattern Anal. Mach. Intell. 15, 12531268. Solari, F., Sabatini, S.P., Bisio, G.M., 2001. Fast technique for phase-
Fleet, D.J., Jepson, A.D., Jenkin, M.R.M., 1991. Phase-based disparity based disparity estimation with no explicit calculation of phase.
measurement. CVGIP: Image Understand. 53 (2), 198210. Electron. Lett. 37 (23), 13821383.

S-ar putea să vă placă și