Object Recognition: Computer Vision CSE399b Spring 2007, Jianbo Shi

Object
Recognition
Computer Vision CSE399b

Spring 2007, Jianbo Shi
QuickTime and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Human vision: recognition
Slides taken from Bart Rypma
What & Where Visual Pathways
Established with electophysiology, lesion,
neuropsychology and neuroimaging data
Monkey Lesion Data
Two types of Delayed
Response Task
Monkeys trained to
criterion on one of
these tasks
Then task was
reversed
After learning, either
temporal or parietal
lobe lesioned

Landmark Discrimination Task
Object Discrimination Task
Effects of Lesion on Landmark
Task
Unoperated monkeys
show no impairment
Temporal-lobe lesion
monkeys show
minimal impairment
Parietal-lobe lesion
monkeys show much
impairment
Effects of Lesion on Object Task
Temporal-lobe lesion
monkeys show much
impairment
Parietal-lobe lesion
monkeys show
minimal impairment
Monkey Lesion Data
Subsequent lesion work supports the what-
where distinction
Object discrimination: Ventral lesion
deficits restricted to visual modality
Posterior/Anterior Ventral Lobe distinction:
Posterior: Visual discrimination
Anterior: Visual memory
The What-Where Distinction:
Human Neuroimaging
Data indicate evidence for what-where distinction
Object task:
Same objects?

Spatial Task
Same locations?
Human Neuropsychological Data
Agnosia
Term coined by Sigmund Freud
From the Greek word for lack of
knowledge
The inability to recognize objects when
using a given sense, even though that sense
is basically intact (Nolte, 1999)

Agnosia
Usually involves damage to the occipito-parietal
pathway
Patient GS
Sensory abilities intact
Language normal
Unable to name objects
Agnosia
Two Types

Apperceptive
Object recognition failure due to perceptual
processing
Associative
Perceptual processing intact but subject cannot
use information to recognize objects
Agnosia
Depends on the availability of the object
representation to consciousness
Apperceptive
Associative
Apperceptive Agnosias (also known as visual space agnosias)
refer to a condition in which a person fails to recognize objects due to a functional
impairment of the occipito-temporal vision areas of the brain. Other elementary
visual functions such as acuity, colour vision, and brightness discrimination are still
intact. Apperceptive agnosics are unable to distinguish visual shapes and so have
trouble recognizing, copying, or discriminating between different visual stimuli. When
patients are able to identify objects, they do so based on inferences using colour,
size, texture and/or reflective cues to piece it together. For example, in the image
below, an apperceptive patient may not be able to distinguish a poker chip from a
scrabble tile despite their clear difference in shape and surface features.
QuickTime and a
QuickTime and a
This would be problem for apperceptive agnosia patient:
They also have trouble with object constancy by view changes
Right hemisphere lesions
Associative Agnosias are also known as visual object agnosias.
Although they can present with a variety of symptoms, the main impairment is failure
to recognize visually presented objects despite having intact perception of that object.
A patient with an associative agnosia may be able to replicate a drawing of the object
but still fail to recognize it. Errors in misidentifying an object as one that looks similar
are common. Three specific criteria are associated with a diagnosis of associative
agnosia (Farah,1990):
1) Difficulty recognizing a variety of visually presented objects (e.g., naming or
grouping objects together according to their semantic categories).
2) Normal recognition of objects from a verbal description of it or when using a sense
other than vision such as touch, smell, or taste.
3) Elementary visual perception intact sufficient to copy an object, as exemplified in
original and copied picture below. Overall, this loss can be thought of as "recognition
without meaning".
QuickTime and a
Prosopagnosia
Specific inability to recognize faces
Are faces and other objects in the world
represented in fundamentally different ways
in memory?
Does face-memory depend on
fundamentally different brain systems?

Are Faces Special?
Subjects presented with a face and asked to represent
a face-part
Subjects presented with a house and asked to
represent a house-part
Are Faces Special?
Houses represented in parts
Faces represented as wholes
Are Faces Special?
Objects represented in parts and holistically
Faces represented holistically

Object Recognition,
Computer Vision
Three distinct Approaches:
1) Alignment, prototype,
2) Part-based classification,

3) Invariance, geometrical & photometrical,
hashing
Hypothesis-Test:
Alignment Method
Recognition by Hypothesize and
Test
General idea
Hypothesize object identity and pose
Render object in camera
Compare to image
Issues
where do the hypotheses come from?
How do we compare to image (verification)?
Step 1: correspondence
What are the features?
They have to project like points
Lines
Conics
Other fitted curves
Regions (particularly the center of a region,
etc.)
Step 2: Shape deformation and
matching
Pose consistency
Strategy:
Generate hypotheses using small numbers of
correspondences (e.g. triples of points for a
calibrated perspective camera, etc., etc.)
Backproject and verify

Appropriate groups are frame groups
Figure from Object recognition using alignment, D.P. Huttenlocher and S.
Ullman, Proc. Int. Conf. Computer Vision, 1986, copyright IEEE, 1986
Models
Body Recognition
G. Mori, X. Ren, A. Efros, and J. Malik, Recovering Human Body Configurations: Combining
Segmentation and Recognition, IEEE Computer Vision and Pattern Recognition, 2004.
G. Mori, X. Ren, A. Efros, and J. Malik, Recovering Human Body Configurations: Combining
Segmentation and Recognition, IEEE Computer Vision and Pattern Recognition, 2004.
Example 1: View-point variations, many examples are needed
Problem with Alignment algorithm:
T. Sebastian
Example 2: Partial occlusion
T. Sebastian
Part-based Object Recognition
Binford 78
Shocks (or medial axis or skeleton) are locus of centers
of maximal circles that are bitangent to shape boundary
Shape boundary
Shocks
Computing part-decomposition
T. Sebastian
Complexity-increasing shape deformation paths are not optimal
Represent a deformation path by a pair of simplifying deformation
paths from A, B to a simpler shape C
T. Sebastian
Shock graph edit operation transforms a shape to adjacent
transition shape
T. Sebastian
Edit-distance is defined as the sum of the cost of edits in
optimal edit sequence
T. Sebastian
Shock graphs represents object parts and part hierarchy
Edit-distance is
robust in presence
of part-based
changes
T. Sebastian
Invariance + hashing
Figure from Efficient model library access by projectively invariant indexing
functions, by C.A. Rothwell et al., Proc. Computer Vision and Pattern Recognition,
1992, copyright 1992, IEEE
Invariant Local Features
Image content is transformed into local feature
coordinates that are invariant to translation, rotation,
scale, and other imaging parameters
SIFT Features
David Lowe

Object Recognition: Computer Vision CSE399b Spring 2007, Jianbo Shi

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Object Recognition: Computer Vision CSE399b Spring 2007, Jianbo Shi

Încărcat de

Drepturi de autor:

Formate disponibile

Object

S-ar putea să vă placă și