Sunteți pe pagina 1din 73

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/309740491

3D Stereo Vision Camera-sensors, Advancements, and Technologies

Chapter · October 2013

CITATIONS READS
0 2,796

1 author:

Lakis Christodoulou
BIOMED MEDICAL SYSTEMS
39 PUBLICATIONS   87 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

AI-Machine Learning can predict Heart Attacks and Brain Strokes Better and more Accurately than Any Medical Doctor View project

Bioinformatics Processing & Analysis of Genomic Genetic Big Data View project

All content following this page was uploaded by Lakis Christodoulou on 08 November 2016.

The user has requested enhancement of the downloaded file.


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Overview: 3D Stereo Vision Camera-sensors-systems,


Advancements, and Technologies

3D Computer Stereo Vision


Research & Development
Innovation

Lakis Christodoulou

Electrical & Computer Engineering


&
Computer Science
(E&CE&CS)

Cyprus University of Technology

FINAL

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Table of Contents

3D Stereo Vision Camera-sensors, Advancements and Technologies


List of Figures
Executive Summary
1. Introduction
1.1. Project Description
1.2. Project Scope and Objectives
1.3. Importance
1.4. Project Team
2. Beyond the State-of-the-Art
2.1. Executive Summary
2.2. Description
3. 3D Stereo Image-Video Processing and Analysis
4. Results
5. Conclusions and recommendations
References

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

List of Figures
Figure 1. HARV ISR Stereo Vision Camera
Figure 2. HARV 3D Vision System
Figure 3. Capella - Stereo vision camera reference design
Figure 4. e-CAM_9V024_STEREO - Stereo Vision Camera Board Features
Figure 5. e-CAM_9V024_STEREO Camera Board
Figure 6. Point Grey - Bumblebee2 Stereo Vision Camera
Figure 7. Point-Grey - Stereo Vision Conversion from 2D to 3D
Figure 8. Sharp 3D Mobile Stereo Vision module
Figure 9. HTC EVO 3D
Figure 10. VGA Camera MT9V022 Module
Figure 11. Stereo Camera Set-up from Xilinx Spartan-3A DSP Video Starter Kit (VSK)
Figure 12. Active Stereo Vision System
Figure 13. USB OEM Stereo Vision Camera module
Figure 14. Quantum Stereo Vision Module
Figure 15. Red Rover's Stereo Vision Camera
Figure 16. An example 3D anaglyph made from NASA Mars rover navigation images
Figure 17. MobileRanger™ C3D Stereovision System
Figure 18. MobileRanger™ C3D Stereovision System
Figure 19. PCI nDepth™ 6cm Baseline Stereo Vision Camera
Figure 20. Videre Stereo and Multi-view Vision Camera Systems
Figure 21. US Army Stereo Vision Camera System
Figure 22. Microsoft Kinect 3D Camera
Figure 23. Microsoft Kinect Sensor Bock Diagram – SoC Hardware Architecture
Figure 24: Stereo Vision Triangulation
Figure 25. The Microsoft Kinect 3D Camera Sensor-System
Figure 26: PrimeSensor Reference Design Hardware
Figure 27. Primesense Depth Image CMOS
Figure 28. The SAFFiR Autonomous Car of the Future
Figure 29. The 6D-Vision’s Algorithm – Stereo Vision Diagram
Figure 30. Volvo Automotive Stereo Vision Camera System
Figure 31. Intelligent Stereo Goggles Vision
Figure 32. 3DSS Stereo Vision Camera-Scanner

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 33. Minoru 3D WebCam Stereo Vision Sensor


Figure 34. Fujifilm 3D Stereo Digital Camera as a Digital Human Vision System
Figure 35. Fujifilm FinePix REAL 3D W1 - 3D Stereo Digital Camera
Figure 36. Sony 3D Stereo Camera
Figure 37. Sony RGB Sampling at Full HD 1920x1080
Figure 38. RGB and Bayer Pattern
Figure 39. RGB Vs. Bayer
Figure 40. HYTEK 3D iVCam2.0
Figure 41. AKC D32 3D Stereo Web-camera
Figure 42. True3Di Stereoscopic 3D Microscopic System
Figure 43. True3Di Stereo Vision Concept
Figure 44. Solid-look 3D HD EndoStereoVision Camera
Figure 45. Surveyor Stereo Vision System (SVS)
Figure 46. CCST 3D Stereo Vision System
Figure 47. HD Panasonic 3D Stereo Camera Vision System
Figure 48. 3D Stereo/Multi-view video H.264/MPEG4 AVC system
Figure 49. HD MAX Camera Sensor for Coastline Surveillance
Figure 50. The pair of color HDMAX cameras used for the NASA-CCST 3D Stereo Video
Imaging system.
Figure 51.a Pt. Hueneme image test target at 3 miles – long shot
Figure 51.b Pt. Hueneme image test target at 3 miles – 5x zoom
Figure 51.c Pt. Hueneme image test target at 3 miles – 10x zoom
Figure 52. General framework of a Stereo or Multi-camera Video Surveillance System
Figure 53. Minoru 3D SD/HD Stereo Web-Camera
Figure 54. Foreground and Background Detection and Segmentation Algorithm
Figure 55. Statistical Adaptive Techniques
Figure 56. Three-Frame Differencing Concept
Figure 57. Hybrid Motion-Segmentation-Detection Algorithm: Abstract Block Diagram
Figure 58. Hybrid Motion Object Detection & Segmentation Algorithm: Analytical Block
Diagram
Fig. 59. Stereo Vision Processing – Algorithm System Architecture

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Executive Summary
This technical report and book chapter summarizes the State of the Art of the latest and
most recent R&D 3D Stereo Vision Camera-sensors, advancements, and technologies in the
field of 3D Stereo Vision Camera-sensors and systems. The need of a research survey in 3D
Stereo Vision Camera-sensors and systems is of high importance since our research is based
on 3D Stereo image and video processing, and analysis algorithms for human surveillance,
security, and monitoring. Our work has been focused primarily on algorithms and techniques
for motion detection and object tracking in human behavior monitoring scenes. After having
investigated existing 3D Stereo Vision Camera-sensors and systems in the literature, we
proposed and implemented a robust 3D Stereo Vision for Multimodal Segmentation,
Detection, Recognition, and Tracking System for human scene segmentation and object
detection and tracking in video sequences with complex, moving background. The study of
3D Stereo Vision Camera-sensors will provide the essential understanding of 3D Stereo
Computer Vision, 3D Depth information, increase the performance of 3D Stereo
segmentation and recognition, and will support us to develop a low-cost 3D Stereo Vision
system for Multimodal Segmentation, Detection, Recognition, and Tracking System.

• 3D Stereo Vision Sensors

• 3D Stereo Vision Systems


• 3D Vision-Imaging Advancements and Technologies
Book Chapter:

3D Video: Stereo and Multi-View Video Technology

Book chapter in "Three-Dimensional Television: Capture, Transmission, Display"

Special issue on Advances in 3D Video Processing

Journal of Visual Communication and Image Representation (JVCI)

Journal of 3D Video Processing

Depth Map and 3D Imaging Applications: Algorithms and Technologies present various 3D
algorithms developed in the recent years and to investigate the application of 3D methods in various
domains. Containing five sections, this book offers perspectives on 3D imaging algorithms, 3D shape
recovery, stereoscopic vision and autostereoscopic vision, 3D vision for robotic applications, and 3D
imaging applications. This book is an important resource for professionals, scientists, researchers,
academics, and software engineers in image/video processing and computer vision.

http://www.igi-global.com/book/depth-map-imaging-applications/52998

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Overview: 3D Stereo Vision Camera-sensors-systems, Advancements, and Technologies

Abstract

The recent advancements in hardware multi-processing and 3D stereo computer vision have
triggered the 3D video technology to design and develop new Stereo vision cameras for a
variety of applications. The rapid deployment of fast hardware-electronic prototype, the
acceleration of the field-programmable gate array (FPGA) technology in designing and
developing new optimized and high performance digital electronic circuitry, and the
research advancements developed in the field of the 3D Stereo computer Vision and 3D
Stereo video processing-transmission-view-playback have started a huge technological and
industrial market for 3D stereo vision Camera-sensors and systems. There is a variety of
range-sensing hardware devices capable generating point clouds now available including
industry-standard stereo camera models, such as the Point Grey BumbleBee2, laser range-
finders, the Microsoft Kinect, and even smart phones like the HTC EVO 3D. Mobile robots
use various range sensors to construct representations of their surrounding environment in
order to make safe navigation decisions, to build useful maps and to acquire models from
the environment. Stereo cameras provide visual information about the environment. Two
overlapping images (i.e., `stereo') allows us to measure the visual disparity of corresponding
features in the scene, from which we can construct a depth map, or range image, of the
visible surface in the environment. These surfaces are represented as a collection of discrete
3D points with respect to the location of the stereo camera, often referred to as a `point
cloud.' The current paper consists of an overview of the latest advancements and
technologies of the developed stereo vision cameras.
The key solution of stereo vision is that it can be used to locate an object in 3D space. It can
also give valuable information about that object (such as color, texture, and patterns that
can be used by intelligent machines for classification). A visual system, or light sensor
retrieves a great deal of information that other sensors cannot. Stereo vision is also a passive
sensor, meaning that it uses the radiation available from its environment. It is non-intrusive
as it does not need to transmit anything for its readings. An active sensor sends out some
form of energy into the atmosphere, which it then collects for its readings. For example, a
laser sends out light that it then collects; and radar sends out its own form of
electromagnetic energy. A passive sensor is ideal when one wants to not influence the
environment or avoid detection.
Research Motivation
3D Stereo Vision Camera-sensors provide superior advantages than the standard mono-
scopic conventional camera sensors. The main benefit of the 3D Stereo Vision Camera-
sensor is that they can provide additional optical spatial view due to the stereo imaging
format of the side to side capture which is based on the disparity factor, the 3D Depth
information, and the avoidance of occlusion. The following advantages are:
• 3D Vision sensors simulate human eyes to provide both color and depth information of
each pixel in the field of view. Depth information allows researchers to develop efficient
algorithms to quickly extract out pixels within a defined distance, and identify objects of
interest from noisy visual background.
• 3D Vision accelerator provides a board-level solution for stereo disparity and depth,
relieving the computer CPU from this demanding task.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

• 3D Stereo Vision can dramatically increase the rate of detection, classification, and
tracking performance.
• 3D Stereo Vision costs thousands of US dollars when using laser or scanners, whereas
inexpensive stereo vision camera-sensors can offer a low-cost stereo vision solutions
Keywords 3D stereo vision, computer vision, depth map, stereo camera

Problem Statement

Estimation Principle

Today, many sensors are used for various purposes. The main principles used for
people/object detection are infrared ray, stereo camera, and time of flight.

Of the several distance measuring methods available, the stereo camera method. This
method uses two images acquired from a right eye CMOS sensor and a left eye CMOS
sensor. It compares the difference between the two images and calculates the distance to
the object. Compared to infrared ray sensors, which use only one or a few distance
measurement spots, the stereo camera uses images to measure distance, which makes it
possible to acquire distance information for each pixel. With this capability, stereo camera
sensors can detect not only the existence of the object, but also its size and movement.

3D Imaging Sensor Module


3D distance measurement sensor can be applied to a variety of markets.

Figure1: Measuring Distance with Stereo Camera Method

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Left: Luminance image Right: Image colored depending on distance data


Figure2: Luminance Image/Distance Information

Assembly Technology Enabled Both Miniaturization and Highly Precise Distance


Measurement
The accuracy of distance detection relies on the base line length. There is, therefore, a
tradeoff between size and accuracy. Ricoh’s sensor module’s head measures 8.7 x 14.4 x
8.3mm, which is approximately half the size of an adult thumb nail, and its accuracy is
3%@1m(±3cm).

Ricoh’s assembly technology is what made this high accuracy and miniaturization possible.
To minimize module size, Ricoh adopted the smallest possible image pick up devices.
Extraordinary accuracy is required for assembly to minimize the influence of miss
registration.

The adhesive process has a critical impact on the registration accuracy of the module. With
conventional methods, bonds are typically used, but Ricoh adopted laser welding (See
Table1).

Laser welding Glue

Required time
(Registration and 30 seconds/piece Few minutes/piece
fixing)

Approximately 10μm due to


Miss registration Few μms
contraction

・Adhesive amount, hardening


Minimum factors due to instant
Dispersion factors time
welding
・Vibration

Required equipment Single unit Several units

Table1: Comparison of Laser Welding and Adhesives

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

By using laser welding, Ricoh was able to avoid miss registration caused by glue hardening
and scattering of glue application. Beyond that, instant adhesion significantly shortened the
process time.

Appling laser welding enabled Ricoh to satisfy miniaturization and achieve high accuracy.

Applications - Technologies
Being able to solve the stereo matching problem is important in two respects. Firstly, it
sheds light on the way how human depth perception might work. Secondly, there are
numerous applications in computer vision. For example, automated 3D visualisation of
terrain and cities has recently gained popularity. In this context, stereo-derived 3D models
can be integrated into Google Earth to allow the user to take a walk in a 3D computer
reconstruction of Vienna. In medical imaging, 3D reconstructions of organs created from
multiple 2D (MRI) images can aid in diagnosis. Apart from visualisation, stereo
reconstructions can be applied for robot navigation (autonomously driving car), but also to
assist handicapped (blind) people to navigate in their environment. Without being
exhaustive, other applications include 3D tracking (surveillance, pose estimation, human-
computer interaction), depth segmentation (z-keying), industrial applications (quality
assurance) and novel view generation (free viewpoint video), to name just a few of them.
Basically, whenever one needs to infer geometric information from the surrounding world,
stereo vision represents a low-cost and non-intrusive alternative to active devices, such as
range finders.

We have added value to our products by combining various technologies with stereo
camera distance measurement sensor. And we were able to halve the auto-focus time by
using the distance sensor as an adjunct of the contrast method auto focus traditionally used
for auto-focusing digital cameras.

Beyond that, the sensor can be used as an industrial embedded module (Figure3)
incorporating a high speed distance calculation chip that provides luminance image and
distance information simultaneously.

Figure3: Example of Providing Form

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

This sensor module, which achieves both miniaturization and high accuracy, is not limited to
consumer products. It can also be used as an input device for image recognition, or as a
embedded module for industrial goods.

The State-of-the-art in Stereo Vision Cameras

There is a large interest in Stereo Vision Cameras and Systems with great potential in the
R&D sector for stereoscopic remote vision, robot vision, tele-medical vision, automotive
vision, and many other applications. Stereo Vision camera systems show great potentials
and opportunities in hybrid sensor systems including GPS/GPRS , laser telemetry sensor,
digital signal processing and intelligent video analysis, 3D Depth View and 3D Object
segmentation, Infrared IFR Night Vision sensors, etc.

Stereo Vision Camera Sensors or Systems are identified in the following categories:

• 3D Stereo Defense Camera


• 3D Stereo Machine Camera
• 3D Stereo Robot Camera
• 3D Stereo Video Came and Entertainment Camera
• 3D Stereo WebCamera
• 3D Stereo Still Camera
• 3D Stereo Medical Imaging Camera
• 3D Mobile Smart-phones
One of the recent stereo vision cameras developed in 2011 by Chatten Associates, [1] shows
a new stereo vision system as shown in the following figure [Fig.1.] The HARV ISR is a
lightweight gimbal loaded with a color/NIR zoom camera, thermal camera & range finding,
suitable for surveillance or ground vehicle applications. The HARV ISR stereo vision camera
system includes the following technological specs: 600 degree/second slew rate in Pan, 2
Mechanical Axes (Pan/Tilt), Zoom camera with 10:1 optical zoom, Thermal camera (320x640
or 640x480), Laser range finder, and a Digital video option

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 1. HARV ISR Stereo Vision Camera

The remote vision work focuses on human/robotic interfaces, compact sensor gimbals, and
head-aimed vision systems for military and industrial applications

Based on the same technological concept, Chatten Associates has also presented the HARV
3D Vision System [2], which provides an end-to-end solution for high-definition, immersive
visual tele-presence with natural stereoscopic depth perception.

Figure 2. HARV 3D Vision System

This vision system shows that stereo and high definition each offer advantages over
standard definition monoscopic camera video, particularly for remote manipulation. The
HARV-3D Head-Aimed Remote Viewer provides the viewers with an immersive tele-presence
associated with natural stereoscopic depth perception. The HARV-3D Digital Video provides
low latency, stereo digital video encoding/decoding, and ethernet video and control transport
based on the H.264/AVC Video codec. The HARV-3D Vision system supports a Stereo HD Zoom,
Stereo SD Zoom, Micro Stereo, Mono HD Zoom, Mono SD Zoom.

HD Spatial Resolution:
Visible & NIR
Resolution: HD (720p/59.94 or 1080i/59.94*)
and SD (NTSC crop or squeeze)
Aspect Ratio: 16:9
HFOV: 63.5° wide
Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013
Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Zoom: 10x optical / 12x digital

SD Spatial Resolution:
Visible
Resolution: SD (NTSC)
Aspect Ratio: 4:3
HFOV: 40°

The HARV 3D Vision System includes a HARV-3D Gimbal paired with a HARV-3D Viewer
(wherever you look, the camera looks), with options for low-latency digital video &
transport.

The Stereo vision Camera Reference Design, Capella [3] , can be used to develop their Stereo
vision algorithms and also by people who would want to integrate Stereo vision in their
product design. The Capella features the Gumstix® Overo® COM, Tobi base board and a
camera daughter card e-CAM_9V024_STEREO. The e-CAM_9V024_STEREO delivers pixel-
synchronous stereo frames to the OMAP/DM37x processor.

Capella is the world’s first embedded pixel synchronous Stereo Vision Camera Reference
Design for Texas Instruments’ (TI) OMAP35x and AM/DM37x processor on Gumstix® Overo®
COMs, designed and developed by e-con Systems.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 3. Capella - Stereo vision camera reference design

e-con Systems offers a complete ready-to-use Stereo camera reference design along with
the software drivers and SDK. This Stereo Vision Package contains the Stereo Camera Board,
e-CAM_9V024_STEREO with pre-aligned and pre-calibrated M12 lenses assembled over
Gumstix® Tobi baseboard powered by Gumstix® Overo® COM in an ABS enclosure. All the
necessary software with V4L2 stereo camera drivers and OpenCV library will be preloaded in
the SD card of the Overo COM. A set of external interface cables, power adaptors including a
mini tripod for convenient mounting and CD with necessary documentations, illustrations
and sample stereo test applications are also included in the package.

Capella – Stereo vision reference design features

• Stereo camera board – e-CAM_9V024_STEREO


• Pixel synchronous Dual camera streams at 30fps
• Connects directly to 27pin camera FPC connector of Gumstix® Overo® COM
• Directly fits over Gumstix Tobi Base board
• Adjustable baseline on customer request
• Factory calibrated S-mount lens pair
• Standard V4L2 Compliant driver with associated APIs and custom IOCTLs
• Sample V4L2 applications demonstrating the synchronous stereo operation
• Complete Stereo camera development package

e-CAM_9V024_STEREO - Stereo Vision Camera Board Features

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 4. e-CAM_9V024_STEREO - Stereo Vision Camera Board Features

• 1/3” Global shutter Monochrome MT9V024 image sensor


• WVGA 752(H) x 480(V) (360690 pixels) sensors output 736(H)x480(V) images in 8-bit
greyscale format.
• Synchronous Parallel Monochrome 8bit video data at 30 fps per sensor
• Integrated stereo video output of 1472(H) x 480(V) pixels at 30fps
• Baseline distance of 100mm (could be altered on customer’s request
• S-mount lens holder (M12 P0.5) with lock-nut with pre-calibrated S-mount lens pair

Software support

• Linux 2.6.35 kernel


• Standard V4L2 Compliant driver with associated APIs and custom IOCTLs
• Sample V4L2 applications demonstrating the synchronous stereo operation
• OpenCV SDK
• OpenCV sample applications for Depth Measurement
• Complete Stereo camera development package

Hardware elements inside Capella reference design

The following picture shows the major hardware elements of the Capella - Stereo vision
reference Design. The Gumstix® Overo® COM along with the Tobi base board used in the
Reference Design kit is shown along with the e-CAM_9V024_STEREO, Stereo camera board
and the interfacing 27pin flex cable.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 5. e-CAM_9V024_STEREO Camera Board

PointGrey has presented the BubleBee Stereo Vision Camera System based on two and three
view vision sensing.

Figure 6. Point Grey - Bumblebee2 Stereo Vision Camera

Point Grey Research unveiled the Bumblebee2 stereo vision camera system with faster
acquisition times, improved 3-D data quality, on-board colour processing and GPIO
connectors for external trigger and strobe functionality. The binocular Bumblebee2 contains
two 1/3 progressive scan CCD sensors, and transmits both the left and right images to a PC
via an IEEE-1394 interface. All Point Grey Research stereo vision products include the
Digiclops and Triclops SDK (software development kit), enabling users to control camera
settings, adjust image quality and access real-time depth range images using stereo vision

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

technology. Extensive sample programmes and source code are also included for ease of
integration. The camera is available in color or monochrome as a 640 x 480 at 48 FPS or a
1024 x 768 at 18 FPS option, and offers a choice of 3.8 or 6.0 mm focal length lenses.

Figure 7. Point-Grey - Stereo Vision Conversion from 2D to 3D

Sensor: Two Sony 1/3" progressive scan CCDs, Color/BW

Resolution and FPS : 640x480 at 48FPS or 1024x768 at 20FPS

Lenses: 2.5mm (100° HFOV), 3.8mm (65° HFOV), 6mm (43° HFOV) focal lengths

Calibration: Pre-calibrated to within 0.1 pixel RMS error*

General Purpose IO: GPIO pins for external trigger / strobe

*Based on a stereo resolution of 640x480 and is valid for all camera models. Calibration
accuracy will vary from camera to camera.

Distance/depth (Z) is calculated using the above equation. The unit of the depth will be
measured in real world linear measurements (i.e. meters).

(b) is the inter-camera distance or baseline; this measurement is predetermined and


constant. The greater the inter-camera distance, the more accurate the system (so long as
the cameras' fields of view still overlap slightly). The unit of the baseline will be real linear
measurements (i.e. centimeters).

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

(r) is the focal length. The focal length can be calculated by placing a reference object of
known linear width at a known distance from the cameras. Measure the length of the object
as seen by the camera (unit in pixels). The ratio between pixels and real linear distance is
used later in conversion of units for (a) and (c). The distance at which this ratio is calibrated
is the focal length; therefore the unit of the focal length should be in real linear
measurements (i.e. centimeters).

(a) and (c) are the optical displacement of the target; these measurements' units are in
pixels initially. To find (a), measure the distance between the center of camera one's view
and the center of the target. (c) is found the same way in the case of camera two. Convert
these distances from pixels into centimeters using the reference ratio/focal length so that
distance(Z) will have the correct units.

This whole idea works around the principles of optical displacement. The closer an object is,
the more it will appear to move when the viewer changes perspectives. In vice versa, the
farther away an object is, the less it will appear to move when the viewer changes
perspectives in the same way. The reason the sun is visible for so long in the day despite the
earth's motion is because the sun is far enough away that the optical displacement is small.
Cars on the highway move much slower relative to a stationary viewer than the earth does
relative to a stationary sun, but those cars move out of view much faster than the sun moves
out of the earth's view. The optical displacement of those moving cars is much greater than
the sun's optical displacement, because the distance between the target (cars) and the
viewer is so much smaller than in the other case where the sun is the target and the earth is
the viewer. Distance calculation via optical displacement measurement does not require
target dimensions to be known, making it an ideal method for any kind of exploratory
mission.

Applications

• 3D Analytical: Machine Vision, Mobile Robotics


• Automotive: Computer Aided Navigation, Driver Guidance
• 3D Objects reconstruction
• 3D Video recording
• 2D synchronous dual camera applications

Sharp Stereo Vision and 3D video for Mobile Devices

In 2010 Sharp has developed a 3D Camera module for mobile devices capable of capturing
High-Definition (HD) 3D Video Images using a progressing capturing process of 720 effective
scanning lines (progressive scanning system) and a spatial resolution: 1280 H x 720 V pixels.
Deploying stereo vision for mobile and wireless handheld devices can support 3D vision and
depth mapping for bringing real time face to face communication and also offer 3D video
games for i-Pad.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Embedding this camera module in mobile devices such as digital cameras, mobile phones,
and smartphones such as the Sharp Aquos SH-12C and HTC EVO 3D smartphone include a
stereoscopic (dual-lens) rear-facing 3D camera. The latest feature people want to see on
smartphones is 3D and with devices like the LG Optimus 3D and HTC EVO 3D soon to hit the
market, they now have a new player to contend with in the shape of the Gingerbread totting
Sharp Aquos SH-12C Phone, featuring 3D and dual cameras.

The Sharp Aquos SH-12C Phone is not a name that is going to roll of peoples tongues easily.
However, with dual cameras sitting nicely on its back that go all the way up to 8 megapixels
(compared to that of the HTC EVO 3D‘s 5 megapixels),

Figure 8. Sharp 3D Mobile Stereo Vision module

HTC EVO 3D

The current HTC EVO 3D Capture and view life in 3D - no glasses required

Figure 9. HTC EVO 3D

Dual 5 megapixel camera lenses and a 720p HD camcorder with stereo sound recording let
you capture an immersive life that jumps out at you

Evo 3D, is the nation's first to pack a glasses-free stereoscopic 3-D

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

15. 2D/3D Switch allows you to set the camera to capture photos or videos in 2D or 3D.
17. Camera lenses allow you to capture high-definition photos and videos in 2D or 3D.

HTC is a bit late on the mobile 3D front as one of its competitors, namely LG, has already
introduced a 3D-capable smartphone called Optimus 3D that features an autostereoscopic
3D display as well as a 3D camera. However HTC is now ready to offer its own alternative in
the form of the new EVO 3D smartphone, a mobile device that is… surprisingly similar to the
Optimus 3D from LG. The HTC EVO 3D features a 4.3-inch autostereoscopic 3D display (not
requiring you to wear glasses to get the 3D effect) with a 540×960 QHD resolution in 2D
mode. The new 3D-capable smartphone from HTC weights 170 grams (6.0 ounces) with the
battery and has the following dimensions: 126x65x12.05 mm (4.96″ x 2.56″ x 0.47″) which is
a slightly bigger than a traditional high-end smartphone. The phone uses a fast 1.2GHz dual-
core Snapdragon processor and features a higher capacity 1730 mAh battery to ensure
longer use of the device. On the back fo the device there is a 3D camera using two 5
megapixel sensors with a dual-LED flash in between them. The 3D camera can record 5
megapixel 2D images or 2 megapixel 3D photos as well as 720p video sin 2D or 3D mode

The device’s camera system lets you capture photos and videos in glasses-free 3D. That
means you can enjoy viewing the 3D media you’ve captured without
wearing 3D glasses.

To capture photos and record videos in 3D, slide the 2D/3D switch to 3D before you take a
picture or record
a video clip.
To record video in HD, set Video quality to HD 720P
(1280 x 720).
high-definition MP4 video formats
in Gallery:
" H.263 profile 0 @ 30 fps, WVGA (800x480), max
2 Mbps
" MPEG-4 simple profile @ 30 fps, 720p
(1280x720), max 6 Mbps
" H.264 baseline profile @ 30 fps, 720p (1280x720),
max 6 Mbps

3D Stereo Vision Camera Systems – Specs & Standards

What we have so far is:

3D Webcam Stereo Vision 3D PC Camera Minoru

But no automatic and knowledge of focus is given!

What we have is one VGA CMOS Sensor with Ethernet port.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 10. VGA Camera MT9V022 Module

More information about this camera can be found in the MT9V022 product brief available from
Micron Technology Inc.

4. J6 – Camera 2 Input (RJ45)


5. J5 – Camera 1 Input (RJ45)

Figure 11. Stereo Camera Set-up from Xilinx Spartan-3A DSP Video Starter Kit (VSK)

The camera provided with VSK is based on a Micron MT9V022 CMOS image
sensor. The image sensor is interfaced to the FMC-Video via an RJ45 connector with
a proprietary pin assignment. A standard CAT6 Ethernet cable can be used to connect
the camera to FMC-Video.

Supported cables CAT5e, and CAT6 Ethernet cable.

Interface Standards/Protocols:

IEEE 1394 standard

IEEE 1394 FireWire® Cable

Active Stereo Vision System

A typical Active Vision-System is a camera-head which can control some (or all) of the
following parameters:
Spatial-position
Orientation of the viewing-direction Neck-movements (Pan- and Tilt-motors)

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Vergence-movements of the eyes (Pan-motors for the cameras)

Focus
Zoom
Iris
Further camera parameters

Figure 12. Active Stereo Vision System

http://ni.www.techfak.uni-bielefeld.de/node/2908

OEM Stereovision Camera

USB camera with 2 image sensors which are simultaneously triggered for exposure.

This camera gives you the raw data straight from the sensors, so you can write your own
stereovision software for it.

Figure 13. USB OEM Stereo Vision Camera module

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Features

•CMOS color sensors


•Max resolution 752 x 480
•Max frame rate 20-23FPS (depending on settings)
•USB interface
•raw FOV single lens 102 x 74 degrees

http://www.visionhardwarepartner.nl/products/camera-%252aslash%252a-
imaging/camera/oem-stereovision-camera.html

http://www.visionhardwarepartner.nl/products/camera-%252aslash%252a-
imaging/camera/oem-stereovision-camera.html

QuantumVision's 3-D system performs stereo vision with two image sensors

The Hammerhead 3-D machine-vision system [5] provides stereo vision without multiple
cameras. It measures 6 × 2.37 × 1.4 in., and includes two image sensors, integrated
environmental control for cold-temperature applications, Ethernet , PoE, RS-232, RS-
422/485, USB, and industrial digital I/O. Synchronization of the left and right image sensors
is automatically handled by the hardware. It can address security and surveillance, robotic
pick and place, vehicle guidance, quality assurance, sorting, material handling, and optical
gauging applications. The system can improve 2-D image-processing with a library of
algorithms that include 1-D and 2-D barcodes, linear measurement tools, and pattern
matching.

QuantumVision

Figure 14. Quantum Stereo Vision Module

Stereo Vision for 3D Mapping and Navigation

Four decades ago, Apollo astronauts landed on the moon and captured 3D images of the
lunar surface. Astrobotic will return to the moon and not only generate 3D imagery, but also

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

produce high-definition 3D video. This media is used for driving, exploration, science, and to
convey a rich, remote experience.

Figure 15. Red Rover's Stereo Vision Camera

Red Rover is equipped with two stereo cameras that extract 3D structure and create maps of
the moon. The rover uses these maps to plan a safe path around obstacles, such as rocks or
craters. The locations of these obstacles are detected by measuring the disparity between
the obstacle’s position in the left and right stereo images. Human eyes detect the position of
objects and perceive depth in much the same way. If only one camera were used for
navigation, precise positions of obstacles relative to the rover would be very difficult to
determine.

Additionally, scientists and the public alike can move through these 3D maps to experience
what it would be like to walk on the moon. Soon all of mankind can take that “one small
step” and walk in Neil Armstrong’s footprints.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

NASA Multi-view / 3D Stereo Camera Sensor System

Figure 16. An example 3D anaglyph made from NASA Mars rover navigation images. The
yellow lines illustrate the disparity between matching objects from each stereo photo. The
greater the disparity, the closer the object is to the rover

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

RE2's manipulation technologies are also being used on mobile platforms. They are
developing a Robotic Nursing Assistant (RNA) to help nurses with difficult tasks, such
as helping a patient sit up, and transferring a patient to a gurney. The RNA uses a
mobile hospital platform with dexterous manipulators to create a capable tool for
nurses to use. RE2 is also working on an autonomous robotic door opening kit for
unmanned ground vehicles.

RE2's expertise in manipulation made them a natural choice to be the systems


integrator for the software track of the DARPA ARM program. The goal of this track is
to autonomously grasp and manipulate known objects using a common hardware
platform. Participants will have to complete various challenges with this platform, like
writing with a pen, sorting objects on a table, opening a gym bag, inserting a key in a
lock, throwing a ball, using duct tape, and opening a jar. There will also be an
outreach track that will provide web-based access. This will enable a community of
students, hobbyists, and corporate teams to test their own skills at these challenges.

RE2 had it own set of challenges: build a robust and capable hardware and software
platform for these participants to use. The ARM robot is a two-arm manipulator with
sensor head. The hardware, valued at around half a million dollars, includes:

• Manipulation
• Two Barrett WAM arms (7-DOF with force-torque sensors)
• Two Barrett Hands (three-finger, tactile sensors on tips and palm)
• Sensor head
Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013
Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

• Swiss Ranger 4000 (176x144 at 54fps)


• Bumblebee 2 (648x488 at 48fps)
• Color camera (5MP, 45 deg FOV)
• Stereo microphones (44kHz, 16-bit)
• Pan-tilt neck (4-DOF, dual pan-tilt)

http://www.ros.org/news/robots/mobile-manipulators/

MobileRanger™ C3D Stereovision System

the TYZX DeepSea G2, cameras are shown below.

Figure 17. MobileRanger™ C3D Stereovision System. Left: A custom built trinocular stereo
cluster made of 3 Point Grey gray scale cameras and 1 color camera. Middle: A TYZX
DeepSea G2 stereo network camera with on board CPU. Right: A G2 camera with side
mounted FLIR Photon thermal infrared camera.

3D MLI sensor

3D MLI Sensor™.

http://www.tyzx.com/products/DeepSeaG2.html

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

MobileRanger

MobileRanger™ C3D stereovision systems are top-of-the-line instruments for measuring


depth for demanding applications such as mobile robot navigation, people and object
tracking, gesture recognition, targeting, 3D surface visualization and advanced human
computer interaction. MobileRanger™ C3D stereovision systems combine an nDepth™
range-finding stereo processor FPGA system with a rugged high quality stereo camera head.
MobileRanger™ C3D is ideal for robot applications because it offloads initial depth
calculations to the separate nDepth™ hardware, reserving the robot's main computing
resources for other tasks. MobileRanger can be used as a 3D sensor in robot navigation or
object sensing, and you can optionally access the raw stereo images to also use the same
device for your own stereo vision research

Figure 18. MobileRanger™ C3D Stereovision System

http://www.mobilerobots.com/Libraries/Downloads/MobileRobots_MobileRangerC3D_Ster
eocamera_Datasheet_-_ACA0195_ACA0196_ACA0295_ACA0296.sflb.ashx

PCI nDepth™ Vision System

6cm Baseline Stereo Vision Camera


The PCI nDepth™ vision system includes a 6cm 752x480 stereo vision camera. Available with
an automotive quality monochrome image sensor, the Focus Robotics stereo vision camera
boasts >60dB of dynamic range and near IR enhanced performance for use with non-visible
near IR illumination. Support for M12x0.5 uVideo lenses ensures you get the right optics for
Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013
Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

your applications needs. The stereo camera connects easily to the PCI card using one
standard CAT6 cable up to 5 meters in length. The cable carries all necessary power, data,
and control for the stereo camera. With progressive scan, global shutter and low-noise
imaging technology, this stereo camera is ideal for a wide variety of imaging applications in
real-world environments.

Figure 19. PCI nDepth™ 6cm Baseline Stereo Vision Camera

Technical Specification
-
nDepth™ Vision Processor Subsystem
Resolution WVGA (752x480)
Disparity Frame Rate 30 frames per second WVGA with 92 disparity
levels.
Disparity Range Up to 124.
Camera Callibration Calibration coeficients generated at the factory.
Processor rectifies and undistorts images in real-
time
Calibration Error 0.1 pixel RMS error.
Stereo Algorithm Sum of Absolute Differences (SAD) with 9x9 block
matching.
Left/Right Check Identifies places where correlation is contradictory
and thus uncertain
Host Interface Standard PCI 33, direct DMA access.
Processor Upgrades Ability to upgrade processor functionality in the
field.
6cm Baseline Stereo Vision Camera
Resolution Two 752x480 1/3-inch wide-VGA CMOS digital
image sensors.
Frame Rate Programmable up to 60 frames per second.
Baseline 6cm. Contact us for custom baseline cameras.
Mounting Includes three standard Tripod mounts on the

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

bottom. Contact us for drawings.


Image Format Monochome: Near_IR enhanced performance for
use with non-visible NIR illumination. Contact us for
information regarding color sensors.
Dynamic Range >60dB
Analog to Digital 10 bit
Conversion
Shutter Type Global shutter photodiode pixels; simultaneous
integration and readout.
Controls Automatic/Manual syncronized exposure and gain
control.
Interface LVDS on CAT6 cable up to 5 meters in length.
Focal Lens Length Uses standard M12x0.5 uVideo lenses. 4.3mm
standard. Contact us for additional lens options.
Power Supply Supplied via CAT6 cables from PCI card.
Power Consumption <1W at maximum data rate;
Dimensions 4.25in x 1.5in x 1.25in.
Host Software Subsystem
Drivers Linux and Windows drivers provide access to the
depth image and the undistorted and rectified
(calibrated) images.
Demonstration Code Includes integration with OpenCV image processing
library and open source samples. Linux driver is
also open source.
Control API Includes programming interfaces for parameter
control and infield processor upgrades.

SRI Small Vision System IEEE 1394 (FireWire)

Videre Design

DCS series (STH-MDCS/MDCS2/DCSG/STOC) are a series of alldigital


devices with VGA or megapixel imagers

a standard 1394 PCI board or PCMCIA card can be used. The card must be OHCI (Open Host
Controller
• 320x240
• 640x480
• 1280x960

Video Format Frame Sizes

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

1394 (digital) interface 1280x960


1024x768
640x480
512x384

http://www.videredesign.com/index.php?id=1

http://www.videredesign.com/assets/docs/manuals/smallv-4.4d.pdf

Figure 20. Videre Stereo and Multi-view Vision Camera Systems

The detection of pedestrians is an important field of research for commercial, governmental,


and military organizations. In particular, the US Army is actively developing obstacle
detection for multifunction utility/logistics equipment (MULE) vehicle operations, path
following, and intent-based anti-tamper surveillance systems. This article introduces a new
optical system for the detection of human shapes from unmanned MULE vehicles on the
move.
Locating people against background noise can be a challenging task. For example,
pedestrians can assume different poses, wear different clothes, and carry objects that
obscure the distinctive human silhouette. These problems are further compounded by
camera movements and different lighting conditions in uncontrolled outdoor environments.
To tackle this task, a number of monocular and stereo optical systems have been developed
that use visible light (daylight cameras) or far infrared (FIR) wavelengths (7–14μm). In many
scenarios, FIR cameras (often called ‘thermal’ cameras) are well suited to initial detection. In
other situations, such as sunny and hot environments, targets are harder to pick out from
the background, and daylight cameras are a better choice. In addition, daylight cameras
provide more detailed images and offer more reliable target verification.
The simultaneous use of two stereo camera systems, one based on visible light (daylight)
and the other on FIR wavelengths, have therefore been investigated to exploit the benefits
of both technologies.1,2
Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013
Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 21. US Army Stereo Vision Camera System

Designed and tested on vehicles as depicted in Figure 1, the system can detect both
stationary and moving pedestrians and exploits passive sensors, which detect apparent
motion by comparing the change in infrared temperature when, for example, a human
passes in front of an infrared source with a different temperature, such as a building.
At the start of four processing steps, the two stereo systems are used independently to scan
the target area. In this phase, different approaches are used to highlight portions of the
images that warrant further attention. For example, warm areas are detected on FIR images,
the density of edges from FIR and daylight images, and techniques such as disparity space
image, among others, further process the initial data.
Stereo-based computation of the scene allows the 3D position of features such as roads, as
well as their slope, distance, and size to be measured against the calibration parameters of
the system so that features incompatible with the presence of a person (or a small group of
people) can be discarded.
In the second step, areas highlighted in the two different spectra are filtered and fused
applying symmetry, size, and distance constraints. In the third step, different models and
filters are used to evaluate the presence of human shapes, which include neural networks,
adaptive boosting, and others.3

Microsoft Kinect Sensor

Microsoft has recently released the Microsoft Kinect for video gaming as a real-feeling
game-experience. The Kinect sensor is also a major breakthrough for robotics and stereo
vision application based on the interesting stereo vision capabilities and features that
provides.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 22. Microsoft Kinect 3D Camera

Microsoft Kinect has introduced a very smart and successful stereo vision circuitry and
stereo vision concept-methodology. Microsoft Kinect has applied the stereo vision theory by
combining the color and the depth image captured, in which one can project the color image
back out into space and create a "holographic" representation of the persons or objects that
were captured.

The interesting part of the Kinect sensor is that it can easily converted into a 3D camera by
combining the depth and color image streams received from the device, and projecting them
back out into 3D space in such a way that real 3D objects inside the cameras' field of view
are recreated virtually, at their proper sizes.

The following figure shows the Kinect sensor-system hardware block diagram. The Kinect
contains a regular color camera, sending images of 640x480 pixels 30 times a second. It also
contains an active-sensing depth camera using a structured light approach (using what
appears to be an infrared LED laser and a micromirror array), which also sends (depth)
images of 640*480 pixels 30 times a second (although it appears that not every pixel is
sampled on every frame). Kinect contains a RGB camera, a depth sensor,

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 23. Microsoft Kinect Sensor Bock Diagram – SoC Hardware Architecture
PS1080 SoC Functionality. Photo by ifixit from:
http://www.ifixit.com/Teardown/Microsoft-Kinect-Teardown/4066/1
How does a Kinect sense depth?
– The IR emitter projects an irregular pattern of IR dots of varying intensities.
– The Depth Camera reconstructs a depth image by recognizing the distortion in this pattern.
The Microsoft Kinect is an accessory for the XBOX 360 console that turns the user’s body
into the controller. It is able to detect multiple bodies simultaneously and use their
movements and voices as input. The hardware for the Kinect is comprised of a color VGA
camera, a depth sensor, and a multi-array microphone. The VGA camera is used to
determine different features of the user and space by detecting RGB colors. It is mainly used
for facial recognition of the user. The multi-array microphone is a set of four microphones
that are able to isolate the voices of multiple users from the ambient noises in the room,
therefore allowing users to be a few feet away from the device but still be able to use the
voice controls. The third component of the hardware, the depth sensor (generally referred
to as the 3D camera), has two parts to it: an infrared projector and a CMOS (complimentary
metal-oxide semiconductor) sensor. The infrared projector casts out a myriad of infrared
dots that the CMOS sensor is able to “see” regardless of the lighting in the room. This is,
therefore, the most important portion of the Kinect which allows it to function. But there is
a second component that would render the Kinect quite useless otherwise: the software
that interprets the inputs from the hardware. These two components will be the main focus
of the paper.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

User videos posted on YouTube show the Kinect’s large array of scattered infrared dots
literally painting the user’s living room in a swathe of green lights. The rays are cast out via
the infrared projector in a pseudo-random array across a large area. The CMOS sensor is
able to then read the depth of all of the pixels at 30 frames per second. It is able to do this
because it is an active pixel sensor (APS), which is comprised of a two-dimensional array of
pixel sensors. Each pixel sensor has a photo detector and an active amplifier. This camera is
used to detect the location of the infrared dots.
Following this, depth calculations are performed in the scene using a method called Stereo
Vision Triangulation. Stereo Triangulation requires two cameras to be able to perform this
calculation. The depth measurement requires that corresponding points in one image need
to be found in the second image. Once those corresponding points are found, we can then
find the disparity (the number of pixels between a point in the right image and the
corresponding point in the left image) between the two images. If the images are rectified
(along the same parallel axis), then, once we have the disparity, we can then use
triangulation to calculate the depth of that point in the scene.
Stereo vision is a technique that uses two cameras to measure distances from the cameras,
similar to human depth perception with human eyes. The process uses two parallel cameras
aligned at a known distance of separation. Each camera captures an image and these images
are analyzed for common features. Triangulation is used with the relative position of these
matched pixels in the images as seen in Figure24 below.

Figure 24: Stereo Vision Triangulation

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Triangulation requires knowing the focal length of the camera (f), the distance between the
camera bases (b), and the center of the images on the image plane (c1 and c2). Disparity (d) is
the difference between the lateral distances to the feature pixel (v2 and v1) on the image
plane from their respective centers. Using the concept of similar triangles, the distance from
the cameras (D) is calculated as D = b * f / d.

The result for the computer vision system is a depth field map which is a grayscale image of
equal size to the original image. Each gray level represents a distance from the camera. For
example, a black pixel identifies a pixel in the computer’s vision as being at infinity distance
and a white pixel signifies being at near-infinity. This processing can be on a computer, but
some cameras do exist that do the processing on-board using FPGA.

This Stereoscopic Triangulation requires two cameras, but the Kinect is unique in that the
depth sensor only has one camera to perform these calculations. This is because the infrared
projector is, in and of itself, a “camera” in the sense that it has an image to compare with
the image taken from CMOS sensor camera. The projected speckles are semi-random in the
fact that they are a generated pattern that the Kinect knows where they can be found. Since
the device knows where the speckles are located, it has an image which can be compared to
find the focal points. The CMOS sensor captures an offset image to detect differences in the
scene where the disparity between dots can be analyzed and the depth can therefore be
calculated. I am assuming that the images are rectified, making it simple to calculate the
depth with the equation in Figure 24. After the depth calculations are obtained, all the data
is interpreted and used in the system.

Figure 25. The Microsoft Kinect 3D Camera Sensor-System ( an I.R. transmitter, 3D Depth
Sensors, (RGB) Camera, a multi-array microphone, and a motorized tilt base)
The RGB camera delivers the three basic color components, displays the video and helps
enable facial recognition. It outputs video at a frame rate of 30 Hz and uses a maximum
resolution of 640 × 480 pixels, 32-bit color.
The 3D depth sensor in Figure 25,[39], consists of an infrared laser projector which captures
video data in 3D under any lightning conditions. The laser is projected into the room. The
sensor is able to detect the information based on what is reflected back at it. Together, the

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

projector and sensor create a depth map. Thus, the 3D depth camera provides detailed 3D
information about the environment. Simply said, it determines how far away an object is
from the camera. It has a practical ranging limit of 1.2–3.5 m distance when used with the
Xbox software.
The infrared (IR) camera is used for tracking the movement and the depth. Combined with
an IR emitter, the IR camera spotlights the room with invisible infrared light. Thus, the eye
does not see the IR light, and the lightening becomes a non-issue for Kinect. The multi-array
microphone enables voice recognition to recognize different voices in a room among the
different players, and it extracts the ambient noise. The four microphones are located along
the bottom of the Kinect and they dictate the size and shape of the sensor device. The
microphone array operates with each channel processing 16-bit audio at a sampling rate of
16 kHz.
The motorized tilt is a pivot for sensor adjustment to track the users, even if they move
around. It is capable of tilting the sensor up to 27° either up or down, while the angular field
of view is of 57° horizontally and 43° vertically.
The area required to use Kinect is approximately 6m², although the sensor can maintain
tracking through an extended range of near 0.7 m to 6 m. In the Kinect manual it is specified
that the sensor can detect the users approximately 2 meters from the sensor. While for two
people, the user should stay approximately 2.5 meters from the sensor.
Kinect is capable of simultaneously tracking up to six people, including two active players
and it can track 20 joints per player in real time. However, PrimeSense, which developed the
3D depth sensors, has stated that the number of people the device can "see" (but not
process as players) is only limited by how many will fit in the field-of-view of the camera.
PrimeSensor 3ITE Middleware
PrimeSense, an Israeli startup, is the leader in sensing and recognition solutions, and its
product portfolio includes the PrimeSensor Reference Design hardware, a 3D data
generation unit and the PrimeSense NITE Middleware.
The PrimeSensor Reference Design (Figure 26) [40], is a low-cost, plug-and-play USB device.
This solution enables a device to perceive the world in 3D and to translate these perceptions
into a synchronized depth image, in the same way that humans. Basically, the Reference
Design generates real time depth, color and audio data of the scene.

Figure 26: PrimeSensor Reference Design Hardware

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

The 3D data generation unit is used for the 3D sensing technology for Kinect Camera device.
It is a motion-control system that lets the players control the interface through full-body
gestures.

Specs and Standards of the Kinect Sensor:


What’s the accuracy of a Kinect sensor?
– Data Stream
• 640X480, 320X240 in Linux and Mac
• 1024X768, 640X480, 320X240 in Windows 7
• 30 frames/sec
– Depth Camera
• Field of View
– Horizontal: 58˚, Vertical: 45˚, Diagonal: 70 ˚
• Spatial X/Y resolution: 3mm
• Depth Z resolution: 1cm
• Operation range: 0.8m - 3.5m
– Physical Tilt Range: ±27 degrees
What Makes The Kinect Special?

It is important to understand the difference between 3D cameras like the Kinect on one
hand, regular (2D) cameras on the other hand, and so-called "3D cameras" -- actually,
stereoscopic 2D cameras -- on the third hand (ouch).

Kinect vs Regular 2D Camera

Any camera, 2D or otherwise, works by projecting 3D objects (or people...), which you can
think of as collections of 3D points in 3D space, onto a 2D imaging plane (the picture) along
straight lines going through the camera's optical center point (the lens). Normally, once 3D
objects are projected to a 2D plane that way, it is impossible to go back and reconstruct the
original 3D objects. While each pixel in a 2D image defines a line from that pixel through the
lens back out into 3D space, and while the original 3D point that generated the pixel must lie
somewhere on that line, the distance that 3D point "traveled" along its line is lost in
projection. There are approaches to estimate that distance for many pixels in an image by
using multiple images or good old guesswork, but they have their limitations.

A 3D camera like a Kinect provides the missing bit of information necessary for 3D
reconstruction. For each 2D pixel on the image plane, it not only records that pixel's color,
i.e., the color of the original 3D point, but also that 3D point's distance along its projection
line. There are multiple technologies to sense this depth information, but the details are not
really relevant. The important part is that now, by knowing a 2D pixel's projection line and a
distance along that projection line, it is possible to project each pixel back out into 3D space,
which effectively reconstructs the originally captured 3D object(s). This reconstruction,
which can only contain one side of an object (the one facing the camera), creates a so-called
Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013
Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

facade. By combining facades from multiple calibrated 3D cameras, one can even generate
more complete 3D reconstructions.

Kinect vs So-Called "3D Camera"

There exist stereoscopic cameras on the market, which are usually advertised as "3D
cameras." This is somewhat misleading. A stereoscopic camera, which can typically be
recognized by having two lenses next to each other, does not capture 3D images, but rather
two 2D images from slightly different viewpoints. If these two images are shown to a viewer,
where the viewer's left eye sees the image captured through the left lens, and the right eye
the other one, the viewer's brain will merge the so-called stereo pair into a full 3D image.
The main difference is that the actual 3D reconstruction does not happen in the camera, but
in the viewer's brain. As a result, images captured from these cameras are "fixed." Since they
are not really 3D, they can only be viewed from the exact viewpoint from which they were
originally taken. Real 3D pictures, on the other hand, can be viewed from any viewpoint,
since that simply involves rendering the reconstructed 3D objects using a different
perspective.

While it is possible to convert stereo pairs into true 3D images using computer vision
approaches (so-called depth-from-stereo methods), those do not work very well in practice.

Figure 27. Primesense Depth Image CMOS. (The I.R. invisible light is emitted and tracked by
the depth image CMOS. The PS1080 SoC then generates a depth image.)

This is a Prime Sense diagram explaining how their reference platform works. The Kinect is
the first (and only) implementation of this platform.

One camera (and one IR transmitter) provide input for the depth map (rumored to be just
320x240), while the third camera detects the human visual spectrum at 640x480 resolution

Saffir Robot

SAFFiR, also known as the Shipboard Autonomous Firefighting Robot, is being shaped by
scientists at the Naval Research Laboratory. America’s Naval Research Laboratory has

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

developed SAFFiR, the Shipboard Autonomous Firefighting Robot. SAFFiR will be an


autonomous bipedal humanoid robot, based on the CHARLI-L1 robot created at Virginia Tech

The robot is designed with enhanced multi-modal sensor technology for advanced
navigation and a sensor suite that includes a camera, gas sensor, and stereo IR camera to
enable it to see through smoke. Its upper body will be capable of manipulating fire
suppressors and throwing propelled extinguishing agent technology (PEAT) grenades. It is
battery powered that holds enough energy for 30 minutes of firefighting. Like a sure-footed
sailor, the robot will also be capable of walking in all directions, balancing in sea conditions,
and traversing obstacles.

http://www.nrl.navy.mil/media/news-releases/2012/nrl-designs-robot-for-shipboard-
firefighting

The Autonomous Car of the Future

The near future cars will be advanced based on multi-sensor devices for enabling radar
range, communication latency, and pixel resolution.

Figure 28. The SAFFiR Autonomous Car of the Future

http://www.wired.com/magazine/2012/01/ff_autonomouscars/3/

1 Radar
High-end cars already bristle with radar, which can track nearby objects. For instance,
Mercedes’ Distronic Plus, an accident-prevention system, includes units on the rear bumper
that trigger an alert when they detect something in the car’s blind spot.

2 Lane-keeping
Windshield-mounted cameras recognize lane markings by spotting the contrast between the

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

road surface and the boundary lines. If the vehicle leaves its lane unintentionally, brief
vibrations of the steering wheel alert the driver.

3 LIDAR
Google employs Velodyne’s rooftop Light Detection and Ranging system, which uses 64
lasers, spinning at upwards of 900 rpm, to generate a point cloud that gives the car a 360-
degree view.

4 Infrared Camera
Mercedes’ Night View assist uses two headlamps to beam invisible, non-reflective infrared
light onto the road ahead. A windshield-mounted camera detects the IR signature and shows
the illuminated image (with hazards highlighted) on the dashboard display.

5 Stereo Vision
Mercedes’ prototype system uses two windshield-mounted cameras to build a real-time 3-D
image of the road ahead, spotting potential hazards like pedestrians and predicting where
they are headed.

6 GPS/Inertial Measurement
A self-driver has to know where it’s going. Google uses a positioning system from Applanix,
as well as its own mapping and GPS tech.

7 Wheel Encoder
Wheel-mounted sensors measure the velocity of the Google car as it maneuvers through
traffic.

Mercedes-Benz 6D-Vision

The “6D-Vision” system from Mercedes-Benz uses two cameras that view their surroundings
in the same manner that a human being’s two eyes do. This stereo arrangement enables 3D
depiction of the vehicle’s surroundings in real time. The system uses this information to
identify every object around the vehicle and assess the risk it might pose for a potential
collision.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

They modeled their system on the function of the human eye and brain whose abilities they
were even able to surpass. The new “6D vision” technology succeeds in identifying children
at play at the side of the road in less than 0.2 seconds – a human being takes more than
twice as long. To achieve this remarkable feat, a stereo camera records three dimensional
images in rapid succession of the surroundings in front and next to the vehicle. An algorithm
developed just for this purpose analyzes the images virtually instantaneously. By comparing
the sequence of images, the system also recognizes whether and how fast objects such as
cyclists, pedestrians, or cars are moving. It even works very reliably in inclement weather
and at twilight.

Daimler will soon be including 6D vision systems in the Mercedes vehicles series – as the
basis for innovative assistance systems that recognize pedestrians, assist drivers as they pass
through blind crossings or navigate narrow highway construction sites. The research team
from Sindelfingen hopes that their innovations will find widespread acceptance in the
automotive industry – so that as many road users as possible are provided with an additional
safety feature. In attempt to ensure that this is the case, the company plans to make the
technology available to other manufacturers. 6D vision has the potential to revolutionize
electronic vision not only in cars, but also in service robots that act independently. These
robots are designed to serve as household helpers or to assist in caring for the infirm. To do
so, they must be able to monitor their surroundings and to recognize where and how their
charge moves around. The six-dimensional look at the world provided by automotive
research makes it possible.

Figure 29. The 6D-Vision’s Algorithm – Stereo Vision Diagram

Volvo Stereo Vision Camera System


SARTRE - Safe Road Trains for the Environment.

When cars and pedestrians meet it’s rarely the pedestrian that gets the better of the
collision. This is why Continental has announced a stereo camera system designed to help
prevent this type of accident. (“Two Eyes Are Better Than One – The Stereo Camera”)

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

The system uses two CMOS cameras mounted 20cm (8 inches) apart and facing through the
windshield. This separation apparently allows the distance of an object in the 20 to 30 meter
range to be determined within 20 to 30cm (that’s 8 to 12 inches.) Stereo vision is becoming
quite well known for tasks like robot guidance, so I guess it’s a logical extension to move it
into an unconstrained urban environment. What’s not clear to me is how the system works
at night. Perhaps there’s a passive IR illumination system built-in? I also wonder how it will
deal with rain, frost or snow. In fact, why not just go with a Kinect-style IR pattern projection
approach?

•Stereo Vision Systems


•Multi-functional camera systems

Figure 30. Volvo Automotive Stereo Vision Camera System

Intelligent goggles for partly-sighted people


“Intelligent” goggles for partly-sighted people have been developed at Universidad
Carlos III in Madrid, Spain. The system consists of a pair of stereoscopic digital
cameras mounted on either side of a virtual reality headset, with two digital screens in
front of the wearer’s eyes in place of lenses. The cameras scan the field of vision in
front of the headset, convert it to digital code and then feed this to a separate
computer package. The computer then runs an algorithm developed by the team, that
determines the distance and outline of any objects seen. What the cameras scan is
displayed on the headset’s screens and information about the objects is conveyed to
the wearer by overlaying them with color-coded silhouettes.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 31. Intelligent Stereo Goggles Vision

“It detects objects and people who move within the visual field that a person with no visual
pathologies would have,” said Professor Vergaz, leader of the research team who has
developed the “intelligent” goggles. “Very often the patient does not detect them due to
problems of contrast. The information regarding depth is what is most missed by patients
who use this type of technical aid.”

3DSS(Three Dimensional Sensing System)

3DSS(Three Dimensional Sensing System) is a non-contact, non-Laser 3D scanner [41].The


3DSS has been developed based on active sensing techniques which composed of Stereo
vision , triangulation principle, structured light and phase shift method: different fringe
patterns are projected onto the object of measurement and are observed with two cameras.
By special calibration program, the optical transformation equations are known, so the 3d-
coordinates for each of the camera pixels can be determined automatically and with high
precision.

Figure 32. 3DSS Stereo Vision Camera-Scanner

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Minoru 3D Webcam

Figure 33. Minoru 3D WebCam Stereo Vision Sensor

An alternative would be to just get a 3D webcamera that has the two lenses already fixed
and software that will synchronize them, so you will only need to capture the output in the
right 3D format. Such a 3D webcam is Minoru for example:

[16] Minoru 3D WebCam http://3dvision-blog.com/minoru-the-worlds-first-consumer-3d-


webcam/

….More Description of the Camera to be analyzed….


[17], [18], [19], [20]

Fujifilm FinePix REAL 3D W1 – The First True 3D Stereo Digital Camera

The latest Fujifilm FinePix Real 3D W1 digital camera represents an advanced developed
Stereo Vision Camera of digital still stereo images as shown in the following figure image.

The Fujifilm 3D Stereo Digital Camera acts like a digital human vision system, Figure…. In the
following diagrams in Figure…, shows how the screen sends different images to each eye,
similar to how we see reality, an eye go with a different angle to another that causes the
brain to interpret a sense of depth.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 34. Fujifilm 3D Stereo Digital Camera as a Digital Human Vision System

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 35. Fujifilm FinePix REAL 3D W1 - 3D Stereo Digital Camera

[21], [22], [23], [24]

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

The key advantage of the Fujifilm FinePix Real 3D W1 is that it can record both in 2D and in
3D (still images and movies). The Real 3D W1 camera uses two separate Fujinon lenses along
with two 10 megapixel CCD sensors to be able to simultaneously take two images just like
the eyes of a normal human see everything. The two images can then be combined into a 3D
stereo picture with the help of the RP (Real Photo) Processor 3D or into a 3D stereo video
clip with resolution of up to 640×480. On the back of the camera a 2.8-inch LCD screen is
available that is capable of displaying both 2D and 3D content so that you will be able to
preview the images and videos you shoot without having to worry if they are in 2D or 3D
mode. Another interesting feature of the camera, when not shooting in 3D mode is the
ability to take two 2D pictures with different settings in order to get the best result without
having to take two different pictures one after the other. And you get all that in a compact
and a good-looking camera that is very easy to use in taking both 2D or 3D images or even
shooting videos

Specifications of Fujifilm FinePix REAL 3D W1

Number of effective pixels – 10 million pixels


CCD sensor – 1/2.3-inch CCD x2 (two sensors for 3D)
Storage media – Internal memory (Approx. 42MB), SD/SDHC memory card
Pictures file format – 3D Still image: MPO+JPEG, MPO (Multi Picture Format compatible); 2D
Still image: JPEG (Exif Ver 2.2)
Movies file format – 3D Movie: 3D-AVI (Stereo AVI format with 2 image channels); 2D Movie:
AVI format (Motion JPEG with sound)
Supported resolutions: L: 3648×2736 (4:3) / L: 3648×2432 (3:2) / M: 2592×1944 (4:3) / S:
2048×1536 (4:3) pixels
Lens: Fujinon 3x optical zoom lens, F3.7 (Wide) – F4.2 (Tele)
Lens focal length: f=6.3 – 18.9 mm, equivalent to 35 – 105 mm on a 35 mm camera
Aperture: Wide: F3.7 / F5 / F8, Telephoto: F4.2 / F5.6 / F9
Sensitivity (ISO): Auto / Equivalent to 100 / 200 / 400 / 800 / 1600 (Standard Output
Sensitivity)
LCD display: 2.8-inch, Approx. 230,000 dots color LCD monitor with Light Direction Control,
Approx. 100% coverage
Movie recording: 640×480 pixels / 320×240 pixels (30 frames/sec.) with stereo sound
Digital interface: USB 2.0 High-speed
Dimensions: Approx. 123.6 (W) x 68 (H) x 25.6 (D) mm
Weight: Approx. 260g

[21], [22], [23], [24]

Sony Bloggie 3D Stereo Camera

The Sony 3D Stereo Camera is also a digital 3D Stereo Camera of Stereo still images [26],
[27]. Sony Bloggie 3D can record 1080p 3D videos via its two sensors/lenses and also has a

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

2.4-inch glasses-free 3D LCD. The Sony 3D Stereo Camera brings two Exmor- CMOS sensor
with spatial resolution of 1920x1080p MP4 Full-HD format at a contrast ration 16:9.

Sony Bloggie 3D camera certainly is one of the latest advanced stereo vision cameras for
capturing 2D and 3D still images at Full HD 1080p (1920pX1080p) based on the advanced
H.264/MPEG-4 AVC technology. In particular the Sony 3D Camera captures high-definition
2D or 3D videos with the MHS-FS3 3D Bloggie HD camera. 3D videos can be played directly
on the 2.4-inch LCD screen without any special 3D glasses required. Or watch the 3D fun
with others by connecting your Bloggie camera directly to any compatible 3D HDTV.Watch
all 3D video in 2D as well. The Bloggie 3D comes with dual lenses for recording high-
definition 3D video and still images. Playback your 3D content via a 3D capable HDMI cable
(sold separately) and compatible 3D HDTV right from your camera.The LCD is also 3D ready,
enabling you to view your 3D content directly on the camera’s LCD without the need for 3D
glasses.The Bloggie 3D camera lets you record your favorite moments in High Definition MP4
(H.264) format and features a 5MP CMOS sensor that lets you take crisp 5MP still images
(2D) and 2MP 3D still photos.The 2.4-inch screen LCD screen will rotate its orie tation
automatically; however you hold the camera – horizontally, vertically, even upside down.

Figure 36. Sony 3D Stereo Camera

Features :

• 3D video and still image recording capability


• 3D viewing directly on 2.4-inch LCD, no need for 3D glasses
• 1920x1080p MP4 HD video w/5MP still images
• Full-screen vertical/horizontal record/playback on 2.4? LCD
• Up to 4 hrs of 3D or 2D HD video w/ 8GB of internal memory

discussing color filter pattern innovations, among many other things.

The previous generation of Sony CineAlta CCD-based Super 35mm cameras used linear RGB
pattern

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Bayer Pattern can produce a lot of artifacts:

Figure 37. Sony RGB Sampling at Full HD 1920x1080

Figure 38. RGB and Bayer Pattern

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 39. RGB Vs. Bayer

3D Stereo Webcam Video

HYTEK 3D iVCam - Stereo 3D Webcam Suite


The dual camera's setup can be adjusted at horizontal/vertical direction as well as the
distance between two cameras. This provides more flexibilities for camera's alignment, 3D
depth adjustment and base object distance selection. Besides it still keeps the two
webcams as two separate devices, much better than fixed single 3D webcam's design in
terms of value and functionalities. And the 12 different video mixing modes provide multiple
2D and 3D viewing methods besides anaglyph 3D.

Figure 40. HYTEK 3D iVCam2.0


Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013
Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

This bundle pack includes:


1.Two USB PC Cameras(Camera exterior color may vary in actual package)
(Max resolution 800x600, 30fps at 320x240, 30fps at 640x480, CMOS, 24bit RGB etc.)
2. Four pairs of red/cyan glasses
3. A mini triport with extendable legs
4.One free license of HYTEK Stereo 3D Camera driver software (valued at $29USD)
5.Quick Installation Guide

The dual camera's setup can be adjusted at horizontal/vertical direction as well as the
distance between two cameras. This provides more flexibilities for camera's alignment, 3D
depth adjustment and base object distance selection. Besides it still keeps the two webcams
as two separate devices, much better than fixed single 3D webcam's design in terms of
value and functionalities. And the 12 different video mixing modes provide multiple 2D
and
3D viewing methods besides anaglyph 3D.

AKC D32 3D Stereo Web-camera, with a microphone, stereo 3D/2D camera, 3d webcam

Figure 41. AKC D32 3D Stereo Web-camera

This 3D webcam [28] connects easily to your PCs USB port just like any other webcam but
that's where the similarities end. The 3D webcam software has stereoscopic anaglyphic
processing that lets you be seen in three dimensions. The red and cyan anaglyph image
produced by it can be viewed by anybody who is wearing commonly available red and cyan
3D glasses (one pair included). The 3D webcam can also be used as a standard 2D webcam
for anyone who doesn't have the 3D glasses at hand. There is also a Picture in Picture
function allowing you to show the two images separately.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Image Sensor:CMOS Max. Resolution:1024x768 Interface Type:USB style:3D webcam


color:blue function:built-in microphone Pixels:1.2 Mege
Supports a live video calls with your favorite instant messenger software (i. E. MSN, Skype,
YouTube, etc. )

USB 2.0 full speed and high speed compatible

Desktop and Monitor mounted

3D snapshot and 3D video capture

High Speed USB 2.0 3D Webcam

2 x VGA 640x480 CMOS sensor

Compatible WIN2000 / XP / Vista/ WIN7

True3Di Stereoscopic 3D Microscopic System

True3Di 3D stereoscopic microscope system [32] is made with the hardware and software,
microscopic monitor (true3di), and microscopic software (SIDP). It can be used for anatomy
operation or high precision electronic parts assembly. It provides accurate distance and
depth perception for precise and safe operation.

Figure 42. True3Di Stereoscopic 3D Microscopic System

For 3d stereoscopic display enables many viewers to watch the screen simultaneously.
On each scope, SM-045 has 2 CCD cameras to transfer the images to the monitor rather than
looking at the small scope.
Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013
Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 43. True3Di Stereo Vision Concept

The high performance 3D LifeViz™ system provides 3D medical images, combining a


compact, easy to use stereovision camera and image management software. The unique
validated technology of the 3D LifeViz™ produces life-like high quality 3D reconstructions

Solid-Look Imaging Solutions


Solid-Look™ owns and manufactures EndoStereoVision™.
Medical High Definition 3D Stereoscopic Vision
Solid-Look is the leadear in innovative Full High Definition 3D Stereoscopic Vision for
Minimally Invasive Surgery
EndoStereoVision™* is a cost effective, full HD 1080, true-color, 3D stereoscopic vision
solution for Minimally Invasive Surgery.
EndoStereoVision™* is a complete and modular solution including an innovative set of rigid
optical FDA approved stereo laparoscopes coupled with a 6 CMOS 1/2" full medical grade
stereoscopic camera and multiple stereoscopic displays.

Figure 44. Solid-look 3D HD EndoStereoVision Camera


Surgeons see in 3D stereo in real-time, with a resolution that starts from 1080 lines per each
eye with true color images displayed at 30 frames per second.
Thanks to our new technologies Surgeons can record in real-time and playback in full high
definition any format 3D stereoscopic videos on any computer using any 3D displays.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

In simple words, everyone sees stereoscopically what EndoStereoVision™ captures - from


any position inside or outside the surgery room and without any need of special helmets or
any infrared wireless synchronization.
Resolution is amazing as well as perception of depth and the comfort of position. These
three factors are extremely important for any surgeon and his team, reducing surgery time
while improving diagnosis and surgery precision and having safer operations.
EndoStereoVision™ is delivered as:
• a standalone solution for MIS,
• add-on to the best surgical robots
• add-on to surgical stereo microscope
to allow full high definition 3D stereoscopic visualization on any screen from 10" to 200", to
record full high definition video and images on any format , to playback any 3D video at any
time as well as to transmit / stream in real-time to extend the visualization in other part of
the hospital, of the city or of the continent and allow surgery collaboration in real-time.
EndoStereoVision™ is also a great tool to create stereo-video libraries suitable for direct
broadcasting or long-term archiving. A must have in medical training.
Stereo-movies can be encrypted and digitally signed under state-of-the-art security
standards to guarantee authenticity, confidentiality, integrity and evidence of time.
Solid-Look also provides 3D services for its customers:
• 3D images for IT workflow and video preservation,
• Design of new 3D full high definition medical innovation laboratories and operating
room
• Support for transmission of 3D HD images for conferences or surgery collaboration.
• 3D HD post production video editing for paper and conferences submission.

Professional, complete and modular Stereoscopic solutions for MIS Medical market.
Manual 3D stereoscopic vision system and integration with best surgical robots.
• 3D Stereoscopic FDA approved rigid optical scopes (10mm, 5mm(in development))
• Multiple format 3D stereoscopic Full HD visualization in real-time
• Recording and playback of Full HD 3D stereoscopic images in real-time.
• Full high definition 3D Medical Grade camera (Six 1/2" Sensors - 30fps at
1920x1080p)
• Stereoscopic 3D displays from 19" to 300", from single viewer with no glasses to
multiple viewers in full high definition (1920x1080 per each eye).
• 3D Stereoscopic HD Video Image Streaming and Satellite transmission in Real-Time

[34] http://www.solid-look.com/

Surveyor Stereo Vision System ("SVS")


Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013
Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 45. Surveyor Stereo Vision System (SVS)

[36]
2k Full HD cameras compliant on base in synch.
Creating a stereo Full HD camera system.

In 2006-2007, the research group of the Multimedia Lab (MLAB), at Florida Atlantic
University has carried a large scale 3D Vision Imaging Systems with the start of multiple R&D
projects based on 3D stereo and multi-view vision systems based on advanced 3D image-
video processing algorithms, multi-view video coding and interpolation techniques, and 3D
Stereo Depth image processing and analysis [37], [38].

The Center for Coastline Security Technology (CCST) focuses on research, simulation, and
evaluation of coastal defense and marine domain awareness equipment, sensors and
components. It builds upon the existing efforts and expertise in coastal systems and sensor
research at the Institute for Ocean and Systems Engineering (IOSE), the NASA Imaging
Technology Center, the Department of Computer Science and Engineering and the
University Consortium for Intermodal Transportation Safety and Security at Florida Atlantic
University.

Figure 46 shows the general architecture of the CCST 3D video system. The stereo views are
encoded at the sender by exploiting the large amount of redundancies among the views.
This asymmetric view coding provides the opportunity to exploit and achieve a range of
video quality, in which the quality of each view can be dropped, so that can reduce the video
bit rate without significantly affecting the 3D observation’s quality. The MLab research group
has used the H.264/MPEG-4 AVC standard as the core compression engine with inter-view
prediction to increase compression efficiency. The coded views are communicated to the
receiver while the decoded views are rendered on an appropriate display. The 3D displays
use a pair of coded views (left and right view) to display 3D video with depth perception.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 46. CCST 3D Stereo Vision System

The Sharp LL-151-3D autostereoscopic display was used to render the stereoscopic videos.
The display is 15-inches, XGA resolution (1024 by 768 pixels). This display which uses
lenticular imaging techniques and renders depth very accurately gives a true 3D experience.
The perception of depth is achieved by a parallax barrier that diverts different patterns of
light to the left and right eye.

For a low-cost 3D Stereo Vision Camera System the MLab research group has utilized the
Panasonic SD/HD Camera Video-recorders ….

Figure 47. HD Panasonic 3D Stereo Camera Vision System

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

The MLab-CCST research group has developed a 3D/multi-view video coding system Fig…
with an initial focus on security and surveillance. The goal of this project is to develop
technologies and tool for efficient compression, communication, and playback of multi-view
and 3D video.

Figure 48. 3D Stereo/Multi-view video H.264/MPEG4 AVC system

Figure 48 shows the general architecture of a multi-view video system. The multiple views
are encoded at the sender by exploiting the large amount of redundancies among the views.
We use H.264 as the core compression engine with inter-view prediction to increase
compression efficiency. The coded views are communicated to the receiver where the
decoded views are rendered on an appropriate display. The 3D displays use a pair of coded
views to display 3D video with depth perception.

The main technical objectives focus of the research analysis and development of the 3D
Stereo/Multi-view Camera System was to support and advance the following:
• 3D Video Compression
• 3D video improvement of surveillance applications
• Efficient compression of stereo views
• Development of a 3D video player for Sharp autostereoscopic displays – no glasses
required
• Asymmetric video encoding that exploits the human visual system
• One of the stereo views can be coded at lower quality
• Multi-view Video Coding
• Arrays of cameras record the same scene
• Video compressed by exploiting redundancies among views

The pair of color HDMAX cameras Figure 50 was used for the 3D video imaging system
based on the Ultra high resolution (UHR) camera system’s development supported by NASA
Imaging research group at FAU, US NAVY, and private industry has resulted in QUAD HD

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

CMOS Camera System with 8 times resolution of HDTV at 60 FPS – no commercial camera in
the world is available with comparable specifications.

The main technical objectives of the CCST Research group was to develop a new advanced
HDMAX High-Resolution QUAD HD Progressive Scan Electronic Camera System Figure 49
with to support 3D Imaging and 3D Video Technologies for Coastline Security Applications.

Figure 49. HD MAX Camera Sensor for Coastline Surveillance

Two Quad HD cameras equipped with 50mm lenses as shown in Fig… were mounted side by
side and polarization-coded 3-D imagery was projected onto the screen with good results.
The 3D Stereo Quad HD-MAX Camera system in Figure 50 is defined by the following
research technologies:
• Thickness variations in the linear polarizers were found to introduce aberrations in the
imaging system with a subsequent reduction in image sharpness. Thinner, optically flatter
polarizing film was ordered and tested with positive results.
• Expected differences in the psychological impact of the 3D imagery were observed
when camera separation and angular orientation were changed.
• The 3-D imaging system, combining two cameras with the two projectors, was tested
on multiple occasions with different groups of people as observers.

Figure 50. The pair of color HDMAX cameras used for the
NASA-CCST 3D Stereo Video Imaging system.
Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013
Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

3D viewing for 3D view-play back and real time view-experience was accomplished with the
use of the 3D Stereo HDMAX Camera System that was interfaced with the and Sony SRX-
R105 projector configuration. The 3D Stereo HDMAX Camera System was capable to deliver
3D imaging at the NASA Imaging Technology Center and the MLab research group at Florida
Atlantic University. This configuration gave pleasing results, although it does not provide full
realism in the display because of discrepancies between the apparent distance to an object
as determined by the angular subtense of the object and the apparent distance to the object
as determined by the angular convergence of the observer’s two eyes. In simple terms, it
distorts depth somewhat. Basic configuration information is as follows.

The objective of this segment of the project was to develop a high definition, high frame rate
color 3D Stereo HDMAX Camera System for surveillance. A 3840x2160 30P color Stereo
HDMAX CMOS camera with variable frame rate and remotely controlled infrared filter
changers was designed, fabricated, tested and demonstrated. The camera gathers 50 times
the amount of information in its field of view as with standard resolution video cameras.

Camera Positioning
The two HDMAX cameras, equipped with their standard 50-mm focal length Canon
camera lenses, are positioned side-by-side and separated by approximately 4 inches. At this
separation, the cameras will produce 3D imagery that exaggerates somewhat the depth of
the scene as viewed on the projection screen. The cameras should both be level and at the
same elevation. They are then aligned such that the optical axes of the cameras converge
approximately 12 feet in front of the camera pair.

Figure 51.a Pt. Hueneme image test target at 3 miles – long shot

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Figure 51.b Pt. Hueneme image test target at 3 Figure 51.c Pt. Hueneme image test target at 3
miles – 5x zoom miles – 10x zoom

Beyond the State-of-the-art

3D Stereo Vision for Multimodal Segmentation, Detection, Recognition,


and Tracking System

Summary
This section provides a technical documentation and an introduction overview of the current
research activities in the field of 3D Stereo image and video analysis algorithms for human-
object segmentation, detection, recognition, and tracking system. The current system is
worked out at Cyprus University of Technology by Lakis Christodoulou and Takis Kasparis.
The Multimedia image-video processing and analysis research group at the Dept. of
Electrical and Computer Engineering and Computer Science (C&EC&CS) at CUT has been
focused on developing robust techniques and methodologies, advanced and innovative
statistical adaptive image-video processing algorithms for 3D stereo video investigating and
developing techniques, and technologies needed to create and analyze 3D images and 3D
videos provided by a low cost stereo standard-definition (SD) and high-definition (HD)
cameras with the specific focus on human surveillance, boarder and coastline security
applications.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Proposed Methodology – Design Architecture

We propose to implement advanced and novel 3D Stereo vision algorithms for


segmentation, detection, recognition, and tracking of video objects and video analysis
using depth information and statistical image-video processing.

Figure 4.1.1 illustrates how these research efforts fit in the overall surveillance system based
on a low cost stereo vision webcam mounted high-definition camera.

Figure 52. General framework of a Stereo or Multi-camera Video Surveillance System

For our 3D Stereo Vision Camera system we propose the use of a low-cost Stereo Vision
System solution

Figure 53. Minoru 3D SD/HD Stereo Web-Camera


Hardware Specs
Stereo Web Camera – Minoru (70 Euros only!)
Laptop Intel® Core™ i5 CPU – M 450 @ 2.40GHz
64-Bit Operating System – Windows 7

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

No 3D Auto-stereoscopic Display!
3D Video content evaluation based on 3D Depth video processing and display on 2D
format
Software Specs
MATLAB R2009a V.7.2.
Image Processing Toolbox
Signal Processing Toolbox
Algorithm Matlab m-script development

Introduction
The current research project proposes a novel hybrid motion object detection and
segmentation algorithm based on a statistical and adaptive threshold. Moving object
detection and segmentation is very important in intelligent video surveillance. The main
motivation of this research work is to overcome current technical difficulties of existing
motion and segmentation techniques, and realize efficient and fast detection and
segmentation algorithm. The actual driven-motivation is to use the proposed hybrid motion
object detection and segmentation algorithm for an introduced 3D Stereo/Multiview image
sensor system introducing multi-data fusion and modeling for smart video surveillance and
motion object monitoring. The introduced new hybrid object detection and segmentation
would be deployed and used in a prototype 3D Stereo vision system for automatic and
intelligent surveillance and monitoring that will provide object recognition and tracking. In
the case of stereo or multi-view video capturing, recording, processing, and analyzing it is
very important to develop and build efficient, robust, and fast detection and segmentation
algorithms. The research project will focus on developing intelligent, biologically inspired
image and video analysis algorithms that are capable of performing relevant human or other
object motion surveillance tasks based on visual information acquired from one and more
cameras.
We are introducing a smart and efficient algorithm based on motion detection, foreground
and background segmentation, using DSP and adaptive threshold techniques that are
superior of existing conventional motion object detection and segmentation algorithms. The
proposed algorithm is based on a hybrid motion technique that relies on the three frame
differencing and on statistical quantities, such as the mean and the variance. Also, the
algorithm shows a foreground-background segmentation methodology that is combined
with a moving detection algorithm. The hybrid motion algorithm has been tested and
verified for gate entrance and access control for human object surveillance system.
Experimental results show the improved hybrid motion algorithm overcomes the technical
difficulties of the three frame-differencing method. The hybrid motion algorithm has a low
computational complexity, a high detection segmentation accuracy rate, a fast and
computational processing speed. This methodology is also providing a low-cost web-camera
solution for visual surveillance and automated monitoring applications, efficient and robust
for video analytics. The main benefit is the development of a novel hybrid motion detection
and segmentation video object algorithm based on adaptive and statistical DSP algorithms.
Future work involves additional adaptive motion detection and probability techniques,
multi-object detection-segmentation-recognition, and incorporating 3D Stereo/Multi-view

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

video processing based on 3D computer vision, 3D Depth maps, and 3D object


reconstruction.
The goal of a visual surveillance system is to detect abnormal object behaviors and to raise
alarms when such behaviors are detected. After moving objects are detected, it is essential
to classify them into pre-defined categories, so that their motion behaviors can be
appropriately interpreted in the context of their identities and their interactions with the
environment. Consequently, object classification is a vital component in a complete visual
surveillance system.

Work Done

Work Done
Camera Environment Modeling
Object Segmentation
Object Detection
Current Status
Disparity Map
3D Depth Map
Correspondence Matching Algorithms
3D Stereo Vision Algorithms
Evaluation Algorithms
Stereo to Left and Right video channel conversion
Next Targets
Object Classification
Object Tracking
Multi-Object Detection

Currently we have also proceeded with the following accomplished


work

1. 3D Image-Video Depth Processing, Rendering, and


Reconstruction

2. 3D Stereo Image-Video Processing and Synthesis

Future Work

Design and implement a Novel IR Infrared Stereo Vision camera system with
• The Development of a Remotely Piloted for Boarder and Coastline Surveillance
• Infrared High Definition High Frame-Rate Color Stereo Camera for Surveillance
• Video Analysis and Image and Video Data Mining
• Surveillance, Monitoring, and Detection Simulation
Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013
Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Infrared Image Sensors


SD and Full-HD Spatial Resolution
3D Laser Scanner
3D Laser Telemetry
Video-DSP Analytics
Real Time Video Procesing
Pan-tilt-zoom capabilities
GPS/GPRS
Wireless and Radio Communication
Border Surveillance and Monitoring
Coastline Surveillance and Security
Stereo Vision for Medical Endoscopisis
Stereo Vision for Automotive Vision

3D VISION CAMERAS
Solution to challenges: Range/3D vision camera
3D vision camera: 2D camera + depth perception
Low cost: one sensor vs. current systems:
-SRR 24GHz (30m) + LLR 77GHz (150m)
-LRR (150m) + stereo camera (25m)
-Laser radar (40m)
High performance: combined ranging detection + obstacle classification (2D pixel array)Two
main techniques:
Stereo vision
Time of flight (TOF)
3D VISION CAMERAS –STEREO VISION Range measurement using stereo vision:
▸Depth perception achieved by comparing and elaborating the scene from two different
points of view
▸Main drawbacks:
-2 sensors
-Very complex elaboration system (correspondence problem)
-Weak systems (alignment)

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Task: Object detection and Segmentation algorithms


Goal: To improve upon our existing solution and provide summarized information about
objects of interest tracked across multiple scenes and answering the question: is it human?
Task: Human motion analysis algorithms
Goal: To investigate and implement algorithms capable of determining which type of activity
the people in a scene are engaged in (and whether it can be labeled as suspicious behavior).
Task: Use biologically-inspired models of visual attention in the Human Visual Surveillance
Goal: To investigate and implement algorithms that rely on computational models of visual
attention and novelty detection to solve visual surveillance problems.
Research Work carried-out (submitted)

Research Literature

In A NOVEL HYBRID MOTION OBJECT DETECTION AND SEGMENTATION ALGORITHM


BASED ON A STATISTICAL AND ADAPTIVE THRESHOLD

Hybrid F & B-ground


Motion Detection
Algorithm Segmentation

Statistical
Adaptive
Threshold

Fo Fi-2 Fi-1 Fi Fn-1

Figure 54. Foreground and Background Detection and Segmentation Algorithm

A NOVEL HYBRID MOTION OBJECT DETECTION AND SEGMENTATION ALGORITHM BASED


ON A STATISTICAL AND ADAPTIVE THRESHOLD
SIPA 2011 Signal and Image Processing and Applications
Paper Submission Deadline: Jan 15, 2011
Main Conference: 22 - 24 June 2011, Crete, Greece
http://www.iasted.org/conferences/cfp-738.html
[738-014ljd.pdf]

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

In 2011 we have presented the research work on ADVANCED STATISTICAL AND ADAPTIVE
THRESHOLD TECHNIQUES FOR MOVING OBJECT DETECTION AND SEGMENTATION at the
DSP2011 International Conference.
The proposed statistical adaptive algorithm is implemented in four main stages of algorithm:
i) statistical analysis and computation based on the spatial content of the current frame ii)
temporal n-frame differencing consisted of the two-frame and three-frame-differencing
method for object detection, ii) adaptive threshold based on robust statistical quantities’
computation as described in (i) and depending on the temporal differencing taking into
account the variations of the moving pixels, and iv) foreground moving object and
background non-moving object segmentation based on the statistical comparison evaluation
of the two-frame and three-frame differencing. Figure 1 shows the analytical processing
stages of the statistical and adaptive algorithm for motion object detection and
segmentation. The discussion and analysis of the algorithm is divided into four subsections.
i col. pixels Image Pre-processing

Classic Statistical
j row
Threshold Analysis &

Single- Three-
Frame Diff. Frame Diff.

n0 n1 n2 n3 nF-1
Adaptive

F&B-ground

Figure 55. Statistical Adaptive Techniques

Frame 0 i col. pixels


Frame 1 p(i,j)
Frame 2
i row pixels

Frame n-1
Frame n
Frame n+1

Frame F-3
Frame F-2
FrameDiff_I: n-(n-1) Frame F-1

FrameDiff_II: (n+1)-1

Figure 56. Three-Frame Differencing Concept


Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013
Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Video Input
Block1

Image Analysis, Hybrid


& Motion Block2
Pre-processing Algorithm

Block3 Block4
Statistical Adaptive Object
Object Threshold
Segmentation

Figure 57. Hybrid Motion-Segmentation-Detection Algorithm: Abstract Block Diagram

ADVANCED STATISTICAL AND ADAPTIVE THRESHOLD TECHNIQUES FOR MOVING OBJECT


DETECTION AND SEGMENTATION
DSP 2011 International Conference on Digital Signal Processing
Image/Video Processing Techniques
Paper Submission Deadline: January 14, 2011
Main Conference: 06-08 JULY 2011, Corfu, GREECE
http://www.dsp2011.gr/call
[dsp2011_submission_29.pdf]

AN IMPROVED VIDEO OBJECT DETECTION ALGORITHM BASED ON AN ADAPTIVE MOTION


HISTOGRAM THRESHOLD
The proposed hybrid motion detection and segmentation algorithm relies on three main
stages of algorithm: i) three-frame-differencing method for object detection, ii) adaptive
threshold based on robust statistical quantities’ computation, and iii) foreground moving
object and background non-moving object segmentation.

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Video Input

Image Analysis, &


Block1

Pre-processing

Hybrid
Block2
Motion

Block3 Block4

Statistical Adaptive
Object Object
Threshold
Segmentation

Figure 58. Hybrid Motion Object Detection & Segmentation Algorithm:


Analytical Block Diagram
SAMHT_15JAN_SIPA2011.doc

Lakis Christodoulou, Takis Kasparis, and Christos Loizou, A Novel Hybrid Motion Object
Detection and Segmentation Algorithm based on a Statistical and Adaptive threshold, Dep.
of Electrical Engineering & Information Technology, (EE&IT), Cyprus University of
Technology, Intercollege, Dep. of Computer Science, School of Sciences, Limassol, Cyprus,
4th Cyprus Workshop on Signal Processing and Informatics University of Cyprus, Nicosia,
Cyprus, New Campus, THEE001 ROOM 148, July 14, 2011
http://cwspi2011.cs.ucy.ac.cy
http://www.dsp-conferences.info/4th%20CWSPI%20Final%20Program-rev.pdf

3D Stereo Vision System Technical Seminar

An Overview in 3D Vision Systems and 3D HD TV Display Technologies

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

Stereo Camera-Image Left Image

Left Camera
Left
Side IA&P
Side to Video
Left and
Right Right IA&P

Video
Right Camera

Right Image
Left

Video Stereo Disparity 3D Dense


Vision
Right
Dense Depth

Video

Stereo Vision Statistical

Algorithm
Analysis &
IA&P: Image Analysis & Preprocessing

Fig. 59. Stereo Vision Processing –


Algorithm System Architecture

A Low-Cost Stereo Vision System for Human Surveillance

A NEW STEREO VISION SYSTEM FOR HUMAN OBJECT MONITORING


AND SURVEILLANCE

ICCV International Conference on Computer Vision


Paper Submission Deadline: March 1, 2011
Main Conference: 8 - 11 November 2011, Barcelona, Spain
http://www.iccv2011.org/call-for-papers

14th International Conference on Computer Vision(ICCV2013)


http://www.iccv2013.org/index.html

[ICCV_01MARCH2011.doc]

A NOVEL ARCHITECTURE OF A 3D STEREO VISION SYSTEM

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

EURASIP Journal on Image and Video Processing

Published by Hindawi

Paper Submission Deadline: OPEN

http://www.hindawi.com/journals/ivp/

[ICCV_01MARCH2011.doc]

ICCV_01MARCH2011.doc

A NEW STEREO VISION SYSTEM FOR HUMAN OBJECT MONITORING AND


SURVEILLANCE

References
[1] http://www.chattenassociates.com/content/harv-isr-gimbal

[2] http://www.chattenassociates.com/content/harv-3d-hd-stereo-vision-system

[3] http://www.e-consystems.com/Stereo-Vision-Camera.asp

[4] http://www.ptgrey.com/products/stereo.asp

[5] http://sharp-world.com/corporate/news/100512.html

[6] http://qvcorp.com/#loc=testimonials

[7] http://www.roadnarrows.com/store/e-con-cameras-solutions/capella-stereo-vision-
camera.html

[8] http://astrobotic.net/2011/02/21/stereo-vision-for-3d-mapping-and-navigation/

[9] Defense & Security


Combining camera systems for human shape detection
Alberto Broggi, Massimo Bertozzi, Mirko Felisa, Paolo Grisleri, and Michael Del Rose
Incorporating daylight and infrared optics offers improved rates of target acquisition.
9 February 2009, SPIE Newsroom. DOI: 10.1117/2.1200902.1472
http://spie.org/x33633.xml

[10] http://www.tyzx.com/products/DeepSeaG2.html

[11] http://www.mobilerobots.com/researchrobots/accessories/mobilerangerc3d.aspx
Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013
Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

[12] http://www.focusrobotics.com/products/systems.html

[14] http://www.videredesign.com/index.php?id=1
http://www.videredesign.com/assets/docs/manuals/smallv-4.4d.pdf

[15] FLIR http://www-


robotics.jpl.nasa.gov/publications/Arturo_Rankin/matthies98performance.pdf

[16] Minoru 3D StereoWebcam


http://books.google.com.cy/books?id=djsHEnqIdFoC&pg=PA53&lpg=PA53&dq=6D-
Vision+diagram&source=bl&ots=BfhFlzSMx1&sig=juFIuPLFIj68Epat_KUkr5hiZkg&hl=el&sa=X
&ei=IMq3T_zUKoWm0QWcoNjyBw&ved=0CFgQ6AEwAA#v=onepage&q=6D-
Vision%20diagram&f=false

[16] DYNAMIC STEREO VISION FOR INTERSECTION ASSISTANCE


1Franke, Uwe*, 2Rabe, Clemens, 1Gehrig, Stefan, 3Badino, Hernan, 1Barth, Alexander
1Daimler AG, Group Research, Germany, 2University of Kiel, 3University of Frankfurt

[17] http://www.mercedes-benzsa.co.za/media-room/news/15032386811/daimler-6d-
vision-technology-as-one-of-the-most-promising-innovations-in-germany/

[18] Uwe Franke, Clemens Rabe, Hernan Badino, and Stefan Gehrig
6D-Vision: Fusion of Stereo and Motion for Robust Environment Perception
DaimlerChrysler AG, 70546 Stuttgart, Germany
fuwe.franke,clemens.rabe,hernan.badino,stefan.gehrigg@daimlerchrysler.com

[19] http://www.deutscher-zukunftspreis.de/en/nominierter/6d-vision-%E2%80%93-
recognizing-danger-faster-humans

[20] http://www.6d-vision.com/

[21] http://3dvision-blog.com/tag/fujifilm-3d-camera/

[22] http://www.fujifilm.com/products/3d/camera/finepix_real3dw1/

[23] www.drt3d.com/W06-Fuji3d.pdf

[24]http://www.digitalcamerareview.com/default.asp?newsID=4756&review=sony+bloggie+
3d

[25] ISee3D stereoscopic imaging system

[26] http://thetechjournal.com/electronics/camera-electronics/sony-bloggie-3d-now-
available.xhtml#ixzz1vUmcstDu

[27] http://www.docs.sony.com/release/MHSFS3_handbook.pdf

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013


Electrical & Computer Engineering & Computer Science
Lakis Christodoulou FINAL

[28] http://www.hytekautomation.ca/BNE001.aspx?productId=20

[29] http://www.aliexpress.com/product-fm/513577626-Free-shipping-popular-3D-
webcam-3D-stereo-camera-with-a-microphone-stereo-3D-2D-camera-
wholesalers.html?tracelog=back_to_detail_a

[30] http://www.htc.com/www/smartphones/htc-evo-3d/

[31] http://dl3.htc.com/htc_na/user_guides/htc-evo-3d-sprint-ug.pdf

[32] http://www.true3di.com/3d-microscopic.html

[33]http://www.quantificare.com/index.php?option=com_content&view=article&id=9&Ite
mid=53

[34] http://www.solid-look.com/

[36] http://www.surveyor.com/stereo/stereo_info.html

[37] Lakis Christodoulou, Liam M Mayron, Hari Kalva, Oge Marques, Borko Furhtin, ‘‘3D TV
Using MPEG-2 and H . 264 View Coding and Autostereoscopic Displays’’, Proceedings of the
14th annual ACM international conference on Multimedia(2006).

[38] Lakis Christodoulou, Hari Kalva, Liam Mayron, Oge Marques, and Borko Furht
CHALLENGES AND OPPORTUNITIES IN VIDEO CODING FOR 3D TV’, Dept. of Computer
Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431

[39] Glennoah Billie, ‘Microsoft Kinect Sensor Evaluation’, NASA USRP – Internship Final
Report Johnson Space Center 1 8/5/2011, Southwestern Indian Polytechnic Institute,
Albuquerque, New Mexico, 87184
[40] Daria NITESCU, Denis Lalanne, and Matthias Schwaller, ‘EVALUATION OF POINTING
STRATEGIES FOR MICROSOFT KINECT SENSOR DEVICE’, FINAL PROJECT REPORT, University of
Bern, University of Neuchatel, University of Fribourg, 14 February 2012
[41] http://www.digitalmanu.com/pr-e.htm
[42] True-View™ – The 3D Device for Your Smartphone
http://www.bornrich.com/true-view-the-3d-device-for-your-smartphone.html

Lakis Christodoulou, All Rights Reserved ® 2013 , CopyRights© 2013 30-Oct-2013

View publication stats

S-ar putea să vă placă și