KEYWORDS: 3D modeling, Buildings, Cameras, Optical filters, Sensors, Data modeling, Systems modeling, Performance modeling, Detection and tracking algorithms, Roads
A mobile robot operating in a netcentric environment can utilize offboard resources on the network to improve its
local perception. One such offboard resource is a world model built and maintained by other sensor systems. In this
paper we present results from research into improving the performance of Deformable Parts Model object detection
algorithms by using an offboard 3D world model. Experiments were run for detecting both people and cars in 2D
photographs taken in an urban environment. After generating candidate object detections, a 3D world model built from
airborne Light Detection and Ranging (LIDAR) and aerial photographs was used to filter out false alarm using several
types of geometric reasoning. Comparison of the baseline detection performance to the performance after false alarm
filtering showed a significant decrease in false alarms for a given probability of detection.
Recent work in computer vision has demonstrated the potential to automatically recover camera and scene geometry
from large collections of uncooperatively-collected photos. At the same time, aerial ladar and Geographic Information
System (GIS) data are becoming more readily accessible. In this paper, we present a system for fusing these data
sources in order to transfer 3D and GIS information into outdoor urban imagery. Applying this system to 1000+ pictures
shot of the lower Manhattan skyline and the Statue of Liberty, we present two proof-of-concept examples of geometry-based
photo enhancement which are difficult to perform via conventional image processing: feature annotation and
image-based querying. In these examples, high-level knowledge projects from 3D world-space into georegistered 2D
image planes and/or propagates between different photos. Such automatic capabilities lay the groundwork for future
real-time labeling of imagery shot in complex city environments by mobile smart phones.
KEYWORDS: Video, Databases, 3D video streaming, Signal processing, Sensor fusion, Target recognition, Current controlled current source, Machine vision, Computer vision technology, Reconstruction algorithms
We extend recent automated computer vision algorithms to reconstruct the
global three-dimensional structures for photos and videos shot at fixed
points in outdoor city environments. Mosaics of digital stills and
embedded videos are georegistered by matching a few of their 2D features
with 3D counterparts in aerial ladar imagery. Once image planes are
aligned with world maps, abstract urban knowledge can propagate from the
latter into the former. We project geotagged annotations from a 3D map
into a 2D video stream and demonstrate their tracking buildings and streets
in a clip with significant panning motion. We also present an interactive
tool which enables users to select city features of interest in video
frames and retrieve their geocoordinates and ranges. Implications of this
work for future augmented reality systems based upon mobile smart phones
are discussed.
Motivated by the problem of uncovering networks of illicit actors in complex urban environments, we present a
prototype system for intuitive navigation of vehicle track data via interacting map and network views. Our system
combines 3D geospatial visualization, social network display and interactive track search software, and it provides a
multi-touch interface for operators to navigate urban scenes and investigate potentially suspicious vehicle activity. We
describe a case study to highlight the system's capabilities using ground truth vehicle data collected during a 2007 urban
exercise. This data is most naturally viewed as tracks in space and time. But as cluttered track displays obscure
potentially important actor relationships, our system provides a social network picture whose condensed format is easier
to interpret. Through coordinated space-time/vehicle network searches, we demonstrate how analysts can uncover "Red"
activities of tactical significance.
KEYWORDS: LIDAR, 3D image processing, Geographic information systems, Cameras, Digital photography, Photography, 3D acquisition, Buildings, Satellites, Satellite imaging
Working with New York data as a representative and instructive example, we fuse aerial ladar imagery with satellite
pictures and Geographic Information System (GIS) layers to form a comprehensive 3D urban map. Digital photographs
are then mathematically inserted into this detailed world space. Reconstruction of the photos' view frusta yields their
cameras' locations and pointing directions which may have been a priori unknown. It also enables knowledge to be
projected from the urban map onto georegistered image planes. For instance, absolute geolocations can be assigned to
individual pixels, and GIS annotations can be transferred from 3D to 2D. Moreover, such information propagates among
all images whose view frusta intercept the same urban map location. We demonstrate how many imagery exploitation
challenges (e.g. identify objects in cluttered scenes, select all photos containing some stationary ground target, etc)
become mathematically tractable once a 3D framework for analyzing 2D images is adopted. Finally, we close by briefly
discussing future applications of this work to photo-based querying of urban knowledge databases.
We assess the impact of supplementing two-dimensional video with three-dimensional geometry for persistent vehicle
tracking in complex urban environments. Using recent video data collected over a city with minimal terrain content, we
first quantify erroneous sources of automated tracking termination and identify those which could be ameliorated by
detailed height maps. They include imagery misregistration, roadway occlusion and vehicle deceleration. We next
develop mathematical models to analyze the tracking value of spatial geometry knowledge in general and high resolution
ladar imagery in particular. Simulation results demonstrate how 3D information could eliminate large numbers of false
tracks passing through impenetrable structures. Spurious track rejection would permit Kalman filter coasting times to be
significantly increased. Track lifetimes for vehicles occluded by trees and buildings as well as for cars slowing down at
corners and intersections could consequently be prolonged. We find high resolution 3D imagery can ideally yield an
83% reduction in the rate of automated tracking failure.
A prototype image processing system has recently been developed which generates, displays and analyzes threedimensional ladar data in real time. It is based upon a suite of novel algorithms that transform raw ladar data into cleaned 3D images. These algorithms perform noise reduction, ground plane identification, detector response deconvolution and illumination pattern renormalization. The system also discriminates static from dynamic objects in a scene. In order to achieve real-time throughput, we have parallelized these algorithms on a Linux cluster. We demonstrate that multiprocessor software plus Blade hardware result in a compact, real-time imagery generation adjunct to an operating ladar. Finally, we discuss several directions for future work, including automatic recognition of moving people, real-time reconnaissance from mobile platforms, and fusion of ladar plus video imagery. Such enhancements of our prototype imaging system can lead to multiple military and civilian applications of national importance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.