Presentation + Paper
31 May 2022 Capturing latent 3D representations of parallax-sensitive landmarks
Author Affiliations +
Abstract
The problem of identifying 3D objects that have drastically different 2D representations in different views is challenging. This is because features that are important for matching are not always view-invariant and may not be visible from certain perspectives. This research looks to infer the 3D geometry of specific landmarks such that predictions of a viewpoint’s orientation about the landmark can be made from 2D images. For our dataset we use Google Earth to visit four well-known landmark sites and capture 2D images from a range of perspectives about them. The landmarks are chosen to be sensitive to parallax in order to ensure wide variance in our training images. We implement a 5-layer autoencoder network that takes 224x224x3 sized images and encodes them into 3136 element-long vectors, then replicates the input image from the encoding vector. We use the bottleneck encodings to generate predictions of the camera’s azimuth, elevation, and range relative to the landmark. We then compare input images with predicted parameters and replicated decoded images to measure the accuracy of our model. Our experimentation shows that a simple autoencoder network is capable of learning enough of the 3D geometry of a landmark to accurately predict viewpoint orientations from 2D images of landmarks.
Conference Presentation
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Michael McCarver, Nazanin Rahnavard, and Abhijit Mahalanobis "Capturing latent 3D representations of parallax-sensitive landmarks", Proc. SPIE 12096, Automatic Target Recognition XXXII, 120960F (31 May 2022); https://doi.org/10.1117/12.2622530
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Cameras

Bridges

3D modeling

Computer programming

Network architectures

3D image processing

Computer vision technology

RELATED CONTENT

Epipolar geometry comparison of SAR and optical camera
Proceedings of SPIE (March 02 2016)
Improved Fourier descriptors in model-based pose estimation
Proceedings of SPIE (December 02 2011)
Stereo-based 3-D scene interpretation using semantic nets
Proceedings of SPIE (March 01 1992)
High-performance camera calibration algorithm
Proceedings of SPIE (October 22 1993)
Computer vision reconstructing 3 D model from 2 D...
Proceedings of SPIE (March 01 1992)

Back to Top