Carnegie Mellon University
3D Manipulation of Objects in Photographs.pdf (176.27 MB)

3D Manipulation of Objects in Photographs

Download (176.27 MB)
posted on 2015-07-01, 00:00 authored by Natasha Kholgade Banerjee

This thesis describes a system that allows users to to perform full three-dimensional manipulations to objects in photographs. Cameras and photo-editing tools have contributed to the explosion in creative content by democratizing the process of creating visual realizations of users’ imaginations. However, shooting photographs using a camera is constrained by real-world physics, while existing photo-editing software is largely restricted to the 2D plane of the image. 3D object edits, intuitive to humans, are simply not possible in photo-editing software. The fundamental challenge in providing 3D object manipulation is that estimating the 3D structure of the object, including the geometry and appearance of object parts hidden from the viewpoint of the camera is ill-posed. 3D object manipulations reveal hidden parts of objects that were not previously seen from the viewpoint of the camera. The key contributions of this thesis are algorithms that leverage 3D models from public repositories to obtain a three-dimensional representation of objects in photographs for 3D manipulation with seamless transition in appearance of the object from the original photograph. 3D models of objects in online repositories cannot be directly used to manipulate photographed objects, as they show mismatches in geometry and appearance, and do not contain three-dimensional illumination representing the scene where the photograph was captured. The work in this thesis provides a system that align the 3D model geometry, estimates three-dimensional illumination, and completes the appearance over the object in three dimensions to provide full 3D manipulation. To correct the mismatch between the geometry of the 3D model and the photographed object, the thesis presents an automatic model alignment technique that performs an exhaustive search in the space of viewpoint, object location, scale, and non-rigid deformation. We also provide a manual geometry adjustment tool that allows users to perform final corrections while imposing smoothness and symmetry constraints. Given the matched geometry, we present an illumination estimation approach that uses the visible pixels to obtain three-dimensional environment illumination that produces plausible effects such as cast shadows and smooth surface shading. Our appearance completion approach relates visible parts of the object to hidden parts using symmetries over the publicly available 3D model. Our interactive system for editing photographs re-imagines typical photo-editing operations such as rotation, translation, copy-paste, scaling, and deformation as 3D manipulations to objects. Using our system, users have created a variety of manipulations to photographs, such as flipping cars, making dynamic compositions of multiple objects suspended in the air, performing animations, and altering the stories of historical images and personal photographs.




Degree Type

  • Dissertation


  • Robotics Institute

Degree Name

  • Doctor of Philosophy (PhD)


Yaser Sheikh

Usage metrics


    Ref. manager