3D sensing with portable imaging systems is becoming more and more popular in computer vision applications such as autonomous driving, virtual reality, robotics manipulation and surveillance, due to the decreasing expense and size of RGB cameras. Despite the compactness and portability of the small baseline vision systems, it is well-known that the uncertainty in range finding using multiple views and the sensor baselines are inversely related. On the other hand, besides compactness, the small baseline vision system has its unique advantages such as easier correspondence and
large overlapping regions across views. The goal of this thesis is to develop computational methods and small baseline imaging systems for 3D sensing of complex scenes in real world conditions. Our design principle is to physically model the scene complexities and specifically infer the uncertainties for the images captured with small baseline setups. With this design principle, we make four contributions. In the first contribution, we propose a two-stage near-light photometric stereo method using a small (6 cm diameter) LED ring. The imaging system is compact compared to traditional photometric stereo systems. In the second contribution, we develop an algorithm to simultaneously estimate the occlusion pattern and depth for thin structures from a focal image stack, which is obtained either by varying the focus/aperture of the lens or computed from a one-shot light field image. As the third contribution, we propose a learning-based method to estimate per-pixel depth and its uncertainty continuously from a monocular
video stream, with small camera baselines across adjacent frames. These depth probability volumes are accumulated over time as more incoming frames are processed
sequentially, which effectively reduces depth uncertainty and improves accuracy, robustness, and temporal stability. Finally, using a pair of high resolution camera and
laser projector, we develop a high spatial resolution Diffuse Optical Tomography system that can detect accurate boundaries and relative depth of heterogeneous structures
up to a depth of 8mm below a highly scattering medium such as whole milk. We showcase the application of a small baseline vision system for in-vivo microscale 3D reconstruction of capillary veins and develop a system for real-time analysis of microvascular blood flow for critical care. We believe that the computational methods
developed in this thesis would find more applications of compact 3D sensing under challenging conditions.