We study distributed inference, learning and optimization in scenarios which involve networked entities in time-varying and random networks, which are ad-hoc in nature. In this thesis, we propose distributed recursive algorithms where the networked entities simultaneously incorporate locally sensed information and information obtained from the neighborhood. The class of distributed algorithms proposed in this thesis encompasses distributed estimation, distributed composite hypothesis testing and distributed optimization. The central theme of the scenarios considered involve systems constrained by limited on board batteries and hence constrained by limited sensing, computation and extremely limited communication resources. A typical example of such resource constrained scenarios being distributed data-parallel machine learning systems, in which the individual entities are commodity devices such as cellphones. Due to the inherent ad-hoc nature of the aforementioned setups, in conjunction with random environments render these setups central coordinator-less. Keeping in mind the resource constrained nature of such setups, we propose distributed inference and optimization algorithms which characterize the interplay between communication, computation and optimality, while allowing for heterogeneity among clients in terms of objectives, data collection and statistical dependencies. With massive data, models for learning and optimization have been getting more and more complex to the extent of being almost analytically intractable. In such models, obtaining gradients for the associated loss function is very expensive and potentially intractable due to the lack of a closed form for the loss function. A major thrust of this thesis is gradient free zeroth order optimization which encompasses distributed setups which exhibit data parallelism and also potentially analytically intractable loss functions. On top of gradient free optimization, in this thesis we also study projection free zeroth order methods for constrained optimization. The techniques developed in this thesis are generic and are of independent interest in classical fields such as stochastic approximation, statistical decision theory and optimization.