# Heuristic Search Based Planning by Minimizing Anticipated Search Efforts

We focus on relatively low dimensional robot motion planning problems, such as planning for navigation of a self-driving vehicle, unmanned aerial vehicles (UAVs), and footstep planning for humanoids. In these problems, there is a need for fast planning, potentially compromising the solution quality. Often, we want to plan fast but are also interested in controlling the solution quality because we desire efficient planning. Bounded-suboptimal heuristic search algorithms are a popular alternative to optimal heuristic search algorithms that compromise solution quality for computation speed. Specifically, these searches aim to find a solution that can be computed faster than the optimal solution while guaranteeing that its cost is bounded within a specified factor of the optimal cost (bounded-suboptimality). Currently, most bounded-suboptimal heuristic search algorithms speed up planning by guiding the search using cost-to-goal estimates. Cost-to-goal estimates are not necessarily correlated with the search effort required to reach the goal. The key idea of this thesis is to *explicitly reason about anticipated search efforts required to reach the goal instead of cost-to-goal estimates, and achieve higher speedup by guiding the search along solutions that minimize these anticipated search efforts* while ensuring bounded-suboptimality of the computed solutions. Also, bounded-suboptimal heuristic search algorithms have been largely investigated in the context of deterministic planning. In this thesis, we use the key idea of this thesis to speed up planning while ensuring bounded suboptimality in the context of deterministic as well as probabilistic planning.

To this end, our first contribution is to speed up deterministic robot planning problems formulated as bounded-suboptimal heuristic search. Weighted A* (wA*) search is a popular bounded-suboptimal heuristic search algorithm. We investigate the problem of computing heuristics that explicitly aim to reduce the search efforts of wA*. For heuristic computation, it is common to solve a simpler planning problem in a relaxed space formed by relaxing some constraints in the original search space. We introduce conservative heuristics—novel heuristics that anticipate search efforts required to reach the goal. We first introduce the notion of a conservative path in the relaxed space, whose existence guarantees the existence of a feasible path in the original space. Conservative heuristics are computed such that if a conservative path exists in the relaxed space, then the search can follow the heuristic gradient to find a feasible path in the original space while expanding only those states that appear on this path. We propose an algorithm to compute conservative heuristics. We evaluate conservative heuristics theoretically as well as empirically using simulated experiments in humanoid footstep planning, planning for a UAV, and a real-world experiment in footstep planning for a NAO robot.

Our second contribution is to speed up a particular class of planning under uncertainty problems. Many real-world planning problems involve robots having to plan in partially known environments. This frequently requires planning under uncertainty over missing information about the environment. Unfortunately, the computational expense of such planning often precludes its scalability to real-world problems. The Probabilistic Planning with Clear Preferences (PPCP) framework computes optimal policies in a scalable way for a subset of such planning problems wherein there exist clear preferences over the actual values of missing information. It runs a series of fast, deterministic, A*-like searches to construct and refine a policy, guaranteeing optimality under certain conditions. Aligning with the key idea of this thesis, we anticipate that while running a series of A*-like searches to compute a policy, search efforts are correlated with the amount of missing information each path relies upon to reach the goal. We observe that by exploiting this correlation and minimizing the amount of missing information each path relies upon, marginally suboptimal policies can be computed significantly faster. To that end, we introduce Fast-PPCP, a novel planning algorithm that computes a provably bounded-suboptimal policy using significantly lesser number of searches than that required to find an optimal policy, for the same subset of problems that can be solved by PPCP. We evaluate Fast-PPCP theoretically and experimentally, showing its benefit over popular baselines. Moreover, to evaluate Fast-PPCP in discounted-reward problems such as RockSample, we also formulate a transformation that converts discounted-reward problems into discounted-cost problems to which Fast-PPCP can be applied.

In the final part of the thesis, we are motivated by the application of Fast-PPCP to real-world robot navigation problems in partially-known environments. In such problems, Fast-PPCP can potentially waste search efforts in exploring multiple partial policies that use an unknown region to reach the goal before discovering that those partial policies are invalid. To reduce this wastage, we introduce an optimized version of Fast-PPCP for planning in environments with large unknown regions. We demonstrate its utility in off-road robot navigation in simulation and on a physical robot.

## History

## Date

2022-09-30## Degree Type

- Dissertation

## Department

- Robotics Institute

## Degree Name

- Doctor of Philosophy (PhD)