<p dir="ltr">In this thesis, we outline how agents can leverage their pretrained knowledge to effectively operate within their specific environments, focusing on perception, cognition, and metacognition. Chapter 1 introduces the topic and establishes the concept of situated agent operation. </p><p dir="ltr">Chapters 2 and 3 explore the perceptual capabilities of agents. In Chapter 2, we examine how an agent can utilize common sense to interpret and make sense of incomplete or ambiguous sensory data, enabling intelligent navigation and exploration. Chapter 3 delves into how an agent can apply physical common sense to adapt their perceptual strategies when introduced to new environmental context. </p><p dir="ltr">Chapters 4 and 5 assess the cognitive abilities of agents in understanding and executing situated language instructions. Chapter 4 explores embodied dialogue, focusing on how agents built from different training mechanisms process and respond to instructions given in dynamic dialogue settings. Chapter 5 investigates the challenges agents face in following situated instructions, particularly when human intent is ambiguous or incomplete. </p><p dir="ltr">Chapter 6 addresses metacognition by developing a framework for training agents to recognize their limitations and request assistance judiciously. We formulate metacognitive help-requesting as a reinforcement learning problem that simultaneously optimizes both the reward function and the help-requesting policy itself.</p>