Long Range Navigator
Extending robot planning horizons beyond metric maps

1University of Washington, 2Google Deepmind, 3Overland AI, 4Mila
Offroad navigation requires reaching waypoints without accurate prior maps. Typically, the robot stack has a local map created through sensor fusion which can be myopic. LRN enables the robot to look beyond local maps through affordances in image space.

Abstract

A robot navigating an outdoor environment with no prior knowledge of the space must rely on its local sensing to perceive its surroundings and plan. This can come in the form of a local metric map or local policy with some fixed horizon. Beyond that, there is a fog of unknown space marked with some fixed cost.

A limited planning horizon can often result in myopic decisions leading the robot off course or worse, into very difficult terrain. Ideally, we would like the robot to have full knowledge that can be orders of magnitude larger than a local cost map. In practice, this is intractable due to sparse sensing information and often computationally expensive.

In this work, we make a key observation that long-range navigation only necessitates identifying good frontier directions for planning instead of full map knowledge. To this end, we propose the Long Range Navigator (LRN), that learns an intermediate affordance representation mapping high-dimensional camera images to affordable frontiers for planning, and then optimizing for maximum alignment with the desired goal. The LRN notably is trained entirely on unlabeled ego-centric videos making it easy to scale and adapt to new platforms. Through extensive off-road experiments on Spot and a Big Vehicle, we find that augmenting existing navigation stacks with LRN reduces human interventions at test-time and leads to faster decision making indicating the relevance of LRN.

Video

LRN Big Vehicle

Spot Robot Dump Site

Spot Robot Helipad

Spot Robot Mini River

LRN System Architecture

LRN System Architecture

LRN is fed with egocentric camera images and a goal heading vector. LRN is composed of the following components, namely, 1) the Affordance Backbone: computes affordable frontiers in the image space as heatmaps agnostic of the goal. These affordance hotspots are then projected into a discrete set of affordable headings for the robot to follow, 2) the Goal Conditioned Head, wherein the affordance scores are multiplied with a discrete gaussian score around the goal and a separate gaussian around the previous prediction (to maintain consistency). The maximum combined score heading (red) is selected. The local system can then use that frontier as a goal for local planning instead of the true goal. This process then repeats as new sensor information comes in.

LRN Qualitative Results

LRN Qualitative Results

Our Long Range Navigator (LRN) enables robots to make informed decisions about long-range navigation by learning to identify affordable frontier directions from visual input. The image above demonstrates how LRN processes the environment to identify viable paths frontiers. This intermediate affordance representation allows the robot to plan effectively over much longer horizons than traditional local metric maps.

LRN Qualitative Results

LRN Mission Deployment

LRN Mission Deployment

Our Long Range Navigator (LRN) enables robots to make informed decisions about long-range navigation by learning to identify affordable frontier directions from visual input. This intermediate affordance representation allows the robot to look beyond the local map and avoid getting into risky regions which consequently leads to less human interventions.

Scalable Data Collection from Videos

LRN Path Planning Results LRN Path Planning Results

LRN is powered by a scalable data collection method where we can leverage in the wild videos or videos recorded by people. This allows us to train LRN without access to a robot. The tool is built on top of the incredible Co-Tracker. We track a point on the ground back in time which allows us to get distant frontiers in the image space without the need for localization and estimation or other sensors like IMU/LiDAR. We were able to walk around for around 1 hour with the Insta360 camera and use that for training the LRN deployed on the Spot robot.

Looking Ahead

We hope this work inspires more research to enable long range decision making by using visual inputs and leveraging terminal value functions to increase the planning horizons of robots. Our work is not without limitations.

Currently the LRN has no notion of depth. So having a goal in front of a cluster of trees might look the same as a goal behind the cluster. This could require a mixture of heuristics to switch between the LRN and a simple euclidean heuristic. Morover, this could be learned from experience or decided by a VLM. Currently we have a hand-tuned parameter to combine the goal direction and heatmap scores to decide on which direction to head towards. Future work could learn this score from experience.

The robot does not have any memory. Even though our method helps the robot avoid local minima, it can still get stuck if the obstructions are simply not visible in the image. Having memory and being able to backtrack and not go to the same direction could be an important aspect from a deployment perspective.

Feel free to reach out to us if you have any questions or would like to discuss this work further if you're interested in deployable systems.

BibTeX

@article{schmittle2025lrn,
  author    = {Schmittle, Matthew and Baijal, Rohan and Hatch, Nathan and Scalise, Rosario and Guaman Castro, Mateo and Talia, Sidharth and Khetarpal, Khimya and Srinivasa, Siddhartha and Boots, Byron},
  title     = {Long Range Navigator (LRN) : Extending robot planning horizons beyond metric maps},
  journal   = {arXiv},
  year      = {2025},
}