Fine-Tuning Hard-to-Simulate Objectives for Quadruped Locomotion:<br> A Case Study on Total Power Saving

Legged locomotion is not just about mobility; it also encompasses crucial objectives such as energy efficiency, safety, and user experience, which are vital for real-world applications. However, key factors such as battery power consumption and stepping noise are often inaccurately modeled or missing in common simulators, leaving these aspects poorly optimized or unaddressed by current sim-to-real methods. Hand-designed proxies, such as mechanical power and foot contact forces, have been used to address these challenges but are often problem-specific and inaccurate.

In this paper, we propose a data-driven framework for fine-tuning locomotion policies, targeting these hard-to-simulate objectives. Our framework leverages real-world data to model these objectives and incorporates the learned model into simulation for policy improvement. We demonstrate the effectiveness of our framework on power saving for quadruped locomotion, achieving a significant 24-28% net reduction in total power consumption from the battery pack at various speeds. In essence, our approach offers a versatile solution for optimizing hard-to-simulate objectives in quadruped locomotion, providing an easy-to-adapt paradigm for continual improving with real-world knowledge.

Motivation: Current popular low-cost quadruped robots suffer from limited battery life per charge^{1, 2, 3, 4}. However, the complex dynamics of Permanent Magnet Synchronous Motors and advanced control strategies like Field-Oriented Control make total power consumption hard-to-simulate.

Baseline: Traditionally, power saving relies on hand-designed proxies that represent analytical power consumption, including mechanical power and Joule heating.

Ours: We present a data-driven proxy that leverages an LSTM-based measurement model trained on real-world data to accurately predict the robot's power consumption.

Metric: Power reduction after fine-tuning compared to the pre-trained policy.

Gross power: The power directly drawn from the battery pack.
Net power: The gross power minus the power measured when all motors are idle.

Our method achieves significant net (gross) power reduction at various speeds, while fine-tuning with analytical proxies only shows a minor improvement.

The number in the parentheses is the gross power reduction.

	Analytical proxy	Data-driven proxy (ours)
v = 0.5 m/s	11.8% (8.3%)	28.4% (19.6%)
v = 0.8 m/s	6.2% (4.5%)	27.0% (20.3%)
v = 1.1 m/s	5.0% (3.9%)	24.2% (19.4%)

Our approach reliably reduces power consumption, whereas the baseline exhibits greater variability and can sometimes lead to higher power usage (with normalized power > 1).

Each point represents the net power consumption of a 1-second segment within the test run. We normalize the power with respect to the pre-trained policy: \(\frac{P^\text{fine-tuned}}{\operatorname{Mean}(P^\text{pre-trained})}\).

In-the-Wild Evaluation

To better mirror practical scenarios, we conducted long-distance evaluations both indoors and outdoors. The plot below illustrates the battery's state of charge (SoC) relative to the distance traveled, underscoring how our approach effectively extends the robot's operational range.

Fine-Tuning Hard-to-Simulate Objectives for Quadruped Locomotion:
A Case Study on Total Power Saving 🔋

Our method fine-tunes hard-to-simulate objectives through iteratively real data collection and simulation policy updates.

Abstract

Video

Case Study: Total Power Saving

Final Power Reduction

Power Distribution

In-the-Wild Evaluation

Indoor Evaluation

Outdoor Evaluation

Fine-Tuning Hard-to-Simulate Objectives for Quadruped Locomotion: A Case Study on Total Power Saving 🔋

Our method fine-tunes hard-to-simulate objectives through iteratively real data collection and simulation policy updates.

Abstract

Video

Case Study: Total Power Saving

Final Power Reduction

Power Distribution

In-the-Wild Evaluation

Indoor Evaluation

Outdoor Evaluation

Fine-Tuning Hard-to-Simulate Objectives for Quadruped Locomotion:
A Case Study on Total Power Saving 🔋