Fine-Tuning Hard-to-Simulate Objectives for Quadruped Locomotion:
A Case Study on Total Power Saving 🔋

Ruiqian Nai1,2,3, Jiacheng You1,2,3, Liu Cao1, Hanchen Cui3, Shiyuan Zhang1, Huazhe Xu1,2,3, Yang Gao1,2,3
1Tsinghua University, 2Shanghai AI Lab, 3Shanghai Qi Zhi Institute

Our method fine-tunes hard-to-simulate objectives through iteratively real data collection and simulation policy updates.

Abstract

Legged locomotion is not just about mobility; it also encompasses crucial objectives such as energy efficiency, safety, and user experience, which are vital for real-world applications. However, key factors such as battery power consumption and stepping noise are often inaccurately modeled or missing in common simulators, leaving these aspects poorly optimized or unaddressed by current sim-to-real methods. Hand-designed proxies, such as mechanical power and foot contact forces, have been used to address these challenges but are often problem-specific and inaccurate.

In this paper, we propose a data-driven framework for fine-tuning locomotion policies, targeting these hard-to-simulate objectives. Our framework leverages real-world data to model these objectives and incorporates the learned model into simulation for policy improvement. We demonstrate the effectiveness of our framework on power saving for quadruped locomotion, achieving a significant 24-28% net reduction in total power consumption from the battery pack at various speeds. In essence, our approach offers a versatile solution for optimizing hard-to-simulate objectives in quadruped locomotion, providing an easy-to-adapt paradigm for continual improving with real-world knowledge.

Video

Case Study: Total Power Saving

To minimize power consumption from the battery pack while maintaining the desired speed .

Motivation: Current popular low-cost quadruped robots suffer from limited battery life per charge1, 2, 3, 4. However, the complex dynamics of Permanent Magnet Synchronous Motors and advanced control strategies like Field-Oriented Control make total power consumption hard-to-simulate.

Baseline: Traditionally, power saving relies on hand-designed proxies that represent analytical power consumption, including mechanical power and Joule heating.

Ours: We present a data-driven proxy that leverages an LSTM-based measurement model trained on real-world data to accurately predict the robot's power consumption.

Metric: Power reduction after fine-tuning compared to the pre-trained policy.

  • Gross power: The power directly drawn from the battery pack.
  • Net power: The gross power minus the power measured when all motors are idle.

Final Power Reduction

Our method achieves significant net (gross) power reduction at various speeds, while fine-tuning with analytical proxies only shows a minor improvement.

Analytical proxy Data-driven proxy (ours)
v = 0.5 m/s 11.8% (8.3%) 28.4% (19.6%)
v = 0.8 m/s 6.2% (4.5%) 27.0% (20.3%)
v = 1.1 m/s 5.0% (3.9%) 24.2% (19.4%)
The number in the parentheses is the gross power reduction.

Power Distribution

Our approach reliably reduces power consumption, whereas the baseline exhibits greater variability and can sometimes lead to higher power usage (with normalized power > 1).

power distribution
Each point represents the net power consumption of a 1-second segment within the test run. We normalize the power with respect to the pre-trained policy: \(\frac{P^\text{fine-tuned}}{\operatorname{Mean}(P^\text{pre-trained})}\).

In-the-Wild Evaluation

To better mirror practical scenarios, we conducted long-distance evaluations both indoors and outdoors. The plot below illustrates the battery's state of charge (SoC) relative to the distance traveled, underscoring how our approach effectively extends the robot's operational range.

Indoor Evaluation

Outdoor Evaluation