Apache-2.0 · Bots

RLinf

Production-grade reinforcement learning infrastructure for embodied and agentic AI.

3.2K stars 0.4K forks Apache-2.0 license 2026-06-04 verified
bash
$pip install rlinf
Open source
Overview

What is RLinf?

RLinf is a flexible and scalable open-source RL infrastructure designed for Embodied and Agentic AI. It supports real-world robot RL on Franka, XSquare Turtle2, and DOS-W1 arms, multiple simulation backends (ManiSkill, LIBERO, MetaWorld, IsaacLab, RoboCasa), and state-of-the-art VLA model fine-tuning (Pi0, Pi0.5, GR00T, OpenVLA). It also extends to agentic AI with support for Search-R1, rStar2, and multi-agent RL.

Unified RL across simulation and real hardware

RLinf supports 10+ simulation backends (ManiSkill, LIBERO, MetaWorld, IsaacLab, RoboCasa, Calvin, etc.) and real-world robots (Franka, XSquare Turtle2, DOS-W1) with the same API.

You can prototype in simulation and deploy on real hardware without rewriting your RL pipeline.

State-of-the-art VLA RL fine-tuning

Fine-tune Pi0, Pi0.5, GR00T, OpenVLA, LingBot-VLA and other VLA models using RL algorithms like GRPO, PPO, and DAPO.

VLA models are typically trained with imitation learning only. RLinf enables RL-based post-training that can surpass demonstration quality.

Real-world online RL with HG-DAgger

Human-Gated DAgger allows safe online RL on real robots — a human supervisor gates when the policy's actions are used vs. when human corrections are needed.

Online RL on real hardware is dangerous without safety mechanisms. HG-DAgger provides a practical bridge between human demonstrations and autonomous RL.

Agentic AI support

Extends beyond robotics to support RL for language agents — Search-R1, rStar2, coding agents, and multi-agent systems.

RLinf is one of the few frameworks that bridges embodied RL and agentic RL in a single codebase.
Install

One command to start

$ pip install rlinf
Use cases

What teams use it for

RL-based post-training for VLA policies

After collecting demonstration data and training a VLA policy with imitation learning, use RLinf to fine-tune the policy with RL for higher success rates.

Real-world robot learning with safety guarantees

Deploy RLinf on a Franka arm with HG-DAgger for safe online learning — the human intervenes when the policy makes unsafe moves, and the system learns from both successes and corrections.

Multi-agent embodied RL research

Use RLinf's multi-agent support to study coordination between multiple robots performing collaborative tasks in simulation.

Ecosystem

Tags & capabilities

botopen sourceroboticsopen source
Comparison

How it stacks up

Choose RLinf for production RL across robots and agents

vs specialized RL libraries

Stable-Baselines3 is simpler for standard RL benchmarks but lacks robot integration. RLinf provides the full stack from simulation to real hardware to agentic AI.

FAQ

Questions

What RL algorithms does RLinf support?

RLinf supports IQL, GRPO, PPO, DAPO, Reinforce++, SAC, CrossQ, RLPD, SAC-Flow, DSRL, and RECAP/CFG among others.

What robots are supported for real-world RL?

Franka Arm (with RealSense, ZED cameras, Franka Hand, Robotiq gripper), XSquare Turtle2 dual-arm, and DOS-W1. More robots are being added.

Can I use RLinf without real hardware?

Yes, RLinf supports 10+ simulation backends including ManiSkill, LIBERO, MetaWorld, IsaacLab, RoboCasa, Calvin, and more — all accessible with the same API.