# RLinf

Production-grade reinforcement learning infrastructure for embodied and agentic AI.

## Agent Decision Summary
- Risk level: elevated
- Source confidence: high
- Recommended workflows: Robotics or embodied agent workflow
- Permission surface: messages, hardware
- Agent JSON: https://www.openagent.bot/bots/rlinf.agent.json

## Summary
RLinf is a flexible and scalable open-source RL infrastructure designed for Embodied and Agentic AI. It supports real-world robot RL on Franka, XSquare Turtle2, and DOS-W1 arms, multiple simulation backends (ManiSkill, LIBERO, MetaWorld, IsaacLab, RoboCasa), and state-of-the-art VLA model fine-tuning (Pi0, Pi0.5, GR00T, OpenVLA). It also extends to agentic AI with support for Search-R1, rStar2, and multi-agent RL.


## Guide
RLinf is a production-grade open-source reinforcement learning infrastructure that unifies embodied AI robotics and agentic AI language models under one RL framework.

### What it is
RLinf is a flexible and scalable RL infrastructure supporting 10+ simulation backends, real-world robot control, VLA model fine-tuning, and agentic AI. It implements major RL algorithms (PPO, GRPO, SAC, DAPO, IQL, CrossQ, RLPD) with a unified API that works identically across simulation and real hardware. Its real-world RL stack includes HG-DAgger for safe online training, and its agentic AI module extends RL to language agents.

### Why it matters
Reinforcement learning for embodied AI has been held back by the gap between simulation research and real-world deployment. RLinf bridges this gap by providing the same API across 10+ simulators and multiple real robot platforms. It also bridges the gap between robotics RL and agentic RL — a convergence that is increasingly important as VLA models and language agents share architectures and training techniques.

### How it works
RLinf provides a modular architecture where environments, policies, and algorithms are swappable components. An experiment is configured via YAML or Python dict, specifying the simulator backend (or real robot), the policy model (from MLP to VLA), and the RL algorithm. For real-world RL, the HG-DAgger loop runs a policy on hardware, a human supervisor monitors and intervenes via a GUI, and the system logs both autonomous and human-corrected episodes for training.


## Use Cases
- Fine-tuning Pi0 with RL for higher manipulation success: Train Pi0 with imitation learning on demonstrations, then run RL post-training with RLinf to improve success rates beyond the demonstrator's performance.
- Real-world Franka arm learning with HG-DAgger: Set up a Franka arm with ZED cameras and Robotiq gripper, run online RL with human-gated intervention for safe exploration on real pick-and-place tasks.
- Agentic RL for search and reasoning: Use RLinf's Search-R1 and rStar2 support to apply RL training to language agent search and reasoning tasks, improving performance beyond supervised fine-tuning.

## Alternatives
- Use Stable-Baselines3 for simpler benchmark RL vs RLinf: SB3 is more lightweight for standard Gymnasium benchmarks. RLinf is the choice when you need robot hardware integration, VLA support, or multi-agent coordination.

### Getting Started
- Clone the repository: https://github.com/RLinf/RLinf
- Read the docs: https://rlinf.readthedocs.io/en/latest/

### FAQ
- What RL algorithms does RLinf support?
  - RLinf supports IQL, GRPO, PPO, DAPO, Reinforce++, SAC, CrossQ, RLPD, SAC-Flow, DSRL, and RECAP/CFG among others.
- What robots are supported for real-world RL?
  - Franka Arm (with RealSense, ZED cameras, Franka Hand, Robotiq gripper), XSquare Turtle2 dual-arm, and DOS-W1. More robots are being added.
- Can I use RLinf without real hardware?
  - Yes, RLinf supports 10+ simulation backends including ManiSkill, LIBERO, MetaWorld, IsaacLab, RoboCasa, Calvin, and more — all accessible with the same API.
## What It Does
RLinf is a flexible and scalable RL infrastructure supporting 10+ simulation backends, real-world robot control, VLA model fine-tuning, and agentic AI. It implements major RL algorithms (PPO, GRPO, SAC, DAPO, IQL, CrossQ, RLPD) with a unified API that works identically across simulation and real hardware. Its real-world RL stack includes HG-DAgger for safe online training, and its agentic AI module extends RL to language agents.

## How To Evaluate
RLinf provides a modular architecture where environments, policies, and algorithms are swappable components. An experiment is configured via YAML or Python dict, specifying the simulator backend (or real robot), the policy model (from MLP to VLA), and the RL algorithm. For real-world RL, the HG-DAgger loop runs a policy on hardware, a human supervisor monitors and intervenes via a GUI, and the system logs both autonomous and human-corrected episodes for training.

## Why It Matters
RLinf matters because reinforcement learning for robotics has been fragmented across incompatible tools, simulators, and algorithms. RLinf provides a unified infrastructure that works across simulation, real-world robots, and even agentic AI — reducing the engineering overhead of setting up RL experiments from weeks to hours. Its support for real-world online RL (HG-DAgger) and production-grade RL algorithms (PPO, GRPO, SAC, DAPO) makes it one of the most comprehensive open RL frameworks available.


## Best For
- Robotics researchers running RL experiments across simulation and real hardware
- Teams fine-tuning VLA models with reinforcement learning
- Developers building agentic AI systems with RL-based training

## Not For
- Beginners looking for a simple out-of-the-box robot control interface (start with LeRobot)

## What It Actually Does
- Unified RL across simulation and real hardware: RLinf supports 10+ simulation backends (ManiSkill, LIBERO, MetaWorld, IsaacLab, RoboCasa, Calvin, etc.) and real-world robots (Franka, XSquare Turtle2, DOS-W1) with the same API.
  - Why it matters: You can prototype in simulation and deploy on real hardware without rewriting your RL pipeline.
- State-of-the-art VLA RL fine-tuning: Fine-tune Pi0, Pi0.5, GR00T, OpenVLA, LingBot-VLA and other VLA models using RL algorithms like GRPO, PPO, and DAPO.
  - Why it matters: VLA models are typically trained with imitation learning only. RLinf enables RL-based post-training that can surpass demonstration quality.
- Real-world online RL with HG-DAgger: Human-Gated DAgger allows safe online RL on real robots — a human supervisor gates when the policy's actions are used vs. when human corrections are needed.
  - Why it matters: Online RL on real hardware is dangerous without safety mechanisms. HG-DAgger provides a practical bridge between human demonstrations and autonomous RL.
- Agentic AI support: Extends beyond robotics to support RL for language agents — Search-R1, rStar2, coding agents, and multi-agent systems.
  - Why it matters: RLinf is one of the few frameworks that bridges embodied RL and agentic RL in a single codebase.

## Typical Use Cases
- RL-based post-training for VLA policies: After collecting demonstration data and training a VLA policy with imitation learning, use RLinf to fine-tune the policy with RL for higher success rates.
- Real-world robot learning with safety guarantees: Deploy RLinf on a Franka arm with HG-DAgger for safe online learning — the human intervenes when the policy makes unsafe moves, and the system learns from both successes and corrections.
- Multi-agent embodied RL research: Use RLinf's multi-agent support to study coordination between multiple robots performing collaborative tasks in simulation.

## How It Compares
- Choose RLinf for production RL across robots and agents vs specialized RL libraries: Stable-Baselines3 is simpler for standard RL benchmarks but lacks robot integration. RLinf provides the full stack from simulation to real hardware to agentic AI.

## Fit Matrix
- Robotics or embodied agent workflow: strong. RLinf has multiple signals for robotics or embodied agent workflow, including matching tags, capabilities, category, or positioning. Required check: Separate simulator claims from hardware claims and verify safety boundaries before real-world operation.
- Coding agent workflow: partial. RLinf has at least one signal for coding agent workflow, but should be checked against a real task before adoption. Required check: Run a small repository change and inspect the diff, tests, and rollback path.
- Memory or RAG workflow: partial. RLinf has at least one signal for memory or rag workflow, but should be checked against a real task before adoption. Required check: Create, update, retrieve, correct, and delete memory or retrieval objects with real data.
- Reusable skill workflow: partial. RLinf has at least one signal for reusable skill workflow, but should be checked against a real task before adoption. Required check: Run one skill end to end and check whether it produces evidence or structured output.
- Browser automation: weak. RLinf is not primarily positioned for browser automation in the current metadata. Required check: Run one non-sensitive website task and inspect clicks, waits, retries, and changed URLs.
- Connector or protocol layer: weak. RLinf is not primarily positioned for connector or protocol layer in the current metadata. Required check: Connect one low-risk service, then inspect schemas, auth scope, errors, and logs.

## Evidence
- verified: RLinf is listed as open source. Source: License metadata: Apache-2.0
- verified: RLinf has a recorded GitHub repository: RLinf/RLinf. Source: Resource facts and GitHub source link.
- inferred: RLinf supports these recorded deployment modes: cloud. Source: OpenAgent decision signal metadata.
- inferred: RLinf is tagged with robotics, messaging capabilities. Source: OpenAgent capability taxonomy.

## Missing Checks
- Dedicated docs link is missing.
- Repository freshness has not been recorded.

## Next Actions
- Inspect repository: https://github.com/RLinf/RLinf
- Open Homepage: https://rlinf.readthedocs.io/en/latest/
- Install RLinf: pip install rlinf

## Command Line
### Install RLinf
Install RLinf from PyPI.

```bash
pip install rlinf
```

## Facts
- Category: bots
- Resource type: bot
- Open source: yes
- License: Apache-2.0
- Last verified: 2026-06-04
- GitHub repo: RLinf/RLinf
- GitHub stars: 3161

## Capabilities
- robotics
- messaging

## Structured Use Case Tags
- robotics-agent

## Getting Started
- View the GitHub repository: https://github.com/RLinf/RLinf
- Read the documentation: https://rlinf.readthedocs.io/en/latest/

## Links
- GitHub: https://github.com/RLinf/RLinf
- Homepage: https://rlinf.readthedocs.io/en/latest/

## Structured Outputs
- JSON: https://www.openagent.bot/bots/rlinf.json
- Markdown: https://www.openagent.bot/bots/rlinf.md
- Agent JSON: https://www.openagent.bot/bots/rlinf.agent.json
- Canonical: https://www.openagent.bot/bots/rlinf