Comparison · Updated April 2026

Robosuite vs Raw MuJoCo: Which Should You Use for Manipulation RL?

Almost every modern manipulation RL paper runs on MuJoCo physics, but only some use raw MuJoCo. Most instead reach for Robosuite, a task-and-asset framework that sits on top of MuJoCo and gives you Franka, Sawyer, UR5e, and other arms with standardized observations, controllers, and benchmark tasks. This guide explains which layer you actually need — and where Robosuite's opinionated defaults help versus hurt.

TL;DR

Robosuite is a higher-level manipulation framework built on MuJoCo. It ships robots, tasks, controllers, and gym-compatible environments.
Raw MuJoCo is just the physics engine and Python bindings. Tasks, controllers, and observations are yours to define.
If you want a benchmark result that is comparable to published work on LIBERO, RoboMimic, or MimicGen, use Robosuite. Those benchmarks are built on it.
If you are building a novel robot, novel controller, or non-standard observation space, raw MuJoCo is lighter and less opinionated.
Both are free and Apache 2.0 / MIT licensed. Both run on CPU or GPU (via MJX for raw MuJoCo).

What each project actually is

MuJoCo (Multi-Joint dynamics with Contact) is a physics engine originally created by Emo Todorov and acquired by Google DeepMind in 2021, at which point it was open-sourced under Apache 2.0. It is a C library with Python bindings that takes an MJCF (an XML description of a scene and its robots), simulates contact-rich rigid-body dynamics, and returns state. That is, roughly, everything it does. There are no task definitions, no rewards, no benchmark environments, and no robots shipped by default — those live in the separate MuJoCo Menagerie repository or in MJCF files you write yourself.

Robosuite is a Python framework originally developed at Stanford's Pair lab that provides manipulation-focused scaffolding on top of MuJoCo. Concretely, it ships pre-configured robot models (Franka Panda, Rethink Sawyer, Universal Robots UR5e, Kinova Gen3, and more), standardized controllers (operational space, joint position, joint velocity), a set of benchmark tasks (Lift, Stack, PickPlace, NutAssembly, Door, Wipe, ToolHang, and two-arm variants), and a Gymnasium-compatible environment API. The official site is the best starting point, and the repository at github.com/ARISE-Initiative/robosuite is actively maintained into 2026.

Where they share DNA and where they diverge

Everything that happens inside a Robosuite simulation is still MuJoCo under the hood — the physics, the contacts, the rendering, the MJCF scene format. Robosuite is not a fork of MuJoCo, it is a user of MuJoCo's Python bindings. That means any fix or improvement that lands in MuJoCo flows into Robosuite on its next upstream bump, and anything you could do in raw MuJoCo (custom controllers, unusual joint structures, deformables via composites) is still available from inside a Robosuite env if you subclass the right base classes.

What diverges is everything above the physics layer. Robosuite imposes a particular view of what a manipulation task looks like: a robot plus a task XML plus a controller plus an observation spec plus a reward function. That view is a good fit for 90% of published manipulation research, which is why the framework has the adoption it does. It is a worse fit for research that breaks those assumptions — mobile manipulation, whole-body humanoid control, and dexterous hand tasks with unusual sensor suites are all awkward in Robosuite but natural in raw MuJoCo.

Feature comparison

Dimension	Robosuite	Raw MuJoCo
License	MIT	Apache 2.0
Physics engine	MuJoCo (as a dependency)	MuJoCo (directly)
Pre-built tasks	13+ standardized manipulation tasks	None shipped by default
Pre-built robots	Franka, Sawyer, UR5e, Kinova, Baxter, IIWA	Via Menagerie repo, manual import
Controllers	Operational space, joint position, joint velocity, IK	BYO
GPU acceleration	CPU by default; MJX port exists but partial	MJX (JAX-native GPU/TPU)
Sensor sim	RGB, depth, segmentation, proprioception, force-torque	Whatever you configure in MJCF
Photorealism	MuJoCo's PBR renderer; fine, not photoreal	Same renderer; can integrate Blender offline
Parallel envs	Gym-style vector env on CPU	MJX scales to 4096+ on a GPU
ROS support	None direct; bridge via mujoco_ros	None direct; same bridge available
Learning curve	Low (hours to first training run)	Moderate (days to first clean task)
Paper & docs	Robosuite paper, docs at robosuite.ai	MuJoCo docs, paper arXiv 2106.04134

Where Robosuite pays off

There are three places where Robosuite saves meaningful engineering time. The first is benchmark comparability. LIBERO, RoboMimic, and MimicGen — the three most cited manipulation imitation-learning and RL benchmarks of 2024-2026 — all build on Robosuite. If you want your result to sit alongside published numbers, building on Robosuite means you share task definitions, assets, and controllers with prior work, and the comparison is apples-to-apples. Reimplementing those benchmarks in raw MuJoCo is possible but introduces small divergences that your reviewers will notice.

The second is controllers. Implementing operational-space control well for a 7-DOF arm is a week of engineering; Robosuite gives you a working OSC controller out of the box, tuned against the robots it ships. If your policy outputs delta end-effector poses rather than joint torques, using Robosuite's controller is almost always the right move.

The third is VLA evaluation. OpenVLA, Octo, RT-X, and newer VLA families all publish eval harnesses that either use Robosuite directly or define tasks in Robosuite-compatible MJCF. If your end goal is to measure a VLA policy's success rate on standard suites, Robosuite is already on the critical path.

Where raw MuJoCo pays off

Raw MuJoCo is the right base when Robosuite's assumptions get in your way. Common scenarios: you are simulating a humanoid or mobile manipulator (Robosuite is arm-centric); you are designing a novel gripper or a multi-fingered hand with custom tendons (Robosuite's controller suite does not target this); you want MJX on GPU with thousands of parallel envs (the Robosuite MJX port is useful but less mature than raw MJX); you need full control over the observation schema for an unusual VLA or diffusion-policy architecture; or you are doing physics-centric research where the task scaffolding would just be overhead.

Raw MuJoCo also wins on installation minimalism. pip install mujoco gets you the engine, Python bindings, and a viewer, and you are done. There is no curriculum of robots, tasks, and controllers to learn before you can simulate anything. For single-file reproducible research, this is a real advantage.

Performance considerations

Both paths go through the same physics kernel, so per-step throughput on a single env is comparable. Where they diverge is in parallelism. Raw MuJoCo gives you access to MJX, a JAX-native rewrite that runs the physics on GPU or TPU and scales cleanly to thousands of parallel envs. Robosuite's primary path is CPU vectorization via Gymnasium's vector env API, which tops out below what MJX can do on modern hardware, although the Robosuite team has been steadily porting tasks to MJX and by 2026 the common manipulation benchmarks all have an MJX variant. If your plan is massive-batch PPO on a single GPU, raw MuJoCo (through MJX) is the lower-risk path today; if your plan is moderate-batch RL or imitation learning with Gymnasium-compatible envs, Robosuite is fine.

Asset and ecosystem considerations

Robosuite's asset library is curated and consistent: every shipped robot has verified inertias, collision meshes, controller tunings, and example tasks. This is invaluable when starting out. MuJoCo Menagerie is the closest analog on the raw MuJoCo side — a DeepMind-maintained repository of high-quality robot models (Unitree, Boston Dynamics, Franka, Shadow Hand, ARX, and dozens more) in MJCF format. Menagerie assets tend to be higher-fidelity than Robosuite's (better collision meshes, more accurate inertias), but they ship just the model, with no tasks, controllers, or reward functions attached.

A common pattern in 2026 labs is to use Menagerie for the robot model, raw MuJoCo for the physics, and Robosuite-style task scaffolding (either from Robosuite itself or a custom lightweight clone) for the RL harness. This gives you high-fidelity models, predictable physics, and a familiar RL API without being locked into Robosuite's shipped arms.

VLA evaluation: the decisive factor

For most teams in 2026, the decisive factor is whether their policy will be compared against published VLA numbers. If the answer is yes, the path of least resistance is Robosuite plus LIBERO or MimicGen, because that is where the published numbers live. If you are training in raw MuJoCo and then re-running on Robosuite benchmarks for evaluation, expect a few days of integration work per benchmark suite. It is doable, but it is the kind of work that tends to expand to fill whatever time the deadline allows.

When to use Robosuite

You want to benchmark on LIBERO, MimicGen, or RoboMimic.
You need a working operational-space controller for a Franka, Sawyer, UR5e, or similar arm.
You are evaluating a VLA (OpenVLA, Octo, RT-X) on standard manipulation tasks.
You want gym-compatible envs and a minimal training script.
You want to write less code and read fewer MJCF files.

When to use raw MuJoCo

You are simulating a humanoid, quadruped, or mobile manipulator.
You are designing a novel gripper, dexterous hand, or unusual sensor suite.
You need MJX on GPU with thousands of parallel envs.
You are doing physics-centric research and need full control over integration, contact model, and constraints.
You want the thinnest possible dependency stack for reproducibility.

Verdict

Robosuite is raw MuJoCo's opinionated face for manipulation research. If your project fits its mold — a tabletop arm, a handful of objects, a standardized controller, a benchmark comparison — use Robosuite and save yourself weeks of plumbing. If your project lives outside that mold, build on raw MuJoCo, borrow assets from Menagerie, and keep Robosuite as the eval harness for the specific benchmarks you need to publish against.

If you want a manipulation RL stack delivered with calibrated dynamics, VLA eval harnesses, and reproducible task definitions, the SVRC Services team builds them both ways and hands them off. See also our ACT vs Diffusion Policy guide for the policy side of the same question.