Pixie: 3D Physics from Pixels

Why learn physics fields?

Photorealistic 3D reconstructions (NeRF, Gaussian Splatting) capture geometry & appearance but lack physics. This limits 3D reconstruction to static scenes. Recently, there has been a surge of interest in integrating physics into 3D modeling. But existing test‑time optimisation methods are slow and scene‑specific. Pixie trains a neural network thatmaps pretrained visual features (i.e., CLIP) to dense material fields of physical properties in a single forward pass, enabling real‑time physics simulations.

PixieVerse Dataset

PixieVerse is a large-scale synthetic benchmark for visual-physics learning. It features thousands of high-quality assets that span diverse semantic classes and material behaviours, each annotated with dense physical properties. The dataset is labeled automatically via a VLM pipeline we developed.

Assets

0

Super-classes

0

Material Models

0

Annotations

E, ν, ρ, ID

Method Overview

Multi‑view posed RGB images are encoded by a NeRF with distilled CLIP features yielding a 3D feature grid. A 3D U‑Net predicts material fields that are transferred onto Gaussian splats and simulated with an Material Point Method (MPM) Physics solver to produce 3D physics simulations.

Results

We render a video of each 3D physic animation predicted by our model and other competitors. A state-of-the-art VLM Gemini-Pro-2.5 is used to evaluate the realism and score the candidates.

What Pixie predicts: Pixie simultaneously recovers discrete material class , E, ν, ρ with a high degree of accuracy. For example, the model correctly labels labels foliage as elastic and the metal can as rigid, while recovering realistic stiffness and density gradients within each object.

Pixie against baselines visually: We visualized the predicted material class and E (left, right respectively) for Pixie and Nerf2Physics, E for DreamPhysics (right), and the plasticity and hyperelastic function classes predicted by OmniPhysGS. Best Gemini score per scene is highlighted 🟢Green while low scores are marked 🔴Red. Pixie produces stable, physically plausible motion while DreamPhysics remains overly stiff due to inaccurate finegrained E prediction or too high E, OmniPhysGS collapses under load due to unrealistic combination of plasticity and hyperelastic functions, and NeRF2Physics exhibits noisy artifacts.

Zero-shot Transfer to Real Scenes

Interactively explore Pixie's material predictions on captured NeRF scenes. Drag the slider to compare input RGB with predicted physics fields, switch feature views, and pick different scenes via thumbnails.

Ablation Study

Ablation Figure: Replacing CLIP with raw RGB or occupancy features hinders sim2real transfer. Incorrect predictions such as leave mislaballed as metal or Young's modulus being uniform within an object are marked with question marks.

Ablation Figure: It also severely degrades material accuracy (−20 %) and almost doubles continuous errors shown Table 1, confirming the importance of semantic priors.

Authors

Long Le¹

Ryan Lucas²

Chen Wang¹

Chuhao Chen¹

Dinesh Jayaraman¹

Eric Eaton¹

Lingjie Liu¹