Why learn physics fields?

Photorealistic 3D reconstructions (NeRF, Gaussian Splatting) capture geometry & appearance but lack physics. This limits 3D reconstruction to static scenes. Recently, there has been a surge of interest in integrating physics into 3D modeling. But existing test‑time optimisation methods are slow and scene‑specific. Pixie trains a neural network thatmaps pretrained visual features (i.e., CLIP) to dense material fields of physical properties in a single forward pass, enabling real‑time physics simulations.

Quantitative results

PixieVerse Dataset

PixieVerse is a large-scale synthetic benchmark for visual-physics learning. It features thousands of high-quality assets that span diverse semantic classes and material behaviours, each annotated with dense physical properties. The dataset is labeled automatically via a VLM pipeline we developed.

Tip:  See our paper for plots of the data distributions.

Assets

0

Super-classes

0

Material Models

0

Annotations

E, ν, ρ, ID

Method Overview

Method overview

Multi‑view posed RGB images are encoded by a NeRF with distilled CLIP features yielding a 3D feature grid. A 3D U‑Net predicts material fields that are transferred onto Gaussian splats and simulated with an Material Point Method (MPM) Physics solver to produce 3D physics simulations.

Results

We render a video of each 3D physic animation predicted by our model and other competitors. A state-of-the-art VLM Gemini-Pro-2.5 is used to evaluate the realism and score the candidates.
Quantitative results VLM Score vs Runtime: On the PixieVerse benchmark, Pixie outperforms DreamPhysics, OmniPhysGS and NeRF2Physics by 2.2–4.6× in Gemini‑Pro realism while running 10³× faster.


Quantitative results
Insight: Pixie achieves state-of-the-art realism while running 10³× faster than existing approaches! ⚡️
More quantitative results! We also report perceptual metrics (PSNR, SSIM) against the reference videos in PixieVerse, the VLM Gemini scores, and five other metrics our method optimizes including discrete material accuracy and continuous errors over E, ν, ρ. Standard errors and 95% CI are also included, and best values are bolded. Pixie is by far the best performer. 2.21-4.58x improvement in VLM score and 3.6-30.3% gains in PSNR and SSIM against competitors!
What Pixie predicts: Pixie simultaneously recovers discrete material class , E, ν, ρ with a high degree of accuracy. For example, the model correctly labels labels foliage as elastic and the metal can as rigid, while recovering realistic stiffness and density gradients within each object.
Tip: See the zoom-in lens for some incorrect predictions.
Pixie against baselines visually: We visualized the predicted material class and E (left, right respectively) for Pixie and Nerf2Physics, E for DreamPhysics (right), and the plasticity and hyperelastic function classes predicted by OmniPhysGS. Best Gemini score per scene is highlighted 🟢Green while low scores are marked 🔴Red. Pixie produces stable, physically plausible motion while DreamPhysics remains overly stiff due to inaccurate finegrained E prediction or too high E, OmniPhysGS collapses under load due to unrealistic combination of plasticity and hyperelastic functions, and NeRF2Physics exhibits noisy artifacts.

Zero-shot Transfer to Real Scenes

Interactively explore Pixie's material predictions on captured NeRF scenes. Drag the slider to compare input RGB with predicted physics fields, switch feature views, and pick different scenes via thumbnails.

Ablation Study

Method overview
Ablation Figure: Replacing CLIP with raw RGB or occupancy features hinders sim2real transfer. Incorrect predictions such as leave mislaballed as metal or Young's modulus being uniform within an object are marked with question marks.
Method overview
Ablation Figure: It also severely degrades material accuracy (−20 %) and almost doubles continuous errors shown Table 1, confirming the importance of semantic priors.

Authors

Citation

If you find this work useful, please consider citing:
@inproceedings{le2025pixie,
  title={{Pixie}: Fast and Generalizable Supervised 3D Physics Learning from Pixels},
  author={Le, Long and Lucas, Ryan and Wang, Chen and Chen, Chuhao and Jayaraman, Dinesh and Eaton, Eric and Liu, Lingjie},
  
  year={2025}
}