Human Scene Interaction
Human-scene interaction (HSI) research focuses on understanding and modeling how humans interact with their 3D environments, aiming to create realistic and controllable simulations of human behavior within scenes. Current research emphasizes developing models that accurately capture dynamic object interactions and human motions from various data sources, including egocentric video and motion capture, often employing techniques like neural radiance fields, diffusion models, and reinforcement learning to synthesize and control these interactions. This field is significant for advancing embodied AI, virtual and augmented reality, and robotics, by enabling more realistic simulations and human-robot collaboration scenarios. The development of large-scale, high-quality HSI datasets is also a key focus, facilitating the training and evaluation of increasingly sophisticated models.
Papers
HumanVLA: Towards Vision-Language Directed Object Rearrangement by Physical Humanoid
Xinyu Xu, Yizheng Zhang, Yong-Lu Li, Lei Han, Cewu Lu
EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting
Daiwei Zhang, Gengyan Li, Jiajie Li, Mickaël Bressieux, Otmar Hilliges, Marc Pollefeys, Luc Van Gool, Xi Wang