Abstract:
Recent breakthroughs in Generative AI have enabled the controllable generation of diverse and photorealistic 2D imagery, resulting in transformative applications in areas such as art and design. As human perception is inherently three-dimensional, the ability to generate 3D content in a controllable manner could unlock numerous applications in virtual and augmented reality, healthcare, autonomous vehicles, robotics, and more, and have wide-reaching implications. However, we are yet to witness the same success of 2D generation in 3D.
In this talk, I will outline three important challenges on the path to closing this gap. The first challenge is that of representing the 3D world in a controllable, expressive, and compact manner. To this end, I will describe a novel approach for representing signals (such as 3D objects or scenes) in a decomposable and interpretable manner that allows constraints to be imposed on the signal with provable guarantees. The second challenge is in remodeling the 3D world in a controllable manner from limited 2D observations. To this end, I will describe a framework for decomposing and manipulating objects in a 3D scene as well as for generating them from novel views, given only 2D training data. The third challenge is in providing an intuitive and flexible interface for humans to create 3D content in a controllable manner. To this end, I will describe a method for intuitively stylizing 3D objects using textual descriptions. Lastly, I will conclude my talk with future directions on using controllable 3D generation for augmented reality, photorealistic simulations for applications such as autonomous vehicles, as well as enabling machines to better understand the world.
https://technion.zoom.us/j/98396614189