News, service

Meta Sam 3D: Create 3D from 2D Photos

25. 11. 25.

Meta has strengthened its vision AI competitiveness by unveiling its next-generation image segmentation model, SAM3, and its single-image-based 3D reconstruction technology, SAM3D. This release is considered a technological advancement that fundamentally changes the way we understand and edit images and videos. Meta also unveiled a new experimental platform, Segment Anything Playground, allowing anyone to try out both models.

Sam3 provides the ability to detect and segment objects within images using text commands alone. While existing models focused on categorized labels like vehicles or animals, Sam3 also recognizes specific expressions like the red hat. This allows users to select and edit only the desired objects during photo or video editing, offering new ways for creators and editors to utilize Sam3. Meta announced that Sam3 will be integrated into its Edits app and Meta's AI-based service, Vibes.

Sam3D is a model that reconstructs objects and people in three dimensions using just a single photo. Sam3D Object reproduces the shape and texture of various objects, such as furniture and household items, found in everyday scenes. Sam3D Body precisely estimates human full-body poses and body shapes. Meta explained that it built a large-scale image- and video-based dataset to achieve this, and designed it to handle complex poses and occluded bodies.

Meta anticipates that Sam3D will be utilized in fields where visual information and spatial understanding are crucial, such as virtual reality content creation, game production, sports medicine, and robotics. Facebook Marketplace has already introduced a new feature, View in Room. Users can preview furniture in their own space before purchasing, a feature created through the collaboration between Sam3D and Sam3.

The model also has limitations. Sam3D is trained to predict only one object at a time, limiting its ability to reflect interactions between multiple objects or people. Degraded texture detail is also observed during full-body reconstruction. Meta proposed improving output resolution and multi-object learning as future research tasks.

To expand collaboration with the research ecosystem, Meta has released the Sam3 model weights and evaluation dataset. Sam3D also provides checkpoints and inference code, allowing developers to validate and apply its features. Through the playground, users can upload their own photos or videos to experience text-prompt-based segmentation and 3D reconstruction.