The Segment Anything Model (SAM) is a promptable segmentation system from Meta AI with zero-shot generalization to unfamiliar objects and images, without the need for additional training.
The Segment Anything Model (SAM) is a promptable segmentation system with zero-shot generalization to unfamiliar objects and images, without the need for additional training. Released on April 5, 2023, the Segment Anything project was developed by Meta AI. The company has made both the model and its dataset available under a permissive open license (Apache 2.0) for research purposes. Segmentation is the process of identifying image pixels belonging to an object. Meta already internally uses this technology internally for tasks such as tagging photos, moderating prohibited content, and determining the posts recommended to users on Facebook and Instagram.
SAM can identify objects in images from a variety ofvarious input prompts allowing for a wide range of segmentation tasks without requiring additional training. Supported prompts include foreground/background points, bounding boxes, and masks, while ; text prompts are being explored, in the accompanying paperbut the capability is not supported upon the release of the model. SAM's promptable design enables the model to be integrated with other systems.
In the blog accompanying the release of SAM, Meta discussed some of the future potential use cases of the model across various industries, including the following:
Previously, there were two primary approaches to segmentation. The first, Interactive segmentation that, required a user to iteratively refine a mask. andThe second, automatic segmentation, allowed for specific object categories to be defined ahead of time,. thisThis approach also required training on a substantial amount of manually annotated objects. SAM is a generalization of these two classes in a single model. It can perform both interactive and automatic segmentation, in a flexible way, thanksdue to the model's promptable interface. SAM is also trained on a diverse dataset of over 1 billion masks, enabling it to generalize new types of objects and images.
SAM is structured with a VIT-H image encoder that runs once per image, outputting an image embedding. The prompt encoder embeds input prompts, such as clicks or boxes. A lightweight transformer-based mask decoder predicts object masks from the image embedding and prompt embedding.
The image encoder has 632M parameters, and the prompt encoder/mask decoder has 4M parameters. The image encoder is implemented in PyTorch and requires a GPU for efficient inference. Both the prompt encoder and mask decoder can run directly with PyTorch or be converted to ONNX. They run efficiently on a CPU or GPU.
The Segment Anything is Model (SAM) is a promptable segmentation system from Meta AI with zero-shot generalization to unfamiliar objects and images, without the need for additional training.
The Segment Anything Model (SAM) is a promptable segmentation system with zero-shot generalization to unfamiliar objects and images, without the need for additional training. Released on April 5, 2023, the Segment Anything project was developed by Meta AI. The company has made both the model and its dataset available under a permissive open license (Apache 2.0) for research purposes Segmentation is the process of identifying image pixels belonging to an object. Meta already internally uses technology internally for tasks such as tagging photos, moderating prohibited content, and determining the posts recommended to users on Facebook and Instagram.
SAM can identify objects in images from a variety of input prompts allowing for a wide range of segmentation tasks without requiring additional training. Supported prompts include foreground/background points, bounding boxes, and masks, while text prompts are explored in the accompanying paper the capability is not supported upon release of the model. SAM's promptable design enables the model to be integrated with other systems
In the blog accompanying the release of SAM, Meta discussed some of the future potential use cases of the model across various industries, including:
Previously there were two primary approaches to segmentation. Interactive segmentation that required a user to iteratively refine a mask and automatic segmentation for specific object categories ahead of time, this approach also required training on a substantial amount of manually annotated objects. SAM is a generalization of these two classes in a single model. It can perform both interactive and automatic segmentation, in a flexible way thanks to the model's promptable interface. SAM is also trained on a diverse dataset of over 1 billion masks, enabling it to generalize new types of objects and images.
SAM is structured with a VIT-H image encoder that runs once per image, outputting an image embedding. The prompt encoder embeds input prompts such as clicks or boxes. A lightweight transformer-based mask decoder predicts object masks from the image embedding and prompt embedding.
The image encoder has 632M parameters and the prompt encoder/mask decoder has 4M parameters. The image encoder is implemented in PyTorch and requires a GPU for efficient inference. Both the prompt encoder and mask decoder can run directly with PyTorch or converted to ONNX. They run efficiently on a CPU or GPU.
April 5, 2023
Segment Anything is is a promptable segmentation system from Meta AI with zero-shot generalization to unfamiliar objects and images, without the need for additional training.
The Segment Anything Model (SAM) is a promptable segmentation system from Meta AI with zero-shot generalization to unfamiliar objects and images, without the need for additional training.