Log in
Enquire now
Segment Anything

Segment Anything

The Segment Anything Model (SAM) is a promptable segmentation system from Meta AI with zero-shot generalization to unfamiliar objects and images, without the need for additional training.

OverviewStructured DataIssuesContributors

Contents

segment-anything.com
Is a
Software
Software
Product
Product

Product attributes

Industry
Artificial Intelligence (AI)
Artificial Intelligence (AI)
Launch Date
April 5, 2023
0
Product Parent Company
Meta AI
Meta AI
Technologies Used
‌
Image segmentation

Software attributes

First Release
April 5, 2023

Other attributes

Blog
ai.facebook.com/blog/se...ntation/
Overview

The Segment Anything Model (SAM) is a promptable segmentation system with zero-shot generalization to unfamiliar objects and images, without the need for additional training. Released on April 5, 2023, the Segment Anything project was developed by Meta AI. The company has made both the model and its dataset available under a permissive open license (Apache 2.0) for research purposes. Segmentation is the process of identifying image pixels belonging to an object. Meta already uses this technology internally for tasks such as tagging photos, moderating prohibited content, and determining the posts recommended to users on Facebook and Instagram.

Example of image segmentation using Segment Anything.

Example of image segmentation using Segment Anything.

SAM can identify objects in images from various input prompts allowing for a wide range of segmentation tasks without requiring additional training. Supported prompts include foreground/background points, bounding boxes, and masks; text prompts are being explored, but the capability is not supported upon the release of the model. SAM's promptable design enables the model to be integrated with other systems.

In the blog accompanying the release of SAM, Meta discussed some of the future potential use cases of the model across various industries, including the following:

  • AI systems—allowing a multimodal understanding of the world; for example, understanding both the visual and text content of a webpage
  • AR/VR—enabling the selection of an object based on a user’s gaze and then “lifting” it into 3D
  • Content creation—improving creative applications, such as extracting image regions for collages or video editing
  • Science—studying natural occurrences on Earth or even in space; for example, by localizing animals or objects to study and track in video
Model

Previously, there were two primary approaches to segmentation. The first, Interactive segmentation, required a user to iteratively refine a mask. The second, automatic segmentation, allowed for specific object categories to be defined ahead of time. This approach also required training on a substantial amount of manually annotated objects. SAM is a generalization of these two classes in a single model. It can perform both interactive and automatic segmentation in a flexible way, due to the model's promptable interface. SAM is also trained on a diverse dataset of over 1 billion masks, enabling it to generalize new types of objects and images.

SAM is structured with a VIT-H image encoder that runs once per image, outputting an image embedding. The prompt encoder embeds input prompts, such as clicks or boxes. A lightweight transformer-based mask decoder predicts object masks from the image embedding and prompt embedding.

Structure of the Say Anything Model.

Structure of the Say Anything Model.

The image encoder has 632M parameters, and the prompt encoder/mask decoder has 4M parameters. The image encoder is implemented in PyTorch and requires a GPU for efficient inference. Both the prompt encoder and mask decoder can run directly with PyTorch or be converted to ONNX. They run efficiently on a CPU or GPU.

Timeline

No Timeline data yet.

Further Resources

Title
Author
Link
Type
Date

Segment Anything

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, Ross Girshick

https://arxiv.org/abs/2304.02643

April 5, 2023

References

Find more entities like Segment Anything

Use the Golden Query Tool to find similar entities by any field in the Knowledge Graph, including industry, location, and more.
Open Query Tool
Access by API
Golden Query Tool
Golden logo

Company

  • Home
  • Pricing
  • Enterprise

Legal

  • Terms of Service
  • Enterprise Terms of Service
  • Privacy Policy

Help

  • Help center
  • API Documentation
  • Contact Us
By using this site, you agree to our Terms of Service.