Sora (OpenAI)

Sora is a text-to-video AI model developed by OpenAI that can generate videos up to a minute long based on user prompts.

Overview Structured Data Issues Contributors

All edits by Arthur Smalley

Edits on 21 Feb, 2024

Arthur Smalley

edited on 21 Feb, 2024

Edits made to:

Timeline

Article (+8/-8 characters)

Article

Sora is a text-to-video AI model developed by OpenAI that can generate videos up to a minute long based on user prompts. Sora can generate scenes with multiple characters, specific types of motion, details for both the subject and background, and multiple shots within a single generated video with persistent characters and visual style. OpenAI announced Sora on February 15, 2024, making the model available to red teamers to assess potential areas of harm and risks. Upon the announcement, OpenAI also made the model available to visual artists, designers, and filmmakers to gain feedback on performance.

...

Sora uses a transformer architecture similar to OpenAI's GPT models. At a high level, Sora turns videos into patches by compressing them into a lower-dimensional latent space and decomposing the representation into spacetime patches. The model builds on previous OpenAI research from the Dall-E and GPT models. In particular, using the recaptioning technique from Dall-E 3 that involvedinvolves generative descriptive captions for visual training data.

Timeline

February 15, 2024

Edits on 16 Feb, 2024

Arthur Smalley

edited on 16 Feb, 2024

Edits made to:

Infobox (+9 properties)

Timeline (+1 events) (+292 characters)

Description (+120 characters)

Article (+2503 characters)

Further Resources (+1 rows) (+5 cells) (+160 characters)

Sora (OpenAI)

Sora is a text-to-video AI model developed by OpenAI that can generate videos up to a minute long based on user prompts.

Article

Overview

Sora is a diffusion model, that generates videos from a starting point of static noise. While large language models (LLMs) have text tokens, Sora has visual patches, an effective representation for models of visual data. Patches are scalable and allow generative models to be trained on a range of video and image types. Sora is a generalist model for visual data, generating videos and images of diverse durations, aspect ratios, and resolutions, outputting up to one minute of high-definition video. It can generate entire videos at once, extend previously generated videos to make them longer, add missing frames to an existing video, and animate an existing still to generate a video.

Sora uses a transformer architecture similar to OpenAI's GPT models. At a high level, Sora turns videos into patches by compressing them into a lower-dimensional latent space and decomposing the representation into spacetime patches. The model builds on previous OpenAI research from the Dall-E and GPT models. In particular, using the recaptioning technique from Dall-E 3 that involved generative descriptive captions for visual training data.

...

OpenAI states Sora has a number of weaknesses, including accurately simulating physics for complex scenes and not understanding specific instances of cause and effect. Sora can also confuse spatial details of a prompt or struggle with descriptions of events that take place over time. Working with red teamers, OpenAI is testing Sora to identify areas such as misinformation, hateful content, and bias. The company states it is also building tools to detect misleading content as well as leveraging existing safety methods built for other products, such as Dall-E 3. OpenAI did not disclose information on the footage Sora was trained on, only stating that the training corpus contained publicly available videos and videos licensed from copyright owners.

Further Resources

Title

Author

Link

Type

Date

Video generation models as world simulators

OpenAI et al.

https://openai.com/research/video-generation-models-as-world-simulators

Technical report

February 15, 2024

Infobox

Launch Date

February 15, 2024

Competitors

Industry

Product Parent Company

OpenAI

Official Website

https://openai.com/sora

Timeline

February 15, 2024

OpenAI announces Sora, a new text-to-video model that generates videos up to a minute long based on natural language prompts.

OpenAI is making the model available to red teamers for testing and creative professionals (visual artists, designers, and filmmakers) to gain feedback on performance.

Edits on 16 Feb, 2024

"Created via: Web app"

Arthur Smalley

created this topic on 16 Feb, 2024

Edits made to:

Infobox (+1 properties)

Sora (OpenAI)

Sora is a text-to-video AI model developed by OpenAI that can generate videos up to a minute long based on user prompts.

Infobox

Is a

Product

Find more entities like Sora (OpenAI)

Use the Golden Query Tool to find similar entities by any field in the Knowledge Graph, including industry, location, and more.

Open Query Tool

Access by API

By using this site, you agree to our Terms of Service.