OpenAI, the company behind ChatGPT, has introduced its first artificial intelligence (AI)-powered text-to-video generation model, ‘Sora’.
The company claims it can generate up to 60-second-long videos. This is longer than any of its competitors in the segment, including Google’s Lumiere, which was unveiled last month.
According to the statement, Sora is currently available to red teamers, cybersecurity experts who extensively test software to help companies improve their software, and some content creators. The AI firm also plans to include Coalition for Content Provenance and Authenticity (C2PA) metadata in the future once the model is deployed in an OpenAI product.
“Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions,” the company said in an X post.
As per the statement. “The length of the video it claims to generate is more than ten times of what its rivals offer. Google’s Lumiere can generate 5-second-long videos, whereas Runway AI and Pika 1.0 can generate 4-second and 3-second-long videos, respectively.”
Sora is essentially a diffusion model which uses a transformer architecture similar to GPT models. Similarly, the data it consumes and generates is represented in a term called patches, which is again akin to tokens in text-generating models. Patches are collections of videos and images, bundled in small portions, as per the company.
“Using this visual data enabled OpenAI to train the video generation model in different durations, resolutions, and aspect ratios. In addition to text-to-video generation, Sora can also take a still image and generate a video from it,” as per the reports.
Trending | UAE AI Office & EGA to accelerate AI adoption in industrial sector