Sora’s Initial Trial: Encountering Video Errors and Technical Challenges

Sora's Initial Trial: Encountering Video Errors and Technical Challenges

OpenAI’s Sora, launched in December 2024, is a revolutionary tool that generates ultra-realistic videos from text prompts. Despite its user-friendly interface and impressive speed, Sora faces significant technical limitations, particularly for non-subscribers. Video outputs vary in quality, with some results being lifelike while others are flawed or unrealistic. Currently, Sora is more of an experimental tool rather than a reliable video generation resource, suggesting that further development is necessary for professional use.

OpenAI’s Sora: A Groundbreaking Video Generation Tool

In February 2024, OpenAI made waves in the tech world with the announcement of Sora, a tool designed to produce ultra-realistic videos from mere text prompts. This innovation sparked a debate about the potential for misinformation in an era where distinguishing fact from fiction is increasingly difficult. The videos showcased by OpenAI were so lifelike that it became nearly impossible to tell reality from fabrication.

Fast forward ten months, and Sora has officially launched. The service, currently accessible through a VPN in France, opened its doors to the public on December 9, 2024. But does Sora live up to the hype? After several hours of testing Sora V1, here’s what we discovered.

A User-Friendly Experience with an Intuitive Interface

Let’s begin with the positives: Sora’s interface is impressively designed. OpenAI opted to create a standalone platform at sora.com, avoiding the clutter of adding video generation to ChatGPT. This thoughtful decision allows users to access specific video tools without overwhelming the chatbot, paving the way for a more focused web application.

The homepage of Sora features a sidebar with various tabs, leading to well-constructed video editing tools that allow users to create what OpenAI refers to as a “storyboard.” In simple terms, you can instruct Sora to generate an image at the first second, incorporate a blur transition, and display something entirely different at the third second, essentially directing your own video. Notably, Sora enhances user prompts by automatically suggesting longer and more specific queries.

Another highlight of Sora is its speed. When functioning smoothly, which has been a challenge due to initial server overloads, Sora generates videos in approximately 30 seconds. The service does come with a web interface, though downloading videos can be more cumbersome on certain devices.

Technical Limitations That Hinder Full Potential

However, not everything is seamless. One significant drawback of Sora is the numerous technical limitations that OpenAI has instituted to manage server load. Currently, only paying subscribers of ChatGPT can access the service, with full features locked behind a $200 monthly fee for ChatGPT Pro. Those opting for ChatGPT Plus at $20 per month face severe restrictions.

With a ChatGPT Plus subscription, users receive just 1,000 credits monthly, enabling the creation of only 50 videos at the lowest quality (480p, limited to ten seconds, with one request at a time). This design seems to discourage excessive use, with the risk of users facing restrictions further down the line. Videos can only be 10 seconds long in 480p quality, and just 5 seconds in 720p. This poses a challenge in finding practical applications for the generated videos.

Additionally, while Sora can animate photos into videos, this feature is only fully available to those paying the $200 subscription. For others, only objects and animals can be animated, leaving out human figures. This limitation suggests that OpenAI is keen on promoting its premium subscription.

Another technical hurdle is the process of sending images to create videos, which often yields results that are either too realistic or lack plausibility. As it stands, Sora feels more like an experimental tool rather than a reliable resource for creating believable content.

Mixed Results in Video Generation

So, how do the videos stack up? We generated about fifteen examples, and the results were a mixed bag. While some videos impressed, the majority fell short of expectations.

A notable example of Sora’s shortcomings was a video of a football player missing a penalty, where the player performed inexplicable acrobatics and an extra shooter appeared on the field. In another instance, Santa Claus running down the Champs Élysées was depicted with an unrealistic appearance: his face obscured, two beards, and a hat that lacked proper physics. Other attempts, like generating a superhero cat or an eagle soaring above a canyon, exhibited unnatural movements.

On the flip side, Sora produced lifelike results in simpler scenarios, such as a man using his phone in the subway and a koala munching on a leaf. However, when tasked with creating a 2D infographic of a delivery truck, Sora struggled, generating nonsensical text and mixing images.

Geographical accuracy was also an issue, with Sora incorrectly representing Nice by substituting pebbles for sand and confusing male and female figures. In one instance, Sora made up an airport that doesn’t exist in the city, while failing to depict a correct boat. Though the human faces were decent, our main character ended up with just four fingers.

Finally, Sora demonstrated inconsistencies regarding copyright concerns. While it correctly refused to animate a photo of the Phrygian cap, it also inexplicably rejected the term “panda.” The tool attempts to transform prompts into a “story,” but the resulting video fails to materialize.

Unlike ChatGPT, which quickly captivated users, Sora’s current capabilities are still evolving. While the potential for ultra-realistic video generation is present, OpenAI has a considerable way to go before achieving that goal.

As it stands, Sora V1 serves primarily as an intriguing toy for tech enthusiasts, offering occasional pleasant surprises but lacking the robustness needed for professional applications. For that, we might need to wait a few more years.