Content
summary Summary
Update
  • Kling AI is now accessible as a web version with an expanded model. A Chinese phone number is still required for access.

Update from July 7, 2024:

Ad

Kuaishou introduces a web version of Kling AI based on an "enhanced model" that can generate videos up to ten seconds long. The new model also allows control of camera movements such as panning, tilting and zooming, and supports negative prompts. Free users can create three 10-second videos per day with the enhanced model.

A Chinese cell phone number is still required to register with Kling, even for the web version. However, you can watch demo videos created by other users without registering.

Original article from June 8, 2024:

Ad
Ad

KLING is the latest AI video generator that could rival OpenAI's Sora

Chinese tech company Kuaishou has unveiled KLING, a new video generation model. Based on the demos, it could rival OpenAI's Sora.

Kuaishou says KLING can make videos up to two minutes long at 1080p resolution and 30 frames per second. It can also model complex motion sequences that are physically accurate.

One video shows a two-minute train ride made with the prompt "Train ride with different landscapes seen through the window." OpenAI announced its video model Sora in mid-February, with relatively consistent videos up to one minute long.

Video: kling.kuaishou.com

Another example of a longer video shows a boy riding a bike in a garden as the seasons change. Of course, the landscapes change with the seasons, and maybe that's the trick to getting the length, but the boy on the bike looks pretty consistent. It would be more impressive if he rode around the same garden in circles, though.

Recommendation

Video: kling.kuaishou.com

A video of a boy eating a cheeseburger at a fast food restaurant is also noteworthy. The burger gets smaller after he takes the first bite.

Video: kling.kuaishou.com

A knife cutting an onion and a man eating pasta from a plate are similar examples of a physical interaction between two objects that causes a change in the video. However, these examples only last a few seconds, so it's not clear how consistent this "physical simulation" is.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Video: kling.kuaishou.com

The developers say KLING uses a 3D space-time attention system to better model motion and physical interaction, and the model is able to generate long, high-resolution videos thanks to a scalable framework and optimized inference.

Kuaishou claims the model correctly simulates the physical properties of the real world. Using a diffusion transformer, it can also combine concepts and create fictional scenes, such as a cat driving a car through a busy city.

Video: kling.kuaishou.com

OpenAI's Sora also uses a diffusion transformer and describes its video generator as a "world simulator"-though AI experts like Meta's Yann LeCun have criticized the startup for making such a bold claim.

KLING is currently available as a public demo in China. Kuaishou is a Beijing-based tech company best known in China for its social media apps. With KLING, it's now entering the race for large-scale generative AI models.

Ad
Ad

Tech investor and actor Ashton Kutcher has access to a beta version of Sora. He believes that generative AI for video will transform the film market and Hollywood.

Update:

  • X user guizang.ai says he has gained access to the model through a smartphone app that requires a Chinese phone number.
  • On X, he shows a series of prompts and the resulting videos, which are of good quality but no longer than five seconds.
  • The user claims he wasn't picking cherries. All videos show the first result for each prompt.
  • Another user, Junie, says that it doesn't take more than 3 minutes to generate a video, and you can generate 5 clips at the same time.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Kuaishou, a Chinese technology company, has introduced KLING, an AI model for video generation that can produce videos of up to two minutes in length at 1080p resolution and 30 frames per second.
  • According to the company, KLING is able to model complex motion sequences in a physically correct way using a 3D space-time attention system. A "diffusion transformer" allows the combination of concepts and the generation of fictitious scenes that were not part of the training dataset.
  • The model is currently available as a public demo in China and could compete with OpenAI's Sora, which also claims to have developed a "world simulator" that can also generate videos.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.