News QuickVid uses AI to generate short videos with voiceovers TechCrunch

Generative AI is coming for video. A new website called QuickVid combines several generative AI systems into a single tool for automatically creating short YouTube, Instagram, TikTok and Snapchat videos.
Just type in a word, and QuickVid selects background video from the library, writes scripts and keywords, overlays images generated by the DALL-E 2, and adds synthesized voice-over and background music from YouTube’s library of royalty-free music. QuickVid creator Daniel Habib said he’s building the service to help creators meet the “growing” demands of fans.
“By providing creators with the tools to produce high-quality content quickly and easily, QuickVid helps creators increase content output and reduce the risk of burnout,” Habib told TechCrunch in an email interview. “Our goal is to empower your favorite creators to meet the needs of their audience by leveraging advances in artificial intelligence.”
But depending on how they’re used, tools like QuickVid can flood an already crowded channel with spam and duplicate content. They also face potential opposition from creators who choose not to use these tools, either because of cost ($10 per month) or principle, but may have to compete with a flood of new AI-generated videos.
pursuit video
QuickVid, which Habib is a self-taught developer who previously worked on Facebook Live and video infrastructure at Meta, was built in a matter of weeks and launched on December 27th. It’s relatively simple at the moment—Habib says more personalization options are coming in January—but QuickVid can piece together the components that make up a typical informative YouTube clip or TikTok video, including subtitles and even avatars.
It’s easy to use. First, users enter a prompt describing the topic of the video they want to create. QuickVid uses hints to generate scripts, taking advantage of GPT-3’s generative text capabilities. Based on keywords automatically extracted from scripts or manually entered, QuickVid selects background video from the royalty-free stock media library Pexels and uses the DALL-E 2 to generate an overlay image. The voice-over is then output via Google Cloud’s text-to-speech API—Habib says users will soon be able to clone their voices—and all those elements are combined into a video.
Image credits: quick view
Watch this video made with cue “cat”:
or this:
QuickVid certainly doesn’t push the boundaries of what’s possible in generative AI. Both Meta and Google have demonstrated AI systems that can generate completely original clips based on text prompts. But QuickVid incorporates existing AI to take advantage of the repetitive, templated B-roll-heavy short video format, solving the problem of having to generate footage yourself.
“Successful creators have a very high quality standard and are not interested in releasing content that they don’t think is their voice,” Habib said. “That’s the use case we’re focusing on.”
Presumably, QuickVid’s videos are generally a mixed bag in terms of quality. Background videos tend to be a bit random or just off-topic, which isn’t surprising considering QuickVids are currently limited to the Pexels directory. At the same time, the images generated by the DALL-E 2 show the limitations of today’s text-to-image techniques, such as garbled text and disproportionate proportions.
In response to my feedback, Habib said that QuickVid “is being tested and patched every day”.
Copyright issue
According to Habib, QuickVid users retain the right to use the commercial content they create and have the right to monetize it on platforms like YouTube. But the copyright status around AI-generated content is… murky, at least for now. For example, the United States Patent and Trademark Office (USPTO) recently moved to revoke copyright protection for AI-generated cartoons, saying copyrighted works require human authorship.
Asked how the USPTO decision might affect QuickVid, Habib said he believes it only concerns the “patentability” of AI-generated products, not the rights of creators to use and profit from their content. He noted that creators don’t often file patents for videos, and they often lean toward a creator economy, letting other creators repurpose their clips to increase their own influence.
“Creators care about using their voice to publish high-quality content that will help grow their channel,” Habib said.
Another legal challenge on the horizon could affect QuickVid’s DALL-E 2 integration and, in turn, the site’s ability to generate image overlays. Microsoft, GitHub and OpenAI are being sued in a class action alleging they violated copyright law by allowing code generation system Copilot to introspect parts of licensed code without giving credit. (Copilot was jointly developed by Microsoft-owned OpenAI and GitHub.) The case has implications for generative art AIs like DALL-E 2, which have likewise been found copying and pasting (i.e., pictures) from the datasets they were trained on.
Habib isn’t worried, thinking that generative AI sprites are busted. “If another lawsuit comes up tomorrow and OpenAI disappears, there are several alternatives that could power QuickVid,” he said, referring to Stable Diffusion, a system similar to the open-source DALL-E 2 . QuickVid is already testing Stable Diffusion for generating avatar images.
Moderation and Spam
In addition to legal woes, QuickVid may soon face audit issues. While OpenAI has implemented filters and techniques to prevent them, generative AI has well-known toxicity and factual accuracy issues. GPT-3 spews misinformation, especially about recent events, that is beyond the scope of its knowledge base. And ChatGPT, a fine-tuned offspring of GPT-3, has been shown to use sexist and racist language.
This is worrisome, especially for people who use QuickVid to make informational videos. In a quick test, I asked my partner — who is more creative than I am, especially in this regard — to input some aggressive cues and see what QuickVid would generate. To QuickVid’s credit, obviously questionable cues like “Jewish New World Order” and “9/11 Conspiracy Theory” didn’t produce a toxic script. But for “Instilling Critical Race Theory in Students,” QuickVid generated a video suggesting that critical race theory could be used to brainwash schoolchildren.
Look:
Habib said he relies on OpenAI’s filters to do most of the review, and claimed that it’s the user’s responsibility to manually review each video QuickVid creates to make sure “everything is within the law.”
“As a general rule, I think people should be able to express themselves and create whatever content they want,” Habib said.
This obviously includes spam. Habib argues that the video platform’s algorithms, not QuickVid, are best at determining the quality of a video, and that people who produce low-quality content “will only damage their own reputations.” A damaged reputation would naturally deter people from using QuickVid to launch large-scale spam campaigns, he said.
“If people don’t want to watch your videos, then you’re not going to get distributed on platforms like YouTube,” he added. “Creating low-quality content can also make people see your channel in a negative light.”
But it’s instructive to look at an ad firm like Fractl, which in 2019 used an artificial intelligence system called Grover to generate marketing material for an entire website — discredited. In an interview with The Verge, Fractl partner Kristin Tynski said she foresees generative AI causing “a massive tsunami of computer-generated content in every niche imaginable.”
Regardless, video-sharing platforms like TikTok and YouTube don’t have to deal with moderating AI-generated content at scale. Deepfakes — synthetic videos that replace existing people with likenesses of others — became popular on platforms like YouTube a few years ago, fueled by tools that made deepfake videos easier to create. But unlike today’s most convincing deepfakes, the types of videos QuickVid creates are clearly not AI-generated in any way.
Google searches for policies on AI-generated text could be a preview of what’s to come in the video space. Google does not treat synthetic text differently from human-written text when it comes to search rankings, but instead takes action on content “designed to manipulate search rankings rather than help users.” This includes splicing or combining content from different web pages,”[doesn’t] Add enough value” and content generated through a purely automated process, both of which may apply to QuickVid.
In other words, if AI-generated videos become popular, they may not be banned outright by platforms, but simply become a cost of doing business. That’s unlikely to allay the fears of experts that platforms like TikTok are becoming the new home for misleading videos, but — as Habib put it in the interview — “nothing is going to stop AI-generated revolution.”