All it takes these days is around $1,000 and less than a five-minute sample video, and a company never has to worry about not being able to sell their products or services online ever again.
When you next take the time to sift through any of the videos that livestream in the early hours of the morning (usually around 4 a.m.) on China's most popular e-commerce platform, TaoBao, you will notice that it's bizarrely full of activity. At a time of the day when many of us are in dreamland, an army of hard-working presenters are promoting products and services they are trying to sell in front of the cameras via a live feed, all while offering huge discounts in those early morning hours.
However, upon closer inspection, it becomes apparent that the people trying to sell their products to you are not real humans and appear somewhat robotic in their movements. That is because they are, in fact, clones or 'deepfakes' of actual live streamers that artificial intelligence tools have generated in digital form.
In other words, the people you think are trying to sell you products aren't real. They are deepfakes of the real live streamers. Under closer scrutiny, their lips tend to mimic what's being said, but there's something not quite right with some of the movements and mannerisms.
The companies that have created this technology have made it possible for any brand to use AI-generated clones (or avatars) of today's most famous live streamers. Today, they are more realistic than ever in terms of their voices and movements.
Their sophisticated AI-generated characters are now extremely hard to differentiate from the actual humans they are based on. These deepfakes are becoming a huge trend that has massively taken off on many of China's biggest e-commerce platforms. In China today, livestreaming is the leading marketing channel for both digital and traditional brands, which stands to reason why this new tool is so popular.
Major influencers on well-established platforms, like Kuaishou, TaoBao, and Douyin, can command mega-money contracts if they are successful. Some of the industry's biggest livestreaming personalities can push over a billion dollars' worth of merchandise for the companies they promote in a single night. The best ones have become more like celebrated Hollywood stars, not merely salespeople selling products.
From a smaller business point of view, with less money to spend on marketing, it typically costs a great deal of money to find and then train a human live streamer to sell new products. You then have to factor in that it can be a challenge to keep hold of those same live streamers for a prolonged period and take into account that a company must also try to arrange the cheapest and most effective way to broadcast their live streamers to the masses. Combining all of these essential elements can be extremely expensive.
It's much cheaper for a company and a far less grueling process to simply invest in more affordable deepfake services to advertise their products to a much bigger market. This is why many smaller companies today are using deepfake technology to shift their products.
In the past year alone, the service of creating deepfake avatars for the purpose of livestreaming products on e-commerce platforms has exploded. The pioneering service is now offered by an untold number of Chinese startup companies and leading tech companies. With a one-off fee costing somewhere in the region of $1,100 and using only a few minutes at most sample video footage of a live streamer, brands are now capable of cloning a human who they can put to work as the face of their company to sell their products every minute of the day.
From deepfake technology to e-commerce
Since the late 2010s, the technology that has been labeled as 'synthetic media' (which is a broad term widely used today to describe the production, modification, and manipulation of data and media, mainly via the use of automated AI-powered tools, which has the ability to change the original meaning and mislead people), has repeatedly hit the headlines.
It all started when a Reddit user who went by the name 'deepfake' was able to swap faces into pornographic material. The technology has come a long way from this in just a few short years, although the idea has remained pretty much the same.
Pretty much anyone who wants to can use simple AI-powered tools to generate an existing face (say a politician, famous actor, or, in this case, a famous live streamer) and manipulate it to move and speak like the real person, even though that person has never said those things. Global leaders from Vladimir Putin to Barrack Obama, tech giants, and Hollywood legends from Mark Zuckerberg to Tom Cruise have been 'deep faked' many times, some more convincingly than others.
The emerging technology has also gained notoriety over the past few years by people using it for political misinformation, identity scams, and revenge porn. However, some have endeavored to commercialize the technology and use it in a safer and less damaging way. For most, it has more or less remained a novel technology. However, Chinese-based artificial intelligence companies seem to have developed a new way to use the technology, and it appears to be flourishing.
A startup company based in Nanjing called Silicon Intelligence, which formed in 2017, uses this tech and mainly focuses on natural language processing, especially things like robocall and other text-to-speech technologies. Back in 2020, the company's founder, Sima Huapeng, stated that his company first noticed the endless possibilities AI-generated clones might unlock using livestreaming as a tool.
At the time, it would take around half an hour of sample videos for his company to generate a digital version of the human it was trying to mimic, and in that time, it could learn to act and talk just like that human. The following year, the initial half-hour learning time was reduced to just ten minutes. Soon after, it went to three minutes, but now, it only needs around 60 seconds of footage to create a clone (or deep fake, as some people still prefer referring to it).
As with most things, the cost of the service has become far more affordable as the technology improved. Producing a basic artificial intelligence clone of someone today would cost somewhere in the region of $1,100 (RMB 8,000, or equivalent currency value). If a client was looking to invest in a far more complex streamer capable of more than just the basics, it could end up costing up to a few thousand more. The fee covers 12 months of maintenance and the actual AI generation of the clone.
From the moment a digital clone has been generated by the software, both the mouth and body can begin to synchronize with the words it has been instructed to say. When the technology first came about, scripts pre-written by humans was the only option, but the technology has evolved to a stage where LLMs (large language models) can also now generate script.
The only thing left for humans to do at this point is input simple data, such as how much money the product or service is selling for and the brand/product/service name. They may also want to proofread the AI-generated script to check if it sounds okay and then simply sit back and watch as the newly created digital influencer live streams across the internet.
Better still, to make it appear as though the AI-powered live streamer is communicating with the audience in real time, the more advanced versions are capable of noticing live chat comments and giving the most relevant answer. Additionally, it can take a different marketing approach based on how many viewers are currently tuned in to the live stream.
According to Huang Wei, the director of a Chinese-based virtual influencer livestreaming business called Xiaoice, the AI clones of live streamers are taught to mimic mannerisms, common scripts, and gestures. The company's database currently has somewhere in the region of 100 pre-designed common movements that are specifically designed to correspond with what's being said.
One of the main focuses of Xiaoice has been to try to create artificial intelligence that is as human-like as possible, particularly clones that can show a range of human emotions. Huang described traditional e-commerce websites as being cold, pointing out that they are more like a shelf of merchandise. AI-powered digital avatars can create more of a connection between the live streamer and the person watching, and it's a much better way of explaining what the product or service is.
In 2022, Xiaoice trialed its AI-powered tool with a handful of clients before officially launching its ground-breaking new service of creating digital avatars costing under $1,000. Only a one-minute video of human live streamers is needed for digital clones to be made. However, just like the company's competitors, clients with a slightly larger budget who decide to use the services of Xiaoice can spend extra to receive a more detailed clone.
For example, an excellent digital clone of the famous Chinese sports announcer, Liu Jianhong, was generated during the 2022 FIFA World Cup in Qatar purely for the purpose of announcing the results of the matches and other relevant stories on a popular streaming platform. It's becoming more difficult for people to tell the difference between a clone and a real person.
An affordable alternative to human live streamers
The artificially generated livestreamers are unlikely to be as successful or popular as the leading human e-commerce influencers, but they are just as good as the next lower-tier ones. According to some human content creators who have trained their own AI clones using their videos, they are starting to feel the pinch of the emerging new trend of AI streamers.
Starting out as a new livestreamer has also become more difficult in recent times, and they currently earn around a fifth as much as they did last year. You are now starting to see fewer human livestreamers and more AI-generated clones, especially during the early hours of the morning, because it's much cheaper than hiring humans.
Deepfake technologies have reached a point where they are convincing enough to fool most people into believing they are real humans, so the technology has worked in that respect, especially for the bigger brands. Virtual livestream hosts are fundamentally an economical option these days to replace humans and bring down costs. Over the past year or so, there has been a great deal of interest from brands looking to use AI streamers to sell their products.
Companies like Xiaoice now provide AI-generated clones of human livestreamers to over a hundred different clients, and they have all been successful in terms of sales brokered while they are on air. One of their digitally generated livestreamers was said to have sold around RMB10,000 (approx. $1,400) worth of product in only one hour.
The technology still has a couple of flaws, though. For instance, some products are much harder to sell than others, especially by the digitally generated livestreamers. It's easy for them to sit at desks and sell smaller products using only a limited number of gestures and movements, but when it comes to selling much bigger things like sofas and other large items of furniture, it becomes more challenging for the tech to pull off.
Despite this, there is still a steady rise in the number of brands and marketing companies focusing more on AI-generated livestreams. However, in recent years, the Chinese government introduced several new pieces of legislation around synthetic media and generative artificial intelligence to make AI-generated livestreamers more transparent, especially concerning the e-commerce sector. The impact of these new regulations on brands using AI in this way to sell their products remains to be seen.
One of the next steps for AI-generated livestreamers, especially for companies like Silicon Intelligence, is to try to incorporate 'emotional intelligence' into their clones' behavior. They are also making progress in finding ways for the AI streamers to learn from each other and interact more.
The company boldly claims that it plans to create an astonishing 100,000,000 digitally generated livestreamers within just a few years from now. There are currently around 400,000 of these silicon-based laborers, as they have been called, meaning there's still much more work to be done.