Three federal agencies released a joint report Tuesday calling on companies to brace themselves against dangers presented by AI-generated media, particularly deepfakes, that increasingly threaten to undermine trust in and authenticity of various forms of digital media.
The FBI, National Security Agency, and Cybersecurity and Infrastructure Security Agency released the 18-page information sheet, which overviews how deepfakes can impact organizations, the emerging trends in these threats, and extensive recommendations for resisting deepfakes.
“Deepfakes are a particularly concerning type of synthetic media that utilizes artificial intelligence/machine learning (AI/ML) to create believable and highly realistic media,” reads the information sheet. Abusive techniques that leverage this tech “threaten an organization’s brand, impersonate leaders and financial officers,” and can “enable access to an organization’s networks, communications and sensitive information.”
The information sheet forecasted that “phishing using deepfakes” (impersonation schemes that involve synthetic video and or audio) will eventually become “an even harder challenge than it is today,” and the agencies advised companies to “proactively prepare to identify and counter it.”
Deepfakes get their name from deep learning, which is a class of machine learning algorithms that uses multiple layers of neural networks to extract progressively higher-level features from data.
For example, in a deep learning algorithm trained to recognize language, one layer might parse basic parts of speech (verbs, nouns, adjectives, etc.) while the next layer might parse basic sentence structures. Progressively deeper layers might parse more complex linguistic features and context, like idioms or sentiments.
Just as large language models have recently begun to display mastery of high-order features of language — for example, how to follow written instructions — models designed to generate photos have also gained in their capabilities, with OpenAI’s DALL•E and Stability AI’s Stable Diffusion the most popular examples. Video and audio generation has also gained, according to Rijul Gupta, CEO and co-founder of AI communications company DeepMedia.
“Deepfakes have gotten more sophisticated — not to mention easier to create — over the years,” Gupta said. “Today, a hacker can manipulate a person’s voice using just seconds of audio.”
The FBI, NSA and CISA highlighted two examples in their joint report of unknown malicious actors deploying deepfakes in May as part of a phishing campaign. In one case, a product line manager was contacted over WhatsApp and invited to a call with a sender claiming to be the CEO of the same company.
“The voice sounded like the CEO and the image and background used likely matched an existing image from several years before and the home background belonging to the CEO,” the agencies reported.
A similar scheme also involved a CEO impersonator conducting video calls with an employee over both WhatsApp and Microsoft Teams, then switching to text because the connection was poor. In this case, the employee caught on and terminated communication. The agencies said the same, unnamed executive had been impersonated via text message on other occasions.
Although cybercriminals have deployed deepfakes to defraud organizations for financial gain, there are “limited indications of significant use of synthetic media techniques by malicious state-sponsored actors,” according to the agencies.
However, “the increasing availability and efficiency of synthetic media techniques available to less capable malicious cyber actors indicate these types of techniques will likely increase in frequency and sophistication,” the agencies said.
Deepfakes that mimic a person’s voice and face have advanced for many of the same reasons that large language models have recently become so capable, according to the federal agencies’ information sheet — “advances in computational power” and technology such as deep learning have made it easier not just to create fake multimedia, “but also less expensive to mass produce,” the document said.
These advances offer defensive capabilities, as well, according to Yinglian Xie, CEO and co-founder of fraud prevention firm DataVisor, enabling companies to build models that recognize subtle but certain signs of deepfakes.
“By training these systems with real-world examples of deepfake attacks, their ability to recognize and counter such threats will improve significantly over time,” Xie said.
Beyond deploying deepfake detection technologies, the federal agencies that issued the Tuesday report had many additional recommendations for companies looking to protect themselves against deepfakes. These include recording and making copies of suspicious media, using reverse image searches, and examining content metadata.
The agencies also listed more advanced examinations that companies can conduct on suspicious media. These include physics-based examinations (borrowing from Hany Farid, a professor at UC Berkeley who specializes in detecting digitally manipulated images) and content-based examinations, which can be done with numerous free and open source tools, many of which are cataloged by the Antispoofing Wiki project.