Anti-deepfake technology refers to forms of detecting deepfakes. The deepfake is a form of synthetic media that replaces images, videos, or audio with information not present in the original image or video. This technology can be used for various reasons, including to de-age an actor or include a previously deceased actor in a film. More often, the technology is used for seemingly malicious purposes, such as faked images or videos of politicians saying or doing things they have not said or done. For example, videos purported to show Ukrainian President Volodymyr Zelensky telling Ukrainian troops to "surrender" in the face of the Russian invasion, which would later be proven to be fake videos.
Deepfakes have been linked to eroding public trust in news media and politicians and have led to an increased interest in technology that can be used to defeat deepfakes and, according to some experts, has led to increased trust in media and journalists. However, the technology tends to be difficult, as much of the deepfake detection technology uses the same underlying automation and methods as the software and systems used to develop the deepfakes.
And in the case of that technology—including convolutional neural networks (CNNs) and generative adversarial networks (GNNs)—these technologies can absorb the detection technology and develop better deepfakes. For example, a 2018-published study on deepfakes showed, through the training of datasets, that deepfake videos did not include blinking. This led to the development and adoption of algorithms for detection, which looked into blinking. But the deepfake systems were, in turn, trained to ensure future deepfakes would blink, thereby defeating the detection technology.
Further, as new detection methods are developed, the malicious users of deepfake technology and software can learn how to dupe them and, in turn, develop more sophisticated, harder-to-detect deepfakes.
In progressing toward developing anti-deepfake technology, understanding how to detect deepfakes is important. Typically, there are artifacts in a deepfake that can be identified either by a human analysis or by an algorithmic analysis that can increase the likelihood of an image or audio being authentic or inauthentic.
Typical artifacts in face manipulations include the following:
- Visible transitions—The face of a person is swapped or superimposed on another person, which results in visible artifacts around the edge of the face with some transition of skin color and texture at points, and parts of the original face can also become noticeable at the edge of the target face.
- Sharp contours are blurred—Many of the deepfake software algorithms fail to create sharp contours, such as those found on the teeth and eyes, which appear, on closer inspection, to be conspicuously blurred when faked.
- Limited facial expressions and shadows—A lack of data in the model can lead to limitations in the model's ability to accurately represent facial expressions or lighting situations, with a profile view often insufficiently learned and resulting in blurriness or related visual errors when a head turns quickly.
With voices and audio-only deepfakes, there are related artifacts in the manipulation, which can lead a detection system, human or algorithmic, to identify authentic or inauthentic media. These include those below:
- Metallic sound—Many processes can produce an audio signal that is perceived as metallic by the human ear.
- Incorrect pronunciations—Many procedures mispronounce some words, especially if the model has been trained in one language and is expected to reproduce a word in a different language.
- Monotone speech output—Especially if the training data is not ideal, the audio signal can be monotonous when it comes to applying word emphasis.
- Incorrect diction—Deepfake techniques can be quite good at faking the timbre of a voice but can have difficulty copying specific characteristics of the voice, and then accents or stresses could not match those of the target speaker.
- Unnatural sounds—If a deepfake model receives input data different from the training data used, the method can produce unnatural sounds or odd silences.
- High delay—Even high-quality fakes are frequently accompanied by a certain time delay since the semantic content has to be pronounced and captured before it can be processed by a deepfake procedure.
The broader surrounding of a piece of media can be important to consider in spotting a deepfaked or authentic media. This can include the following:
- Source reliability—The credibility and trustworthiness of a source sharing an image or video can provide a contextual clue to the authenticity of a piece of media. Often deepfaked videos are propagated through unverified or suspicious channels.
- Reverse image/video search—A reverse image or video search engine can be used to see if the same or similar content appears elsewhere on the internet, and if the media has been widely circulated or presented in multiple contexts, it can increase the likelihood of authenticity.
- Awareness of current trends—Being informed about the latest advancements in deepfake technology and detection methods can allow individuals to enhance their ability to spot deepfakes effectively.
- Synthetic artifacts—Artifacts or distortions in a video, such as unnatural lighting, inconsistent shadows, or pixelation, can indicate tampering, although they do not always mean the media has been tampered with, as these types of artifacts can occur based on the conditions under which a video is captured.
The amount of deepfaked media is believed to double every six months, with some estimates suggesting doubling at a faster rate. This has led many technology companies, including Microsoft, Google, and Meta, to take steps toward detecting deepfakes through the 2020 Deepfake Detector Challenge. This challenge invited researchers and developers to develop technology, especially algorithms, capable of detecting deepfakes.
The challenge offered a large dataset of visual deepfakes and resulted in a benchmark made freely available to researchers working to develop deepfake video detection systems. However, what emerged from the challenge was that the best systems only achieved around 65 percent accuracy. While this sounds impressive, it has been noted that individuals being shown deepfake videos often score above 50 percent in the un-specialized detection of these videos. If an individual is trained on a dataset, that number has been shown to increase, keeping pace with or outpacing automated detection systems. In another case, the Microsoft Azure Cognitive Services API, which has been used to detect deepfakes, was fooled by 78 percent of the deepfakes it was fed.
Similar to the Deepfake Detector Challenge, the United States Defense Advanced Research Projects Agency (DARPA) introduced the Media Forensics (MediFor) program, which awards grants intended to fund deepfake research and detection methods with a mission to ensure greater trust in media.
Generally, technological countermeasures are focused on automation, to increase the capabilities of detection and try to have detection occur in real time to stop the malicious effects of deepfaked content. However, there are several challenges these countermeasures face, not least of which they only work reliably under certain framework conditions. A key challenge is the limited extent to which detection methods can be generalized, especially as many have been trained on specific data and can reliably work on similar data. If the parameters are changed, these methods will stop working. The 2020 DeepFake Detection Challenge, as shown above, saw the best model only achieve an average accuracy of 65.18 percent.
Another key problem with these techniques is that AI-specific attacks can overcome them. For example, an adaptive attacker could create targeted noise and superimpose it on a manipulated image, which could prevent a detection procedure from classifying the image as fake while being subtle enough to not be noticeable to the human eye.
One suggested method for combatting deepfakes has been image authentication. One such solution has been developed by Truepic, with a proprietary method that requires users to capture photos or videos through its mobile application, which records data points, like timing or image contents, to verify the provenance of the given image or video. This image or video is then certified at capture. However, this technology does not detect deepfaked images; rather, it ensures images captured through its system can be verified as authentic.
Similar methods have been developed that have used blockchain cameras used through proprietary applications, which create a similar authentication certification to the system used by Truepic. However, in these cases, the blockchain application is intended to transparently store the authentication and prove its authenticity and show any record of the image being in any way changed.
Another approach to image authentication has been developed by researchers from the NYU Tandon School of Engineering. This method uses neural networks to verify the authenticity of an image from the acquisition to delivery, with the process using a neural network in the photo development pipeline to place artifacts into the digital image. In the case of the image being altered, these sensitive artifacts distort and serve as a hidden watermark. However, the artifacts are designed to allow the typical post-processing that occurs in digital images, such as stabilization and lighting adjustments.
Most deepfake detection methods are focused on automated detection, which emphasizes speed and ease of detection, as many want deepfaked media to be at least acknowledged for being fake before it can be spread widely on social media sites, where most deepfaked media is spread, and before it can be picked up by media agencies. The case of deepfaked media being picked up and repeated or spread by media agencies has already led to wrecking and ruining individual and corporate reputations. Further, when deepfaked media spreads unchecked, it can undermine trust in establishments, especially as individuals can find themselves believing, with full confidence, in the deepfaked media rather than in the refutation. This has led to further decreases in the trust of media agencies and the government, and the detection and announcement of deepfaked media have been posited as a chance to restore that trust.
Most automated deepfake detection approaches have focused on using deep learning models, especially as these technologies are the ones typically used to create deepfakes. Deep learning approaches can be used to continue to learn new deepfake techniques and can be used to develop new approaches to deal with the increasingly challenging deepfaked media and even extend to preventing deepfakes before they can mislead viewers. These approaches can also be handcrafted with supervised methods to ensure that new and emerging understandings for detection can be included in the deep learning model.
Most of the mainstream methods of detecting deepfakes have used convolutional neural networks (CNNs) as the backbone. Many of these networks rely on local texture information, which is often determined by forgery methods of training data; however, it means that CNN-based methods cannot generalize well to unseen data. This has led some researchers to propose a solution to lead CNNs to analyze anomalies in face images. However, in many cases, this offers comparable performance to other methods.
Another popular neural network for automated deepfake detection has been the use of generative adversarial networks (GANs), which use two neural networks—one to generate fake images and a second to evaluate those images, in order to learn how to detect deepfaked images and videos. The second part of the network, known as the discriminator, is trained on datasets of real images mixed with previous data, be that real or fake, while the neural network generating images, the generator, works through each training cycle to improve in deepfaking images and videos. Often in these networks, the generator network fools the discriminators more often than the discriminator can detect the deepfake. However, GANs have been used to build deepfake detection algorithms to automatically detect deepfakes in the wild. But often, the deepfake developers use GANs themselves, and these algorithms can be used, in turn, to create more convincing deepfakes.
One method proposed for detecting deepfakes has been the use of mutliple data modalities. This type of detection is not as rapid as other forms of artificially intelligent detection processes; instead, it uses a more forensics approach by ensuring semantic consistency across media assets of different modalities. This tends to be more challenging for current deepfake tools and therefore is considered more of a forensic tool than an initial detection tool. However, one such attempt to develop such a tool, with rapid detection in mind, has been Microsoft's Video Authenticator tool, which uses this type of multi-modal or cross-modal analysis to generate confidence scores.
In 2020, Microsoft developed and launched a video authentication tool. This tool was launched for the United States 2020 Presidential Election and intended to help agencies and social media sites combat deepfakes during the election process. The Video Authenticator tool is a media analysis program that determines a percentage chance—called a "confidence score"—that a given piece of media has been manipulated. For videos, the software can provide a real-time, frame-by-frame analysis. The technology works by detecting subtle fading or greyscale elements that could be invisible to the human eye.
Similarly, Microsoft is also in the process of releasing a series of digital signatures that can be added to a video's encoding to prove authenticity. Microsoft's tool is not intended to be a tool to help identify deepfakes, and Microsoft is cautious to say it is not perfect, and the individual's critical eye can be equally important.
Another approach toward the detection and recognition of deepfakes has been the use of data science to develop algorithms for detection. This approach utilizes a combination of supervised and unsupervised learning algorithms to detect and analyze videos and identify unique patterns and features to allow these algorithms to distinguish between authentic and inauthentic content. This process can involve the extraction of facial landmarks, examining inconsistencies in facial expressions and movements, and analyzing audio-visual synchronization. Further, data scientists can work with domain experts, such as forensic analysts and digital media professionals, to refine and enhance detection algorithms. This type of approach is also considered one that can incorporate a wide range of data, including various demographics, ethnicities, and social backgrounds, to further improve the accuracy of such an approach.
Various techniques using computer vision have been proposed or developed by researchers to detect deepfakes. These techniques use a deep-learning-based approach to apply computer vision to a digital image or video to determine its integrity. Computer vision, in this case, is used to check the features of the image frames using fuzzy clustering feature extraction, which has been shown in studies to improve the accuracy of detection rate by 98 percent when tested on various datasets.
Intel has developed a deepfake detector as part of the company's Responsible AI work. This tool, called FakeCatcher, is a detector designed by Ilke Demir in collaboration with Umur Ciftci from the State University of New York at Binghamton. The tool uses Intel's hardware and software, runs on servers, interfaces through web-based platforms, and uses OpenVino to run AI models for its detection. Intel's detection method uses computer vision to look for authentic clues in videos and assesses blood flow in the pixels of a video. This is based on the change of color in the veins of a person's face, which occurs when the heart pumps blood. These blood flow signals can be collected all over the face and are then translated algorithmically into spatiotemporal maps. The map is then put through a deep learning engine to understand whether a video is real or fake.
However, Intel's solution has raised questions, especially as Intel has claimed a 98 percent accuracy rate with the tool. This comes as real-time analysis has been difficult, with most successful detectors being reactive. The approach could be considered to be dependent on visual alignment, which suggests there would be specific types of conditions that the technology needs to work effectively. For example, it is unclear if the technology works if the subject's face is in profile, if there are visibility issues, if the lighting of the subject matters, and whether the technology works with people of all racial and ethnic backgrounds.
Despite the technological approaches to detecting deepfakes, human detection has remained a key element. Especially for individuals seeking not to be taken in by deepfakes, a critical or skeptical attitude towards media—especially that which seems incredible or too good to be true—remains the best defense against deepfakes. However, studies into the ability of people to detect deepfakes were considered to be just above chance, or around 62 percent, and interventions intended to train people in better detection did not significantly improve this accuracy. The study also found that participants' confidence in their answers was unreasonably high and unrelated to the accuracy of detection.
Regardless of the technology brought to bear upon deepfake media, many governments are developing regulations around the development of and distribution of deepfake media. Much of this regulation is based on stopping users from spreading deepfakes, which leads to misinformation and distrust and causes other disruptions in the wider economy.
This has led to the call for regulations and laws against deepfakes, especially in a case where they can be deemed to be the root cause of some kind of economic or reputational damage. Many of the laws that have previously been suggested could be used to litigate against deepfakes include the use of copyright laws, especially in a case where the deepfake uses previously copyrighted material; defamation laws, particularly in the case where deepfakes cause reputational harm of an individual; violation of privacy, which could be especially useful in cases where an individual's personal information is leaked or distributed in a deepfake; appropriation of personality, where the anti-identity theft laws could be stretched to apply to the use of a personality to gain economic advantage through deepfake technology; and the criminal code, especially in the case of deepfaked pornography where the criminal code could be applied.
However, in many cases, the existing legislation does not cover the complexity and difficulty inherent in deepfakes, especially when much of it can be spread using anonymous accounts on social media sites, and with little to no information on their origin. This has led to calls for regulation of the technology that allows for the development of deepfakes, which is itself complicated as much of this technology is part of other artificial intelligence technologies.
The first country to develop anti-deepfake regulation is China. In January of 2023, China introduced regulations intended to combat the development of and distribution of deepfakes, which the Chinese government has seen as a risk to its control over the state's internet. The key provisions of the regulations require users to consent for their image to be used in any deep synthesis technology; for that deep synthesis technology or service to be used in any way to disseminate "fake news" while also requiring those services to authenticate the real identity of users; any synthetic content is required to notify users that the image or video has been altered; and any content that goes against existing laws is prohibited as the country sees that content endangering national security and its interests.