- The Danger of Deepfakes
- A Look Into Our Model
- Putting It All Together: Example Input and Response
- Final Thoughts
The Danger of Deepfakes
When generative AI models first gained popularity in the late 2010s, they brought with them the ability to create deepfakes. Deepfakes are synthetic media, typically video, in which one person’s likeness is replaced by another’s using deep learning. They are powerful tools for fraud and misinformation, allowing for the creation of synthetic videos of political leaders and letting scammers easily take on new identities.
The primary use, though, of deepfake technology is the fabrication of nonconsensual pornography. The term “deepfake” itself was coined in 2017 by a Reddit user of the same name who made fake pornographic videos featuring popular female celebrities. In 2019, the company Sensity AI catalogued deepfakes across the web and reported that a whopping 96% of them were pornographic, all of which were of women. In the years since, more of this sort of deepfake pornography has become readily available online, with countless forums and even entire porn sites dedicated to it. The targets of this are not just celebrities. They are also everyday women superimposed into adult content by request—on-demand revenge porn for anyone with an internet connection.
Many sites have banned deepfakes entirely, since they are far more often used for harm than for good. At Hive, we’re committed to providing API-accessible solutions for challenging moderation problems like this one. We’ve built our new Deepfake Detection API to empower enterprise customers to easily identify and moderate deepfake content hosted on their platforms.
This blog post explains how our model identifies deepfakes and the new API that makes this functionality accessible.
A Look Into Our Model
Hive’s Deepfake Detection model is essentially a version of our Demographic API that is optimized to identify deepfakes as opposed to demographic attributes. When a query is submitted, this visual detection model locates any faces present in the input. It then performs an additional classification step that determines whether or not each detected face is a deepfake. In its response, it provides a bounding-box location and classification (with confidence scores) for each face.
While the face detection aspect of this process is the same as the one used for our industry-leading Demographic API, the classification step was fine-tuned for deepfake identification by training on a vast repository of synthetic and real video data. Many of these examples were pulled from genres commonly associated with deepfakes, such as pornography, celebrity interviews, and movie clips. We also included other types of examples in order to create a classifier that identifies deepfakes across many different content genres.
Putting It All Together: Example Input and Response
With only one head, the response of our Deepfake Detection model is easily interpretable. When an image or video query is submitted, it is first split into frames. Each frame is then analyzed by our visual detection model in order to find any faces present in the image. Every face then receives a deepfake classification — either yes_deepfake or no_deepfake. Confidence scores for these classifications range from 0.0 to 1.0, with a higher score indicating higher confidence in the model’s results.
Here we see the deepfaked image and, to its left, the two original images used to create it. This input image doesn’t appear to be fake at first glance, especially when the image is displayed at a small size. Even with a close examination, a human reviewer could fail to realize that it is actually a deepfake. As the example illustrates, the model correctly identifies this realistic deepfake with a high confidence score of more than 0.99. Since there is only one face present in this image, we see one corresponding “bounding poly” in the response. This “bounding poly” contains all model response information for that face. Vertices and dimensions are also provided, though those fields are truncated here for clarity.
Because deepfakes like this one can be very convincing, they are difficult to moderate with manual flagging alone. Automating this task is not only ideal to accelerate moderation processes, but also to spot realistic deepfakes that human reviewers might miss.
Digital platforms, particularly those that host NSFW media, can integrate this Deepfake Detection API into their workflows by automatically screening all content as it is posted. Video communication platforms and applications that use any kind of visual identity verification can also utilize our model to counter deepfake fraud.
Hive’s Deepfake Detection API joins our recently released AI-Generated Media Recognition API in the aim to expand content-moderation to keep up with the fast-growing domain of generative AI. Moving forward, we plan to continually update both models so as to best keep up with new generative techniques, popular content genres, and emerging customer needs.
The recent popularity of diffusion models like Stable Diffusion, Midjourney, and DALL-E 2 has brought deepfakes back into the spotlight and sparked conversation on whether these newer generative techniques can be used to develop brand-new ways of making them. Whether or not this happens, deepfakes aren’t going away any time soon and are only growing in number, popularity, and quality. Identifying and removing them across online platforms is crucial to limit the fraud, misinformation, and digital sexual abuse that they enable.
If you’d like to learn more about our Deepfake Detection API and other solutions we’re building, please feel free to reach out to firstname.lastname@example.org or contact us here.